Commit Graph

432 Commits

Author SHA1 Message Date
Michal Wozniak
c12bcf4e94 Refactor enactJobFinished util function for Job controller 2024-06-20 13:02:54 +02:00
Kubernetes Prow Robot
bd88faee8b
Merge pull request #125520 from mimowo/cleanup-success-policy-check
Remove redundant check in Job success policy code
2024-06-18 12:40:09 -07:00
Michal Wozniak
7a3d73d234 Remove redundant check in Job success policy code 2024-06-14 19:59:40 +02:00
Michal Wozniak
18a14bcff9 Remove unused parameter in Job controller function 2024-06-14 19:43:19 +02:00
Yuki Iwai
be3316e2e1 Job: Fix a bug that the SuccessCriteriaMet condition is added to the Job with successPolicy even if the JobSuccessPolicy featureGate is disabled
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-06-12 00:36:36 +09:00
Kubernetes Prow Robot
c49b140c45
Merge pull request #125175 from dejanzele/feat/count-terminating-for-failed-jobs
Count terminating pods when deleting active pods for failed jobs
2024-06-10 16:56:37 -07:00
Kubernetes Prow Robot
cfd949e321
Merge pull request #124942 from AxeZhan/getFinishTimeFromContainers
[Sidecar Containers] Sidecar containers finish time needs to be accounted for in job controller
2024-06-06 12:23:05 -07:00
Dejan Pejchev
01536f5a84
add additional tests to make sure job controller logic is correct when counting terminating pods with enabled and disabled PodReplacementPolicy feature 2024-06-06 11:40:54 +02:00
AxeZhan
d97282052e check sidecar featuregate in getFinishedTime 2024-06-06 15:46:41 +08:00
Dejan Pejchev
7dd2948620
count terminating pods when deleting active pods for failed jobs 2024-06-04 11:31:00 +02:00
Tomas Tormo
ce56b2ca58 Remove JobReadyPods feature flag 2024-05-27 13:09:52 +00:00
AxeZhan
3a2a500182 check restartpolicy in getFinishTimeFromContainers 2024-05-21 11:58:06 +08:00
AxeZhan
e4348a1210 consider sidecar containers in getFinishTimeFromContainers 2024-05-20 18:31:23 +08:00
Michal Wozniak
a6c9d5ba00 Do not remove Job's finalizer from Pod owned by a non-batch/v1 Job 2024-05-14 17:29:23 +02:00
Kubernetes Prow Robot
d2e6c51b05
Merge pull request #123537 from kaisoz/commonize-job-util-functions
Add the util pkg to commonize job util functions
2024-05-07 16:59:28 -07:00
Tomas Tormo
c856c412b9 Add util pkg to commonize job util functions 2024-05-07 09:27:46 +00:00
Alvaro Aleman
6d0ac8c561 Use the generic/typed workqueue throughout
This change makes us use the generic workqueue throughout the project in
order to improve type safety and readability of the code.
2024-05-04 14:33:12 -04:00
Marek Siarkowicz
3ee8178768 Cleanup defer from SetFeatureGateDuringTest function call 2024-04-24 20:25:29 +02:00
Hiroki Takatsuka
84ffc2fc3d
fix(job_controller): add delay duration to log message when enqueueing job 2024-04-07 16:42:48 +09:00
Yuki Iwai
f2508df279 Job: Use the fake clock in TestTrackJobStatusAndRemoveFinalizers
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-03-09 06:09:05 +09:00
Yuki Iwai
e216742672 Job: Support for the JobSuccessPolicy (alpha)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-03-08 05:49:09 +09:00
Michał Woźniak
e568a77a93
Support for the Job managedBy field (alpha) (#123273)
* support for the managed-by label in Job

* Use managedBy field instead of managed-by label

* Additional review remarks

* Review remarks 2

* review remarks 3

* Skip cleanup of finalizers for job with custom managedBy

* Drop the performance optimization

* imrpove logs
2024-03-05 09:25:15 -08:00
Kubernetes Prow Robot
df366107d1
Merge pull request #123529 from thockin/go-workspaces
Go workspaces for k/k and k/staging/*
2024-03-01 08:43:03 -08:00
Tim Hockin
6dbc754ed6
Retool typecheck to be simpler
Instead of walking paths ourselves, just let Go's packages library do
it.  This is a slight CLI change - it wants "./foo" rather than "foo".

This also flagged a few things which seem to be legit failures.
2024-02-29 22:07:00 -08:00
Mengjiao Liu
b584b87a94 kube-controller-manager: readjust log verbosity
- Increase the global level for broadcaster's logging to 3 so that users can ignore event messages by lowering the logging level. It reduces information noise.
- Making sure the context is properly injected into the broadcaster, this will allow the -v flag value to be used also in that broadcaster, rather than the above global value.
- test: use cancellation from ktesting
- golangci-hints: checked error return value
2024-02-26 14:51:56 +08:00
Michal Wozniak
f84d643c20 Use the Defer for pod replacement policy 2024-02-15 17:37:31 +01:00
Michal Wozniak
115dc90633 Increase accuracy of the pods_creation_total metric and improve test exec time 2024-02-15 10:59:01 +01:00
Kubernetes Prow Robot
d687dc4772
Merge pull request #122261 from mimowo/unit-tests-for-job-controller-fix
Add unit test for Job Controller for panic when PodFailurePolicy is used on 1.28
2023-12-14 07:27:48 +01:00
Kubernetes Prow Robot
17823e00d1
Merge pull request #121935 from tenzen-y/job-use-builtin-integer
Job: Use built-in min function instead of integer package
2023-12-14 02:42:39 +01:00
Dejan Zele Pejchev
31710a799d
fix: Refactor TestFinalizerCleanup unit test by asserting job and pod informer state (#121856)
* fix: change informer resync period in TestFinalizerCleanup unit test

* refactor TestFinalizerCleanup to avoid flakiness by asserting job and pod informer state

* minor cleanups in TestFinalizerCleanup test in job_controller_test.go

another minor cleanup in TestFinalizerCleanup test in job_controller_test.go

* reduce poll time to 10ms in TestFinalizerCleanup in job_controller_test.go when waiting for podStore cache to get updated

* remove a whitespace from TestFinalizerCleanup to keep diff smaller
2023-12-13 23:55:28 +01:00
Michal Wozniak
34bc590418 Add unit test for Job Controller for panic when PodFailurePolicy is used on 1.28 2023-12-11 11:08:46 +01:00
Yuki Iwai
a85f587984 Job: Use built-in min function instead of integer package
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-11-17 14:10:00 +09:00
Kevin Hannon
2645b22003
Self nominate Kevin Hannon for reviewer for job controller
I have been lead the PodReplacementPolicy KEP for alpha and I helped review/fix some issues in beta.  

https://github.com/kubernetes/kubernetes/pulls?q=+is%3Apr+reviewed-by%3Akannon92+label%3Asig%2Fapps+

I have also been an active reviewer and helped GA job tracking last release.  I hope to continue reviewing Job related code.
2023-11-07 13:21:02 -05:00
Dejan Pejchev
e98c33bfaf
switch feature flag to beta for pod replacement policy and add e2e test
update pod replacement policy feature flag comment and refactor the e2e test for pod replacement policy

minor fixes for pod replacement policy and e2e test

fix wrong assertions for pod replacement policy e2e test

more fixes to pod replacement policy e2e test

refactor PodReplacementPolicy e2e test to use finalizers

fix unit tests when pod replacement policy feature flag is promoted to beta

fix podgc controller unit tests when pod replacement feature is enabled

fix lint issue in pod replacement policy e2e test

assert no error in defer function for removing finalizer in pod replacement policy e2e test

implement test using a sh trap for pod replacement policy

reduce sleep after SIGTERM in pod replacement policy e2e test to 5s
2023-10-26 21:50:37 +02:00
Dejan Pejchev
88c0a8be1b
feat: add job_pods_creation_total metric 2023-10-24 17:49:04 +02:00
Dejan Zele Pejchev
f8a4e343a1
Fix tracking of terminating Pods when nothing else changes (#121342)
* cleanup: refactor pod replacement policy integration test into staged assertion

* cleanup: remove typo in job_test.go

* refactor PodReplacementPolicy test and remove test for defaulting the policy

* fix issue with missing update in job controller for terminating status and refactor pod replacement policy integration test

* use t.Cleanup instead of defer in PodReplacementPolicy integration tests

* revert t.Cleanup to defer for reseting feature flag in PodReplacementPolicy integration tests
2023-10-24 15:04:46 +02:00
Kubernetes Prow Robot
8149ab3f3f
Merge pull request #121356 from mimowo/backoff-limit-per-index-beta
Graduate BackoffLimitPerIndex to Beta
2023-10-23 18:39:58 +02:00
Kubernetes Prow Robot
1fc3d10f7e
Merge pull request #121292 from mimowo/backoff-limit-per-index-metrics
Introduce the job_finished_indexes_total metric
2023-10-20 23:50:57 +02:00
Anton Stuchinskii
34294cd67f locking feature-gate for ready pods job status 2023-10-20 16:08:54 +02:00
Michal Wozniak
b0d04d933b Introduce the job_finished_indexes_total metric 2023-10-20 15:19:04 +02:00
Michal Wozniak
6dd0ad5c0f Graduate BackoffLimitPerIndex to Beta 2023-10-19 12:18:36 +02:00
Yuki Iwai
d7556769e7 Job: Replace deprecated wait functions with supported one
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-10-19 00:14:35 +09:00
Kubernetes Prow Robot
6d70013af5
Merge pull request #121147 from kannon92/rm-at-least-no-terminating-count
Remove terminating count from rmAtLeast
2023-10-18 00:44:51 +02:00
Kubernetes Prow Robot
27ff547a14
Merge pull request #121011 from kannon92/job-pod-replacement-policy-feature-on-but-api-specified
Fix panic when enablement of pod replacement policy is skewed
2023-10-17 21:28:48 +02:00
Yuki Iwai
201c30fba8
Job: Handle error returned from AddEventHandler function (#119917)
* Job: Handle error returned from AddEventHandler function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Use the error message the similar to CronJob

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Clean up error messages

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the tesing.T on the second place in the args for the newControllerFromClient function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.T on the second place in the args for the newControllerFromClientWithClock function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Call t.Helper()

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.TB on the second place in the args for the createJobControllerWithSharedInformers function and call tb.Helper() there

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.TB on the second place in the args for the startJobControllerAndWaitForCaches function and call tb.Helper() there

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Adapt TestFinializerCleanup to the eventhandler error

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

---------

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-10-17 21:28:34 +02:00
Kevin Hannon
7a1ac18bc8 Fix panic if there are more terminating pods than active pods
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-10-17 14:50:38 -04:00
Kevin Hannon
d7ee6b9d1b fix possible panic if pod replacement policy is turned on and jobs do not set pod replacement policy 2023-10-11 08:37:50 -04:00
Kevin Hannon
b96a074bcd convert pointer to ptr for job controller 2023-10-05 09:30:01 -04:00
Kevin Hannon
a62eb45ae2 Rename job reasons to JobReasons as part of api review 2023-09-19 13:10:22 -04:00
Kevin Hannon
c6e9fba79b move reasons to api package for job controller 2023-09-14 13:24:29 -04:00