Commit Graph

90 Commits

Author SHA1 Message Date
Michal Wozniak
168e016947 Benchmark job with backoff limit per index 2023-10-31 17:35:39 +01:00
Dejan Pejchev
e98c33bfaf
switch feature flag to beta for pod replacement policy and add e2e test
update pod replacement policy feature flag comment and refactor the e2e test for pod replacement policy

minor fixes for pod replacement policy and e2e test

fix wrong assertions for pod replacement policy e2e test

more fixes to pod replacement policy e2e test

refactor PodReplacementPolicy e2e test to use finalizers

fix unit tests when pod replacement policy feature flag is promoted to beta

fix podgc controller unit tests when pod replacement feature is enabled

fix lint issue in pod replacement policy e2e test

assert no error in defer function for removing finalizer in pod replacement policy e2e test

implement test using a sh trap for pod replacement policy

reduce sleep after SIGTERM in pod replacement policy e2e test to 5s
2023-10-26 21:50:37 +02:00
Kubernetes Prow Robot
6fed03ea91
Merge pull request #121408 from alculquicondor/merge-job-metric-tests
Remove independent tests for job metrics
2023-10-25 19:02:50 +02:00
Kubernetes Prow Robot
6817e6a7cc
Merge pull request #119912 from kannon92/pod-replacement-policy-integration-tests
Add a missing integration test for PodReplacementPolicy
2023-10-25 02:09:49 +02:00
kannon92
aeceec72bb add integration tests 2023-10-24 17:09:40 -04:00
Aldo Culquicondor
97e72d792c
Remove independent tests for metrics
Change-Id: Ibefebf95df47c68e6752e85c61fface9f06cbd38
2023-10-24 16:29:08 -04:00
Dejan Pejchev
9e2821d585
revert changes to TestMetricsOnSuccesses for job pods creation total metric 2023-10-24 19:41:14 +02:00
Dejan Pejchev
88c0a8be1b
feat: add job_pods_creation_total metric 2023-10-24 17:49:04 +02:00
Dejan Zele Pejchev
f8a4e343a1
Fix tracking of terminating Pods when nothing else changes (#121342)
* cleanup: refactor pod replacement policy integration test into staged assertion

* cleanup: remove typo in job_test.go

* refactor PodReplacementPolicy test and remove test for defaulting the policy

* fix issue with missing update in job controller for terminating status and refactor pod replacement policy integration test

* use t.Cleanup instead of defer in PodReplacementPolicy integration tests

* revert t.Cleanup to defer for reseting feature flag in PodReplacementPolicy integration tests
2023-10-24 15:04:46 +02:00
Kubernetes Prow Robot
1fc3d10f7e
Merge pull request #121292 from mimowo/backoff-limit-per-index-metrics
Introduce the job_finished_indexes_total metric
2023-10-20 23:50:57 +02:00
Anton Stuchinskii
34294cd67f locking feature-gate for ready pods job status 2023-10-20 16:08:54 +02:00
Michal Wozniak
b0d04d933b Introduce the job_finished_indexes_total metric 2023-10-20 15:19:04 +02:00
Kevin Hannon
1a41ed394d convert pointer to ptr for sig-apps integration tests 2023-10-19 10:35:38 -04:00
Yuki Iwai
d7556769e7 Job: Replace deprecated wait functions with supported one
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-10-19 00:14:35 +09:00
Kubernetes Prow Robot
613f75926e
Merge pull request #121274 from dejanzele/fix/pod-replacement-policy-int-tests
cleanup: improve assertions for Failed PodReplacementPolicy integration test cases
2023-10-17 23:28:48 +02:00
Yuki Iwai
201c30fba8
Job: Handle error returned from AddEventHandler function (#119917)
* Job: Handle error returned from AddEventHandler function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Use the error message the similar to CronJob

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Clean up error messages

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the tesing.T on the second place in the args for the newControllerFromClient function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.T on the second place in the args for the newControllerFromClientWithClock function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Call t.Helper()

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.TB on the second place in the args for the createJobControllerWithSharedInformers function and call tb.Helper() there

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.TB on the second place in the args for the startJobControllerAndWaitForCaches function and call tb.Helper() there

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Adapt TestFinializerCleanup to the eventhandler error

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

---------

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-10-17 21:28:34 +02:00
Dejan Pejchev
fad4430f9e
cleanup: remove redundant logic in PodReplacementPolicy integration tests 2023-10-17 20:30:54 +02:00
Dejan Pejchev
2ccf7e8e49
fix: minor lint issues and redundant check 2023-10-17 20:07:20 +02:00
Dejan Pejchev
056b25dfca
fix: improve PodReplacementPolicy integration test case names and update deprecated methods 2023-10-17 19:18:58 +02:00
Dejan Pejchev
e73edf7764
fix: typo in indexed & non-indexed completion policies for failed pod replacement policy integration tests 2023-10-17 18:15:55 +02:00
Dejan Pejchev
bcf1c113f4
cleanup: add new test cases for failed pod replacement policy instead of editing existing ones 2023-10-17 18:04:25 +02:00
Dejan Pejchev
f2b723a130
fix: improve assertion for Failed PodReplacementPolicy integration test cases 2023-10-16 21:16:17 +02:00
kannon92
74fcf3e766 implementation of PodReplacementPolicy kep in the job controller 2023-07-21 00:44:53 +00:00
Michał Woźniak
a15c27661e
Job controller implementation of backoff limit per index (#118009) 2023-07-18 13:44:11 -07:00
Aldo Culquicondor
f7a1fb76f4
Only declare job as finished after removing all finalizers
Change-Id: Id4b01b0e6fabe24134e57e687356e0fc613cead4
2023-07-07 14:08:19 -04:00
Patrick Ohly
dfd646e0a8 scheduler_perf: fix namespace deletion
Merely deleting the namespace is not enough:
- Workloads might rely on the garbage collector to get rid of obsolete objects,
  so we should run it to be on the safe side.
- Pods must be force-deleted because kubelet is not running.
- Finally, the namespace controller is needed to get rid of
  deleted namespaces.
2023-06-28 09:22:25 +02:00
Kubernetes Prow Robot
162034db85
Merge pull request #118744 from mimowo/job-it-tests-small-default-backoff
Set small DefaultJobPodFailureBackOff in Job integration tests
2023-06-19 08:50:22 -07:00
Michal Wozniak
3dd1bac4dc Set small DefaultJobPodFailureBackOff in Job integration tests 2023-06-19 16:52:38 +02:00
Michal Wozniak
2596245f5a Replace deprecated sets.Int with sets.Set[int] in Job integration tests 2023-06-19 13:55:54 +02:00
Michal Wozniak
74c5ff97f1 Lower the constants for the rate limiter in Job controller 2023-06-16 17:00:04 +02:00
Ziqi Zhao
7bc449d7e0 add contextual logging to job-controller
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-06-14 13:40:02 +08:00
Jongwoo Han
1dec97436c
Fix typo at job_test.go
Signed-off-by: jongwooo <jongwooo.han@gmail.com>
2023-04-09 01:47:42 +09:00
Sathyanarayanan Saravanamuthu
c84c8add70
Decouple batch/job back-off logic from workqueues (#114768)
* batch/job: decouple backoff from workqueue

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>

* Resolving review comments

* Resolving more review comments

* Resolving review comments

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>

* Computing finish time to now when FinishedAt is unix epoch

* Addressing review comments

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>

---------

Signed-off-by: Sathyanarayanan Saravanamuthu <sathyanarays@vmware.com>
2023-03-16 10:15:21 -07:00
Kubernetes Prow Robot
cb00077cd3
Merge pull request #113471 from ncdc/gc-contextual-logging
garbagecollector: use contextual logging
2023-03-10 04:34:39 -08:00
Andy Goldstein
26e3dab78b garbagecollector: use contextual logging
Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
2023-03-08 08:37:56 -05:00
ahg-g
2ecd24011a Graduate JobMutableNodeSchedulingDirectives feature to GA 2023-02-28 15:47:13 +00:00
Yuan Chen
a24aef6510 Replace a function closure
Replace more closures with pointer conversion

Replace deprecated Int32Ptr to Int32
2023-02-27 09:13:36 -08:00
Daniel Vega-Myhre
c63f448451 change test names and address other comments 2023-02-23 03:25:17 +00:00
Daniel Vega-Myhre
b0b0959b92 address comments 2023-02-23 03:25:16 +00:00
Daniel Vega-Myhre
d41302312e update validation logic so completions is mutable iff completions is modified in tandem with parallelsim so completions == parallelism 2023-02-23 03:25:16 +00:00
kannon92
6dfaeff33c Remove Legacy Job Tracking 2023-01-10 14:52:54 +00:00
Aldo Culquicondor
61fe6114b3
Reduce load of Job integration test
Change-Id: If99856aa6640375a8a9feff13fa213d4f974a99a
2022-12-02 12:58:28 -05:00
Michal Wozniak
c803892bd8 Enable the feature into beta 2022-11-09 09:02:40 +01:00
Aldo Culquicondor
4948918155
Graduate JobTrackingWithFinalizers to stable
Change-Id: Ifc749a85b1270c0155ac511b91d4681d53236820
2022-11-04 17:05:53 -04:00
Aldo Culquicondor
5e03865f65
Add benchmark for large indexed job
Change-Id: I556f0cce5842699c98654cfb5a66e7c8d63b2e2e
2022-11-02 11:56:26 -04:00
Michał Woźniak
3628532311
Extend metrics with the new labels (#113324)
* Extend job metrics

* Refactor TestMetrics to extract its checks into dedicated tests per feature
2022-10-31 08:50:45 -07:00
Aldo Culquicondor
12d308f5c4 Add metric for terminated pods with tracking finalizer
Change-Id: I26f3169588c30ed82250cb7baff8e277f8d13bb7
2022-10-20 11:35:20 -04:00
Aldo Culquicondor
b8bd168180 Simplify tests for job metrics by resetting them
Change-Id: I20a0acbbb179bf895953b9d7af72625a2191b8eb
2022-10-19 13:52:00 -04:00
Kubernetes Prow Robot
bf14677914
Merge pull request #112546 from oscr/the-the
grammar: replace all occurrences of "the the" with "the"
2022-10-19 10:03:02 -07:00
Oscar Utbult
e4f776f230 grammar: replace all occurrences of "the the" with "the" 2022-10-14 09:03:14 +02:00