Commit Graph

6449 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
755b4e2bc4
Merge pull request #121576 from huww98/opti-aswp
ad controller: optimize populateActualStateOfWorld
2023-12-19 17:43:45 +01:00
Kubernetes Prow Robot
2b5c0c281d
Merge pull request #122310 from weilaaa/use_buildin_max_min_instead
use build-in max and min func to instead of k8s.io/utils/integer funcs
2023-12-18 19:25:34 +01:00
weilaaa
eb8f3f194f use build-in max and min func to instead of k8s.io/utils/integer funcs 2023-12-15 15:09:11 +08:00
Kubernetes Prow Robot
d687dc4772
Merge pull request #122261 from mimowo/unit-tests-for-job-controller-fix
Add unit test for Job Controller for panic when PodFailurePolicy is used on 1.28
2023-12-14 07:27:48 +01:00
Kubernetes Prow Robot
734fdc775c
Merge pull request #122143 from ii2day/typo_registerCidrsetMetrics
Fix typos: update registerCidrsetMetrics to registerCIDRSetMetrics
2023-12-14 06:18:43 +01:00
Kubernetes Prow Robot
17823e00d1
Merge pull request #121935 from tenzen-y/job-use-builtin-integer
Job: Use built-in min function instead of integer package
2023-12-14 02:42:39 +01:00
Dejan Zele Pejchev
31710a799d
fix: Refactor TestFinalizerCleanup unit test by asserting job and pod informer state (#121856)
* fix: change informer resync period in TestFinalizerCleanup unit test

* refactor TestFinalizerCleanup to avoid flakiness by asserting job and pod informer state

* minor cleanups in TestFinalizerCleanup test in job_controller_test.go

another minor cleanup in TestFinalizerCleanup test in job_controller_test.go

* reduce poll time to 10ms in TestFinalizerCleanup in job_controller_test.go when waiting for podStore cache to get updated

* remove a whitespace from TestFinalizerCleanup to keep diff smaller
2023-12-13 23:55:28 +01:00
Kubernetes Prow Robot
c24e8ee403
Merge pull request #121697 from carlory/clean-111
add comments for switch case of syncUnboundClaim
2023-12-13 22:35:00 +01:00
Kubernetes Prow Robot
89c314ed8b
Merge pull request #121669 from xigang/daemonsetcontroller_eventhandler
node labels and taints do not change, node events shoule be ignored in daemonset controller.
2023-12-13 22:34:42 +01:00
Kubernetes Prow Robot
2f0aca1fc2
Merge pull request #121663 from carlory/clean-010
remove EventRecorder from ControllerParameters of pv base controller
2023-12-13 22:34:33 +01:00
Daniel Henkel
2e388dd7e9
keep existing PDB conditions when updating status
When the disruption controller updates the PDB status, it removes all conditions from the new status object and then re-adds the sufficient pods condition. Unfortunately, this behavior removes conditions set by other controllers, leading to multiple consecutive updates.
Therefore, this commit ensures that conditions are preserved during updates.
2023-12-13 19:57:42 +01:00
Michal Wozniak
34bc590418 Add unit test for Job Controller for panic when PodFailurePolicy is used on 1.28 2023-12-11 11:08:46 +01:00
ii2day
4b1408c2e7 Fix typos: update registerCidrsetMetrics to registerCIDRSetMetrics
Signed-off-by: ii2day <ji.li@daocloud.io>
2023-12-04 10:09:52 +08:00
huweiwen
a3e27aa1ad ad controller: optimize populateActualStateOfWorld
Reduce overall complexity from O(n^2) to O(n)
Run time of the new benchmark is reduced from hours to 13s
2023-11-17 16:48:30 +08:00
huweiwen
854347ac81 ad controller populator: check for inUse sync in test 2023-11-17 16:48:30 +08:00
Yuki Iwai
a85f587984 Job: Use built-in min function instead of integer package
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-11-17 14:10:00 +09:00
carlory
4a4940694f remove stale comments 2023-11-09 11:58:50 +08:00
carlory
ae24846c48 add comments for switch of syncUnboundClaim 2023-11-08 17:15:27 +08:00
Kevin Hannon
2645b22003
Self nominate Kevin Hannon for reviewer for job controller
I have been lead the PodReplacementPolicy KEP for alpha and I helped review/fix some issues in beta.  

https://github.com/kubernetes/kubernetes/pulls?q=+is%3Apr+reviewed-by%3Akannon92+label%3Asig%2Fapps+

I have also been an active reviewer and helped GA job tracking last release.  I hope to continue reviewing Job related code.
2023-11-07 13:21:02 -05:00
carlory
957e9a7f1a adc remove redundant check 2023-11-03 16:17:39 +08:00
xigang
6b3476b79f node labels and taints do not change, node events are ignored in daemonset controller
Signed-off-by: xigang <wangxigang2014@gmail.com>
2023-11-02 17:53:44 +08:00
carlory
58236aa3eb remove EventRecorder from ControllerParameters of pv base controller 2023-11-01 17:54:00 +08:00
Antonio Ojea
3b69bd6a9b servicecidrs controller clarify condition false reevaluation
Change-Id: I0eb8d39abe9b7b0ce6472ff426e9a62e7155aae1
2023-10-31 21:05:58 +00:00
Antonio Ojea
3edcce52e3 service cidr controller manager: use new ServiceCIDR API 2023-10-31 21:05:50 +00:00
Antonio Ojea
599597ca65 fix race on ServiceCIDR deletion
When a ServiceCIDR is deleted, the service CIDR controller on the
controller manager verifies that is safe to be deleted before removing
the finalizer, howerver, since the information of deletion takes time to
propragate, there can be a race where the apiserver allocators didn't
receive the information of deletion and assign an IP address that will
be orphan.

To avoid this race, the service cidr controller waits a grace period
before removing the finalizer to ensure the allocators do not assign any
new IP Address from that range before is completely deleted.

Change-Id: Ib34d32c0bdde91c6e84f1d056db9374589b25c0b
2023-10-31 21:05:06 +00:00
Antonio Ojea
4ff80864e1 service cidr controller manager
Controls the lifecycle of the ServiceCIDRs adding finalizers and
setting the Ready condition in status when they are created, and
removing the finalizers once it is safe to remove (no orphan IPAddresses)

An IPAddress is orphan if there are no ServiceCIDR containing it.

Change-Id: Icbe31e1ed8525fa04df3b741c8a817e5f2a49e80
2023-10-31 21:05:05 +00:00
James Munnelly
76463e21d4 KEP-4193: bound service account token improvements 2023-10-30 21:15:10 +00:00
Kubernetes Prow Robot
05765a851c
Merge pull request #121389 from aleksandra-malinowska/sts-restart-always
Resubmit "Make StatefulSet restart pods with phase Succeeded"
2023-10-30 21:11:51 +01:00
Kubernetes Prow Robot
e4212878dd
Merge pull request #119208 from atosatto/separate-taint-manager
Decouple TaintManager from NodeLifeCycleController (KEP-3902)
2023-10-30 21:11:33 +01:00
Kubernetes Prow Robot
ceea5fd0cb
Merge pull request #119109 from jiahuif-forks/feature/validating-admission-policy/crd-typechecking
ValidatingAdmissionPolicy - Type Checking for API Expensions types
2023-10-30 21:11:19 +01:00
Andrea Tosatto
ccda2d6fd4 kube-controller-manager: Decouple TaintManager from NodeLifeCycleController (KEP-3902) 2023-10-30 12:23:56 +00:00
carlory
5a20ff1617 fix wrong controller name for ephemeralController 2023-10-30 18:45:13 +08:00
Kubernetes Prow Robot
74098ab5ad
Merge pull request #119500 from JackTroy/fix-threshold-arg
Add explanation for large-cluster-size-threshold arg
2023-10-30 02:50:10 +01:00
Kubernetes Prow Robot
99bf6a674c
Merge pull request #121039 from josselin-c/master
hpa: always update status metrics when updating the replica count
2023-10-28 19:35:01 +02:00
Kubernetes Prow Robot
848de697d8
Merge pull request #115711 from sourcelliu/improve
Improve lock performance
2023-10-27 23:41:32 +02:00
Kubernetes Prow Robot
fe21e4d749
Merge pull request #120682 from yt2985/cleanSA
LegacyServiceAccountTokenCleanUp beta
2023-10-27 19:08:05 +02:00
huweiwen
63b3085f2a fix ad controller populators test
The informer is not initialized, so no assertion performed before. Fixed this now.

Then fixed the test failure by using NewAttachDetachController to initialize adc.
2023-10-27 23:35:45 +08:00
tinatingyu
5925dc0775 LegacyServiceAccountTokenCleanUp beta 2023-10-27 03:52:06 +00:00
Dejan Pejchev
e98c33bfaf
switch feature flag to beta for pod replacement policy and add e2e test
update pod replacement policy feature flag comment and refactor the e2e test for pod replacement policy

minor fixes for pod replacement policy and e2e test

fix wrong assertions for pod replacement policy e2e test

more fixes to pod replacement policy e2e test

refactor PodReplacementPolicy e2e test to use finalizers

fix unit tests when pod replacement policy feature flag is promoted to beta

fix podgc controller unit tests when pod replacement feature is enabled

fix lint issue in pod replacement policy e2e test

assert no error in defer function for removing finalizer in pod replacement policy e2e test

implement test using a sh trap for pod replacement policy

reduce sleep after SIGTERM in pod replacement policy e2e test to 5s
2023-10-26 21:50:37 +02:00
Jiahui Feng
fd132665a8 extend VAP status controller for extensions type checking. 2023-10-26 10:26:03 -07:00
Aleksandra Malinowska
e07d898cfd Make StatefulSet restart pods with phase Succeeded 2023-10-26 15:34:01 +02:00
Dejan Pejchev
88c0a8be1b
feat: add job_pods_creation_total metric 2023-10-24 17:49:04 +02:00
Dejan Zele Pejchev
f8a4e343a1
Fix tracking of terminating Pods when nothing else changes (#121342)
* cleanup: refactor pod replacement policy integration test into staged assertion

* cleanup: remove typo in job_test.go

* refactor PodReplacementPolicy test and remove test for defaulting the policy

* fix issue with missing update in job controller for terminating status and refactor pod replacement policy integration test

* use t.Cleanup instead of defer in PodReplacementPolicy integration tests

* revert t.Cleanup to defer for reseting feature flag in PodReplacementPolicy integration tests
2023-10-24 15:04:46 +02:00
Kubernetes Prow Robot
cdd20eebb7
Merge pull request #118381 from SataQiu/fix-controller-20230601
controller: fix the help information format of sorting_deletion_age_ratio metric
2023-10-24 15:04:25 +02:00
Kubernetes Prow Robot
015297a577
Merge pull request #121327 from soltysh/fix_nextScheduleTimeDuration
Fix next schedule time duration
2023-10-24 12:18:35 +02:00
Maciej Szulik
bf2f640ea2
Add more test cases ensuring nextScheduleTimeDuration is never < 0 2023-10-24 11:08:02 +02:00
Kubernetes Prow Robot
ccca58aa36
Merge pull request #120075 from lowang-bh/enhancement
Call getPodRevision once
2023-10-23 19:51:40 +02:00
Kubernetes Prow Robot
8149ab3f3f
Merge pull request #121356 from mimowo/backoff-limit-per-index-beta
Graduate BackoffLimitPerIndex to Beta
2023-10-23 18:39:58 +02:00
Kubernetes Prow Robot
2b16f7b6bb
Merge pull request #120001 from qingwave/hpa-sidecar
HPA: calculate sidecar container resource in pod autoscaler
2023-10-23 18:39:31 +02:00
Kubernetes Prow Robot
1fc3d10f7e
Merge pull request #121292 from mimowo/backoff-limit-per-index-metrics
Introduce the job_finished_indexes_total metric
2023-10-20 23:50:57 +02:00
Anton Stuchinskii
34294cd67f locking feature-gate for ready pods job status 2023-10-20 16:08:54 +02:00
Michal Wozniak
b0d04d933b Introduce the job_finished_indexes_total metric 2023-10-20 15:19:04 +02:00
Kubernetes Prow Robot
568aee16e8
Merge pull request #120731 from Nordix/sts_issue/adil
Fixing CurrentReplicas and CurrentRevision in completeRollingUpdate
2023-10-20 14:42:08 +02:00
Michal Wozniak
32fdb55192 Use Patch instead of SSA for Pod Disruption condition 2023-10-19 21:00:19 +02:00
adil ghaffar
00c21ced3a
Fixing CurrentReplicas and CurrentRevision in completeRollingUpdate 2023-10-19 14:17:42 +03:00
Michal Wozniak
6dd0ad5c0f Graduate BackoffLimitPerIndex to Beta 2023-10-19 12:18:36 +02:00
Maciej Szulik
db8b303156
Modify mostRecentScheduleTime to return more detailed information about missed schedules
Initially this method was returning a number of missed schedules, but
that turned out to be not reliable for some complex schedules. For
example, those which are being run only during week days. The second
approach was to only return a boolean indicating the too many missed
information. It turns out that we need to return all three values:
none missed, few missed and many missed, to let consumers know what to
do, but don't leak the wrong number out of mostRecentScheduleTime.
2023-10-18 20:03:03 +02:00
Maciej Szulik
6c4f71b31c
Fix spelling 2023-10-18 19:15:34 +02:00
Yuki Iwai
d7556769e7 Job: Replace deprecated wait functions with supported one
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-10-19 00:14:35 +09:00
Kubernetes Prow Robot
6d70013af5
Merge pull request #121147 from kannon92/rm-at-least-no-terminating-count
Remove terminating count from rmAtLeast
2023-10-18 00:44:51 +02:00
Kubernetes Prow Robot
27ff547a14
Merge pull request #121011 from kannon92/job-pod-replacement-policy-feature-on-but-api-specified
Fix panic when enablement of pod replacement policy is skewed
2023-10-17 21:28:48 +02:00
Yuki Iwai
201c30fba8
Job: Handle error returned from AddEventHandler function (#119917)
* Job: Handle error returned from AddEventHandler function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Use the error message the similar to CronJob

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Clean up error messages

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the tesing.T on the second place in the args for the newControllerFromClient function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.T on the second place in the args for the newControllerFromClientWithClock function

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Call t.Helper()

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.TB on the second place in the args for the createJobControllerWithSharedInformers function and call tb.Helper() there

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Put the testing.TB on the second place in the args for the startJobControllerAndWaitForCaches function and call tb.Helper() there

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

* Adapt TestFinializerCleanup to the eventhandler error

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>

---------

Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2023-10-17 21:28:34 +02:00
Kevin Hannon
7a1ac18bc8 Fix panic if there are more terminating pods than active pods
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-10-17 14:50:38 -04:00
Antonio Ojea
c2d473f0d4 remove ClusterCIDR
KEP-2593 proposed to expand the existing node-ipam controller
to be configurable via a ClusterCIDR objects, however, there
were reasonable doubts on the SIG about the feature and after
several months of dicussions we decided to not move forward
with the KEP intree, hence, we are going to remove the existing
code, that is still in alpha.

https://groups.google.com/g/kubernetes-sig-network/c/nts1xEZ--gQ/m/2aTOUNFFAAAJ

Change-Id: Ieaf2007b0b23c296cde333247bfb672441fe6dfc
2023-10-14 19:06:22 +00:00
Kubernetes Prow Robot
bae6911b11
Merge pull request #121142 from aleksandra-malinowska/sts-concurrent-write-fix
Fix concurrent map writes on missing PVC creation in StatefulSet controller
2023-10-12 17:11:19 +02:00
Kubernetes Prow Robot
07029999f9
Merge pull request #120666 from b8kings0ga/feature/fix-comment-correction
AttachDetachControllerConfiguration.ReconcilerSyncLoopPeriod default value comment fix
2023-10-11 22:51:49 +02:00
Aleksandra Malinowska
7989400bef Fix concurrent write when filling PVC labels 2023-10-11 15:07:55 +02:00
Aleksandra Malinowska
54714686bc Modify test PVC to detect concurrent map write bug 2023-10-11 15:07:50 +02:00
Kevin Hannon
d7ee6b9d1b fix possible panic if pod replacement policy is turned on and jobs do not set pod replacement policy 2023-10-11 08:37:50 -04:00
Kubernetes Prow Robot
d3559bf77f
Merge pull request #120595 from jsafrane/fix-detach-uncertain
Mark a volume as uncertain-attached after detach error
2023-10-08 05:54:01 +02:00
Josselin Costanzi
3c4512c6cc hpa: always update status metrics when updating the replica count
Have hpa always update both the metrics and replica count. This fix an
edge case behavior bug where the metrics would not be updated if a
custom metrics was unavailable.
2023-10-06 21:34:09 +00:00
Lukasz Stankiewicz
1b489963c8 Add nil checks for hpa object target type values 2023-10-05 17:15:51 -07:00
Kevin Hannon
b96a074bcd convert pointer to ptr for job controller 2023-10-05 09:30:01 -04:00
Abhishek Srivastav
5f8fc30b2c
Added locks on request tracker before accessing fields (#120599)
* Added locks on request tracker before accessing fields

Unit test StatefulSetAutoDeletePVCEnabled has been
flaking with DATARACE. Added lock on request tracker
before accessing err field.

* Addressed review comments for PR : Added locks on request tracker before accessing fields
2023-10-03 16:38:08 +02:00
Kubernetes Prow Robot
622509830c
Merge pull request #120716 from xrstf/fix-typos
Fix typos
2023-09-30 00:25:56 -07:00
b8kings0ga
9345da51ac fix comment mistake, run "make update" 2023-09-22 16:37:55 +08:00
Filip Křepinský
c816601d83 reintroduce resourcequota.NewMonitor
- this function is used by other packages and  was mistakenly removed
  in 397cc73dc9
- let resource quota controller use this constructor instead of an
  object instantiation
2023-09-20 17:18:55 +02:00
Kubernetes Prow Robot
fd5f36e6a0
Merge pull request #120175 from kannon92/move-pod-failure-policy-constant
move reasons to api package for job controller
2023-09-20 03:06:00 -07:00
Kubernetes Prow Robot
355feb21fd
Merge pull request #120649 from andrewsykim/fix-cronjob-controller-already-exists-err
cronjob controller: ensure already existing jobs are added to active list
2023-09-20 02:00:00 -07:00
Kubernetes Prow Robot
963c9b3cb9
Merge pull request #119317 from mochizuki875/fix_ds_rolling_update_118823
Exclude nodes from rolling update depending on tolerations
2023-09-19 16:50:17 -07:00
Kevin Hannon
a62eb45ae2 Rename job reasons to JobReasons as part of api review 2023-09-19 13:10:22 -04:00
Andrew Sy Kim
301aa69fec cronjob controller: ensure already existing jobs are added to Active list of cronjobs
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
2023-09-19 15:18:44 +00:00
Aleksandra Malinowska
5ed60a72f6
Revert "Make StatefulSet restart pods with phase Succeeded" 2023-09-19 15:49:36 +02:00
mochizuki875
2a82776745 change rolling update logic to exclude sunsetting nodes 2023-09-19 11:39:32 +00:00
Christoph Mewes
79a7833ade fix typo Mininum => Minimum 2023-09-17 11:24:29 +02:00
Stephen Kitt
567fca7baa
Use copy() instead of a for loop
Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-15 09:20:08 +02:00
Kevin Hannon
c6e9fba79b move reasons to api package for job controller 2023-09-14 13:24:29 -04:00
Kubernetes Prow Robot
a68093a3ff
Merge pull request #120506 from alexzielenski/import-restrictions
Update e2e import restrictions
2023-09-13 21:56:22 -07:00
Kubernetes Prow Robot
3eca0a5f78
Merge pull request #120398 from aleksandra-malinowska/sts-restart-always
Make StatefulSet restart pods with phase Succeeded
2023-09-13 12:40:12 -07:00
Jan Safranek
7fc11f47ff Mark a volume as uncertain-attached after detach error
Volume that failed Detach() should not be marked as attached, CSI
external-attacher is probably still trying to detach it.

Mark it uncertain instead and wait for Detach() to succeed.
2023-09-13 10:03:28 +02:00
Kubernetes Prow Robot
db49b13ccd
Merge pull request #120252 from kerthcet/cleanup/framework-import
Move framework testing libraries to the right place
2023-09-12 17:44:11 -07:00
kerthcet
6fbb8ec7e4 Move scheduler testing utils to /scheduler/testing
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-09-12 13:42:38 +08:00
Aldo Culquicondor
6b4ab616a2
Increase range of job_sync_duration_seconds
Change-Id: I7ed4b006faecf0a7e6e583c42b4d6bc4b786a164
2023-09-11 18:01:33 -04:00
Kubernetes Prow Robot
aa4ec3c5b0
Merge pull request #119944 from Sharpz7/jm/backup-finalizers
Adding backup code for removing finalizers to more Job End States.
2023-09-11 09:30:30 -07:00
Alexander Zielenski
f135eed37b update codegen 2023-09-08 09:49:35 -07:00
Aleksandra Malinowska
d7264d0af0 Make StatefulSet restart pods with phase Succeeded 2023-09-08 17:47:17 +02:00
Sharpz7
7e4b5d0d49 Final Fix 2023-09-08 14:44:22 +00:00
Stephen Kitt
aa89e6dc97
Use ptr.To to retrieve intstr addresses
This uses the generic ptr.To in k8s.io/utils to replace functions and
code constructs which only serve to return pointers to intstr
values. Other uses of the deprecated pointer package are updated in
modified files.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-08 11:10:50 +02:00
Sharpz7
43fc6b5bdb Added suggests changes 2023-09-06 03:05:14 +00:00
Kubernetes Prow Robot
73580b2038
Merge pull request #120336 from pohly/dra-generated-name-hyphen
resource claim controller: separate generated suffix from base
2023-09-05 11:22:51 -07:00