kubernetes/pkg/controller at fd8d92a29d188c55b8e21e1eb2742122c8bed7ea - kubernetes - Gitea: Git with a cup of tea

github/kubernetes

Files

History

Nikhita Raghunath fd8d92a29d pkg/controller/job: re-honor exponential backoff

This commit makes the job controller re-honor exponential backoff for
failed pods. Before this commit, the controller created pods without any
backoff. This is a regression because the controller used to
create pods with an exponential backoff delay before (10s, 20s, 40s ...).

The issue occurs only when the JobTrackingWithFinalizers feature is
enabled (which is enabled by default right now). With this feature, we
get an extra pod update event when the finalizer of a failed pod is
removed.

Note that the pod failure detection and new pod creation happen in the
same reconcile loop so the 2nd pod is created immediately after the 1st
pod fails. The backoff is only applied on 2nd pod failure, which means
that the 3rd pod created 10s after the 2nd pod, 4th pod is created 20s
after the 3rd pod and so on.

This commit fixes a few bugs:

1. Right now, each time `uncounted != nil` and the job does not see a
_new_ failure, `forget` is set to true and the job is removed from the
queue. Which means that this condition is also triggered each time the
finalizer for a failed pod is removed and `NumRequeues` is reset, which
results in a backoff of 0s.

2. Updates `updatePod` to only apply backoff when we see a particular
pod failed for the first time. This is necessary to ensure that the
controller does not apply backoff when it sees a pod update event
for finalizer removal of a failed pod.

3. If `JobsReadyPods` feature is enabled and backoff is 0s, the job is
now enqueued after `podUpdateBatchPeriod` seconds, instead of 0s. The
unit test for this check also had a few bugs:
    - `DefaultJobBackOff` is overwritten to 0 in certain unit tests,
    which meant that `DefaultJobBackOff` was considered to be 0,
    effectively not running any meaningful checks.
    - `JobsReadyPods` was not enabled for test cases that ran tests
    which required the feature gate to be enabled.
    - The check for expected and actual backoff had incorrect
    calculations.

2023-01-12 20:34:10 +05:30

..

refactor: remove deprecated flags

2022-04-22 20:28:12 +08:00

remove rate limiter metric as it is not in use

2022-10-13 13:07:11 -07:00

kubelet: add key encipherment usage only if it is rsa key

2022-12-27 16:04:25 +08:00

clusterroleaggregation

Lock ServerSideApply feature to true

2022-09-27 13:48:28 +02:00

Fix indentation/spacing in comments to render correctly in godoc

2022-12-17 23:27:38 -05:00

Update daemonSet status even if syncDaemonSet fails

2022-12-10 11:45:56 +09:00

Fix indentation/spacing in comments to render correctly in godoc

2022-12-17 23:27:38 -05:00

Fix clearing rate limiter in disruption controller

2023-01-03 15:06:06 +01:00

Merge pull request #111178 from lucming/cleanup

2022-12-16 19:17:52 -08:00

Merge pull request #111178 from lucming/cleanup

2022-12-16 19:17:52 -08:00

endpointslicemirroring

endpointslicemirroring handle endpoints with multiple subsets

2022-12-10 11:44:10 +00:00

garbagecollector

pkg/controller: Replace deprecated func usage from the k8s.io/utils/pointer pkg

2022-11-23 17:40:23 +02:00

convert int32 to pointer using library function

2022-07-01 14:58:26 +08:00

pkg/controller/job: re-honor exponential backoff

2023-01-12 20:34:10 +05:30

remove rate limiter metric as it is not in use

2022-10-13 13:07:11 -07:00

add metric for max no. of CIDRs that can be allocated from MultiCIDRSet

2022-12-05 15:18:45 +00:00

pkg/controller: Replace deprecated func usage from the k8s.io/utils/pointer pkg

2022-11-23 17:40:23 +02:00

spelling mistake rectified

2022-12-29 17:55:17 +00:00

Enable the feature into beta

2022-11-09 09:02:40 +01:00

Merge pull request #110747 from harshanarayana/cleanup/GIT-110737/logging-improvements

2022-11-03 00:49:34 -07:00

Enable propagration of HasSynced

2022-12-14 18:43:33 +00:00

kube-controller-manager: add ResourceClaim controller

2022-11-10 20:23:50 +01:00

quota: add an update filter

2022-07-08 18:39:55 -04:00

lock LegacyServiceAccountTokenNoAutoGeneration

2022-12-16 10:45:35 -08:00

Merge pull request #114870 from mattcary/mutation

2023-01-05 23:16:09 -08:00

storageversiongc

pkg/controller/storageversiongc: add constructor function newKubeApiserverLease

2022-11-09 15:52:47 -05:00

Wait for Pods to finish before considering Failed in Job (#113860 )

2022-11-15 09:44:53 -08:00

Reduce number of buckets in ttl controller for 2k+ nodes clusters

2022-05-05 12:26:36 +00:00

ttlafterfinished

pkg/controller: Replace deprecated func usage from the k8s.io/utils/pointer pkg

2022-11-23 17:40:23 +02:00

endpoints: remove obsolete ServiceSelectorCache

2022-12-12 08:00:48 -08:00

Fix indentation/spacing in comments to render correctly in godoc

2022-12-17 23:27:38 -05:00

controller_ref_manager_test.go

Merge pull request #101250 from evertrain/master

2021-11-10 09:19:26 -08:00

controller_ref_manager.go

Fix indentation/spacing in comments to render correctly in godoc

2022-12-17 23:27:38 -05:00

controller_utils_test.go

NodeLifecycleController: Remove race condition

2022-10-24 19:36:58 +00:00

controller_utils.go

Merge pull request #111683 from lucming/code-cleanup5

2022-12-09 15:42:21 -08:00

doc.go

Use Go canonical import paths

2016-07-16 13:48:21 -04:00

lookup_cache.go

Use fnv.New32a() in hash instead adler32

2017-02-15 14:03:54 +08:00

OWNERS

add myself as approver to pkg/controller

2022-01-12 19:33:02 -05:00