Commit Graph

5982 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
5550064bc2
Merge pull request #115063 from kannon92/tracking-remove-comments
tracking with finalizers is the default way for the job controller so comments are not needed that say we are tracking with finalizers
2023-01-17 07:56:44 -08:00
Kubernetes Prow Robot
7b01daba71
Merge pull request #115074 from yangjunmyfm192085/deleteklogv0-controller
use klog instead of klog.V(0)--controller manager part
2023-01-16 09:58:50 -08:00
Kubernetes Prow Robot
ed8cad1e80
Merge pull request #115056 from mimowo/podgc-do-not-add-condition-for-terminated-pods
PodGC should not add DisruptionTarget condition for pods which are in terminal phase
2023-01-16 03:04:50 -08:00
JunYang
29086e2b04 use klog instead of klog.V(0) 2023-01-14 21:15:50 +08:00
Kubernetes Prow Robot
9af5ae0365
Merge pull request #115030 from kannon92/remove-pod-error-job-tracking
Update SyncJob with PodControllerError updates in job unit tests
2023-01-13 12:08:14 -08:00
Kubernetes Prow Robot
70217a4083
Merge pull request #114944 from mimowo/fix-active-deadline-test
Fix the job controller unit test for enforcing ActiveDeadlineSeconds
2023-01-13 10:46:26 -08:00
Michal Wozniak
3833c0c349 PodGC should not add DisruptionTarget condition for pods which are in terminal phase 2023-01-13 18:28:44 +01:00
kannon92
4890928b78 tracking with finalizers is the default way for the job controller 2023-01-13 16:48:35 +00:00
kannon92
3a838033f8 Update SyncJob with PodControllerError updates in job unit tests 2023-01-13 16:39:18 +00:00
Michal Wozniak
7065b42bb2 Fix the job controller unit test for enforcing ActiveDeadlineSeconds 2023-01-13 16:48:15 +01:00
Kubernetes Prow Robot
c0c386b9c9
Merge pull request #114516 from nikhita/job-backoff-fix
pkg/controller/job: re-honor exponential backoff delay
2023-01-13 07:36:40 -08:00
Kubernetes Prow Robot
1b8692ce46
Merge pull request #114296 from cbroglie/concurrent-monitor-node-health
controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently
2023-01-12 12:42:54 -08:00
Nikhita Raghunath
fd8d92a29d pkg/controller/job: re-honor exponential backoff
This commit makes the job controller re-honor exponential backoff for
failed pods. Before this commit, the controller created pods without any
backoff. This is a regression because the controller used to
create pods with an exponential backoff delay before (10s, 20s, 40s ...).

The issue occurs only when the JobTrackingWithFinalizers feature is
enabled (which is enabled by default right now). With this feature, we
get an extra pod update event when the finalizer of a failed pod is
removed.

Note that the pod failure detection and new pod creation happen in the
same reconcile loop so the 2nd pod is created immediately after the 1st
pod fails. The backoff is only applied on 2nd pod failure, which means
that the 3rd pod created 10s after the 2nd pod, 4th pod is created 20s
after the 3rd pod and so on.

This commit fixes a few bugs:

1. Right now, each time `uncounted != nil` and the job does not see a
_new_ failure, `forget` is set to true and the job is removed from the
queue. Which means that this condition is also triggered each time the
finalizer for a failed pod is removed and `NumRequeues` is reset, which
results in a backoff of 0s.

2. Updates `updatePod` to only apply backoff when we see a particular
pod failed for the first time. This is necessary to ensure that the
controller does not apply backoff when it sees a pod update event
for finalizer removal of a failed pod.

3. If `JobsReadyPods` feature is enabled and backoff is 0s, the job is
now enqueued after `podUpdateBatchPeriod` seconds, instead of 0s. The
unit test for this check also had a few bugs:
    - `DefaultJobBackOff` is overwritten to 0 in certain unit tests,
    which meant that `DefaultJobBackOff` was considered to be 0,
    effectively not running any meaningful checks.
    - `JobsReadyPods` was not enabled for test cases that ran tests
    which required the feature gate to be enabled.
    - The check for expected and actual backoff had incorrect
    calculations.
2023-01-12 20:34:10 +05:30
Christopher Broglie
3c88de52c8 controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently
Marking the pods not ready on a node requires looping over them and
updating each pod's status one at a time. This is performed serially,
and can take a while if we're processing each node serially as well.

Since the time is spent waiting on io, there's an opportunity to go
faster by processing multiple nodes concurrently. This change modifies
the loop to process nodes in parallel, using the same number of workers
as doNodeProcessingPassWorker.

This change also introduces histogram metrics to better observe
monitorNodeHealth.
2023-01-11 12:34:39 -08:00
kannon92
6dfaeff33c Remove Legacy Job Tracking 2023-01-10 14:52:54 +00:00
Kubernetes Prow Robot
e7549eae87
Merge pull request #114905 from kannon92/sync-job-test-fix
Fix SyncPastDeadlineJobFinished for enabling finalizer path
2023-01-09 12:47:28 -08:00
kannon92
0362c67859 Fix SyncPastDeadlineJobFinished for enabling finalizer path 2023-01-09 17:12:52 +00:00
Aldo Culquicondor
4c1b95ddfa
Ensure job is up to date in informer cache in test
The fake client doesn't guarantee that the informer cache is updated.
If it's not up-to-date, the controller always tries to set the
StartTime, leading to a broken test.

Change-Id: I71f26d46ea44beff88f0d03517985348654aec95
2023-01-09 10:53:19 -05:00
Kubernetes Prow Robot
901c1de5ea
Merge pull request #114870 from mattcary/mutation
Avoid mutation of PVC in stateful set controller shared cache
2023-01-05 23:16:09 -08:00
Matthew Cary
ed18ab54ba Avoid mutation of PVC in stateful set controller shared cache
Change-Id: Ieb8e443e460150d16524ca1c1fb3770f546b2c28
2023-01-05 18:09:05 -08:00
Kubernetes Prow Robot
492637878f
Merge pull request #111660 from pacoxu/key-encipherment-v1.26
Key encipherment usage  v1.27
2023-01-04 15:51:57 -08:00
Michal Wozniak
c3d0e8ff05 Fix clearing rate limiter in disruption controller 2023-01-03 15:06:06 +01:00
Kushagra
80384bbb55 spelling mistake rectified 2022-12-29 17:55:17 +00:00
Kushagra
f380ef8b61 Misleading message when there are no metrics. 2022-12-29 10:57:43 +00:00
Paco Xu
160f015ef4 kubelet: add key encipherment usage only if it is rsa key
remove allowOmittingUsageKeyEncipherment as it is always true

Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-12-27 16:04:25 +08:00
Kubernetes Prow Robot
45f14a93f1
Merge pull request #113787 from gjkim42/update-daemonset-status-despite-error
Update daemonSet status even if syncDaemonSet fails
2022-12-22 15:49:25 -08:00
Kubernetes Prow Robot
d1c715a982
Merge pull request #113834 from atiratree/sts-handle-delete-pod-error
statefulset: handle API error on pod deletion
2022-12-22 08:17:26 -08:00
Harsha Narayana
208c3868cf
job controller: refactored job controller to be able to inject FakeClock for Unit Test 2022-12-20 21:29:24 +05:30
Jordan Liggitt
78cb3862f1
Fix indentation/spacing in comments to render correctly in godoc 2022-12-17 23:27:38 -05:00
Kubernetes Prow Robot
7f7bf68c7c
Merge pull request #111178 from lucming/cleanup
clean up code
2022-12-16 19:17:52 -08:00
Kubernetes Prow Robot
9edd4d86c8
Merge pull request #114522 from zshihang/master
lock LegacyServiceAccountTokenNoAutoGeneration
2022-12-16 14:24:32 -08:00
Kubernetes Prow Robot
f11e9aaf10
Merge pull request #113929 from howardjohn/endpointslice/use-optimized-set
endpoints: remove obsolete ServiceSelectorCache
2022-12-16 14:24:09 -08:00
Shihang Zhang
4fd09a06d6 lock LegacyServiceAccountTokenNoAutoGeneration 2022-12-16 10:45:35 -08:00
Daniel Smith
8100efc7b3 Enable propagration of HasSynced
* Add tracker types and tests
* Modify ResourceEventHandler interface's OnAdd member
* Add additional ResourceEventHandlerDetailedFuncs struct
* Fix SharedInformer to let users track HasSynced for their handlers
* Fix in-tree controllers which weren't computing HasSynced correctly
* Deprecate the cache.Pop function
2022-12-14 18:43:33 +00:00
Kubernetes Prow Robot
741bd5c382
Merge pull request #113947 from mowangdk/chore/change_adcontroller_log_level
Lower volume attached touch log level
2022-12-12 17:41:51 -08:00
John Howard
d9f2cc0c95 endpoints: remove obsolete ServiceSelectorCache
Since https://github.com/kubernetes/kubernetes/pull/112648, we can
efficiently handle selectors from pre-existing `map[string]string`,
making the cache obsolete.

Benchmark:

```
name                         old time/op    new time/op    delta
GetPodServiceMemberships-48     189µs ± 1%     193µs ± 1%  +2.10%  (p=0.000 n=10+10)

name                         old alloc/op   new alloc/op   delta
GetPodServiceMemberships-48    59.0kB ± 0%    58.9kB ± 0%  -0.09%  (p=0.000 n=9+9)

name                         old allocs/op  new allocs/op  delta
GetPodServiceMemberships-48     1.02k ± 0%     1.02k ± 0%    ~     (all equal)
```
2022-12-12 08:00:48 -08:00
Kubernetes Prow Robot
2118bc8aec
Merge pull request #114155 from aojea/mirroring_repack
endpointslicemirroring handle endpoints with multiple subsets
2022-12-10 07:53:42 -08:00
Kubernetes Prow Robot
9303ea836f
Merge pull request #114076 from akhilles/remove-unused-var
Remove unused `numExistingEndpoints` variable
2022-12-10 06:04:25 -08:00
Kubernetes Prow Robot
92ffe94592
Merge pull request #114033 from Octopusjust/k8s-pr14
pkg/controller/deployment/util/deployment_util.go:Improving test cove…
2022-12-10 06:03:55 -08:00
Antonio Ojea
ef6d9edea5 endpointslicemirroring handle endpoints with multiple subsets
Endpoints generated by the endpoints controller are in the canonical
form, however, custom endpoints can not be in canonical format
(there was a time they were canonicalized in the apiserver, but this
caused performance issues because the endpoint controller kept
updating them since the created endpoint were different than the
stored one due to the canonicalization)

There are cases where a custom endpoint may generate multiple slices
due to the controller, per example, when the same address is present
in different subsets.

The endpointslice mirroring controller should canonicalize the
endpoints subsets before start processing them to be consistent
on the slices generated, there is no risk of hotlooping because
the endpoint is only used as input.

Change-Id: I2a8cd53c658a640aea559a88ce33e857fa98cc5c
2022-12-10 11:44:10 +00:00
Kubernetes Prow Robot
0cd13e573c
Merge pull request #113196 from mimowo/job-controller-reviewer
Self-nominate mimowo as a reviewer for pkg/controller/job & test/integration/job packages
2022-12-10 02:01:39 -08:00
Gunju Kim
69fcde750a
Update daemonSet status even if syncDaemonSet fails
This ensures that the daemonset controller updates daemonset statuses in
a best-effort manner even if syncDaemonSet fails.

In order to add an integration test, this also replaces
`cmd/kube-apiserver/app/testing.StartTestServer` with
`test/integration/framework.StartTestServer` and adds
`setupWithServerSetup` to configure the admission control of the
apiserver.
2022-12-10 11:45:56 +09:00
Kubernetes Prow Robot
63a01a5465
Merge pull request #112260 from aryan9600/cidr-metrics
Add metric for max no. of CIDRs available
2022-12-09 15:42:59 -08:00
Kubernetes Prow Robot
da3d98277b
Merge pull request #111839 from ialidzhikov/cleanup/pkg-controller
pkg/controller: Replace deprecated func usage from the `k8s.io/utils/pointer` pkg
2022-12-09 15:42:37 -08:00
Kubernetes Prow Robot
4557c694ef
Merge pull request #111683 from lucming/code-cleanup5
reorganize some logic of controller_utils.go
2022-12-09 15:42:21 -08:00
Kubernetes Prow Robot
5fe12aae11
Merge pull request #111207 from lucming/code-cleanup2
Reduce indentation in daemonset controller code
2022-12-09 15:41:41 -08:00
Sanskar Jaiswal
b501d6036a add metric for max no. of CIDRs that can be allocated from MultiCIDRSet
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-12-05 15:18:45 +00:00
Sanskar Jaiswal
37f4d4624b add metric for max no. of CIDRs that can be allocated from CidrSet
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-12-05 15:18:45 +00:00
ialidzhikov
aede3fbf40 pkg/controller: Replace deprecated func usage from the k8s.io/utils/pointer pkg 2022-11-23 17:40:23 +02:00
Akhil Velagapudi
70d31ea917 Remove unused numExistingEndpoints variable 2022-11-23 00:54:50 +00:00