Commit Graph

46270 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
e51fe4a61c Merge pull request #114492 from SataQiu/update-prefered-storageversion-20221215
apiserver: update serialization version priority for flowcontrol API
2023-01-13 08:42:24 -08:00
kannon92
3a838033f8 Update SyncJob with PodControllerError updates in job unit tests 2023-01-13 16:39:18 +00:00
Michal Wozniak
7065b42bb2 Fix the job controller unit test for enforcing ActiveDeadlineSeconds 2023-01-13 16:48:15 +01:00
Kubernetes Prow Robot
c0c386b9c9 Merge pull request #114516 from nikhita/job-backoff-fix
pkg/controller/job: re-honor exponential backoff delay
2023-01-13 07:36:40 -08:00
Kubernetes Prow Robot
696701b9fd Merge pull request #114086 from xmcqueen/113935
block ephemeral container addition to static pods
2023-01-13 07:36:28 -08:00
SataQiu
950c147db5 apiserver: update serialization version priority for flowcontrol API 2023-01-13 22:19:39 +08:00
Kubernetes Prow Robot
6ce055d62d Merge pull request #114947 from saschagrunert/seccomp-ga-cleanup
Make seccomp annotations non-functional
2023-01-12 13:48:54 -08:00
xing-yang
07a1bc5b3e Update warnings for removed in-tree plugins 2023-01-12 16:25:00 -05:00
Kubernetes Prow Robot
1b8692ce46 Merge pull request #114296 from cbroglie/concurrent-monitor-node-health
controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently
2023-01-12 12:42:54 -08:00
Kubernetes Prow Robot
3e049c5e68 Merge pull request #114883 from bobbypage/cadvisor_v047
deps: Bump cAdvisor to v0.47.1
2023-01-12 09:04:54 -08:00
Sascha Grunert
af1f6a230b Make seccomp annotations non-functional
This cleanup has been planned to finish the corresponding KEP:
https://github.com/kubernetes/kubernetes/issues/91286

As follow-up on the partly removal of the seccomp annotations in
https://github.com/kubernetes/kubernetes/pull/109819, we now drop
the version skew handling completely, but still warn as well as keep
the validation in place if both (annotation and field) are set.

The Pod Security Admission code has been already changed in
https://github.com/kubernetes/kubernetes/pull/114846.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-01-12 17:11:52 +01:00
Nikhita Raghunath
fd8d92a29d pkg/controller/job: re-honor exponential backoff
This commit makes the job controller re-honor exponential backoff for
failed pods. Before this commit, the controller created pods without any
backoff. This is a regression because the controller used to
create pods with an exponential backoff delay before (10s, 20s, 40s ...).

The issue occurs only when the JobTrackingWithFinalizers feature is
enabled (which is enabled by default right now). With this feature, we
get an extra pod update event when the finalizer of a failed pod is
removed.

Note that the pod failure detection and new pod creation happen in the
same reconcile loop so the 2nd pod is created immediately after the 1st
pod fails. The backoff is only applied on 2nd pod failure, which means
that the 3rd pod created 10s after the 2nd pod, 4th pod is created 20s
after the 3rd pod and so on.

This commit fixes a few bugs:

1. Right now, each time `uncounted != nil` and the job does not see a
_new_ failure, `forget` is set to true and the job is removed from the
queue. Which means that this condition is also triggered each time the
finalizer for a failed pod is removed and `NumRequeues` is reset, which
results in a backoff of 0s.

2. Updates `updatePod` to only apply backoff when we see a particular
pod failed for the first time. This is necessary to ensure that the
controller does not apply backoff when it sees a pod update event
for finalizer removal of a failed pod.

3. If `JobsReadyPods` feature is enabled and backoff is 0s, the job is
now enqueued after `podUpdateBatchPeriod` seconds, instead of 0s. The
unit test for this check also had a few bugs:
    - `DefaultJobBackOff` is overwritten to 0 in certain unit tests,
    which meant that `DefaultJobBackOff` was considered to be 0,
    effectively not running any meaningful checks.
    - `JobsReadyPods` was not enabled for test cases that ran tests
    which required the feature gate to be enabled.
    - The check for expected and actual backoff had incorrect
    calculations.
2023-01-12 20:34:10 +05:30
Kubernetes Prow Robot
457341c3d4 Merge pull request #114647 from kannon92/remove-legacy-job-tracking-job-controller
Removing Legacy Job Tracking Code
2023-01-12 04:38:53 -08:00
David Porter
8e3a02efa8 Remove AcceleratorUsageMetrics from kubelet
The feature gate is GA'd and enabled by default and the metrics have
been removed from cAdvisor.

Signed-off-by: David Porter <david@porter.me>
2023-01-11 16:07:39 -08:00
Kubernetes Prow Robot
08d9a0ef5b Merge pull request #113467 from pacoxu/psp-cleanup
Remove PodSecurityPolicy related code except client-go & API type
2023-01-11 14:28:07 -08:00
Christopher Broglie
3c88de52c8 controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently
Marking the pods not ready on a node requires looping over them and
updating each pod's status one at a time. This is performed serially,
and can take a while if we're processing each node serially as well.

Since the time is spent waiting on io, there's an opportunity to go
faster by processing multiple nodes concurrently. This change modifies
the loop to process nodes in parallel, using the same number of workers
as doNodeProcessingPassWorker.

This change also introduces histogram metrics to better observe
monitorNodeHealth.
2023-01-11 12:34:39 -08:00
Kubernetes Prow Robot
7372e7e807 Merge pull request #114724 from tnqn/fix-lb-svc-delete-error
Do not log errors when ServiceHealthServer is closed normally
2023-01-11 10:31:45 -08:00
Kubernetes Prow Robot
14c2d7b39b Merge pull request #114980 from mimowo/do-not-include-scheduler-name-in-event
Do not include scheduler name in the preemption event message
2023-01-11 06:43:56 -08:00
Kubernetes Prow Robot
6f6c468168 Merge pull request #114802 from moshe010/pod-resource-metrics
kubelet podresource: fix GetAllocatableResources metrics
2023-01-11 06:43:44 -08:00
Michal Wozniak
437179afc3 Do not include scheduler name in the preemption event message 2023-01-11 09:32:21 +01:00
Kubernetes Prow Robot
6882e76c60 Merge pull request #114063 from ruquanzhao/fixNetworkTypesDoc
fix doc of types.go of network v1, v1alpha1, v1beta1
2023-01-10 23:47:56 -08:00
Kubernetes Prow Robot
2a2f994c24 Merge pull request #114187 from claudiubelu/refactor-platform-deps-3
Refactors kubelet's plugin watcher
2023-01-10 15:25:26 -08:00
Kubernetes Prow Robot
564f438892 Merge pull request #114691 from thockin/fix-pod-warning-string
Make the warning about pod name clearer
2023-01-10 13:47:38 -08:00
Kubernetes Prow Robot
5a896bf379 Merge pull request #114677 from kl52752/epd-warning-address-type
Generate warning for EndpointSlice AddressType FQDN
2023-01-10 13:47:27 -08:00
Kubernetes Prow Robot
2d08117e9e Merge pull request #114065 from ruquanzhao/fixNodeTypesDoc
fix doc of types.go of node
2023-01-10 10:39:25 -08:00
kannon92
6dfaeff33c Remove Legacy Job Tracking 2023-01-10 14:52:54 +00:00
Monis Khan
0b22cb0b72 Prevent CSIMigrationAzureFile gate from being disabled
Signed-off-by: Monis Khan <mok@microsoft.com>
2023-01-10 09:43:35 -05:00
RuquanZhao
d5b4644d23 fix doc of types.go of network v1, v1alpha1, v1beta1
Signed-off-by: Ruquan Zhao <ruquan.zhao@arm.com>
2023-01-10 20:24:51 +08:00
Kubernetes Prow Robot
1d6ae20301 Merge pull request #114798 from kerthcet/cleanup/code-refactor
Code refactor for readability in `RunFilterPlugins`
2023-01-09 17:31:12 -08:00
Kubernetes Prow Robot
1e3946ce9d Merge pull request #114923 from mimowo/do-not-leak-pod-name-in-event
Adjust preemption event message to do not include preemptor pod metadata
2023-01-09 13:51:28 -08:00
Kubernetes Prow Robot
b3138ba1b3 Merge pull request #114907 from haoruan/doc-fix-typo
fix a typo in pkg/proxy/ipvs/proxier.go
2023-01-09 12:47:39 -08:00
Kubernetes Prow Robot
e7549eae87 Merge pull request #114905 from kannon92/sync-job-test-fix
Fix SyncPastDeadlineJobFinished for enabling finalizer path
2023-01-09 12:47:28 -08:00
Kubernetes Prow Robot
eb7fd7f51c Merge pull request #114914 from mimowo/do-not-leak-pod-name
Adjust DisruptionTarget condition message to do not include preemptor pod metadata
2023-01-09 11:15:40 -08:00
Kubernetes Prow Robot
bea405c581 Merge pull request #114876 from alculquicondor/fix-deadline-test
Ensure job is up to date in informer cache in test
2023-01-09 10:07:28 -08:00
Michal Wozniak
f79a34d267 Do not leak cross namespace pod metadata in preemption events 2023-01-09 18:30:19 +01:00
kannon92
0362c67859 Fix SyncPastDeadlineJobFinished for enabling finalizer path 2023-01-09 17:12:52 +00:00
Aldo Culquicondor
4c1b95ddfa Ensure job is up to date in informer cache in test
The fake client doesn't guarantee that the informer cache is updated.
If it's not up-to-date, the controller always tries to set the
StartTime, leading to a broken test.

Change-Id: I71f26d46ea44beff88f0d03517985348654aec95
2023-01-09 10:53:19 -05:00
Michal Wozniak
bdf58ce2eb Adjust DisruptionTarget condition message to do not include preemptor metadata 2023-01-09 12:22:19 +01:00
Hao Ruan
7f3de6e53a fix a typo in pkg/proxy/ipvs/proxier.go 2023-01-09 09:29:22 +08:00
kidddddddddddddddddddddd
733d5695f2 always run filter in test 2023-01-08 11:13:16 +08:00
kidddddddddddddddddddddd
5411c05460 save state data for reserve 2023-01-08 10:49:35 +08:00
kidddddddddddddddddddddd
0abdf6abc2 revert check in filter 2023-01-07 22:30:16 +08:00
kidddddddddddddddddddddd
059d520537 return skip 2023-01-07 21:58:54 +08:00
kidddddddddddddddddddddd
de7c8db7cb return skip 2023-01-07 21:48:30 +08:00
Kensei Nakada
570c2d7036 cleanup(nodeaffinity): remove impossible scenario from test cases 2023-01-07 08:46:35 +00:00
Kante Yin
2ceadfe885 Code refactor for readability
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-01-07 11:31:46 +08:00
Ian K. Coolidge
5533e49e2c cpuset: Add package comment
Describe use cases (node IDs, HT siblings, etc)

Call out novelty (Linux CPU list parse/dump)

Describe future work (relax immutable, refactor to use 'set')
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
cbb985a310 cpuset: Delete 'builder' methods
All usage of builder pattern is convertible to cpuset.New()
with the same or fewer lines of code.

Migrate Builder.Add to a private method of CPUSet, with a comment
that it is only intended for internal use to preserve immutable
propoerty of the exported interface.

This also removes 'require' library dependency, which avoids
non-standard library usage.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
f3829c4be3 cpuset: Rename 'NewCPUSet' to 'New' 2023-01-06 23:32:51 +00:00
Ian K. Coolidge
768b1ecfb6 cpuset: hide 'Filter' API
FilterNot is only used in this file, and is trivially converted to a
'filter' call site by inverting the predicate.

Filter is only used in this file, so don't export it.
2023-01-06 23:32:51 +00:00