Commit Graph

10678 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
7fe63a9b1a
Merge pull request #116902 from sourcelliu/cast
Remove unnecessary int type conversion
2023-04-20 19:15:10 -07:00
Kubernetes Prow Robot
9d34ca5b66
Merge pull request #117276 from pacoxu/image-pull-event
kubelet: truncate the precision at a millisecond for image pull event message
2023-04-20 17:45:10 -07:00
Kubernetes Prow Robot
53cccbe4f9
Merge pull request #117019 from bobbypage/gh_116925
kubelet: Mark new terminal pods as non-finished in pod worker
2023-04-17 15:10:58 -07:00
Todd Neal
453f81d1ca
kubelet: pass context to VolumeManager.WaitFor*
This allows us to return with a timeout error as soon as the
context is canceled.  Previously in cases where the mount will
never succeed pods can get stuck deleting for 2 minutes.

In the Sync*Pod methods that call VolumeManager.WaitFor*, we
must filter out wait.Interrupted errors from being logged as
they are part of control flow, not runtime problems. Any
early interruption should result in exiting the Sync*Pod method
as quickly as possible without logging intermediate errors.
2023-04-17 11:53:28 -05:00
mantuliu
3b7c14e8cf Remove unnecessary int type conversion 2023-04-14 16:41:44 +08:00
Dan Winship
2bb35e08f4 Clarify kubelet/kube-proxy iptables rule skew constraints 2023-04-13 14:05:58 -04:00
Paco Xu
c042837a76 truncate the precision at a millisecond for image pull event message 2023-04-13 15:56:16 +08:00
Tim Hockin
bc302fa414
Replace uses of ObjectReflectDiff with cmp.Diff
ObjectReflectDiff is already a shim over cmp.Diff, so no actual output
or behavior changes
2023-04-12 08:48:03 -07:00
Tim Hockin
29c0b73d64
Replace uses of diff.ObjectDiff with cmp.Diff
ObjectDiff is already a shim over cmp.Diff, so no actual output or
behavior changes
2023-04-12 08:46:12 -07:00
Tim Hockin
dd7af241c1
Replace diff.ObjectDiff with cmp.Equal
More obvious and cheaper, and ObjectDiff is already written in terms of
cmp.
2023-04-12 08:45:32 -07:00
Kubernetes Prow Robot
2f1db33dd5
Merge pull request #116482 from smarterclayton/no_mutate
kubelet: Do not mutate pods in the pod manager
2023-04-12 02:22:32 -07:00
Kubernetes Prow Robot
74ad7c397d
Merge pull request #116723 from SergeyKanzhelev/ExperimentalHostUserNamespaceDefaulting
deprecate ExperimentalHostUserNamespaceDefaulting
2023-04-11 21:16:57 -07:00
Kubernetes Prow Robot
006ad0576e
Merge pull request #116560 from bart0sh/PR107-DRA-get-rid-of-extra-loops
DRA: get rid of unneeded loops over pod containers
2023-04-11 21:16:50 -07:00
Kubernetes Prow Robot
ce56fd7c8b
Merge pull request #117152 from samuelkarp/godoc-typo
cpumanager: fix typo in godoc
2023-04-11 20:22:14 -07:00
Kubernetes Prow Robot
e7426a00c3
Merge pull request #117020 from cji/cji-seccomplocalhost
Fix seccomp localhost error handling
2023-04-11 19:18:15 -07:00
Kubernetes Prow Robot
036807ae35
Merge pull request #116995 from smarterclayton/pending_update
kubelet: Ensure pods that have not started track a pendingUpdate
2023-04-11 19:17:37 -07:00
Kubernetes Prow Robot
f46626364f
Merge pull request #116833 from mpatlasov/fix-memleak-in-kubelet-volumemanager
Fix memory leak in kubelet volume_manager populator processedPods
2023-04-11 18:19:58 -07:00
Kubernetes Prow Robot
dcf3792310
Merge pull request #116730 from danwinship/network-owners
sig-network OWNERS fixups
2023-04-11 18:19:44 -07:00
Kubernetes Prow Robot
d48c883372
Merge pull request #116690 from smarterclayton/handle_twice
kubelet: HandlePodCleanups takes an extra sync to restart pods
2023-04-11 18:19:23 -07:00
Kubernetes Prow Robot
4893c66a48
Merge pull request #116134 from cvvz/fix-111933
fix: After a Node is down and take some time to get back to up again, the mount point of the evicted Pods cannot be cleaned up successfully.
2023-04-11 15:35:41 -07:00
Kubernetes Prow Robot
779abe6ebe
Merge pull request #115399 from 3u13r/feat/documentTLS13Exception
Add note about TLS 1.3 cipher suites
2023-04-11 15:35:27 -07:00
Kubernetes Prow Robot
0c969ad660
Merge pull request #115133 from ffromani/podresources-windows
node: create podresources endpoint also on windows
2023-04-11 15:35:19 -07:00
Kubernetes Prow Robot
d0fc9d16ce
Merge pull request #114800 from haoruan/feature-8976-spew-sprintf-refactor
Capture spew.Sprintf() with all our favorite config into a util func
2023-04-11 15:34:57 -07:00
David Porter
d04d7ffa6e kubelet: Mark new terminal pods as non-finished in pod worker
The pod worker may recieve a new pod which is marked as terminal in
the runtime cache. This can occur if a pod is marked as terminal and the
kubelet is restarted.

The kubelet needs to drive these pods through the termination state
machine. If upon restart, the kubelet receives a pod which is terminal
based on runtime cache, it indicates that pod finished
`SyncTerminatingPod`, but it did not complete `SyncTerminatedPod`. The
pod worker needs ensure that `SyncTerminatedPod` will run on these pods.
To accomplish this, set `finished=False`, on the pod sync status, to
drive the pod through the rest of the state machine.

This will ensure that status manager and other kubelet subcomponents
(e.g. volume manager), will be aware of this pod and properly cleanup
all of the resources of the pod after the kubelet is restarted.

While making change, also update the comments to provide a bit more
background around why the kubelet needs to read the runtime pod cache
for newly synced terminal pods.

Signed-off-by: David Porter <david@porter.me>
2023-04-11 01:39:05 -07:00
Samuel Karp
ea74a2d877
cpumanager: fix typo in godoc
Signed-off-by: Samuel Karp <samuelkarp@google.com>
2023-04-06 16:48:24 -07:00
Craig Ingram
3d3686b9cf Return error for localhost seccomp type with no localhost profile defined 2023-04-04 14:53:46 +00:00
Clayton Coleman
ed48dcd2d7
kubelet: Ensure pods that have not started track a pendingUpdate
A pod that cannot be started yet (due to static pod fullname
exclusion when UIDs are reused) must be accounted for in the
pod worker since it is considered to have been admitted and will
eventually start.

Due to a bug we accidentally cleared pendingUpdate for pods that
cannot start yet which means we can't report the right metric to
users in kubelet_working_pods and in theory we might fail to start
the pod in the future (although we currently have not observed
that in tests that should catch such an error). Describe, implement,
and test the invariant that when startPodSync returns in every path
that either activeUpdate OR pendingUpdate is set on the status, but
never both, and is only nil when the pod can never start.

This bug was detected by a "programmer error" assertion we added
on metrics that were not being reported, suggesting that we should
be more aggressive on using log assertions and automating detection
in tests.
2023-03-29 15:29:59 -04:00
Hao Ruan
f638e2849f replaced spew.Sprintf with a util pretty print function 2023-03-27 09:24:22 +08:00
mantuliu
99ad88a261 Remove unnecessary int type conversion
Signed-off-by: mantuliu <240951888@qq.com>
2023-03-24 15:43:25 +08:00
Maxim Patlasov
fbf33e32e6 Fix memory leak in kubelet volume_manager populator processedPods
`findAndRemoveDeletedPods()` processes only pods from volume_manager cache: `dswp.desiredStateOfWorld.GetVolumesToMount()`. `podWorker` calls volume_manager `WaitForUnmount()` asynchronously. If it happens after populator cleaned up resources, an entry is added to `processedPods` and will never be seen. Let's cleanup such entries if they don't have a pod and marked for deletion.
2023-03-21 20:16:02 -07:00
Kubernetes Prow Robot
c7cc7886e2
Merge pull request #116702 from vinaykul/restart-free-pod-vertical-scaling-podmutation-fix
Fix pod object update that may cause data race
2023-03-21 19:26:36 -07:00
Kubernetes Prow Robot
9c6414cdfe
Merge pull request #116792 from pacoxu/fix-safe-sysctl-windows
safe-sysctl: skip checking for windows
2023-03-21 17:39:59 -07:00
vinay kulkarni
f41702b8d2 Return updatedPod if resize upon successful checkpointing of allocated resources 2023-03-22 00:24:00 +00:00
Paco Xu
e154b73535 safe-sysctl: skip checking for windows 2023-03-22 07:40:29 +08:00
Claudiu Belu
c68bc27f73 kubelet: Read DNS Config options from file for Windows
A previous commit added the capability to read the DNS configuration options
from a Windows host, while removing the capability to read from a resolv.conf-like
file.

This commit addresses this issue: if the given ``--resolv-conf`` option is not set to
``Host``, it will consider it as a file, preserving the previous behavior.
2023-03-21 22:21:57 +00:00
vinay kulkarni
d753893260 Do not modify original pod object when processing pod resource resize 2023-03-18 17:57:25 +00:00
Dan Winship
db5590a194 Remove sig-network-driver-approvers alias
This previously existed so the dockershim networking stuff had
different approvers than general sig-network stuff, but that code is
gone now, and the remaining kubelet DNS code should be owned by
sig-network in general (and sig-node via inheritance).
2023-03-18 11:29:38 -04:00
Sergey Kanzhelev
eb60dce33b deprecate ExperimentalHostUserNamespaceDefaulting 2023-03-17 22:07:25 +00:00
vinay kulkarni
358474b71d Explicitly return from checkpoint update failures. SyncPod will retry 2023-03-17 18:00:04 +00:00
Paco Xu
5134520a3b add lock in volume manager reconciler to avoid data race
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2023-03-17 21:29:10 +08:00
vinay kulkarni
f66e8848ee Fix pod object update that may cause data race 2023-03-17 08:50:52 +00:00
Paco Xu
7afcfe1826 kubelet: use filepath.Clean before init, validate it in setupDataDirs 2023-03-17 15:45:39 +08:00
Clayton Coleman
d25572c389
kubelet: HandlePodCleanups takes an extra sync to restart pods
HandlePodCleanups is responsible for restarting pods that are no
longer running (usually due to delete and recreation with the same
UID in quick succession). We have to filter the list of pods to
restart from podManager to get the list of admitted pods, which
uses filterOutInactivePods on the kubelet. That method excludes
pods the pod worker has already terminated. Since a restarted
pod will be in the terminated state before HandlePodCleanups
calls SyncKnownPods, we have to call filterOutInactivePods after
SyncKnownPods, otherwise the to-be-restarted pod is ignored and
we have to wait for the next houskeeping cycle to restart it.

Since static pods are often critical system components, this
extra 2s wait is undesirable and we should restart as soon as
we can. Add a failing test that passes after we move the filter
call after SyncKnownPods.
2023-03-16 15:18:44 -06:00
Michal Wozniak
3d68f362c3 Give terminal phase correctly to all pods that will not be restarted 2023-03-16 21:25:29 +01:00
Clayton Coleman
58d1dc669f kubelet: Remove status manager channel
The status manager channel forces all container status to be
processed, even if multiple updates are generated in succession.
Instead of queueing the updates, just remember which ones changed
and process them in a batch. This should reduce QPS load from
the Kubelet for status, reduce latency of status propagation to
the API in general, and is easier to reason about.

This also prevents status from being lost when the channel is
full - all updates sent by SetPodStatus are guaranteed to be
recorded. Changing to remove the channel allows us to set a
marker flag when the pod worker state machine completes that
avoids the status manager having to call into the pod worker
directly.
2023-03-16 21:22:43 +01:00
Dan Winship
7605163620 Split up PreferNodeIP into legacy and non-legacy versions
Though not obvious as currently written, PreferNodeIP() has different
semantics with legacy and external cloud providers, since one kind of
node IP value never gets passed in the external cloud provider case.
Split it into two functions to make this clearer (and to prepare for
adding new external-cloud-only semantics, and to make it clearer that
some of the code can be deleted when legacy cloud providers go away).
2023-03-15 14:50:17 -04:00
Ed Bartosh
1aeec10efb DRA: get rid of unneeded loops over pod containers 2023-03-15 09:41:30 +02:00
Kubernetes Prow Robot
37937bb227
Merge pull request #110566 from claudiubelu/unittests-5
Adds Pod DNS Policies support for Windows pods
2023-03-14 23:54:14 -07:00
Kubernetes Prow Robot
74123a7341
Merge pull request #116621 from moshe010/dra-lock
kubelet dra: add lock to addCDIDevices
2023-03-14 19:27:28 -07:00
Kubernetes Prow Robot
815b1bf0d8
Merge pull request #116558 from klueska/update-dra-kubeletplugin-v1alpha2
Update kubeletplugin API for DRA to v1alpha2
2023-03-14 19:27:06 -07:00