Commit Graph

10700 Commits

Author SHA1 Message Date
mantuliu
99ad88a261 Remove unnecessary int type conversion
Signed-off-by: mantuliu <240951888@qq.com>
2023-03-24 15:43:25 +08:00
Maxim Patlasov
fbf33e32e6 Fix memory leak in kubelet volume_manager populator processedPods
`findAndRemoveDeletedPods()` processes only pods from volume_manager cache: `dswp.desiredStateOfWorld.GetVolumesToMount()`. `podWorker` calls volume_manager `WaitForUnmount()` asynchronously. If it happens after populator cleaned up resources, an entry is added to `processedPods` and will never be seen. Let's cleanup such entries if they don't have a pod and marked for deletion.
2023-03-21 20:16:02 -07:00
Kubernetes Prow Robot
c7cc7886e2 Merge pull request #116702 from vinaykul/restart-free-pod-vertical-scaling-podmutation-fix
Fix pod object update that may cause data race
2023-03-21 19:26:36 -07:00
Kubernetes Prow Robot
9c6414cdfe Merge pull request #116792 from pacoxu/fix-safe-sysctl-windows
safe-sysctl: skip checking for windows
2023-03-21 17:39:59 -07:00
vinay kulkarni
f41702b8d2 Return updatedPod if resize upon successful checkpointing of allocated resources 2023-03-22 00:24:00 +00:00
Paco Xu
e154b73535 safe-sysctl: skip checking for windows 2023-03-22 07:40:29 +08:00
Claudiu Belu
c68bc27f73 kubelet: Read DNS Config options from file for Windows
A previous commit added the capability to read the DNS configuration options
from a Windows host, while removing the capability to read from a resolv.conf-like
file.

This commit addresses this issue: if the given ``--resolv-conf`` option is not set to
``Host``, it will consider it as a file, preserving the previous behavior.
2023-03-21 22:21:57 +00:00
vinay kulkarni
d753893260 Do not modify original pod object when processing pod resource resize 2023-03-18 17:57:25 +00:00
Dan Winship
db5590a194 Remove sig-network-driver-approvers alias
This previously existed so the dockershim networking stuff had
different approvers than general sig-network stuff, but that code is
gone now, and the remaining kubelet DNS code should be owned by
sig-network in general (and sig-node via inheritance).
2023-03-18 11:29:38 -04:00
Sergey Kanzhelev
eb60dce33b deprecate ExperimentalHostUserNamespaceDefaulting 2023-03-17 22:07:25 +00:00
vinay kulkarni
358474b71d Explicitly return from checkpoint update failures. SyncPod will retry 2023-03-17 18:00:04 +00:00
Paco Xu
5134520a3b add lock in volume manager reconciler to avoid data race
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2023-03-17 21:29:10 +08:00
vinay kulkarni
f66e8848ee Fix pod object update that may cause data race 2023-03-17 08:50:52 +00:00
Paco Xu
7afcfe1826 kubelet: use filepath.Clean before init, validate it in setupDataDirs 2023-03-17 15:45:39 +08:00
Clayton Coleman
d25572c389 kubelet: HandlePodCleanups takes an extra sync to restart pods
HandlePodCleanups is responsible for restarting pods that are no
longer running (usually due to delete and recreation with the same
UID in quick succession). We have to filter the list of pods to
restart from podManager to get the list of admitted pods, which
uses filterOutInactivePods on the kubelet. That method excludes
pods the pod worker has already terminated. Since a restarted
pod will be in the terminated state before HandlePodCleanups
calls SyncKnownPods, we have to call filterOutInactivePods after
SyncKnownPods, otherwise the to-be-restarted pod is ignored and
we have to wait for the next houskeeping cycle to restart it.

Since static pods are often critical system components, this
extra 2s wait is undesirable and we should restart as soon as
we can. Add a failing test that passes after we move the filter
call after SyncKnownPods.
2023-03-16 15:18:44 -06:00
Michal Wozniak
3d68f362c3 Give terminal phase correctly to all pods that will not be restarted 2023-03-16 21:25:29 +01:00
Clayton Coleman
58d1dc669f kubelet: Remove status manager channel
The status manager channel forces all container status to be
processed, even if multiple updates are generated in succession.
Instead of queueing the updates, just remember which ones changed
and process them in a batch. This should reduce QPS load from
the Kubelet for status, reduce latency of status propagation to
the API in general, and is easier to reason about.

This also prevents status from being lost when the channel is
full - all updates sent by SetPodStatus are guaranteed to be
recorded. Changing to remove the channel allows us to set a
marker flag when the pod worker state machine completes that
avoids the status manager having to call into the pod worker
directly.
2023-03-16 21:22:43 +01:00
Dan Winship
7605163620 Split up PreferNodeIP into legacy and non-legacy versions
Though not obvious as currently written, PreferNodeIP() has different
semantics with legacy and external cloud providers, since one kind of
node IP value never gets passed in the external cloud provider case.
Split it into two functions to make this clearer (and to prepare for
adding new external-cloud-only semantics, and to make it clearer that
some of the code can be deleted when legacy cloud providers go away).
2023-03-15 14:50:17 -04:00
Ed Bartosh
1aeec10efb DRA: get rid of unneeded loops over pod containers 2023-03-15 09:41:30 +02:00
Kubernetes Prow Robot
37937bb227 Merge pull request #110566 from claudiubelu/unittests-5
Adds Pod DNS Policies support for Windows pods
2023-03-14 23:54:14 -07:00
Kubernetes Prow Robot
74123a7341 Merge pull request #116621 from moshe010/dra-lock
kubelet dra: add lock to addCDIDevices
2023-03-14 19:27:28 -07:00
Kubernetes Prow Robot
815b1bf0d8 Merge pull request #116558 from klueska/update-dra-kubeletplugin-v1alpha2
Update kubeletplugin API for DRA to v1alpha2
2023-03-14 19:27:06 -07:00
Kubernetes Prow Robot
9ddf1a02bd Merge pull request #116504 from vinaykul/restart-free-pod-vertical-scaling-kubeletonly-fix
Fix null pointer access in doPodResizeAction for kubeletonly mode
2023-03-14 19:26:59 -07:00
Kubernetes Prow Robot
ae36991498 Merge pull request #116332 from klueska/extend-resourceclaimstatus
Update resource.AllocationResult with a slice of ResourceHandlers
2023-03-14 19:26:50 -07:00
Kubernetes Prow Robot
9053b5dc2c Merge pull request #116119 from vinaykul/restart-free-pod-vertical-scaling-fixes
Restructure resize policy naming and set default resize policy values
2023-03-14 19:26:42 -07:00
Kevin Klues
579295e727 Update kubeletplugin API for DynamicResourceAllocation to v1alpha2
This PR makes the NodePrepareResources() and NodeUnprepareResource()
calls of the kubeletplugin API for DynamicResourceAllocation
symmetrical. It wasn't clear how one would use the set of CDIDevices
passed back in the NodeUnprepareResource() of the v1alpha1 API, and the
new API now passes back the full ResourceHandle that was originally
passed to the Prepare() call. Passing the ResourceHandle is strictly
more informative and a plugin could always (re)derive the set of
CDIDevice from it.

This is a breaking change, but this release is scheduled to break
multiple APIs for DynamicResourceAllocation, so it makes sense to do
this now instead of later.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-14 23:09:44 +00:00
Moshe Levi
ffb07d1e78 kubelet dra: add lock to addCDIDevices
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-15 00:50:45 +02:00
Kevin Klues
74d634a028 Update kubelet support for recent changes to resource.k8s.io/v1alpha2
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-14 22:34:18 +00:00
Kubernetes Prow Robot
715e957084 Merge pull request #115374 from pacoxu/add-net.ipv4.ip_local_reserved_ports
add net.ipv4.ip_local_reserved_ports to safe sysctls
2023-03-14 15:14:14 -07:00
Claudiu Belu
f335812719 unittests: Fixes unit tests for Windows (part 5)
Currently, there are some unit tests that are failing on Windows due to
various reasons:

- getHostDNSConfig is reading a resolv.conf file. However, we don't have
  that on Windows. Instead, we can get the DNS server list and the DNS
  suffix list from Windows itself.

On Windows, getHostDNSConfig will now return the host's DNS configuration
if the given resolverConfig is "Host". If it's not "Host" or an empty string,
an error will be returned.

Based on the code from kubernetes/test/images/agnhost/dns/dns_windows.go
2023-03-14 22:11:29 +00:00
Francesco Romani
6b4ffdb9f7 node: re-implement Localendpoint on windows
this will allows us to move forward with the podresources
endpoint GA graduation.

xref: https://github.com/kubernetes/kubernetes/issues/78628

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-14 22:39:56 +01:00
Francesco Romani
195fc2f516 kubelet: podresources: rename variable
on unix, the podresources endpoint is a unix domain socket;
on windows, the podresources endpoint is a named pipe;
rename the variables to convey this fact. No changes in behavior.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-14 22:39:54 +01:00
Kubernetes Prow Robot
34acfb877a Merge pull request #116546 from marosset/winstats-10-seconds
Updating perfCounterUpdatePeriod for Windows to 10 seconds
2023-03-14 14:13:11 -07:00
Kubernetes Prow Robot
28fa3cbbf1 Merge pull request #115847 from moshe010/pod-resource-api-dra-upstream
Extend the PodResources API to include resources allocated by DRA
2023-03-14 14:12:26 -07:00
Kubernetes Prow Robot
89a9c0c8bb Merge pull request #96120 from LorbusChris/kubelet-journal-logs
KEP 2258: add node log query
2023-03-14 14:12:14 -07:00
vinay kulkarni
86efc8bd79 Add isInPlacePodVerticalScalingAllowed for restart check block 2023-03-14 20:30:02 +00:00
vinay kulkarni
5b2682ac04 Make in-place resize exclusion conditions (such as static pods) very obvious 2023-03-14 19:37:35 +00:00
Kubernetes Prow Robot
6a111bebe2 Merge pull request #116377 from kinvolk/rata/userns
KEP-127: user namespace support for stateless pods
2023-03-14 10:40:43 -07:00
Kubernetes Prow Robot
49649c89ea Merge pull request #113584 from yangjunmyfm192085/volume-contextual-logging
volume: use contextual logging
2023-03-14 10:40:16 -07:00
Moshe Levi
67a71c0bd7 kubelet podresources: add unit tests for DyanmicResource and Get method
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
2a568bcfc8 kubelet podresources: extend List to support Dynamic Resources and implement Get API
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
9c57613912 Add ClassName to chekpoint state and in-memory cache
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
71d6e4d53c kubelet metrics: add pod resources get metrics
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:03 +02:00
Francesco Romani
5e03998991 kubelet: podresources: pack parameters in a struct
To enable rate limiting, needed for GA graduation,
we need to pass more parameters to the already crowded
`ListenAndServePodresources` function.

To tidy up a bit, pack the parameters in a helper struct,
with no intended changes in behavior.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-14 19:33:01 +02:00
Kubernetes Prow Robot
e192a7dbcc Merge pull request #116330 from SataQiu/clean-kubelet-20230307
Followup 112643: remove residual code associated with DynamicKubeletConfig
2023-03-14 09:39:51 -07:00
Kubernetes Prow Robot
cb5ad1e044 Merge pull request #115576 from silenceshell/fix-fake-os-files-concurrent-map-write
fix concurrent-map-write of FakeOS.Create
2023-03-14 09:39:26 -07:00
Kubernetes Prow Robot
8bf7805e05 Merge pull request #115397 from sourcelliu/boottime
Add test for pkg/kubelet/util
2023-03-14 09:39:18 -07:00
Kubernetes Prow Robot
af97bb9ac5 Merge pull request #115053 from qingwave/remove-unuse-code
Remove unuse code in pkg/kubelet/util
2023-03-14 09:39:10 -07:00
Kubernetes Prow Robot
898143a96a Merge pull request #114904 from TommyStarK/kubelet/pod_startup_latency_tracker
kubelet: fix recording when pulling image did finish
2023-03-14 09:39:02 -07:00
Kubernetes Prow Robot
aa49f001bc Merge pull request #114701 from goushicui/vlm
update comment
2023-03-14 09:38:53 -07:00