Commit Graph

10612 Commits

Author SHA1 Message Date
Moshe Levi
ffb07d1e78 kubelet dra: add lock to addCDIDevices
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-15 00:50:45 +02:00
Kubernetes Prow Robot
715e957084 Merge pull request #115374 from pacoxu/add-net.ipv4.ip_local_reserved_ports
add net.ipv4.ip_local_reserved_ports to safe sysctls
2023-03-14 15:14:14 -07:00
Kubernetes Prow Robot
34acfb877a Merge pull request #116546 from marosset/winstats-10-seconds
Updating perfCounterUpdatePeriod for Windows to 10 seconds
2023-03-14 14:13:11 -07:00
Kubernetes Prow Robot
28fa3cbbf1 Merge pull request #115847 from moshe010/pod-resource-api-dra-upstream
Extend the PodResources API to include resources allocated by DRA
2023-03-14 14:12:26 -07:00
Kubernetes Prow Robot
89a9c0c8bb Merge pull request #96120 from LorbusChris/kubelet-journal-logs
KEP 2258: add node log query
2023-03-14 14:12:14 -07:00
Kubernetes Prow Robot
6a111bebe2 Merge pull request #116377 from kinvolk/rata/userns
KEP-127: user namespace support for stateless pods
2023-03-14 10:40:43 -07:00
Kubernetes Prow Robot
49649c89ea Merge pull request #113584 from yangjunmyfm192085/volume-contextual-logging
volume: use contextual logging
2023-03-14 10:40:16 -07:00
Moshe Levi
67a71c0bd7 kubelet podresources: add unit tests for DyanmicResource and Get method
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
2a568bcfc8 kubelet podresources: extend List to support Dynamic Resources and implement Get API
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
9c57613912 Add ClassName to chekpoint state and in-memory cache
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
71d6e4d53c kubelet metrics: add pod resources get metrics
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:03 +02:00
Francesco Romani
5e03998991 kubelet: podresources: pack parameters in a struct
To enable rate limiting, needed for GA graduation,
we need to pass more parameters to the already crowded
`ListenAndServePodresources` function.

To tidy up a bit, pack the parameters in a helper struct,
with no intended changes in behavior.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-14 19:33:01 +02:00
Kubernetes Prow Robot
e192a7dbcc Merge pull request #116330 from SataQiu/clean-kubelet-20230307
Followup 112643: remove residual code associated with DynamicKubeletConfig
2023-03-14 09:39:51 -07:00
Kubernetes Prow Robot
cb5ad1e044 Merge pull request #115576 from silenceshell/fix-fake-os-files-concurrent-map-write
fix concurrent-map-write of FakeOS.Create
2023-03-14 09:39:26 -07:00
Kubernetes Prow Robot
8bf7805e05 Merge pull request #115397 from sourcelliu/boottime
Add test for pkg/kubelet/util
2023-03-14 09:39:18 -07:00
Kubernetes Prow Robot
af97bb9ac5 Merge pull request #115053 from qingwave/remove-unuse-code
Remove unuse code in pkg/kubelet/util
2023-03-14 09:39:10 -07:00
Kubernetes Prow Robot
898143a96a Merge pull request #114904 from TommyStarK/kubelet/pod_startup_latency_tracker
kubelet: fix recording when pulling image did finish
2023-03-14 09:39:02 -07:00
Kubernetes Prow Robot
aa49f001bc Merge pull request #114701 from goushicui/vlm
update comment
2023-03-14 09:38:53 -07:00
Kubernetes Prow Robot
b623fcc181 Merge pull request #114634 from TommyStarK/unit-tests/pkg-kubelet-cloudresource
kubelet/cloudresource: Improving test coverage
2023-03-14 09:38:45 -07:00
kunkunhaohao
a772691165 Update pod_container_manager_linux.go (#114598)
* Update pod_container_manager_linux.go

This is a simple optimization to reduce repeated invoking of the GetPodContainerName function.

* Update pod_container_manager_linux.go

将podContainerName, _ := m.GetPodContainerName(pod)更靠近使用podcontainerName变量的位置
2023-03-14 09:38:36 -07:00
Aravindh Puthiyaparambil
d12696c20f kubelet: Expose simple journald and Get-WinEvent shims on the logs endpoint
Provide an administrator a streaming view of journal logs on Linux
systems using journalctl, and event logs on Windows systems using the
Get-WinEvent PowerShell cmdlet without them having to implement a client
side reader.

Only available to cluster admins.

The implementation for journald on Linux was originally done by Clayton
Coleman.

Introduce a heuristics approach to query logs

The logs query for node objects will follow a heuristics approach
when asked to query for logs from a service. If asked to get the
logs from a service foobar, it will first check if foobar logs to the
native OS service log provider. If unable to get logs from these, it
will attempt to get logs from /var/foobar, /var/log/foobar.log or
/var/log/foobar/foobar.log in that order.
The logs sub-command can also directly serve a file if the query looks
like a file.

Co-authored-by: Clayton Coleman <ccoleman@redhat.com>
Co-authored-by: Christian Glombek <cglombek@redhat.com>
2023-03-14 08:54:36 -07:00
Aravindh Puthiyaparambil
26279a5282 kubelet: Add validation for EnableNodeLogQuery 2023-03-14 08:45:20 -07:00
Aravindh Puthiyaparambil
aadad09410 api: Add EnableNodeLogQuery to KubeletConfiguration
Added EnableNodeLogQuery field to kubelet/apis/config/types.go and
staging/src/k8s.io/kubelet/config/v1beta1/types.go, then executed.
 `hack/update-codegen.sh`.

This new field will default to off and will need to be explicitly
enabled in addition to the NodeLogQuery gate to use the feature.
2023-03-14 08:45:19 -07:00
Kubernetes Prow Robot
a9008b502d Merge pull request #116577 from jsafrane/fix-standalone-mode
Fix volume reconstruction in standalone mode
2023-03-14 08:37:02 -07:00
Kubernetes Prow Robot
204a9a1f17 Merge pull request #116459 from ffromani/podresources-ratelimit-minimal
add podresources DOS prevention using rate limit
2023-03-14 08:36:45 -07:00
Kubernetes Prow Robot
2bd69db8d7 Merge pull request #116351 from vinaykul/restart-free-pod-vertical-scaling-kubelet-fix-followup
Initialize pod resource allocation checkpoint manager to noop
2023-03-14 08:36:37 -07:00
Jan Safranek
c4f8c3f628 Fix volume reconstruction in standalone mode
Kubelet in standalone mode won't have kubeclient, it cannot get node.status
and get devices from it. Such a kubelet cannot mount attachable volumes
anyway.
2023-03-14 12:32:21 +01:00
Kubernetes Prow Robot
c8f001d798 Merge pull request #114504 from vrutkovs/tracing-kubelet-toplevel
kubelet: create top-level traces for pod sync and GC
2023-03-14 03:12:16 -07:00
Patrick Ohly
29941b8d3e api: resource.k8s.io v1alpha1 -> v1alpha2
For Kubernetes 1.27, we intend to make some breaking API changes:
- rename PodScheduling -> PodSchedulingHints (https://github.com/kubernetes/kubernetes/issues/114283)
- extend ResourceClaimStatus (https://github.com/kubernetes/enhancements/pull/3802)

We need to switch from v1alpha1 to v1alpha2 for that.
2023-03-14 07:52:03 +01:00
Kubernetes Prow Robot
dfc63f218c Merge pull request #116557 from smarterclayton/sync_known_race
kubelet: TestSyncKnownPods should not race
2023-03-13 22:27:24 -07:00
Kubernetes Prow Robot
e998b09bc4 Merge pull request #116555 from bart0sh/PR106-dra-plugin-constant
DRA: add constant PluginClientTimeout
2023-03-13 17:51:31 -07:00
杨军10092085
361e4ff0fa volume: use contextual logging 2023-03-14 08:37:30 +08:00
Kubernetes Prow Robot
4641604be8 Merge pull request #116513 from klueska/dra-update-multipleplugins
Update DRAManager to allow multiple plugins to process a single claim
2023-03-13 16:49:14 -07:00
Ed Bartosh
50cb3268b6 DRA: add constant PluginClientTimeout 2023-03-14 00:37:43 +02:00
Clayton Coleman
71a36529d1 kubelet: TestSyncKnownPods should not race
SyncKnownPods began triggering UpdatePod() for pods that have been
orphaned by desired config to ensure pods run to termination. This
test reads a mutex protected value while pod workers are running
in the background and as a consequence triggers a data race.

Wait for the workers to stabilize before reading the value. Other
tests validate that the correct sync events are triggered (see
kubelet_pods_test.go#TestKubelet_HandlePodCleanups for full
verification of this behavior).

It is slightly concerning that I was unable to recreate the race
locally even under stress testing, but I cannot identify why.
2023-03-13 16:24:37 -06:00
Rodrigo Campos
ec0410a266 kubelet: Move userns manager to its own package
To that end, we need to add one kubelet getter listPodsFromDisk(). Other
than that, it is a pretty trivial move.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-03-13 22:28:04 +01:00
Rodrigo Campos
16d76f6813 kubelet: Don't reserve mapping for userns phase II
Latest changes to KEP-127 removed that phase, so let's stop reserving
those IDs for that.

While we are there, we replace 0 for 0*65536 as before we had a bug that
we were not multiplying the index, to avoid bugs in the future.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-03-13 22:28:04 +01:00
Rodrigo Campos
8af3cce7fe kubelet: remove GetHostIDsForPod()
Now KEP-127 relies on idmap mounts to do the ID translation and we won't
do any chowns in the kubelet.

This patch just removes the usage of GetHostIDsForPod() in
operationexecutor to do the chown, and also removes the
GetHostIDsForPod() method from the kubelet volume interface.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-03-13 22:28:03 +01:00
Giuseppe Scrivano
9075404dc4 kubelet: use idmapped mounts for all volumes
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-03-13 22:28:03 +01:00
Mark Rossetti
b3a67abec7 Updating perfCounterUpdatePerioud for Windows to 10 seconds
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2023-03-13 12:13:24 -07:00
Kubernetes Prow Robot
3106a5c553 Merge pull request #116301 from andyzhangx/remove-azuredisk-code
Remove Azure disk in-tree storage plugin
2023-03-13 10:38:48 -07:00
Kevin Klues
685688c703 Update DRAManager to allow multiple plugins to process a single claim
Right now, the v1alpha1 API only passes enough information for one plugin to
process a claim, but the v1alpha2 API will allow for multiple plugins to
process a claim. This commit prepares the code for this upcoming change.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-13 12:52:41 +00:00
Kevin Klues
569ed33d78 Add additional tests to DRAManager checkpointing
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-13 12:52:41 +00:00
Kevin Klues
fd7370b84d Update DRAManager checkpoint to store a map for CDIDevices
The key of the map is the KubeletPluginName where the CDIDevices originate.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-13 12:52:41 +00:00
Kevin Klues
273a8ffad1 Rename CdiDevices to CDIDevices in dramanager checkpoint
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-13 12:52:41 +00:00
Claudiu Belu
e3edf13486 unittests: Adds winstats unittests
The module pkg/kubelet/winstats has almost no coverage for Windows. This
commit adds unit tests to cover the mentioned module.
2023-03-13 12:08:15 +00:00
Saza
d34b0275a3 dynamic resource allocation: add timeouts for communiction with plugin (#114844)
* add timeouts for communication with dra plugin

* move timeout constant to k8s.io/kubernetes/pkg/kubelet/cm/util

* move settings of timeout to pkg/kubelet/plugin/dra/plugin/client.go

* remove timeout constant
2023-03-13 04:34:56 -07:00
John Kwiatkoski
69465d2949 Adding test coverage for NewPodContainerManager() (#110220) 2023-03-13 02:08:44 -07:00
vinay kulkarni
1e01358ea2 Initialize pod resource allocation checkpoint manager to noop
This avoids accidentally introducing null pointer access if manager
functions were called outside of InPlacePodVerticalScaling feature gate.
2023-03-13 00:16:44 +00:00
Kubernetes Prow Robot
3c6e419cc3 Merge pull request #116450 from vinaykul/restart-free-pod-vertical-scaling-api
Rename ContainerStatus.ResourcesAllocated to ContainerStatus.AllocatedResources
2023-03-12 16:06:40 -07:00