kubernetes

Author	SHA1	Message	Date
carlory	3836d58744	fix handle terminating pvc when kubelet rebuild dsw Signed-off-by: carlory <baofa.fan@daocloud.io>	2025-03-10 18:59:59 +08:00
Richa Banker	19ebee96b2	Add tests	2025-02-10 14:39:06 -08:00
Tim Allclair	4272f7016c	Kubelet server handler cleanup	2025-02-06 11:04:01 -08:00
Kubernetes Prow Robot	d7fc7e30cb	Merge pull request #129519 from kishen-v/automated-cherry-pick-of-#127422-upstream-release-1.31 Automated cherry pick of #127422: Fix Go vet errors for master golang	2025-01-22 11:10:37 -08:00
Aravindh Puthiyaparambil	c94919d68b	kubelet: use env vars in node log query PS command - Use environment variables to pass string arguments in the node log query PS command - Split getLoggingCmd into getLoggingCmdEnv and getLoggingCmdArgs for better modularization	2025-01-13 14:46:05 -08:00
Abhishek Kr Srivastav	9d10ddb060	Fix Go vet errors for master golang Co-authored-by: Rajalakshmi-Girish <rajalakshmi.girish1@ibm.com> Co-authored-by: Abhishek Kr Srivastav <Abhishek.kr.srivastav@ibm.com>	2025-01-08 15:11:34 +05:30
carlory	04f5b20388	kubelet: Fix the volume manager did't check the device mount state in the actual state of the world before marking the volume as detached. It may cause a pod to be stuck in the Terminating state due to the above issue when it was deleted.	2024-12-03 09:47:51 +08:00
Kubernetes Prow Robot	a8a78f0da6	Merge pull request #127212 from SergeyKanzhelev/automated-cherry-pick-of-#126543-upstream-release-1.31 Automated cherry pick of #126543: Restart the init container to not be stuck in created state	2024-09-09 23:15:07 +01:00
Kubernetes Prow Robot	939edc7c6b	Merge pull request #127207 from SergeyKanzhelev/automated-cherry-pick-of-#126343-upstream-release-1.31 Automated cherry pick of #126343: Terminated pod should not be re-admitted	2024-09-09 22:05:54 +01:00
Gunju Kim	fc5d752394	Restart the init container to not be stuck in created state The main sync loop should have created and started the container in one step. If the init container is in the 'created' state, it's likely that the container runtime failed to start it. To prevent the container from getting stuck in the 'created' state, restart it.	2024-09-06 20:00:48 +00:00
Sergey Kanzhelev	8a28b17c3a	succeeded pod is being re-admitted	2024-09-06 18:27:57 +00:00
Gunju Kim	8469207728	Avoid SidecarContainers code path for non-sidecar pods This fixes a regression in the SidecarContainers feature by minimizing the impact of the new code path. Use the old code path for pods without restartable init containers, and apply the new code path only to pods with restartable init containers.	2024-09-06 16:37:09 +00:00
James Sturtevant	2454d8d4c3	Revert "fix: handle socket file detection on Windows" This reverts commit `4060ee60c1`.	2024-09-03 17:40:06 +00:00
Jordan Liggitt	d8da86b16d	Switch DisableNodeKubeProxyVersion back to disabled-by-default This is clearing a stable API field, so the 1 year from announcement to change period applies	2024-08-15 13:16:30 -04:00
Kubernetes Prow Robot	b5b21717ca	Merge pull request #126427 from pacoxu/fix-TestUpdateAllocatedResourcesStatus ignore order of containers status allocated resources	2024-07-29 15:54:07 -07:00
Sascha Grunert	50e430b3e9	Fix kubelet cadvisor stats runtime panic Fixing a kubelet runtime panic when the runtime returns incomplete data: ``` E0729 08:17:47.260393 5218 panic.go:115] "Observed a panic" panic="runtime error: index out of range [0] with length 0" panicGoValue="runtime.boundsError{x:0, y:0, signed:true, code:0x0}" stacktrace=< goroutine 174 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x33631e8, 0x4ddf5c0}, {0x2c9bfe0, 0xc000a563f0}) k8s.io/apimachinery/pkg/util/runtime/runtime.go:107 +0xbc k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x33631e8, 0x4ddf5c0}, {0x2c9bfe0, 0xc000a563f0}, {0x4ddf5c0, 0x0, 0x10000000043c9e5?}) k8s.io/apimachinery/pkg/util/runtime/runtime.go:82 +0x5e k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000ae08c0?}) k8s.io/apimachinery/pkg/util/runtime/runtime.go:59 +0x108 panic({0x2c9bfe0?, 0xc000a563f0?}) runtime/panic.go:785 +0x132 k8s.io/kubernetes/pkg/kubelet/stats.(cadvisorStatsProvider).ImageFsStats(0xc000535d10, {0x3363348, 0xc000afa330}) k8s.io/kubernetes/pkg/kubelet/stats/cadvisor_stats_provider.go:277 +0xaba k8s.io/kubernetes/pkg/kubelet/images.(realImageGCManager).GarbageCollect(0xc000a3c820, {0x33631e8?, 0x4ddf5c0?}, {0x0?, 0x0?, 0x4dbca20?}) k8s.io/kubernetes/pkg/kubelet/images/image_gc_manager.go:354 +0x1d3 k8s.io/kubernetes/pkg/kubelet.(Kubelet).StartGarbageCollection.func2() k8s.io/kubernetes/pkg/kubelet/kubelet.go:1472 +0x58 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?) k8s.io/apimachinery/pkg/util/wait/backoff.go:226 +0x33 k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000add110, {0x3330380, 0xc000afa300}, 0x1, 0xc0000ac150) k8s.io/apimachinery/pkg/util/wait/backoff.go:227 +0xaf k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000add110, 0x45d964b800, 0x0, 0x1, 0xc0000ac150) k8s.io/apimachinery/pkg/util/wait/backoff.go:204 +0x7f k8s.io/apimachinery/pkg/util/wait.Until(...) k8s.io/apimachinery/pkg/util/wait/backoff.go:161 created by k8s.io/kubernetes/pkg/kubelet.(Kubelet).StartGarbageCollection in goroutine 1 k8s.io/kubernetes/pkg/kubelet/kubelet.go:1470 +0x247 ``` This commit fixes panics if: - `len(imageStats.ImageFilesystems) == 0` - `len(imageStats.ContainerFilesystems) == 0` - `imageStats.ImageFilesystems[0].FsId == nil` - `imageStats.ContainerFilesystems[0].FsId == nil` - `imageStats.ImageFilesystems[0].UsedBytes == nil` - `imageStats.ContainerFilesystems[0].UsedBytes == nil` It also fixes the wrapped `nil` error for the check: `err != nil \|\| imageStats == nil` in case that `imageStats == nil`. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2024-07-29 14:13:47 +02:00
Paco Xu	78d3830d97	ignore order of containers status allocated resources	2024-07-29 16:48:00 +08:00
Kubernetes Prow Robot	e9d9a82839	Merge pull request #124101 from haircommander/process_stats-with-pid-fix kubelet: fix PID based eviction	2024-07-25 11:59:57 -07:00
Kevin Hannon	3e642aee3f	move container fs check so that we only check if system is split	2024-07-24 11:22:23 -04:00
carlory	c4851c64a0	remove volumeoptions from VolumePlugin and BlockVolumePlugin	2024-07-24 14:07:02 +08:00
Kubernetes Prow Robot	57d197fb89	Merge pull request #124430 from AllenXu93/fix-kubelet-restart-notReady fix node notReady in first sync period after kubelet restart	2024-07-23 21:20:40 -07:00
Kubernetes Prow Robot	5af1710d90	Merge pull request #126243 from SergeyKanzhelev/devicePluginFailures Implement resource health in pod status (KEP 4680)	2024-07-23 20:12:24 -07:00
Kubernetes Prow Robot	d97cf3a1eb	Merge pull request #126303 from bart0sh/PR150-dra-refactor-checkpoint-upstream DRA: refactor checkpointing	2024-07-23 18:01:53 -07:00
Sergey Kanzhelev	62f96d2748	set AllocatedResourcesStatus in the Pod Status	2024-07-24 00:29:35 +00:00
Kubernetes Prow Robot	fa4b8f32ac	Merge pull request #125935 from gjkim42/fix-125880 Terminate restartable init containers ignoring not-started containers	2024-07-23 15:45:11 -07:00
Ed Bartosh	c0d922e786	DRA: Kubelet code cleanup	2024-07-24 00:27:52 +03:00
Ed Bartosh	59555c6a62	DRA: move dra/checkpont/* to dra/state/*	2024-07-24 00:12:10 +03:00
Ed Bartosh	35fbbc5cfd	DRA: use crc32.ChecksumIEEE to calculate checkpoint checksum	2024-07-24 00:10:39 +03:00
Ed Bartosh	59daed75d6	DRA: refactor checkpointing Co-authored-by: Kevin Klues <klueska@gmail.com>	2024-07-24 00:10:30 +03:00
Kubernetes Prow Robot	107f621462	Merge pull request #126108 from gnufied/changes-volume-recovery Reduce state changes when expansion fails and mark certain failures as infeasible	2024-07-23 13:30:56 -07:00
Kubernetes Prow Robot	fbdfb9d8d9	Merge pull request #126031 from harche/kubelet_cgroupv1_arg KEP-4569: Kubelet option to disable cgroup v1 support	2024-07-23 09:21:11 -07:00
Kubernetes Prow Robot	a4f9910c51	Merge pull request #126014 from PannagaRao/kep-ephemeral-storage-quota pkg/volume/*: Enable quotas in user namespace	2024-07-23 09:21:02 -07:00
Kubernetes Prow Robot	d7194eb370	Merge pull request #124884 from carlory/report-event-when-kubelet-attach-failed report an event to pod if kubelet does attach operation failed	2024-07-23 09:20:43 -07:00
Kubernetes Prow Robot	581a073dc4	Merge pull request #125663 from saschagrunert/oci-volumesource-kubelet [KEP-4639] Add `ImageVolumeSource` implementation	2024-07-22 15:48:33 -07:00
Kubernetes Prow Robot	d21b17264e	Merge pull request #125488 from pohly/dra-1.31 DRA for 1.31	2024-07-22 11:45:55 -07:00
Sascha Grunert	979863d15c	Add `ImageVolumeSource` implementation This patch adds the kubelet implementation of the image volume source feature. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2024-07-22 18:46:46 +02:00
Patrick Ohly	d11b58efe6	DRA kubelet: refactor gRPC call timeouts Some of the E2E node tests were flaky. Their timeout apparently was chosen under the assumption that kubelet would retry immediately after a failed gRPC call, with a factor of 2 as safety margin. But according to `0449cef8fd`, kubelet has a different, higher retry period of 90 seconds, which was exactly the test timeout. The test timeout has to be higher than that. As the tests don't use the gRPC call timeout anymore, it can be made private. While at it, the name and documentation gets updated.	2024-07-22 18:09:34 +02:00
Patrick Ohly	877829aeaa	DRA kubelet: adapt to v1alpha3 API This adds the ability to select specific requests inside a claim for a container. NodePrepareResources is always called, even if the claim is not used by any container. This could be useful for drivers where that call has some effect other than injecting CDI device IDs into containers. It also ensures that drivers can validate configs. The pod resource API can no longer report a class for each claim because there is no such 1:1 relationship anymore. Instead, that API reports claim, API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet itself doesn't extract that information from the claim. Instead, it relies on drivers to report this information when the claim gets prepared. This isolates the kubelet from API changes. Because of a faulty E2E test, kubelet was told to contact the wrong driver for a claim. This was not visible in the kubelet log output. Now changes to the claim info cache are getting logged. While at it, naming of variables and some existing log output gets harmonized. Co-authored-by: Oksana Baranova <oksana.baranova@intel.com> Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>	2024-07-22 18:09:34 +02:00
Patrick Ohly	91d7882e86	DRA: new API for 1.31 This is a complete revamp of the original API. Some of the key differences: - refocused on structured parameters and allocating devices - support for constraints across devices - support for allocating "all" or a fixed amount of similar devices in a single request - no class for ResourceClaims, instead individual device requests are associated with a mandatory DeviceClass For the sake of simplicity, optional basic types (ints, strings) where the null value is the default are represented as values in the API types. This makes Go code simpler because it doesn't have to check for nil (consumers) and values can be set directly (producers). The effect is that in protobuf, these fields always get encoded because `opt` only has an effect for pointers. The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new "request" field. This is considered acceptable because the entire `claims` field in the pod spec is still alpha. The implementation is complete enough to bring up the apiserver. Adapting other components follows.	2024-07-22 18:09:34 +02:00
Itamar Holder	6c1f14c468	unit tests: exclude critical pods from swapping Signed-off-by: Itamar Holder <iholder@redhat.com>	2024-07-22 17:56:52 +03:00
Itamar Holder	532cd5f84c	Exclude critical pods from having swap access Signed-off-by: Itamar Holder <iholder@redhat.com>	2024-07-22 17:56:52 +03:00
Peter Hunt	5fd7219cf4	kubelet/stats: fix pid stats for cadvisor stats provider the process stats aren't correct coming from only the pod stats. They need to be summed for all of the containers, as cadvisor is only reading per pid (per container process) Signed-off-by: Peter Hunt <pehunt@redhat.com>	2024-07-22 10:54:42 -04:00
Kevin Hannon	7d8ba7849b	priority pid tests should match on processes pids 0 process should not be nonzero	2024-07-22 10:54:42 -04:00
David Porter	6e6b2b76a3	test: Update summary test to check for process count The process count is expected to always be >= 1 for pods in the test. Let's check it's >= 1, so we can catch issues if the proecss count is not reported. Signed-off-by: David Porter <david@porter.me> Signed-off-by: Paco Xu <paco.xu@daocloud.io>	2024-07-22 10:54:42 -04:00
David Porter	f58b46cb97	fix process stats Signed-off-by: David Porter <david@porter.me>	2024-07-22 10:54:42 -04:00
PannagaRamamanohara	d16fd6a915	pkg/volume: Use QuotaMonitoring in UserNamespace Enable LocalStorageCapacityIsolationFSQuotaMonitoring only when hostUsers in PodSpec is set to false. Modify unit tests and e2e tests to verify Signed-off-by: PannagaRamamanohara <pbhojara@redhat.com>	2024-07-22 09:43:57 -04:00
Patrick Ohly	b51d68bb87	DRA: bump API v1alpha2 -> v1alpha3 This is in preparation for revamping the resource.k8s.io completely. Because there will be no support for transitioning from v1alpha2 to v1alpha3, the roundtrip test data for that API in 1.29 and 1.30 gets removed. Repeating the version in the import name of the API packages is not really required. It was done for a while to support simpler grepping for usage of alpha APIs, but there are better ways for that now. So during this transition, "resourceapi" gets used instead of "resourcev1alpha3" and the version gets dropped from informer and lister imports. The advantage is that the next bump to v1beta1 will affect fewer source code lines. Only source code where the version really matters (like API registration) retains the versioned import.	2024-07-21 17:28:13 +02:00
Kubernetes Prow Robot	558c9536a1	Merge pull request #123678 from kinvolk/userns-use-kubelet-user-mappings kubelet: Add logs for userns custom mappings parsing	2024-07-20 19:59:57 -07:00
Kubernetes Prow Robot	a8d354bf39	Merge pull request #126122 from HirazawaUi/remove-unused-options kubelet: Remove unused run container options	2024-07-19 18:05:16 -07:00
Kubernetes Prow Robot	14b34fc255	Merge pull request #125834 from tallclair/log-cleanup [kubelet] Cleanup incorrect log about static pod status change	2024-07-19 16:58:54 -07:00

1 2 3 4 5 ...

11471 Commits