Commit Graph

50446 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
b5b21717ca Merge pull request #126427 from pacoxu/fix-TestUpdateAllocatedResourcesStatus
ignore order of containers status allocated resources
2024-07-29 15:54:07 -07:00
Kubernetes Prow Robot
e8588e6493 Merge pull request #126429 from saschagrunert/kubelet-panic
Fix kubelet cadvisor stats runtime panic
2024-07-29 11:06:07 -07:00
Micah Hausler
a7af830209 Rename kubelet CSR admission feature gate
Retitle the feature to the affirmative ("AllowInsecure...=false") instead of a
double-negative ("Disable$NEWTHING...=false") for clarity

Signed-off-by: Micah Hausler <mhausler@amazon.com>
2024-07-29 10:14:19 -05:00
Sascha Grunert
50e430b3e9 Fix kubelet cadvisor stats runtime panic
Fixing a kubelet runtime panic when the runtime returns incomplete data:

```
E0729 08:17:47.260393    5218 panic.go:115] "Observed a panic" panic="runtime error: index out of range [0] with length 0" panicGoValue="runtime.boundsError{x:0, y:0, signed:true, code:0x0}" stacktrace=<
        goroutine 174 [running]:
        k8s.io/apimachinery/pkg/util/runtime.logPanic({0x33631e8, 0x4ddf5c0}, {0x2c9bfe0, 0xc000a563f0})
                k8s.io/apimachinery/pkg/util/runtime/runtime.go:107 +0xbc
        k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x33631e8, 0x4ddf5c0}, {0x2c9bfe0, 0xc000a563f0}, {0x4ddf5c0, 0x0, 0x10000000043c9e5?})
                k8s.io/apimachinery/pkg/util/runtime/runtime.go:82 +0x5e
        k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000ae08c0?})
                k8s.io/apimachinery/pkg/util/runtime/runtime.go:59 +0x108
        panic({0x2c9bfe0?, 0xc000a563f0?})
                runtime/panic.go:785 +0x132
        k8s.io/kubernetes/pkg/kubelet/stats.(*cadvisorStatsProvider).ImageFsStats(0xc000535d10, {0x3363348, 0xc000afa330})
                k8s.io/kubernetes/pkg/kubelet/stats/cadvisor_stats_provider.go:277 +0xaba
        k8s.io/kubernetes/pkg/kubelet/images.(*realImageGCManager).GarbageCollect(0xc000a3c820, {0x33631e8?, 0x4ddf5c0?}, {0x0?, 0x0?, 0x4dbca20?})
                k8s.io/kubernetes/pkg/kubelet/images/image_gc_manager.go:354 +0x1d3
        k8s.io/kubernetes/pkg/kubelet.(*Kubelet).StartGarbageCollection.func2()
                k8s.io/kubernetes/pkg/kubelet/kubelet.go:1472 +0x58
        k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
                k8s.io/apimachinery/pkg/util/wait/backoff.go:226 +0x33
        k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000add110, {0x3330380, 0xc000afa300}, 0x1, 0xc0000ac150)
                k8s.io/apimachinery/pkg/util/wait/backoff.go:227 +0xaf
        k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000add110, 0x45d964b800, 0x0, 0x1, 0xc0000ac150)
                k8s.io/apimachinery/pkg/util/wait/backoff.go:204 +0x7f
        k8s.io/apimachinery/pkg/util/wait.Until(...)
                k8s.io/apimachinery/pkg/util/wait/backoff.go:161
        created by k8s.io/kubernetes/pkg/kubelet.(*Kubelet).StartGarbageCollection in goroutine 1
                k8s.io/kubernetes/pkg/kubelet/kubelet.go:1470 +0x247
```

This commit fixes panics if:

- `len(imageStats.ImageFilesystems) == 0`
- `len(imageStats.ContainerFilesystems) == 0`
- `imageStats.ImageFilesystems[0].FsId == nil`
- `imageStats.ContainerFilesystems[0].FsId == nil`
- `imageStats.ImageFilesystems[0].UsedBytes == nil`
- `imageStats.ContainerFilesystems[0].UsedBytes == nil`

It also fixes the wrapped `nil` error for the check: `err != nil ||
imageStats == nil` in case that `imageStats == nil`.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-07-29 14:13:47 +02:00
Paco Xu
78d3830d97 ignore order of containers status allocated resources 2024-07-29 16:48:00 +08:00
Dr. Stefan Schimanski
3987d850a4 kube-apiserver/leaderelection/test: clean up controller test
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-29 09:56:39 +02:00
Dr. Stefan Schimanski
b13aab9cf1 kube-apiserver/leaderelection: remove klog noise
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-29 09:56:05 +02:00
Jefftree
f173f0c58c kube-apiserver/leaderelection/tests: fix test case PingTime should be ahead of RenewTime 2024-07-27 17:54:09 +00:00
Dr. Stefan Schimanski
b8045f98a4 kube-apiserver/leaderelection/tests: use fake clock
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-27 17:54:09 +00:00
Dr. Stefan Schimanski
8c971c5c15 kube-apiserver/leaderelection/test: fixing waiting for informer
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-27 17:54:08 +00:00
Dr. Stefan Schimanski
c7a1fa432a Call non-blocking informerFactory.Start synchronously to avoid races
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-27 18:13:09 +02:00
Dr. Stefan Schimanski
87f40441d6 kube-apiserver/leaderelection: remove broken printf
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-26 09:27:05 +02:00
Kubernetes Prow Robot
bee5e03707 Merge pull request #126333 from aroradaman/master
kube-proxy: internal config: fuzz cidr values for unit tests
2024-07-25 15:47:17 -07:00
Kubernetes Prow Robot
c853ca49c3 Merge pull request #126355 from haircommander/fs-quotas-false
set LocalStorageCapacityIsolationFSQuotaMonitoring to false by default
2024-07-25 13:06:11 -07:00
Kubernetes Prow Robot
5f5c02da51 Merge pull request #124012 from Jefftree/le-controller
Coordinated Leader Election
2024-07-25 13:05:53 -07:00
Kubernetes Prow Robot
e9d9a82839 Merge pull request #124101 from haircommander/process_stats-with-pid-fix
kubelet: fix PID based eviction
2024-07-25 11:59:57 -07:00
Peter Hunt
eeae981048 set LocalStorageCapacityIsolationFSQuotaMonitoring to false by default
as the feature relies on UserNamespaces support, which is also off by default.
Having it on by default won't do anything negative, except adding some needless
checks as to whether the pod has hostUsers==true (impossible without the feature gate)

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-07-25 10:11:10 -04:00
Daman Arora
5359098c14 kube-proxy: internal config: fuzz cidr values for unit tests
Signed-off-by: Daman Arora <aroradaman@gmail.com>
2024-07-25 19:20:24 +05:30
Kevin Hannon
3e642aee3f move container fs check so that we only check if system is split 2024-07-24 11:22:23 -04:00
Jefftree
56b278d5d2 fix flake in TestLeaseCandidateCleanup 2024-07-24 14:41:13 +00:00
Jefftree
919e7abe0f update codegen and openapi 2024-07-24 14:41:13 +00:00
Jefftree
0c774d0b1f Change PingTime to be persistent 2024-07-24 14:41:13 +00:00
Dr. Stefan Schimanski
a738daa88a Review feedback: fix context handling in LeaseCandidateGCController
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-24 14:38:13 +00:00
Dr. Stefan Schimanski
15affefcab Review feedback: handle non-kube strategy correctly
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-24 14:38:13 +00:00
Dr. Stefan Schimanski
a64418ba0a Review feedback
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-24 14:38:13 +00:00
Jefftree
42678f1553 regen clients 2024-07-24 14:38:12 +00:00
Jefftree
fac7581640 feedback: leasecandidate clients 2024-07-24 14:38:12 +00:00
Dr. Stefan Schimanski
68226b0501 Review feedback
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-24 14:38:12 +00:00
Jefftree
e0c6987ca8 add gc and improve testing 2024-07-24 14:38:11 +00:00
Jefftree
c47ff1e1a9 CLE controller and client changes 2024-07-24 14:38:11 +00:00
Jefftree
9b16b0dc97 CLE feature gate 2024-07-24 14:38:11 +00:00
Jefftree
e3e56eb1e2 CLE storage and type registration changes 2024-07-24 14:38:11 +00:00
Jefftree
3999b98c88 Coordinated Leader Election Alpha API 2024-07-24 14:38:10 +00:00
Kubernetes Prow Robot
ceb58a4dbc Merge pull request #126323 from saschagrunert/image-volume-runtime-panic
Fix runtime panic in imagevolume `CanSupport` method
2024-07-24 04:57:06 -07:00
Sascha Grunert
a43cc08ffb Fix runtime panic in imagevolume CanSupport method
The following tests are failing right now:

- ci-kubernetes-e2e-ec2-alpha-enabled-default
- ci-kubernetes-e2e-gci-gce-alpha-enabled-default

Because of:

```
goroutine 347 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x33092b0, 0x4d6ed00}, {0x296a7e0, 0x4c20c10})
        k8s.io/apimachinery/pkg/util/runtime/runtime.go:107 +0xbc
k8s.io/apimachinery/pkg/util/runtime.handleCrash({0x33092b0, 0x4d6ed00}, {0x296a7e0, 0x4c20c10}, {0x4d6ed00, 0x0, 0x1000000004400a5?})
        k8s.io/apimachinery/pkg/util/runtime/runtime.go:82 +0x5e
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000517be8?})
        k8s.io/apimachinery/pkg/util/runtime/runtime.go:59 +0x108
panic({0x296a7e0?, 0x4c20c10?})
        runtime/panic.go:770 +0x132
k8s.io/kubernetes/pkg/volume/image.(*imagePlugin).CanSupport(0xc00183d140?, 0xc0006a2600?)
        k8s.io/kubernetes/pkg/volume/image/image.go:52 +0x3
k8s.io/kubernetes/pkg/volume.(*VolumePluginMgr).FindPluginBySpec(0xc0008a1388, 0xc000f7ddb8)
        k8s.io/kubernetes/pkg/volume/plugins.go:637 +0x208
k8s.io/kubernetes/pkg/kubelet/volumemanager/cache.(*desiredStateOfWorld).AddPodToVolume(0xc000517bc0, {0xc000e94a50, 0x24}, 0xc00172b208, 0xc000f7ddb8, {0xc0017892a0, 0xe}, {0xc000a4d6ec, 0x3}, {0xc000978af0, ...})
        k8s.io/kubernetes/pkg/kubelet/volumemanager/cache/desired_state_of_world.go:270 +0xf2
k8s.io/kubernetes/pkg/kubelet/volumemanager/populator.(*desiredStateOfWorldPopulator).processPodVolumes(0xc0003e6700, 0xc00172b208, 0xc00183ddd8)
        k8s.io/kubernetes/pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:319 +0x685
k8s.io/kubernetes/pkg/kubelet/volumemanager/populator.(*desiredStateOfWorldPopulator).findAndAddNewPods(0xc0003e6700)
        k8s.io/kubernetes/pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:204 +0x2dc
k8s.io/kubernetes/pkg/kubelet/volumemanager/populator.(*desiredStateOfWorldPopulator).populatorLoop(0xc0003e6700)
        k8s.io/kubernetes/pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:173 +0x18
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000905eb0?)
        k8s.io/apimachinery/pkg/util/wait/backoff.go:226 +0x33
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00183df70, {0x32d7340, 0xc000a7be60}, 0x1, 0xc0000b2660)
        k8s.io/apimachinery/pkg/util/wait/backoff.go:227 +0xaf
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000f8bf70, 0x5f5e100, 0x0, 0x1, 0xc0000b2660)
        k8s.io/apimachinery/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...)
        k8s.io/apimachinery/pkg/util/wait/backoff.go:161
k8s.io/kubernetes/pkg/kubelet/volumemanager/populator.(*desiredStateOfWorldPopulator).Run(0xc0003e6700, {0x32e3228, 0xc000b3faa0}, 0xc0000b2660)
        k8s.io/kubernetes/pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go:158 +0x1a5
created by k8s.io/kubernetes/pkg/kubelet/volumemanager.(*volumeManager).Run in goroutine 335
        k8s.io/kubernetes/pkg/kubelet/volumemanager/volume_manager.go:286 +0x14f
```

Fixes https://github.com/kubernetes/kubernetes/issues/126317

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-07-24 09:54:03 +02:00
carlory
c4851c64a0 remove volumeoptions from VolumePlugin and BlockVolumePlugin 2024-07-24 14:07:02 +08:00
Kubernetes Prow Robot
57d197fb89 Merge pull request #124430 from AllenXu93/fix-kubelet-restart-notReady
fix node notReady in first sync period after kubelet restart
2024-07-23 21:20:40 -07:00
Kubernetes Prow Robot
5af1710d90 Merge pull request #126243 from SergeyKanzhelev/devicePluginFailures
Implement resource health in pod status (KEP 4680)
2024-07-23 20:12:24 -07:00
Kubernetes Prow Robot
d97cf3a1eb Merge pull request #126303 from bart0sh/PR150-dra-refactor-checkpoint-upstream
DRA: refactor checkpointing
2024-07-23 18:01:53 -07:00
Kubernetes Prow Robot
39a80796b6 Merge pull request #122628 from sanposhiho/pod-smaller-events
add(scheduler/framework): implement smaller Pod update events
2024-07-23 18:01:46 -07:00
Sergey Kanzhelev
62f96d2748 set AllocatedResourcesStatus in the Pod Status 2024-07-24 00:29:35 +00:00
Sergey Kanzhelev
3790ee2fe8 reset fields when the feature gate was not set 2024-07-24 00:29:35 +00:00
Sergey Kanzhelev
2253b53b58 generated files 2024-07-24 00:29:35 +00:00
Sergey Kanzhelev
16e8911fdc add AllocatedResourcesStatus field to ContainerStatus 2024-07-24 00:29:34 +00:00
Kubernetes Prow Robot
fa4b8f32ac Merge pull request #125935 from gjkim42/fix-125880
Terminate restartable init containers ignoring not-started containers
2024-07-23 15:45:11 -07:00
Kubernetes Prow Robot
f93fe412c7 Merge pull request #126281 from saschagrunert/oci-volume-docs
[KEP-4639] Mention that `fsGroupChangePolicy` has no effect
2024-07-23 14:40:14 -07:00
Ed Bartosh
c0d922e786 DRA: Kubelet code cleanup 2024-07-24 00:27:52 +03:00
Ed Bartosh
59555c6a62 DRA: move dra/checkpont/* to dra/state/* 2024-07-24 00:12:10 +03:00
Ed Bartosh
35fbbc5cfd DRA: use crc32.ChecksumIEEE to calculate checkpoint checksum 2024-07-24 00:10:39 +03:00
Ed Bartosh
59daed75d6 DRA: refactor checkpointing
Co-authored-by: Kevin Klues <klueska@gmail.com>
2024-07-24 00:10:30 +03:00