Commit Graph

147 Commits

Author SHA1 Message Date
Sascha Grunert
e663285ccf
Add image_id to CRI Container message
This new field allows fixing the kubelet image garbage collection in
container runtimes. The `image_ref` has been historically used by
container runtimes to reference images by digest.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-02-28 10:05:07 +01:00
Kubernetes Prow Robot
d311ce0435
Merge pull request #123343 from haircommander/image-gc-e2e-2
KEP-4210: add e2e tests and add small fix for ImageGCMaxAge
2024-02-20 10:48:15 -08:00
Peter Hunt
a8ea936364 image gc: don't start until max age has passed since kubelet started
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-02-19 14:44:20 -05:00
Peter Hunt
c8b4d8ebed kubelet: add reason field to image gc metric
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-02-16 16:02:41 -05:00
Kubernetes Prow Robot
14f8f5519d
Merge pull request #121719 from ruiwen-zhao/metric-size
Add image pull duration metric with bucketed image size
2024-02-13 16:23:50 -08:00
ruiwen-zhao
0f5cf6c1cd Add image pull duration metric with bucketed image size
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2024-02-08 00:30:31 +00:00
Kubernetes Prow Robot
c3114b2789
Merge pull request #119652 from lixd/kubelet_image_gc
fix kubelet image gc
2023-11-13 17:36:15 +01:00
Kevin Hannon
26923b91e8 implementation of split disk kep 2023-11-01 14:46:33 -04:00
kiashok
252e1d2dfe Imagepull per runtime class alpha release changes
This commit does the following:
1. Add RuntimeClassInImageCriApi feature gate
2. Extend pkg/kubelet/container Image struct
3. Adds runtimeHandler string in the following CRI calls
   i.   ImageStatus
   ii.  PullImageRequest
   iii.  RemoveImage

Signed-off-by: kiashok <kiashok@microsoft.com>
2023-10-31 15:52:46 -07:00
Peter Hunt
49c947ba15 metrics: add and use ImageGarbageCollectedTotal
to help find MaxAge thresholds and detect image addition/removal thrashing

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-10-20 12:23:31 -04:00
Peter Hunt
d8a5cd59c0 kubelet/images: add unit test for MaxAge
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-10-20 12:23:31 -04:00
Peter Hunt
914aa746c1 kubelet/images: add and use freeOldImages function
to free images older than configured ImageGCMaximumAge

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-10-20 12:23:31 -04:00
Peter Hunt
d992ea4b30 kubelet: add and use ImageMaximumGCAge in KubeletConfiguration
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-10-20 12:23:31 -04:00
Peter Hunt
28f335a339 kubelet/images: refactor image gc unit tests
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-10-16 16:34:29 -04:00
Peter Hunt
e22ebf13a9 kubelet/images: refactor freeImage and imagesInEvictionOrder
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2023-10-16 16:34:29 -04:00
lixd
9533cfe2ee fix: add unit test
Signed-off-by: lixd <xueduan.li@gmail.com>
2023-09-19 15:49:12 +08:00
lixd
bad0593a68 fix kubelet image gc
Signed-off-by: lixd <xueduan.li@gmail.com>
2023-09-19 15:48:59 +08:00
Sohan Kunkerkar
d5690f12b6 pkg/kubelet: allow sandbox image pinning from CRI
As part of this change, the code responsible for managing the sandbox
image within the kubelet has been removed. Previously, the kubelet used
to prevent sandbox image from the garbage collection process. However,
with this update, the responsibility of managing the sandbox containers
has been shifted to the CRI implementation itself. By allowing sandbox
image pinning from CRI, we improve efficiency and simplify the kubelet's
interaction with the container runtime. As a result, the kubelet can now
rely on the container runtime's built-in mechanisms for sandbox container
lifecycle management.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2023-08-29 15:34:51 -04:00
Kubernetes Prow Robot
74c66a8b39
Merge pull request #116231 from kannon92/kubelet-image-cleanup
Using parsers in applyDefaultImageTag and adding error test cases.
2023-05-23 10:24:27 -07:00
kannon92
0819d34204 using parsers in applyDefaultImageTag 2023-05-15 15:53:47 +00:00
Sascha Grunert
aa405c8aac
Allow runtimes to provide additional context on CRI pull errors
Right now container runtimes have no way to provide additional context
to the pull errors. We now loosen the constraints and check for
additional messages after the actual CRI errors, which allows to enrich
the verbosity of the warning events, for example:

```
Warning  Failed     2s (x3 over 43s)   kubelet            Failed to pull image "localhost:5000/foo": RegistryUnavailable: pinging container registry localhost:5000: Get "http://localhost:5000/v2/": dial tcp [::1]:5000: connect: connection refused
```

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-05-15 09:08:14 +02:00
Sascha Grunert
63b69dd50c
Add support for CRI ErrSignatureValidationFailed
This allows container runtimes to propagate an image signature
verification error through the CRI and display that to the end user
during image pull. There is no other behavioral difference compared to a
regular image pull failure.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-05-04 08:34:11 +02:00
Sascha Grunert
4cdfe600e0
Fix image pull error type ErrRegistryUnavailable
The current error comparison `imagePullResult.err ==
ErrRegistryUnavailable` will never work with any remote runtime, because
we produce gRPC errors which wrap a code and a description, like:

```
rpc error: code = Unknown desc = This is the error description
```

To be able to check custom error types from `pkg/kubelet/images/types.go`,
we now strip the code if the status is unknown on image pull.

Beside that, we use a string comparison to check against
`ErrRegistryUnavailable.Error()`, because validating them via the
`errors` package is not yet supported by grpc-go:
https://github.com/grpc/grpc-go/issues/3616

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-04-26 11:02:47 +02:00
Paco Xu
c042837a76 truncate the precision at a millisecond for image pull event message 2023-04-13 15:56:16 +08:00
Kubernetes Prow Robot
898143a96a
Merge pull request #114904 from TommyStarK/kubelet/pod_startup_latency_tracker
kubelet: fix recording when pulling image did finish
2023-03-14 09:39:02 -07:00
Vadim Rutkovsky
556d774945 kubelet: create top-level traces for pod sync and GC
This starts new top level OpenTelemetry spans every time syncPod or image / container GC is invoked
2023-03-11 10:42:14 +01:00
TommyStarK
951decd1e6 kubelet: fix recording when pulling image did finish
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-03-02 20:21:35 +01:00
ruiwen-zhao
572e6e0ffb Add MaxParallelImagePulls support
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-03-02 03:57:59 +00:00
Claudiu Belu
ec753fcb55 unittests: Fixes unit tests for Windows (part 6)
Currently, there are some unit tests that are failing on Windows due to
various reasons:

- On Windows, consecutive time.Now() calls may return the same timestamp, which would cause
  the TestFreeSpaceRemoveByLeastRecentlyUsed test to flake.
- tests in kuberuntime_container_windows_test.go fail on Nodes that have fewer than 3 CPUs,
  expecting the CPU max set to be more than 100% of available CPUs, which is not possible.
- calls in summary_windows_test.go are missing context.
- filterTerminatedContainerInfoAndAssembleByPodCgroupKey will filter and group container
  information by the Pod cgroup key, if it exists. However, we don't have cgroups on Windows,
  thus we can't make the same assertions.
2023-01-31 11:49:26 +00:00
Paco Xu
41902853fd image pull event include duration with waiting
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-11-06 13:42:44 +08:00
Paco Xu
054ceab58d kubelet: make the image pull time more accurate in event
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-11-06 13:42:44 +08:00
David Ashpole
64af1adace
Second attempt: Plumb context to Kubelet CRI calls (#113591)
* plumb context from CRI calls through kubelet

* clean up extra timeouts

* try fixing incorrectly cancelled context
2022-11-05 06:02:13 -07:00
Artur Żyliński
b0fac15cd6 Make the interface local to each package 2022-10-26 11:28:18 +02:00
Artur Żyliński
9f31669a53 New histogram: Pod start SLI duration 2022-10-26 11:28:17 +02:00
Kubernetes Prow Robot
244c035b87
Merge pull request #110263 from claudiubelu/unittests
unittests: Fixes unit tests for Windows
2022-10-25 14:50:34 -07:00
Claudiu Belu
6f2eeed2e8 unittests: Fixes unit tests for Windows
Currently, there are some unit tests that are failing on Windows due to
various reasons:

- config options not supported on Windows.
- files not closed, which means that they cannot be removed / renamed.
- paths not properly joined (filepath.Join should be used).
- time.Now() is not as precise on Windows, which means that 2
  consecutive calls may return the same timestamp.
- different error messages on Windows.
- files have \r\n line endings on Windows.
- /tmp directory being used, which might not exist on Windows. Instead,
  the OS-specific Temp directory should be used.
- the default value for Kubelet's EvictionHard field was containing
  OS-specific fields. This is now moved, the field is now set during
  Kubelet's initialization, after the config file is read.
2022-10-25 23:46:56 +03:00
Todd Neal
9e83c2d7eb reword image gc failure log
Reword the log so that it sounds less like a failure of kubelet and points
towards the root cause of not enough data being eligible to free.
2022-09-20 21:57:59 -05:00
Davanum Srinivas
50bea1dad8
Move from k8s.gcr.io to registry.k8s.io
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-05-31 10:16:53 -04:00
Wojciech Tyczyński
6088fe4221 Remove no-longer used selflink code from kubelet 2022-01-14 10:38:23 +01:00
Sascha Grunert
de37b9d293
Make CRI v1 the default and allow a fallback to v1alpha2
This patch makes the CRI `v1` API the new project-wide default version.
To allow backwards compatibility, a fallback to `v1alpha2` has been added
as well. This fallback can either used by automatically determined by
the kubelet.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-11-17 11:05:05 -08:00
Skyler Clark
e9766c2b81
adds pinned field to imageRecords 2021-11-03 14:47:37 -04:00
Skyler Clark
d3ae0a381a
prevents garbage collection from removing pinned images 2021-11-02 14:43:02 -04:00
Kubernetes Prow Robot
cab54856f1
Merge pull request #104933 from vikramcse/automate_mockery
conversion of tests from mockery to mockgen
2021-09-30 18:33:21 -07:00
vikram Jadhav
0de4397490 mockery to mockgen conversion 2021-09-25 16:15:08 +00:00
wojtekt
53ce79a18a Migrate to k8s.io/utils/clock in pkg/kubelet 2021-09-10 12:20:09 +02:00
Kubernetes Prow Robot
5ead6af84e
Merge pull request #99994 from AfrouzMashayekhi/sl-cmd-kubelet
Migrate cmd/kubelet and pkg/kubelet/cadvisor , pkg/kubelet/cri/remote/util , pkg/kubelet/images to structured logging
2021-03-16 14:49:56 -07:00
Elana Hashman
10976cbb43
Migrate image_gc_manager.go to structured logs 2021-03-15 12:39:42 -07:00
afrouz
8f2e927b4a Migrate cmd/kubelet, image, cadvisor and cri to structured logging 2021-03-09 23:12:10 +03:30
Kubernetes Prow Robot
a4025a8462
Merge pull request #98986 from gjkim42/fix-runtime-assert
kubelet: Make the test fail if (*FakeRuntime).Assert fails
2021-03-04 18:34:33 -08:00
Benjamin Elder
56e092e382 hack/update-bazel.sh 2021-02-28 15:17:29 -08:00