Commit Graph

237 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
009a291573
Merge pull request #124677 from HirazawaUi/add-const-ContainerStatusUnknown
kubelet: Use constant replace same value variables of the ContainerStateTerminated Reason field
2024-06-06 17:05:23 -07:00
Shingo Omura
552fd7e850
KEP-3619: Fine-grained SupplementalGroups control (#117842)
* Add `Linux{Sandbox,Container}SecurityContext.SupplementalGroupsPolicy` and `ContainerStatus.user` in cri-api

* Add `PodSecurityContext.SupplementalGroupsPolicy`, `ContainerStatus.User` and its featuregate

* Implement DropDisabledPodFields for PodSecurityContext.SupplementalGroupsPolicy and ContainerStatus.User fields

* Implement kubelet so to wire between SecurityContext.SupplementalGroupsPolicy/ContainerStatus.User and cri-api in kubelet

* Clarify `SupplementalGroupsPolicy` is an OS depdendent field.

* Make `ContainerStatus.User` is initially attached user identity to the first process in the ContainerStatus

It is because, the process identity can be dynamic if the initially attached identity
has enough privilege calling setuid/setgid/setgroups syscalls in Linux.

* Rewording suggestion applied

* Add TODO comment for updating SupplementalGroupsPolicy default value in v1.34

* Added validations for SupplementalGroupsPolicy and ContainerUser

* No need featuregate check in validation when adding new field with no default value

* fix typo: identitiy -> identity
2024-05-29 15:40:29 -07:00
HirazawaUi
3ec13c5e37 remove HashWithoutResources field 2024-05-22 10:01:31 +08:00
HirazawaUi
7a4531c5ba add ContainerStatusUnknown constant 2024-05-03 00:27:19 +08:00
Akihiro Suda
c7f52b34f3
kubelet: KEP-3857: Recursive Read-only (RRO) mounts
See <https://kep.k8s.io/3857>.

An example manifest:
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: rro
spec:
  volumes:
    - name: mnt
      hostPath:
        # tmpfs is mounted on /mnt/tmpfs
        path: /mnt
  containers:
    - name: busybox
      image: busybox
      args: ["sleep", "infinity"]
      volumeMounts:
        # /mnt-rro/tmpfs is not writable
        - name: mnt
          mountPath: /mnt-rro
          readOnly: true
          mountPropagation: None
          recursiveReadOnly: IfPossible
        # /mnt-ro/tmpfs is writable
        - name: mnt
          mountPath: /mnt-ro
          readOnly: true
        # /mnt-rw/tmpfs is writable
        - name: mnt
          mountPath: /mnt-rw
```

Requirements:
- Feature gate "RecursiveReadOnlyMounts" to be enabled
- Linux kernel >= 5.12
- runc >= 1.1

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 03:00:59 +09:00
Akihiro Suda
6f12e1d8e5
kubelet: expose containerStatuses.volumeMounts
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 03:00:59 +09:00
Akihiro Suda
76081a10c2
kubelet: RuntimeHandler: add SupportsRecursiveReadOnlyMounts
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:09 +09:00
Akihiro Suda
27f24a62e3
kubelet: change map[string]RuntimeHandler to []RuntimeHandler
The map is changed to an array so as to retain the order of the original array
propagated from the CRI runtime.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:07 +09:00
Sascha Grunert
e38531e9a2
Add image_id to CRI ContainerStatus message
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:

411c29c39f/pkg/kubelet/container/helpers.go (L259-L292)

To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.

We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.

Follow-up on: https://github.com/kubernetes/kubernetes/pull/123508

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-02-29 12:41:55 +01:00
Kubernetes Prow Robot
68a47053d1
Merge pull request #123508 from saschagrunert/image-id-container
Add `image_id` to CRI `Container` message
2024-02-28 11:01:35 -08:00
Sascha Grunert
e663285ccf
Add image_id to CRI Container message
This new field allows fixing the kubelet image garbage collection in
container runtimes. The `image_ref` has been historically used by
container runtimes to reference images by digest.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-02-28 10:05:07 +01:00
Giuseppe Scrivano
024146f705
KEP-127: the kubelet stores runtime helpers
as they are received from the ResponseStatus request to the runtime.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-02-27 11:07:35 +01:00
ruiwen-zhao
0f5cf6c1cd Add image pull duration metric with bucketed image size
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2024-02-08 00:30:31 +00:00
Kevin Hannon
26923b91e8 implementation of split disk kep 2023-11-01 14:46:33 -04:00
kiashok
252e1d2dfe Imagepull per runtime class alpha release changes
This commit does the following:
1. Add RuntimeClassInImageCriApi feature gate
2. Extend pkg/kubelet/container Image struct
3. Adds runtimeHandler string in the following CRI calls
   i.   ImageStatus
   ii.  PullImageRequest
   iii.  RemoveImage

Signed-off-by: kiashok <kiashok@microsoft.com>
2023-10-31 15:52:46 -07:00
Sergey Kanzhelev
eb60dce33b deprecate ExperimentalHostUserNamespaceDefaulting 2023-03-17 22:07:25 +00:00
Kubernetes Prow Robot
b8aaaf380a
Merge pull request #116083 from SataQiu/clean-20230227
kubelet: remove unused DockerID type
2023-03-06 02:22:58 -08:00
Ed Bartosh
5a86895070 DRA: pass CDI devices through CRI CDIDevice field 2023-02-28 19:21:20 +02:00
SataQiu
ed2caf17e0 kubelet: remove unused DockerID type 2023-02-27 16:02:59 +08:00
Chen Wang
7db339dba2 This commit contains the following:
1. Scheduler bug-fix + scheduler-focussed E2E tests
2. Add cgroup v2 support for in-place pod resize
3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes.

Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>
2023-02-24 18:21:21 +00:00
Vinay Kulkarni
f2bd94a0de In-place Pod Vertical Scaling - core implementation
1. Core Kubelet changes to implement In-place Pod Vertical Scaling.
2. E2E tests for In-place Pod Vertical Scaling.
3. Refactor kubelet code and add missing tests (Derek's kubelet review)
4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature.
5. Fix corner-case where resize A->B->A gets ignored
6. Add cgroup v2 support to pod resize E2E test.
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources

Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>
2023-02-24 18:21:21 +00:00
Peter Hunt
6298ce68e2 kubelet: wire ListPodSandboxMetrics
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2022-11-08 14:47:08 -05:00
Harshal Patil
86284d42f8
Add support for Evented PLEG
Signed-off-by: Harshal Patil <harpatil@redhat.com>
Co-authored-by: Swarup Ghosh <swghosh@redhat.com>
2022-11-08 20:06:16 +05:30
David Ashpole
64af1adace
Second attempt: Plumb context to Kubelet CRI calls (#113591)
* plumb context from CRI calls through kubelet

* clean up extra timeouts

* try fixing incorrectly cancelled context
2022-11-05 06:02:13 -07:00
Antonio Ojea
9c2b333925 Revert "plumb context from CRI calls through kubelet"
This reverts commit f43b4f1b95.
2022-11-02 13:37:23 +00:00
David Ashpole
f43b4f1b95
plumb context from CRI calls through kubelet 2022-10-28 02:55:28 +00:00
Sascha Grunert
b296f82c69
Sort kubelet pods by their creation time
There is a corner case when blocking Pod termination via a lifecycle
preStop hook, for example by using this StateFulSet:

```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: ubi
  serviceName: "ubi"
  replicas: 1
  template:
    metadata:
      labels:
        app: ubi
    spec:
      terminationGracePeriodSeconds: 1000
      containers:
      - name: ubi
        image: ubuntu:22.04
        command: ['sh', '-c', 'echo The app is running! && sleep 360000']
        ports:
        - containerPort: 80
          name: web
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - 'echo aaa; trap : TERM INT; sleep infinity & wait'
```

After creation, downscaling, forced deletion and upscaling of the
replica like this:

```
> kubectl apply -f sts.yml
> kubectl scale sts web --replicas=0
> kubectl delete pod web-0 --grace-period=0 --force
> kubectl scale sts web --replicas=1
```

We will end up having two pods running by the container runtime, while
the API only reports one:

```
> kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          92s
```

```
> sudo crictl pods
POD ID              CREATED              STATE     NAME     NAMESPACE     ATTEMPT     RUNTIME
e05bb7dbb7e44       12 minutes ago       Ready     web-0    default       0           (default)
d90088614c73b       12 minutes ago       Ready     web-0    default       0           (default)
```

When now running `kubectl exec -it web-0 -- ps -ef`, there is a random chance that we hit the wrong
container reporting the lifecycle command `/bin/sh -c echo aaa; trap : TERM INT; sleep infinity & wait`.

This is caused by the container lookup via its name (and no podUID) at:
02109414e8/pkg/kubelet/kubelet_pods.go (L1905-L1914)

And more specifiy by the conversion of the pod result map to a slice in `GetPods`:
02109414e8/pkg/kubelet/kuberuntime/kuberuntime_manager.go (L407-L411)

We now solve that unexpected behavior by tracking the creation time of
the pod and sorting the result based on that. This will cause to always
match the most recently created pod.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-10-13 16:32:44 +02:00
Adrian Reber
8c24857ba3
kubelet: add CheckpointContainer() to the runtime
Signed-off-by: Adrian Reber <areber@redhat.com>
2022-07-14 10:27:41 +00:00
Ciprian Hacman
0819451ea6 Clean up logic for deprecated flag --container-runtime in kubelet
Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>
2022-02-10 13:26:59 +02:00
Sascha Grunert
de37b9d293
Make CRI v1 the default and allow a fallback to v1alpha2
This patch makes the CRI `v1` API the new project-wide default version.
To allow backwards compatibility, a fallback to `v1alpha2` has been added
as well. This fallback can either used by automatically determined by
the kubelet.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-11-17 11:05:05 -08:00
Skyler Clark
d3ae0a381a
prevents garbage collection from removing pinned images 2021-11-02 14:43:02 -04:00
vikram Jadhav
0de4397490 mockery to mockgen conversion 2021-09-25 16:15:08 +00:00
Clayton Coleman
3eadd1a9ea
Keep pod worker running until pod is truly complete
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.

Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).

Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.

Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.

See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
2021-07-06 15:55:22 -04:00
yuzhiquan
bebca30309 comment should have function name as prefix 2021-04-28 15:26:46 +08:00
JunYang
01a4e4face Structured Logging migration: modify volume and container part logs of kubelet.
Signed-off-by: JunYang <yang.jun22@zte.com.cn>
2021-03-17 08:59:03 +08:00
Sergey Kanzhelev
4c9e96c238 Revert "Merge pull request #92817 from kmala/kubelet"
This reverts commit 88512be213, reversing
changes made to c3b888f647.
2021-01-12 22:27:22 +00:00
Kubernetes Prow Robot
88512be213
Merge pull request #92817 from kmala/kubelet
Check for sandboxes before deleting the pod from apiserver
2020-09-10 07:27:45 -07:00
Sergey Kanzhelev
d20fd40884 remove legacy leftovers of portmapping functionality that was moved to CNI 2020-07-30 23:12:16 +00:00
Keerthan Reddy,Mala
851d778531 address review comments 2020-07-22 11:54:58 -07:00
Keerthan Reddy,Mala
90cc954eed add sandbox deletor to delete sandboxes on pod delete event 2020-07-22 11:54:58 -07:00
Keerthan Reddy,Mala
d4325f42fb Check for sandboxes before deleting the pod from apiserver 2020-07-22 11:54:56 -07:00
Sergey Kanzhelev
ee53488f19 fix golint issues in pkg/kubelet/container 2020-06-19 15:48:08 +00:00
Kubernetes Prow Robot
e50a46459b
Merge pull request #91303 from SergeyKanzhelev/fixLinterInKebeletContainer
Fix golint failures for kubelet/container
2020-06-09 14:48:18 -07:00
Jordan Liggitt
591e0043c8 Revert "Merge pull request 89667 from kmala/kubelet"
This reverts commit fa785a5706, reversing
changes made to cf13f8d994.
2020-05-21 13:30:14 -04:00
Sergey Kanzhelev
a4bfc732f7 minor correction in comments 2020-05-21 04:15:44 +00:00
Sergey Kanzhelev
4e94144d1e Fix golint failures for kubelet/container 2020-05-20 19:01:23 +00:00
Kubernetes Prow Robot
fa785a5706
Merge pull request #89667 from kmala/kubelet
Check for sandboxes before deleting the pod from apiserver
2020-05-19 23:40:18 -07:00
Kubernetes Prow Robot
f4112710f5
Merge pull request #90061 from marosset/runtimehandler-image-spec-annotations
Add annotations to CRI ImageSpec objects
2020-05-18 16:29:36 -07:00
Davanum Srinivas
442a69c3bd
switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
marosset
03479e4d12 kubelet - adding pod annotations to various image calls to get runtime-handler info to CRI 2020-04-17 23:57:09 +00:00