* Add `Linux{Sandbox,Container}SecurityContext.SupplementalGroupsPolicy` and `ContainerStatus.user` in cri-api
* Add `PodSecurityContext.SupplementalGroupsPolicy`, `ContainerStatus.User` and its featuregate
* Implement DropDisabledPodFields for PodSecurityContext.SupplementalGroupsPolicy and ContainerStatus.User fields
* Implement kubelet so to wire between SecurityContext.SupplementalGroupsPolicy/ContainerStatus.User and cri-api in kubelet
* Clarify `SupplementalGroupsPolicy` is an OS depdendent field.
* Make `ContainerStatus.User` is initially attached user identity to the first process in the ContainerStatus
It is because, the process identity can be dynamic if the initially attached identity
has enough privilege calling setuid/setgid/setgroups syscalls in Linux.
* Rewording suggestion applied
* Add TODO comment for updating SupplementalGroupsPolicy default value in v1.34
* Added validations for SupplementalGroupsPolicy and ContainerUser
* No need featuregate check in validation when adding new field with no default value
* fix typo: identitiy -> identity
* Windows: Consider slash-prefixed paths as absolute
filepath.IsAbs does not consider "/" or "\" as absolute paths, even
though files can be addressed as such. [1][2]
Currently, there are some unit tests that are failing on Windows due to
this reason.
[1] https://learn.microsoft.com/en-us/dotnet/standard/io/file-path-formats#traditional-dos-paths
[2] https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#fully-qualified-vs-relative-paths
* Add test to verify IsAbs for windows
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
* Fix abs path validation on windows
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
* Skipp path clean check for podLogDir on windows
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
* Implement IsPathClean to validate path
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
* Add warn comment for IsAbs
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
---------
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
Co-authored-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
allow to specify what IDs must be used by the kubelet to create user
namespaces.
If no additional UIDs/GIDs are not allocated to the "kubelet" user,
then the kubelet assumes it can use any ID on the system.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:
411c29c39f/pkg/kubelet/container/helpers.go (L259-L292)
To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.
We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.
Follow-up on: https://github.com/kubernetes/kubernetes/pull/123508
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
block the creation of a pod that requires a user namespace, unless the
runtime handler has support for it.
If the pod requested for a user namespace, and the handler does not
support it then return an error regardless of the feature gate.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
- Implement `computeInitContainerActions` to sets the actions for the
init containers, including those with `RestartPolicyAlways`.
- Allow StartupProbe on the restartable init containers.
- Update PodPhase considering the restartable init containers.
- Update PodInitialized status and status manager considering the
restartable init containers.
Co-authored-by: Matthias Bertschy <matthias.bertschy@gmail.com>
hostIPs order may not be be consistent. If secondary IP is before
primary one, current logic adds primary IP twice into PodIPs, which
leads to error: "may specify no more than one IP for each IP family".
In this case, the second IP shouldn't be added.
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Container runtimes like CRI-O and containerd reuse the code by copying
it from Kubernetes. To have a single source of truth for the streaming
server we now move the already isolated implementation to the
k8s.io/kubelet staging repository. This way runtimes can re-use the code
without copying the parts.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
The two are not coupled except accidentally. Separate them and
update callsites. This will reduce the scope of PodManager interface
to make exposing the pod worker cleaner.
The HandlePod* methods are all structurally similar, but accrued
subtle differences. In general the only point for Handle is to
process admission and to update the pod worker with the desired
state of the kubelet's config (so that pod worker can make it
the actual state).
Add a new GetPodAndMirrorPod() method that handles when the config
pod is ambiguous (pod or mirror pod) and inline the structure.
Add comments on questionable additions in the config methods for
future improvement.
Move the metric observation of container count closer to where
pods are actually started (in the pod worker). A future change
can likely move it to syncPod.
There is only one caller and both sets of data are part of the
resync operation between kubelet's desired state and the actual
state of the pod workers. Reduces the size of the interface so
that it is easier to create another pod manager.
HandlePodCleanups is responsible for restarting pods that are no
longer running (usually due to delete and recreation with the same
UID in quick succession). We have to filter the list of pods to
restart from podManager to get the list of admitted pods, which
uses filterOutInactivePods on the kubelet. That method excludes
pods the pod worker has already terminated. Since a restarted
pod will be in the terminated state before HandlePodCleanups
calls SyncKnownPods, we have to call filterOutInactivePods after
SyncKnownPods, otherwise the to-be-restarted pod is ignored and
we have to wait for the next houskeeping cycle to restart it.
Since static pods are often critical system components, this
extra 2s wait is undesirable and we should restart as soon as
we can. Add a failing test that passes after we move the filter
call after SyncKnownPods.