This allows user namespace support to progress, either by allowing
snapshotters to deal with ownership, or falling back to containerd doing
a recursive chown.
In the future, when snapshotters implement idmap mounts, they should
report the "remap-ids" capability.
Co-authored-by: Rodrigo Campos <rodrigoca@microsoft.com>
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Signed-off-by: David Leadbeater <dgl@dgl.cx>
As explained in the comments, this patch lets the OCI runtime create the
netns when userns are in use. This is needed because the netns needs to
be owned by the userns (otherwise can't modify the IP, etc.).
Before this patch, we are creating the netns and then starting the pod
sandbox asking to join this netns. This can't never work with userns, as
the userns needs to be created first for the netns ownership to be
correct.
One option would be to also create the userns in containerd, then create
the netns. But this is painful (needs tricks with the go runtime,
special care to write the mapping, etc.).
So, we just let the OCI runtime create the userns and netns, that
creates them with the proper ownership.
As requested by Mike Brown, the current code when userns is not used is
left unchanged. We can unify the cases (with and without userns) in a
future release.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Due to when we were updating the pod sandboxes underlying container
object, the pointer to the sandbox would have the right info, but
the on-disk representation of the data was behind. This would cause
the data returned from loading any sandboxes after a restart to have
no CNI result or IP information for the pod.
This change does an additional update to the on-disk container info
right after we invoke the CNI plugin so the metadata for the CNI result
and other networking information is properly flushed to disk.
Signed-off-by: Danny Canter <danny@dcantah.dev>
Skip automatic `if swapLimit == 0 { s.Linux.Resources.Memory.Swap = &limit }` when the swap controller is missing.
(default on Ubuntu 20.04)
Fix issue 7828 (regression in PR 7783 "cri: make swapping disabled with memory limit")
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
While we need to support CRI v1alpha2, the implementation doesn't have
to be tied to gogo/protobuf.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
We do a ton of host networking checks around the CRI plugin, all mainly
doing the same thing of checking the different quirks on various platforms
(for windows are we a HostProcess pod, for linux is namespace mode the
right thing, darwin doesn't have CNI support etc.) which could all be
bundled up into a small helper that can be re-used.
Signed-off-by: Danny Canter <danny@dcantah.dev>
OCI runtime spec defines memory.swap as 'limit of memory+Swap usage'
so setting them to equal should disable the swap. Also, this change
should make containerd behaviour same as other runtimes e.g
'cri-dockerd/dockershim' and won't be impacted when user turn on
'NodeSwap' (https://github.com/kubernetes/enhancements/issues/2400) feature.
Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
In the CRI streaming server, a goroutine (`handleResizeEvents`) is launched
to handle terminal resize events if a TTY is asked for with an exec; this
is the sender of terminal resize events. Another goroutine is launched
shortly after successful process startup to actually do something with
these events, however the issue arises if the exec process fails to start
for any reason that would have `process.Start` return non-nil. The receiver
goroutine never gets launched so the sender is stuck blocked on a channel send
infinitely.
This could be used in a malicious manner by repeatedly launching execs
with a command that doesn't exist in the image, as a single goroutine
will get leaked on every invocation which will slowly grow containerd's
memory usage.
Signed-off-by: Danny Canter <danny@dcantah.dev>
Remove direct invocation of old v0.1.0 NRI plugins. They
can be enabled using the revised NRI API and the v0.1.0
adapter plugin.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Implement the adaptation interface required by the NRI
service plugin to handle CRI sandboxes and containers.
Hook the NRI service plugin into CRI request processing.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
All pause container object references must be removed
from sbserver. This is an implementation detail of
podsandbox package.
Added TODOs for remaining work.
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>