This will help to reduce the amount of runc/libcontainer code that's used in
Moby / Docker Engine (in favor of using the containerd implementation).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This ports the changes of 95a59bf206
to this repository.
From that PR:
(mode&S_IFCHR == S_IFCHR) is the wrong way of checking the type of an
inode because the S_IF* bits are actually not a bitmask and instead must
be checked using S_IF*. This bug was neatly hidden behind a (major == 0)
sanity-check but that was removed by [1].
In addition, add a test that makes sure that HostDevices() doesn't give
rubbish results -- because we broke this and fixed this before[2].
[1]: 24388be71e ("configs: use different types for .Devices and .Resources.Devices")
[2]: 3ed492ad33 ("Handle non-devices correctly in DeviceFromPath")
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
The `oci.WithUser` function relies on checking a path on the hosts disk to
grab/validate the uid:gid pair for the user string provided. For LCOW it's a
bit harder to confirm that the user actually exists on the host as a rootfs isn't
mounted on the host and shared into the guest, but rather the rootfs is constructed
entirely in the guest itself. To accomodate this, a spot to place the user string
provided by a client as-is is needed.
The `Username` field on the runtime spec is marked by Platform as only for Windows,
and in this case it *is* being set on a Windows host at least, but will be used as a
temporary holding spot until the guest can use the string to perform these same
operations to grab the uid:gid inside.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
CRI container runtimes mount devices (set via kubernetes device plugins)
to containers by taking the host user/group IDs (uid/gid) to the
corresponding container device.
This triggers a problem when trying to run those containers with
non-zero (root uid/gid = 0) uid/gid set via runAsUser/runAsGroup:
the container process has no permission to use the device even when
its gid is permissive to non-root users because the container user
does not belong to that group.
It is possible to workaround the problem by manually adding the device
gid(s) to supplementalGroups. However, this is also problematic because
the device gid(s) may have different values depending on the workers'
distro/version in the cluster.
This patch suggests to take RunAsUser/RunAsGroup set via SecurityContext
as the device UID/GID, respectively. The feature must be enabled by
setting device_ownership_from_security_context runtime config value to
true (valid on Linux only).
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Remove build tags which are already implied by the name of the file.
Ensures build tags are used consistently
Signed-off-by: Derek McGowan <derek@mcg.dev>
Looks like we had our own copy of the "getDevices" code already, so use
that code (which also matches the code that's used to _generate_ the spec,
so a better match).
Moving the code to a separate file, I also noticed that the _unix and _linux
code was _exactly_ the same (baring some `//nolint:` comments), so also
removing the duplicated code.
With this patch applied, we removed the dependency on the libcontainer/devices
package (leaving only libcontainer/user).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Move `pkg/cri/opts.WithoutRunMount` function to `oci.WithoutRunMount`
so that it can be used without dependency on CRI.
Also add `oci.WithoutMounts(dests ...string)` for generality.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
This enables cases where devices exist in a subdirectory of /dev,
particularly where those device names are not portable across machines,
which makes it problematic to specify from a runtime such as cri.
Added this to `ctr` as well so I could test that the code at least
works.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This change is needed for running the latest containerd inside Docker
that is not aware of the recently added caps (BPF, PERFMON, CHECKPOINT_RESTORE).
Without this change, containerd inside Docker fails to run containers with
"apply caps: operation not permitted" error.
See kubernetes-sigs/kind 2058
NOTE: The caller process of this function is now assumed to be as
privileged as possible.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The Err() method should be called after the Scan() loop, not inside it.
Found by: git grep -A3 -F '.Scan()'
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
With the change in #3542 it breaks $PATH handling for images becuase our
default spec always sets a PATH on the process's .Env.
This removes the default and adds an Opt to add this back.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>