Merge pull request #8287 from kinvolk/rata/userns-stateless-idmap
Add support for userns in stateless and stateful pods with idmap mounts (KEP-127, k8s >= 1.27)
This commit is contained in:
commit
fe17f65159
@ -461,4 +461,4 @@ more quickly.
|
|||||||
| [NRI in CRI Support](https://github.com/containerd/containerd/pull/6019) | containerd v1.7 | containerd v2.0 |
|
| [NRI in CRI Support](https://github.com/containerd/containerd/pull/6019) | containerd v1.7 | containerd v2.0 |
|
||||||
| [gRPC Shim](https://github.com/containerd/containerd/pull/8052) | containerd v1.7 | containerd v2.0 |
|
| [gRPC Shim](https://github.com/containerd/containerd/pull/8052) | containerd v1.7 | containerd v2.0 |
|
||||||
| [CRI Runtime Specific Snapshotter](https://github.com/containerd/containerd/pull/6899) | containerd v1.7 | containerd v2.0 |
|
| [CRI Runtime Specific Snapshotter](https://github.com/containerd/containerd/pull/6899) | containerd v1.7 | containerd v2.0 |
|
||||||
| [CRI Support for User Namespaces](https://github.com/containerd/containerd/pull/7679) | containerd v1.7 | containerd v2.0 |
|
| [CRI Support for User Namespaces](./docs/user-namespaces/README.md) | containerd v1.7 | containerd v2.0 |
|
||||||
|
146
docs/user-namespaces/README.md
Normal file
146
docs/user-namespaces/README.md
Normal file
@ -0,0 +1,146 @@
|
|||||||
|
# Support for user namespaces
|
||||||
|
|
||||||
|
Kubernetes supports running pods with user namespace since v1.25. This document explains the
|
||||||
|
containerd support for this feature.
|
||||||
|
|
||||||
|
## What are user namespaces?
|
||||||
|
|
||||||
|
A user namespace isolates the user running inside the container from the one in the host.
|
||||||
|
|
||||||
|
A process running as root in a container can run as a different (non-root) user in the host; in
|
||||||
|
other words, the process has full privileges for operations inside the user namespace, but is
|
||||||
|
unprivileged for operations outside the namespace.
|
||||||
|
|
||||||
|
You can use this feature to reduce the damage a compromised container can do to the host or other
|
||||||
|
pods in the same node. There are several security vulnerabilities rated either HIGH or CRITICAL that
|
||||||
|
were not exploitable when user namespaces is active. It is expected user namespace will mitigate
|
||||||
|
some future vulnerabilities too.
|
||||||
|
|
||||||
|
See [the kubernetes documentation][kube-intro] for a high-level introduction to
|
||||||
|
user namespaces.
|
||||||
|
|
||||||
|
[kube-intro]: https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/#introduction
|
||||||
|
|
||||||
|
## Stack requirements
|
||||||
|
|
||||||
|
The Kubernetes implementation was redesigned in 1.27, so the requirements are different for versions
|
||||||
|
pre and post Kubernetes 1.27.
|
||||||
|
|
||||||
|
Please note that if you try to use user namespaces with containerd 1.6 or older, the `hostUsers:
|
||||||
|
false` setting in your pod.spec will be **silently ignored**.
|
||||||
|
|
||||||
|
### Kubernetes 1.25 and 1.26
|
||||||
|
|
||||||
|
* Containerd 1.7 or greater
|
||||||
|
* runc 1.1 or greater
|
||||||
|
|
||||||
|
### Kubernetes 1.27 and greater
|
||||||
|
|
||||||
|
* Linux 6.3 or greater
|
||||||
|
* Containerd 2.0 or greater
|
||||||
|
* You can use runc or crun as the OCI runtime:
|
||||||
|
* runc 1.2 or greater
|
||||||
|
* crun 1.9 or greater
|
||||||
|
|
||||||
|
Furthermore, all the file-systems used by the volumes in the pod need kernel-support for idmap
|
||||||
|
mounts. Some popular file-systems that support idmap mounts in Linux 6.3 are: `btrfs`, `ext4`, `xfs`,
|
||||||
|
`fat`, `tmpfs`, `overlayfs`.
|
||||||
|
|
||||||
|
The kubelet is in charge of populating some files to the containers (like configmap, secrets, etc.).
|
||||||
|
The file-system used in that path needs to support idmap mounts too. See [the Kubernetes
|
||||||
|
documentation][kube-req] for more info on that.
|
||||||
|
|
||||||
|
|
||||||
|
[kube-req]: https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/#before-you-begin
|
||||||
|
|
||||||
|
## Creating a Kubernetes pod with user namespaces
|
||||||
|
|
||||||
|
First check your containerd, Linux and Kubernetes versions. If those are okay, then there is no
|
||||||
|
special configuration needed on conntainerd. You can just follow the steps in the [Kubernetes
|
||||||
|
website][kube-example].
|
||||||
|
|
||||||
|
[kube-example]: https://kubernetes.io/docs/tasks/configure-pod-container/user-namespaces/
|
||||||
|
|
||||||
|
# Limitations
|
||||||
|
|
||||||
|
You can check the limitations Kubernetes has [here][kube-limitations]. Note that different
|
||||||
|
Kubernetes versions have different limitations, be sure to check the site for the Kubernetes version
|
||||||
|
you are using.
|
||||||
|
|
||||||
|
Different containerd versions have different limitations too, those are highlighted in this section.
|
||||||
|
|
||||||
|
[kube-limitations]: https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/#limitations
|
||||||
|
|
||||||
|
### containerd 1.7
|
||||||
|
|
||||||
|
One limitation present in containerd 1.7 is that it needs to change the ownership of every file and
|
||||||
|
directory inside the container image, during Pod startup. This means it has a storage overhead (the
|
||||||
|
size of the container image is duplicated each time a pod is created) and can significantly impact
|
||||||
|
the container startup latency.
|
||||||
|
|
||||||
|
You can mitigate this limitation by switching `/sys/module/overlay/parameters/metacopy` to `Y`. This
|
||||||
|
will significantly reduce the storage and performance overhead, as only the inode for each file of
|
||||||
|
the container image will be duplicated, but not the content of the file. This means it will use less
|
||||||
|
storage and it will be faster. However, it is not a panacea.
|
||||||
|
|
||||||
|
If you change the metacopy param, make sure to do it in a way that is persistant across reboots. You
|
||||||
|
should also be aware that this setting will be used for all containers, not just containers with
|
||||||
|
user namespaces enabled. This will affect all the snapshots that you take manually (if you happen to
|
||||||
|
do that). In that case, make sure to use the same value of `/sys/module/overlay/parameters/metacopy`
|
||||||
|
when creating and restoring the snapshot.
|
||||||
|
|
||||||
|
### containerd 2.0
|
||||||
|
|
||||||
|
The storage and latency limitation from containerd 1.7 are not present in container 2.0 and above,
|
||||||
|
if you use the overlay snapshotter (this is used by default). It will not use more storage at all,
|
||||||
|
and there is no startup latency.
|
||||||
|
|
||||||
|
This is achieved by using the kernel feature idmap mounts with the container rootfs (the container
|
||||||
|
image). This allows an overlay file-system to expose the image with different UID/GID without copying
|
||||||
|
the files nor the inodes, just using a bind-mount.
|
||||||
|
|
||||||
|
You can check if you are using idmap mounts for the container image if you create a pod with user
|
||||||
|
namespaces, exec into it and run:
|
||||||
|
|
||||||
|
```
|
||||||
|
mount | grep overlay
|
||||||
|
```
|
||||||
|
|
||||||
|
You should see a reference to the idmap mount in the `lowerdir` parameter, in this case we can see
|
||||||
|
`idmapped` used there:
|
||||||
|
|
||||||
|
```
|
||||||
|
overlay on / type overlay (rw,relatime,lowerdir=/tmp/ovl-idmapped823885363/0,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1018/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1018/work)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Creating a container with user namespaces with `ctr`
|
||||||
|
|
||||||
|
You can also create a container with user namespaces using `ctr`. This is more low-level, be warned.
|
||||||
|
|
||||||
|
Create an OCI bundle as explained [here][runc-bundle]. Then, change the UID/GID to 65536:
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo chown -R 65536:65536 rootfs/
|
||||||
|
```
|
||||||
|
|
||||||
|
Copy [this config.json](./config.json) and replace `XXX-path-to-rootfs` with the
|
||||||
|
absolute path to the rootfs you just chowned.
|
||||||
|
|
||||||
|
Then create and start the container with:
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo ctr create --config <path>/config.json userns-test
|
||||||
|
sudo ctr t start userns-test
|
||||||
|
```
|
||||||
|
|
||||||
|
This will open a shell inside the container. You can run this, to verify you are inside a user
|
||||||
|
namespace:
|
||||||
|
|
||||||
|
```
|
||||||
|
root@runc:/# cat /proc/self/uid_map
|
||||||
|
0 65536 65536
|
||||||
|
```
|
||||||
|
|
||||||
|
The output should be exactly the same.
|
||||||
|
|
||||||
|
[runc-bundle]: https://github.com/opencontainers/runc#creating-an-oci-bundle
|
199
docs/user-namespaces/config.json
Normal file
199
docs/user-namespaces/config.json
Normal file
@ -0,0 +1,199 @@
|
|||||||
|
{
|
||||||
|
"ociVersion": "1.0.2-dev",
|
||||||
|
"process": {
|
||||||
|
"terminal": true,
|
||||||
|
"user": {
|
||||||
|
"uid": 0,
|
||||||
|
"gid": 0
|
||||||
|
},
|
||||||
|
"args": [
|
||||||
|
"bash"
|
||||||
|
],
|
||||||
|
"env": [
|
||||||
|
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
|
||||||
|
"TERM=xterm"
|
||||||
|
],
|
||||||
|
"cwd": "/",
|
||||||
|
"capabilities": {
|
||||||
|
"bounding": [
|
||||||
|
"CAP_AUDIT_WRITE",
|
||||||
|
"CAP_KILL",
|
||||||
|
"CAP_NET_BIND_SERVICE"
|
||||||
|
],
|
||||||
|
"effective": [
|
||||||
|
"CAP_AUDIT_WRITE",
|
||||||
|
"CAP_KILL",
|
||||||
|
"CAP_NET_BIND_SERVICE"
|
||||||
|
],
|
||||||
|
"inheritable": [
|
||||||
|
"CAP_AUDIT_WRITE",
|
||||||
|
"CAP_KILL",
|
||||||
|
"CAP_NET_BIND_SERVICE"
|
||||||
|
],
|
||||||
|
"permitted": [
|
||||||
|
"CAP_AUDIT_WRITE",
|
||||||
|
"CAP_KILL",
|
||||||
|
"CAP_NET_BIND_SERVICE"
|
||||||
|
],
|
||||||
|
"ambient": [
|
||||||
|
"CAP_AUDIT_WRITE",
|
||||||
|
"CAP_KILL",
|
||||||
|
"CAP_NET_BIND_SERVICE"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"rlimits": [
|
||||||
|
{
|
||||||
|
"type": "RLIMIT_NOFILE",
|
||||||
|
"hard": 1024,
|
||||||
|
"soft": 1024
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"noNewPrivileges": true
|
||||||
|
},
|
||||||
|
"root": {
|
||||||
|
"path": "XXX-path-to-rootfs"
|
||||||
|
},
|
||||||
|
"hostname": "runc",
|
||||||
|
"mounts": [
|
||||||
|
{
|
||||||
|
"destination": "/proc",
|
||||||
|
"type": "proc",
|
||||||
|
"source": "proc"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"destination": "/dev",
|
||||||
|
"type": "tmpfs",
|
||||||
|
"source": "tmpfs",
|
||||||
|
"options": [
|
||||||
|
"nosuid",
|
||||||
|
"strictatime",
|
||||||
|
"mode=755",
|
||||||
|
"size=65536k"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"destination": "/dev/pts",
|
||||||
|
"type": "devpts",
|
||||||
|
"source": "devpts",
|
||||||
|
"options": [
|
||||||
|
"nosuid",
|
||||||
|
"noexec",
|
||||||
|
"newinstance",
|
||||||
|
"ptmxmode=0666",
|
||||||
|
"mode=0620",
|
||||||
|
"gid=5"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"destination": "/dev/shm",
|
||||||
|
"type": "tmpfs",
|
||||||
|
"source": "shm",
|
||||||
|
"options": [
|
||||||
|
"nosuid",
|
||||||
|
"noexec",
|
||||||
|
"nodev",
|
||||||
|
"mode=1777",
|
||||||
|
"size=65536k"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"destination": "/dev/mqueue",
|
||||||
|
"type": "mqueue",
|
||||||
|
"source": "mqueue",
|
||||||
|
"options": [
|
||||||
|
"nosuid",
|
||||||
|
"noexec",
|
||||||
|
"nodev"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"destination": "/sys",
|
||||||
|
"type": "sysfs",
|
||||||
|
"source": "sysfs",
|
||||||
|
"options": [
|
||||||
|
"nosuid",
|
||||||
|
"noexec",
|
||||||
|
"nodev",
|
||||||
|
"ro"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"destination": "/sys/fs/cgroup",
|
||||||
|
"type": "cgroup",
|
||||||
|
"source": "cgroup",
|
||||||
|
"options": [
|
||||||
|
"nosuid",
|
||||||
|
"noexec",
|
||||||
|
"nodev",
|
||||||
|
"relatime",
|
||||||
|
"ro"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"linux": {
|
||||||
|
"uidMappings": [
|
||||||
|
{
|
||||||
|
"containerID": 0,
|
||||||
|
"hostID": 65536,
|
||||||
|
"size": 65536
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"gidMappings": [
|
||||||
|
{
|
||||||
|
"containerID": 0,
|
||||||
|
"hostID": 65536,
|
||||||
|
"size": 65536
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"resources": {
|
||||||
|
"devices": [
|
||||||
|
{
|
||||||
|
"allow": false,
|
||||||
|
"access": "rwm"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"namespaces": [
|
||||||
|
{
|
||||||
|
"type": "pid"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "network"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "ipc"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "uts"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "mount"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "cgroup"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "user"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"maskedPaths": [
|
||||||
|
"/proc/acpi",
|
||||||
|
"/proc/asound",
|
||||||
|
"/proc/kcore",
|
||||||
|
"/proc/keys",
|
||||||
|
"/proc/latency_stats",
|
||||||
|
"/proc/timer_list",
|
||||||
|
"/proc/timer_stats",
|
||||||
|
"/proc/sched_debug",
|
||||||
|
"/sys/firmware",
|
||||||
|
"/proc/scsi"
|
||||||
|
],
|
||||||
|
"readonlyPaths": [
|
||||||
|
"/proc/bus",
|
||||||
|
"/proc/fs",
|
||||||
|
"/proc/irq",
|
||||||
|
"/proc/sys",
|
||||||
|
"/proc/sysrq-trigger"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
2
go.mod
2
go.mod
@ -46,7 +46,7 @@ require (
|
|||||||
github.com/opencontainers/go-digest v1.0.0
|
github.com/opencontainers/go-digest v1.0.0
|
||||||
github.com/opencontainers/image-spec v1.1.0-rc4
|
github.com/opencontainers/image-spec v1.1.0-rc4
|
||||||
github.com/opencontainers/runc v1.1.9
|
github.com/opencontainers/runc v1.1.9
|
||||||
github.com/opencontainers/runtime-spec v1.1.0
|
github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4
|
||||||
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626
|
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626
|
||||||
github.com/opencontainers/selinux v1.11.0
|
github.com/opencontainers/selinux v1.11.0
|
||||||
github.com/pelletier/go-toml v1.9.5
|
github.com/pelletier/go-toml v1.9.5
|
||||||
|
4
go.sum
4
go.sum
@ -800,8 +800,8 @@ github.com/opencontainers/runtime-spec v1.0.2/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/
|
|||||||
github.com/opencontainers/runtime-spec v1.0.3-0.20200929063507-e6143ca7d51d/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.0.3-0.20200929063507-e6143ca7d51d/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-spec v1.1.0 h1:HHUyrt9mwHUjtasSbXSMvs4cyFxh+Bll4AjJ9odEGpg=
|
github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4 h1:EctkgBjZ1y4q+sibyuuIgiKpa0QSd2elFtSSdNvBVow=
|
||||||
github.com/opencontainers/runtime-spec v1.1.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-tools v0.0.0-20181011054405-1d69bd0f9c39/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
|
github.com/opencontainers/runtime-tools v0.0.0-20181011054405-1d69bd0f9c39/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
|
||||||
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 h1:DmNGcqH3WDbV5k8OJ+esPWbqUOX5rMLR2PMvziDMJi0=
|
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 h1:DmNGcqH3WDbV5k8OJ+esPWbqUOX5rMLR2PMvziDMJi0=
|
||||||
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626/go.mod h1:BRHJJd0E+cx42OybVYSgUvZmU0B8P9gZuRXlZUP7TKI=
|
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626/go.mod h1:BRHJJd0E+cx42OybVYSgUvZmU0B8P9gZuRXlZUP7TKI=
|
||||||
|
@ -14,7 +14,7 @@ require (
|
|||||||
github.com/containerd/typeurl/v2 v2.1.1
|
github.com/containerd/typeurl/v2 v2.1.1
|
||||||
github.com/opencontainers/go-digest v1.0.0
|
github.com/opencontainers/go-digest v1.0.0
|
||||||
github.com/opencontainers/image-spec v1.1.0-rc4
|
github.com/opencontainers/image-spec v1.1.0-rc4
|
||||||
github.com/opencontainers/runtime-spec v1.1.0
|
github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4
|
||||||
github.com/stretchr/testify v1.8.4
|
github.com/stretchr/testify v1.8.4
|
||||||
go.opentelemetry.io/otel v1.14.0
|
go.opentelemetry.io/otel v1.14.0
|
||||||
go.opentelemetry.io/otel/sdk v1.14.0
|
go.opentelemetry.io/otel/sdk v1.14.0
|
||||||
|
@ -1470,8 +1470,9 @@ github.com/opencontainers/runtime-spec v1.0.3-0.20200929063507-e6143ca7d51d/go.m
|
|||||||
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.0.3-0.20220825212826-86290f6a00fb/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-spec v1.1.0-rc.2/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.1.0-rc.2/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-spec v1.1.0 h1:HHUyrt9mwHUjtasSbXSMvs4cyFxh+Bll4AjJ9odEGpg=
|
|
||||||
github.com/opencontainers/runtime-spec v1.1.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
github.com/opencontainers/runtime-spec v1.1.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
|
github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4 h1:EctkgBjZ1y4q+sibyuuIgiKpa0QSd2elFtSSdNvBVow=
|
||||||
|
github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
|
||||||
github.com/opencontainers/runtime-tools v0.0.0-20181011054405-1d69bd0f9c39/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
|
github.com/opencontainers/runtime-tools v0.0.0-20181011054405-1d69bd0f9c39/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
|
||||||
github.com/opencontainers/runtime-tools v0.9.0/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
|
github.com/opencontainers/runtime-tools v0.9.0/go.mod h1:r3f7wjNzSs2extwzU3Y+6pKfobzPh+kKFJ3ofN+3nfs=
|
||||||
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 h1:DmNGcqH3WDbV5k8OJ+esPWbqUOX5rMLR2PMvziDMJi0=
|
github.com/opencontainers/runtime-tools v0.9.1-0.20221107090550-2e043c6bd626 h1:DmNGcqH3WDbV5k8OJ+esPWbqUOX5rMLR2PMvziDMJi0=
|
||||||
|
@ -286,6 +286,10 @@ func WithWindowsResources(r *runtime.WindowsContainerResources) ContainerOpts {
|
|||||||
}
|
}
|
||||||
|
|
||||||
func WithVolumeMount(hostPath, containerPath string) ContainerOpts {
|
func WithVolumeMount(hostPath, containerPath string) ContainerOpts {
|
||||||
|
return WithIDMapVolumeMount(hostPath, containerPath, nil, nil)
|
||||||
|
}
|
||||||
|
|
||||||
|
func WithIDMapVolumeMount(hostPath, containerPath string, uidMaps, gidMaps []*runtime.IDMapping) ContainerOpts {
|
||||||
return func(c *runtime.ContainerConfig) {
|
return func(c *runtime.ContainerConfig) {
|
||||||
hostPath, _ = filepath.Abs(hostPath)
|
hostPath, _ = filepath.Abs(hostPath)
|
||||||
containerPath, _ = filepath.Abs(containerPath)
|
containerPath, _ = filepath.Abs(containerPath)
|
||||||
@ -293,6 +297,8 @@ func WithVolumeMount(hostPath, containerPath string) ContainerOpts {
|
|||||||
HostPath: hostPath,
|
HostPath: hostPath,
|
||||||
ContainerPath: containerPath,
|
ContainerPath: containerPath,
|
||||||
SelinuxRelabel: selinux.GetEnabled(),
|
SelinuxRelabel: selinux.GetEnabled(),
|
||||||
|
UidMappings: uidMaps,
|
||||||
|
GidMappings: gidMaps,
|
||||||
}
|
}
|
||||||
c.Mounts = append(c.Mounts, mount)
|
c.Mounts = append(c.Mounts, mount)
|
||||||
}
|
}
|
||||||
|
@ -17,28 +17,137 @@
|
|||||||
package integration
|
package integration
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"context"
|
||||||
|
"errors"
|
||||||
"fmt"
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
|
"os/user"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
|
"strings"
|
||||||
"syscall"
|
"syscall"
|
||||||
"testing"
|
"testing"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
"github.com/containerd/containerd/integration/images"
|
"github.com/containerd/containerd/integration/images"
|
||||||
|
runc "github.com/containerd/go-runc"
|
||||||
"github.com/stretchr/testify/assert"
|
"github.com/stretchr/testify/assert"
|
||||||
"github.com/stretchr/testify/require"
|
"github.com/stretchr/testify/require"
|
||||||
exec "golang.org/x/sys/execabs"
|
exec "golang.org/x/sys/execabs"
|
||||||
|
"golang.org/x/sys/unix"
|
||||||
runtime "k8s.io/cri-api/pkg/apis/runtime/v1"
|
runtime "k8s.io/cri-api/pkg/apis/runtime/v1"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
const (
|
||||||
|
defaultRoot = "/var/lib/containerd-test"
|
||||||
|
)
|
||||||
|
|
||||||
|
func supportsUserNS() bool {
|
||||||
|
if _, err := os.Stat("/proc/self/ns/user"); os.IsNotExist(err) {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
func supportsIDMap(path string) bool {
|
||||||
|
treeFD, err := unix.OpenTree(-1, path, uint(unix.OPEN_TREE_CLONE|unix.OPEN_TREE_CLOEXEC))
|
||||||
|
if err != nil {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
defer unix.Close(treeFD)
|
||||||
|
|
||||||
|
// We want to test if idmap mounts are supported.
|
||||||
|
// So we use just some random mapping, it doesn't really matter which one.
|
||||||
|
// For the helper command, we just need something that is alive while we
|
||||||
|
// test this, a sleep 5 will do it.
|
||||||
|
cmd := exec.Command("sleep", "5")
|
||||||
|
cmd.SysProcAttr = &syscall.SysProcAttr{
|
||||||
|
Cloneflags: syscall.CLONE_NEWUSER,
|
||||||
|
UidMappings: []syscall.SysProcIDMap{{ContainerID: 0, HostID: 65536, Size: 65536}},
|
||||||
|
GidMappings: []syscall.SysProcIDMap{{ContainerID: 0, HostID: 65536, Size: 65536}},
|
||||||
|
}
|
||||||
|
if err := cmd.Start(); err != nil {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
defer func() {
|
||||||
|
_ = cmd.Process.Kill()
|
||||||
|
_ = cmd.Wait()
|
||||||
|
}()
|
||||||
|
|
||||||
|
usernsFD := fmt.Sprintf("/proc/%d/ns/user", cmd.Process.Pid)
|
||||||
|
var usernsFile *os.File
|
||||||
|
if usernsFile, err = os.Open(usernsFD); err != nil {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
defer usernsFile.Close()
|
||||||
|
|
||||||
|
attr := unix.MountAttr{
|
||||||
|
Attr_set: unix.MOUNT_ATTR_IDMAP,
|
||||||
|
Userns_fd: uint64(usernsFile.Fd()),
|
||||||
|
}
|
||||||
|
if err := unix.MountSetattr(treeFD, "", unix.AT_EMPTY_PATH, &attr); err != nil {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// traversePath gives 755 permissions for all elements in tPath below
|
||||||
|
// os.TempDir() and errors out if elements above it don't have read+exec
|
||||||
|
// permissions for others. tPath MUST be a descendant of os.TempDir(). The path
|
||||||
|
// returned by testing.TempDir() usually is.
|
||||||
|
func traversePath(tPath string) error {
|
||||||
|
// Check the assumption that the argument is under os.TempDir().
|
||||||
|
tempBase := os.TempDir()
|
||||||
|
if !strings.HasPrefix(tPath, tempBase) {
|
||||||
|
return fmt.Errorf("traversePath: %q is not a descendant of %q", tPath, tempBase)
|
||||||
|
}
|
||||||
|
|
||||||
|
var path string
|
||||||
|
for _, p := range strings.SplitAfter(tPath, "/") {
|
||||||
|
path = path + p
|
||||||
|
stats, err := os.Stat(path)
|
||||||
|
if err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
|
||||||
|
perm := stats.Mode().Perm()
|
||||||
|
if perm&0o5 == 0o5 {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if strings.HasPrefix(tempBase, path) {
|
||||||
|
return fmt.Errorf("traversePath: directory %q MUST have read+exec permissions for others", path)
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := os.Chmod(path, perm|0o755); err != nil {
|
||||||
|
return err
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
func TestPodUserNS(t *testing.T) {
|
func TestPodUserNS(t *testing.T) {
|
||||||
containerID := uint32(0)
|
containerID := uint32(0)
|
||||||
hostID := uint32(65536)
|
hostID := uint32(65536)
|
||||||
size := uint32(65536)
|
size := uint32(65536)
|
||||||
|
idmap := []*runtime.IDMapping{
|
||||||
|
{
|
||||||
|
ContainerId: containerID,
|
||||||
|
HostId: hostID,
|
||||||
|
Length: size,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
volumeHostPath := t.TempDir()
|
||||||
|
if err := traversePath(volumeHostPath); err != nil {
|
||||||
|
t.Fatalf("failed to setup volume host path: %v", err)
|
||||||
|
}
|
||||||
|
|
||||||
for name, test := range map[string]struct {
|
for name, test := range map[string]struct {
|
||||||
sandboxOpts []PodSandboxOpts
|
sandboxOpts []PodSandboxOpts
|
||||||
containerOpts []ContainerOpts
|
containerOpts []ContainerOpts
|
||||||
checkOutput func(t *testing.T, output string)
|
checkOutput func(t *testing.T, output string)
|
||||||
|
hostVolumes bool // whether to config uses host Volumes
|
||||||
expectErr bool
|
expectErr bool
|
||||||
}{
|
}{
|
||||||
"userns uid mapping": {
|
"userns uid mapping": {
|
||||||
@ -85,6 +194,31 @@ func TestPodUserNS(t *testing.T) {
|
|||||||
assert.Contains(t, output, "=0=0=")
|
assert.Contains(t, output, "=0=0=")
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
"volumes permissions": {
|
||||||
|
sandboxOpts: []PodSandboxOpts{
|
||||||
|
WithPodUserNs(containerID, hostID, size),
|
||||||
|
},
|
||||||
|
hostVolumes: true,
|
||||||
|
containerOpts: []ContainerOpts{
|
||||||
|
WithUserNamespace(containerID, hostID, size),
|
||||||
|
WithIDMapVolumeMount(volumeHostPath, "/mnt", idmap, idmap),
|
||||||
|
// Prints numeric UID and GID for path.
|
||||||
|
// For example, if UID and GID is 0 it will print: =0=0=
|
||||||
|
// We add the "=" signs so we use can assert.Contains() and be sure
|
||||||
|
// the UID/GID is 0 and not things like 100 (that contain 0).
|
||||||
|
// We can't use assert.Equal() easily as it contains timestamp, etc.
|
||||||
|
WithCommand("stat", "-c", "'=%u=%g='", "/mnt/"),
|
||||||
|
},
|
||||||
|
checkOutput: func(t *testing.T, output string) {
|
||||||
|
// The UID and GID should be the current user if chown/remap is done correctly.
|
||||||
|
uid := "0"
|
||||||
|
user, err := user.Current()
|
||||||
|
if user != nil && err == nil {
|
||||||
|
uid = user.Uid
|
||||||
|
}
|
||||||
|
assert.Contains(t, output, "="+uid+"="+uid+"=")
|
||||||
|
},
|
||||||
|
},
|
||||||
"fails with several mappings": {
|
"fails with several mappings": {
|
||||||
sandboxOpts: []PodSandboxOpts{
|
sandboxOpts: []PodSandboxOpts{
|
||||||
WithPodUserNs(containerID, hostID, size),
|
WithPodUserNs(containerID, hostID, size),
|
||||||
@ -94,12 +228,17 @@ func TestPodUserNS(t *testing.T) {
|
|||||||
},
|
},
|
||||||
} {
|
} {
|
||||||
t.Run(name, func(t *testing.T) {
|
t.Run(name, func(t *testing.T) {
|
||||||
cmd := exec.Command("true")
|
if !supportsUserNS() {
|
||||||
cmd.SysProcAttr = &syscall.SysProcAttr{
|
t.Skip("User namespaces are not supported")
|
||||||
Cloneflags: syscall.CLONE_NEWUSER,
|
|
||||||
}
|
}
|
||||||
if err := cmd.Run(); err != nil {
|
if !supportsIDMap(defaultRoot) {
|
||||||
t.Skip("skipping test: user namespaces are unavailable")
|
t.Skipf("ID mappings are not supported on: %v", defaultRoot)
|
||||||
|
}
|
||||||
|
if test.hostVolumes && !supportsIDMap(volumeHostPath) {
|
||||||
|
t.Skipf("ID mappings are not supported host volume filesystem: %v", volumeHostPath)
|
||||||
|
}
|
||||||
|
if err := supportsRuncIDMap(); err != nil {
|
||||||
|
t.Skipf("OCI runtime doesn't support idmap mounts: %v", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
testPodLogDir := t.TempDir()
|
testPodLogDir := t.TempDir()
|
||||||
@ -164,3 +303,22 @@ func TestPodUserNS(t *testing.T) {
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func supportsRuncIDMap() error {
|
||||||
|
var r runc.Runc
|
||||||
|
features, err := r.Features(context.Background())
|
||||||
|
if err != nil {
|
||||||
|
// If the features command is not implemented, then runc is too old.
|
||||||
|
return fmt.Errorf("features command failed: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
if features.Linux.MountExtensions == nil || features.Linux.MountExtensions.IDMap == nil {
|
||||||
|
return errors.New("missing `mountExtensions.idmap` entry in `features` command")
|
||||||
|
|
||||||
|
}
|
||||||
|
if enabled := features.Linux.MountExtensions.IDMap.Enabled; enabled == nil || !*enabled {
|
||||||
|
return errors.New("idmap mounts not supported")
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
@ -162,8 +162,26 @@ func WithMounts(osi osinterface.OS, config *runtime.ContainerConfig, extra []*ru
|
|||||||
return fmt.Errorf("relabel %q with %q failed: %w", src, mountLabel, err)
|
return fmt.Errorf("relabel %q with %q failed: %w", src, mountLabel, err)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if mount.UidMappings != nil || mount.GidMappings != nil {
|
|
||||||
return fmt.Errorf("idmap mounts not yet supported, but they were requested for: %q", src)
|
var uidMapping []runtimespec.LinuxIDMapping
|
||||||
|
if mount.UidMappings != nil {
|
||||||
|
for _, mapping := range mount.UidMappings {
|
||||||
|
uidMapping = append(uidMapping, runtimespec.LinuxIDMapping{
|
||||||
|
HostID: mapping.HostId,
|
||||||
|
ContainerID: mapping.ContainerId,
|
||||||
|
Size: mapping.Length,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
var gidMapping []runtimespec.LinuxIDMapping
|
||||||
|
if mount.GidMappings != nil {
|
||||||
|
for _, mapping := range mount.GidMappings {
|
||||||
|
gidMapping = append(gidMapping, runtimespec.LinuxIDMapping{
|
||||||
|
HostID: mapping.HostId,
|
||||||
|
ContainerID: mapping.ContainerId,
|
||||||
|
Size: mapping.Length,
|
||||||
|
})
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
s.Mounts = append(s.Mounts, runtimespec.Mount{
|
s.Mounts = append(s.Mounts, runtimespec.Mount{
|
||||||
@ -171,6 +189,8 @@ func WithMounts(osi osinterface.OS, config *runtime.ContainerConfig, extra []*ru
|
|||||||
Destination: dst,
|
Destination: dst,
|
||||||
Type: "bind",
|
Type: "bind",
|
||||||
Options: options,
|
Options: options,
|
||||||
|
UIDMappings: uidMapping,
|
||||||
|
GIDMappings: gidMapping,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
return nil
|
return nil
|
||||||
|
@ -157,7 +157,7 @@ func (c *criService) CreateContainer(ctx context.Context, r *runtime.CreateConta
|
|||||||
var volumeMounts []*runtime.Mount
|
var volumeMounts []*runtime.Mount
|
||||||
if !c.config.IgnoreImageDefinedVolumes {
|
if !c.config.IgnoreImageDefinedVolumes {
|
||||||
// Create container image volumes mounts.
|
// Create container image volumes mounts.
|
||||||
volumeMounts = c.volumeMounts(platform, containerRootDir, config.GetMounts(), &image.ImageSpec.Config)
|
volumeMounts = c.volumeMounts(platform, containerRootDir, config, &image.ImageSpec.Config)
|
||||||
} else if len(image.ImageSpec.Config.Volumes) != 0 {
|
} else if len(image.ImageSpec.Config.Volumes) != 0 {
|
||||||
log.G(ctx).Debugf("Ignoring volumes defined in image %v because IgnoreImageDefinedVolumes is set", image.ID)
|
log.G(ctx).Debugf("Ignoring volumes defined in image %v because IgnoreImageDefinedVolumes is set", image.ID)
|
||||||
}
|
}
|
||||||
@ -341,7 +341,17 @@ func (c *criService) CreateContainer(ctx context.Context, r *runtime.CreateConta
|
|||||||
// volumeMounts sets up image volumes for container. Rely on the removal of container
|
// volumeMounts sets up image volumes for container. Rely on the removal of container
|
||||||
// root directory to do cleanup. Note that image volume will be skipped, if there is criMounts
|
// root directory to do cleanup. Note that image volume will be skipped, if there is criMounts
|
||||||
// specified with the same destination.
|
// specified with the same destination.
|
||||||
func (c *criService) volumeMounts(platform platforms.Platform, containerRootDir string, criMounts []*runtime.Mount, config *imagespec.ImageConfig) []*runtime.Mount {
|
func (c *criService) volumeMounts(platform platforms.Platform, containerRootDir string, containerConfig *runtime.ContainerConfig, config *imagespec.ImageConfig) []*runtime.Mount {
|
||||||
|
var uidMappings, gidMappings []*runtime.IDMapping
|
||||||
|
if platform.OS == "linux" {
|
||||||
|
if usernsOpts := containerConfig.GetLinux().GetSecurityContext().GetNamespaceOptions().GetUsernsOptions(); usernsOpts != nil {
|
||||||
|
uidMappings = usernsOpts.GetUids()
|
||||||
|
gidMappings = usernsOpts.GetGids()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
criMounts := containerConfig.GetMounts()
|
||||||
|
|
||||||
if len(config.Volumes) == 0 {
|
if len(config.Volumes) == 0 {
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
@ -371,6 +381,8 @@ func (c *criService) volumeMounts(platform platforms.Platform, containerRootDir
|
|||||||
ContainerPath: dst,
|
ContainerPath: dst,
|
||||||
HostPath: src,
|
HostPath: src,
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
return mounts
|
return mounts
|
||||||
@ -966,11 +978,17 @@ func (c *criService) buildDarwinSpec(
|
|||||||
return specOpts, nil
|
return specOpts, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// containerMounts sets up necessary container system file mounts
|
// linuxContainerMounts sets up necessary container system file mounts
|
||||||
// including /dev/shm, /etc/hosts and /etc/resolv.conf.
|
// including /dev/shm, /etc/hosts and /etc/resolv.conf.
|
||||||
func (c *criService) linuxContainerMounts(sandboxID string, config *runtime.ContainerConfig) []*runtime.Mount {
|
func (c *criService) linuxContainerMounts(sandboxID string, config *runtime.ContainerConfig) []*runtime.Mount {
|
||||||
var mounts []*runtime.Mount
|
var mounts []*runtime.Mount
|
||||||
securityContext := config.GetLinux().GetSecurityContext()
|
securityContext := config.GetLinux().GetSecurityContext()
|
||||||
|
var uidMappings, gidMappings []*runtime.IDMapping
|
||||||
|
if usernsOpts := securityContext.GetNamespaceOptions().GetUsernsOptions(); usernsOpts != nil {
|
||||||
|
uidMappings = usernsOpts.GetUids()
|
||||||
|
gidMappings = usernsOpts.GetGids()
|
||||||
|
}
|
||||||
|
|
||||||
if !isInCRIMounts(etcHostname, config.GetMounts()) {
|
if !isInCRIMounts(etcHostname, config.GetMounts()) {
|
||||||
// /etc/hostname is added since 1.1.6, 1.2.4 and 1.3.
|
// /etc/hostname is added since 1.1.6, 1.2.4 and 1.3.
|
||||||
// For in-place upgrade, the old sandbox doesn't have the hostname file,
|
// For in-place upgrade, the old sandbox doesn't have the hostname file,
|
||||||
@ -984,6 +1002,8 @@ func (c *criService) linuxContainerMounts(sandboxID string, config *runtime.Cont
|
|||||||
HostPath: hostpath,
|
HostPath: hostpath,
|
||||||
Readonly: securityContext.GetReadonlyRootfs(),
|
Readonly: securityContext.GetReadonlyRootfs(),
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -994,6 +1014,8 @@ func (c *criService) linuxContainerMounts(sandboxID string, config *runtime.Cont
|
|||||||
HostPath: c.getSandboxHosts(sandboxID),
|
HostPath: c.getSandboxHosts(sandboxID),
|
||||||
Readonly: securityContext.GetReadonlyRootfs(),
|
Readonly: securityContext.GetReadonlyRootfs(),
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1005,6 +1027,8 @@ func (c *criService) linuxContainerMounts(sandboxID string, config *runtime.Cont
|
|||||||
HostPath: c.getResolvPath(sandboxID),
|
HostPath: c.getResolvPath(sandboxID),
|
||||||
Readonly: securityContext.GetReadonlyRootfs(),
|
Readonly: securityContext.GetReadonlyRootfs(),
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1018,6 +1042,17 @@ func (c *criService) linuxContainerMounts(sandboxID string, config *runtime.Cont
|
|||||||
HostPath: sandboxDevShm,
|
HostPath: sandboxDevShm,
|
||||||
Readonly: false,
|
Readonly: false,
|
||||||
SelinuxRelabel: sandboxDevShm != devShm,
|
SelinuxRelabel: sandboxDevShm != devShm,
|
||||||
|
// XXX: tmpfs support for idmap mounts got merged in
|
||||||
|
// Linux 6.3.
|
||||||
|
// Our Ubuntu 22.04 CI runs with 5.15 kernels, so
|
||||||
|
// disabling idmap mounts for this case makes the CI
|
||||||
|
// happy (the other fs used support idmap mounts in 5.15
|
||||||
|
// kernels).
|
||||||
|
// We can enable this at a later stage, but as this
|
||||||
|
// tmpfs mount is exposed empty to the container (no
|
||||||
|
// prepopulated files) and using the hostIPC with userns
|
||||||
|
// is blocked by k8s, we can just avoid using the
|
||||||
|
// mappings and it should work fine.
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
return mounts
|
return mounts
|
||||||
|
@ -231,12 +231,22 @@ func TestContainerSpecCommand(t *testing.T) {
|
|||||||
|
|
||||||
func TestVolumeMounts(t *testing.T) {
|
func TestVolumeMounts(t *testing.T) {
|
||||||
testContainerRootDir := "test-container-root"
|
testContainerRootDir := "test-container-root"
|
||||||
|
idmap := []*runtime.IDMapping{
|
||||||
|
{
|
||||||
|
ContainerId: 0,
|
||||||
|
HostId: 100,
|
||||||
|
Length: 1,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
for _, test := range []struct {
|
for _, test := range []struct {
|
||||||
desc string
|
desc string
|
||||||
platform platforms.Platform
|
platform platforms.Platform
|
||||||
criMounts []*runtime.Mount
|
criMounts []*runtime.Mount
|
||||||
|
usernsEnabled bool
|
||||||
imageVolumes map[string]struct{}
|
imageVolumes map[string]struct{}
|
||||||
expectedMountDest []string
|
expectedMountDest []string
|
||||||
|
expectedMappings []*runtime.IDMapping
|
||||||
}{
|
}{
|
||||||
{
|
{
|
||||||
desc: "should setup rw mount for image volumes",
|
desc: "should setup rw mount for image volumes",
|
||||||
@ -297,25 +307,88 @@ func TestVolumeMounts(t *testing.T) {
|
|||||||
"/abs/test-volume-4",
|
"/abs/test-volume-4",
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
desc: "should include mappings for image volumes on Linux",
|
||||||
|
platform: platforms.Platform{OS: "linux"},
|
||||||
|
usernsEnabled: true,
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"/test-volume-1/": {},
|
||||||
|
"/test-volume-2/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-2/",
|
||||||
|
"/test-volume-2/",
|
||||||
|
},
|
||||||
|
expectedMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
desc: "should NOT include mappings for image volumes on Linux if !userns",
|
||||||
|
platform: platforms.Platform{OS: "linux"},
|
||||||
|
usernsEnabled: false,
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"/test-volume-1/": {},
|
||||||
|
"/test-volume-2/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-2/",
|
||||||
|
"/test-volume-2/",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
desc: "should convert rel imageVolume paths to abs paths and add userns mappings",
|
||||||
|
platform: platforms.Platform{OS: "linux"},
|
||||||
|
usernsEnabled: true,
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"test-volume-1/": {},
|
||||||
|
"C:/test-volume-2/": {},
|
||||||
|
"../../test-volume-3/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-1",
|
||||||
|
"/C:/test-volume-2",
|
||||||
|
"/test-volume-3",
|
||||||
|
},
|
||||||
|
expectedMappings: idmap,
|
||||||
|
},
|
||||||
} {
|
} {
|
||||||
test := test
|
test := test
|
||||||
t.Run(test.desc, func(t *testing.T) {
|
t.Run(test.desc, func(t *testing.T) {
|
||||||
config := &imagespec.ImageConfig{
|
config := &imagespec.ImageConfig{
|
||||||
Volumes: test.imageVolumes,
|
Volumes: test.imageVolumes,
|
||||||
}
|
}
|
||||||
|
containerConfig := &runtime.ContainerConfig{Mounts: test.criMounts}
|
||||||
|
if test.usernsEnabled {
|
||||||
|
containerConfig.Linux = &runtime.LinuxContainerConfig{
|
||||||
|
SecurityContext: &runtime.LinuxContainerSecurityContext{
|
||||||
|
NamespaceOptions: &runtime.NamespaceOption{
|
||||||
|
UsernsOptions: &runtime.UserNamespace{
|
||||||
|
Mode: runtime.NamespaceMode_POD,
|
||||||
|
Uids: idmap,
|
||||||
|
Gids: idmap,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
c := newTestCRIService()
|
c := newTestCRIService()
|
||||||
got := c.volumeMounts(test.platform, testContainerRootDir, test.criMounts, config)
|
got := c.volumeMounts(test.platform, testContainerRootDir, containerConfig, config)
|
||||||
assert.Len(t, got, len(test.expectedMountDest))
|
assert.Len(t, got, len(test.expectedMountDest))
|
||||||
for _, dest := range test.expectedMountDest {
|
for _, dest := range test.expectedMountDest {
|
||||||
found := false
|
found := false
|
||||||
for _, m := range got {
|
for _, m := range got {
|
||||||
if m.ContainerPath == dest {
|
if m.ContainerPath != dest {
|
||||||
found = true
|
continue
|
||||||
assert.Equal(t,
|
|
||||||
filepath.Dir(m.HostPath),
|
|
||||||
filepath.Join(testContainerRootDir, "volumes"))
|
|
||||||
break
|
|
||||||
}
|
}
|
||||||
|
found = true
|
||||||
|
assert.Equal(t,
|
||||||
|
filepath.Dir(m.HostPath),
|
||||||
|
filepath.Join(testContainerRootDir, "volumes"))
|
||||||
|
if test.expectedMappings != nil {
|
||||||
|
assert.Equal(t, test.expectedMappings, m.UidMappings)
|
||||||
|
assert.Equal(t, test.expectedMappings, m.GidMappings)
|
||||||
|
}
|
||||||
|
break
|
||||||
}
|
}
|
||||||
assert.True(t, found)
|
assert.True(t, found)
|
||||||
}
|
}
|
||||||
@ -481,6 +554,14 @@ func TestBaseRuntimeSpec(t *testing.T) {
|
|||||||
|
|
||||||
func TestLinuxContainerMounts(t *testing.T) {
|
func TestLinuxContainerMounts(t *testing.T) {
|
||||||
const testSandboxID = "test-id"
|
const testSandboxID = "test-id"
|
||||||
|
idmap := []*runtime.IDMapping{
|
||||||
|
{
|
||||||
|
ContainerId: 0,
|
||||||
|
HostId: 100,
|
||||||
|
Length: 1,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
for _, test := range []struct {
|
for _, test := range []struct {
|
||||||
desc string
|
desc string
|
||||||
statFn func(string) (os.FileInfo, error)
|
statFn func(string) (os.FileInfo, error)
|
||||||
@ -550,6 +631,50 @@ func TestLinuxContainerMounts(t *testing.T) {
|
|||||||
},
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
desc: "should setup uidMappings/gidMappings when userns is used",
|
||||||
|
securityContext: &runtime.LinuxContainerSecurityContext{
|
||||||
|
NamespaceOptions: &runtime.NamespaceOption{
|
||||||
|
UsernsOptions: &runtime.UserNamespace{
|
||||||
|
Mode: runtime.NamespaceMode_POD,
|
||||||
|
Uids: idmap,
|
||||||
|
Gids: idmap,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
expectedMounts: []*runtime.Mount{
|
||||||
|
{
|
||||||
|
ContainerPath: "/etc/hostname",
|
||||||
|
HostPath: filepath.Join(testRootDir, sandboxesDir, testSandboxID, "hostname"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: idmap,
|
||||||
|
GidMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
ContainerPath: "/etc/hosts",
|
||||||
|
HostPath: filepath.Join(testRootDir, sandboxesDir, testSandboxID, "hosts"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: idmap,
|
||||||
|
GidMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
ContainerPath: resolvConfPath,
|
||||||
|
HostPath: filepath.Join(testRootDir, sandboxesDir, testSandboxID, "resolv.conf"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: idmap,
|
||||||
|
GidMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
ContainerPath: "/dev/shm",
|
||||||
|
HostPath: filepath.Join(testStateDir, sandboxesDir, testSandboxID, "shm"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
desc: "should use host /dev/shm when host ipc is set",
|
desc: "should use host /dev/shm when host ipc is set",
|
||||||
securityContext: &runtime.LinuxContainerSecurityContext{
|
securityContext: &runtime.LinuxContainerSecurityContext{
|
||||||
|
@ -143,7 +143,7 @@ func (c *criService) CreateContainer(ctx context.Context, r *runtime.CreateConta
|
|||||||
var volumeMounts []*runtime.Mount
|
var volumeMounts []*runtime.Mount
|
||||||
if !c.config.IgnoreImageDefinedVolumes {
|
if !c.config.IgnoreImageDefinedVolumes {
|
||||||
// Create container image volumes mounts.
|
// Create container image volumes mounts.
|
||||||
volumeMounts = c.volumeMounts(containerRootDir, config.GetMounts(), &image.ImageSpec.Config)
|
volumeMounts = c.volumeMounts(containerRootDir, config, &image.ImageSpec.Config)
|
||||||
} else if len(image.ImageSpec.Config.Volumes) != 0 {
|
} else if len(image.ImageSpec.Config.Volumes) != 0 {
|
||||||
log.G(ctx).Debugf("Ignoring volumes defined in image %v because IgnoreImageDefinedVolumes is set", image.ID)
|
log.G(ctx).Debugf("Ignoring volumes defined in image %v because IgnoreImageDefinedVolumes is set", image.ID)
|
||||||
}
|
}
|
||||||
@ -318,10 +318,19 @@ func (c *criService) CreateContainer(ctx context.Context, r *runtime.CreateConta
|
|||||||
// volumeMounts sets up image volumes for container. Rely on the removal of container
|
// volumeMounts sets up image volumes for container. Rely on the removal of container
|
||||||
// root directory to do cleanup. Note that image volume will be skipped, if there is criMounts
|
// root directory to do cleanup. Note that image volume will be skipped, if there is criMounts
|
||||||
// specified with the same destination.
|
// specified with the same destination.
|
||||||
func (c *criService) volumeMounts(containerRootDir string, criMounts []*runtime.Mount, config *imagespec.ImageConfig) []*runtime.Mount {
|
func (c *criService) volumeMounts(containerRootDir string, containerConfig *runtime.ContainerConfig, config *imagespec.ImageConfig) []*runtime.Mount {
|
||||||
if len(config.Volumes) == 0 {
|
if len(config.Volumes) == 0 {
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
var uidMappings, gidMappings []*runtime.IDMapping
|
||||||
|
if goruntime.GOOS != "windows" {
|
||||||
|
if usernsOpts := containerConfig.GetLinux().GetSecurityContext().GetNamespaceOptions().GetUsernsOptions(); usernsOpts != nil {
|
||||||
|
uidMappings = usernsOpts.GetUids()
|
||||||
|
gidMappings = usernsOpts.GetGids()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
criMounts := containerConfig.GetMounts()
|
||||||
var mounts []*runtime.Mount
|
var mounts []*runtime.Mount
|
||||||
for dst := range config.Volumes {
|
for dst := range config.Volumes {
|
||||||
if isInCRIMounts(dst, criMounts) {
|
if isInCRIMounts(dst, criMounts) {
|
||||||
@ -343,6 +352,8 @@ func (c *criService) volumeMounts(containerRootDir string, criMounts []*runtime.
|
|||||||
ContainerPath: dst,
|
ContainerPath: dst,
|
||||||
HostPath: src,
|
HostPath: src,
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
return mounts
|
return mounts
|
||||||
|
@ -62,6 +62,12 @@ const (
|
|||||||
func (c *criService) containerMounts(sandboxID string, config *runtime.ContainerConfig) []*runtime.Mount {
|
func (c *criService) containerMounts(sandboxID string, config *runtime.ContainerConfig) []*runtime.Mount {
|
||||||
var mounts []*runtime.Mount
|
var mounts []*runtime.Mount
|
||||||
securityContext := config.GetLinux().GetSecurityContext()
|
securityContext := config.GetLinux().GetSecurityContext()
|
||||||
|
var uidMappings, gidMappings []*runtime.IDMapping
|
||||||
|
if usernsOpts := securityContext.GetNamespaceOptions().GetUsernsOptions(); usernsOpts != nil {
|
||||||
|
uidMappings = usernsOpts.GetUids()
|
||||||
|
gidMappings = usernsOpts.GetGids()
|
||||||
|
}
|
||||||
|
|
||||||
if !isInCRIMounts(etcHostname, config.GetMounts()) {
|
if !isInCRIMounts(etcHostname, config.GetMounts()) {
|
||||||
// /etc/hostname is added since 1.1.6, 1.2.4 and 1.3.
|
// /etc/hostname is added since 1.1.6, 1.2.4 and 1.3.
|
||||||
// For in-place upgrade, the old sandbox doesn't have the hostname file,
|
// For in-place upgrade, the old sandbox doesn't have the hostname file,
|
||||||
@ -75,6 +81,8 @@ func (c *criService) containerMounts(sandboxID string, config *runtime.Container
|
|||||||
HostPath: hostpath,
|
HostPath: hostpath,
|
||||||
Readonly: securityContext.GetReadonlyRootfs(),
|
Readonly: securityContext.GetReadonlyRootfs(),
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -85,6 +93,8 @@ func (c *criService) containerMounts(sandboxID string, config *runtime.Container
|
|||||||
HostPath: c.getSandboxHosts(sandboxID),
|
HostPath: c.getSandboxHosts(sandboxID),
|
||||||
Readonly: securityContext.GetReadonlyRootfs(),
|
Readonly: securityContext.GetReadonlyRootfs(),
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -96,6 +106,8 @@ func (c *criService) containerMounts(sandboxID string, config *runtime.Container
|
|||||||
HostPath: c.getResolvPath(sandboxID),
|
HostPath: c.getResolvPath(sandboxID),
|
||||||
Readonly: securityContext.GetReadonlyRootfs(),
|
Readonly: securityContext.GetReadonlyRootfs(),
|
||||||
SelinuxRelabel: true,
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: uidMappings,
|
||||||
|
GidMappings: gidMappings,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -109,6 +121,16 @@ func (c *criService) containerMounts(sandboxID string, config *runtime.Container
|
|||||||
HostPath: sandboxDevShm,
|
HostPath: sandboxDevShm,
|
||||||
Readonly: false,
|
Readonly: false,
|
||||||
SelinuxRelabel: sandboxDevShm != devShm,
|
SelinuxRelabel: sandboxDevShm != devShm,
|
||||||
|
// XXX: tmpfs support for idmap mounts got merged in
|
||||||
|
// Linux 6.3.
|
||||||
|
// Our CI runs with 5.15 kernels, so disabling idmap
|
||||||
|
// mounts for this case makes the CI happy (the other fs
|
||||||
|
// used support idmap mounts in 5.15 kernels).
|
||||||
|
// We can enable this at a later stage, but as this
|
||||||
|
// tmpfs mount is exposed empty to the container (no
|
||||||
|
// prepopulated files) and using the hostIPC with userns
|
||||||
|
// is blocked by k8s, we can just avoid using the
|
||||||
|
// mappings and it should work fine.
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
return mounts
|
return mounts
|
||||||
|
@ -459,6 +459,14 @@ func TestContainerAndSandboxPrivileged(t *testing.T) {
|
|||||||
|
|
||||||
func TestContainerMounts(t *testing.T) {
|
func TestContainerMounts(t *testing.T) {
|
||||||
const testSandboxID = "test-id"
|
const testSandboxID = "test-id"
|
||||||
|
idmap := []*runtime.IDMapping{
|
||||||
|
{
|
||||||
|
ContainerId: 0,
|
||||||
|
HostId: 100,
|
||||||
|
Length: 1,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
for _, test := range []struct {
|
for _, test := range []struct {
|
||||||
desc string
|
desc string
|
||||||
statFn func(string) (os.FileInfo, error)
|
statFn func(string) (os.FileInfo, error)
|
||||||
@ -528,6 +536,50 @@ func TestContainerMounts(t *testing.T) {
|
|||||||
},
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
desc: "should setup uidMappings/gidMappings when userns is used",
|
||||||
|
securityContext: &runtime.LinuxContainerSecurityContext{
|
||||||
|
NamespaceOptions: &runtime.NamespaceOption{
|
||||||
|
UsernsOptions: &runtime.UserNamespace{
|
||||||
|
Mode: runtime.NamespaceMode_POD,
|
||||||
|
Uids: idmap,
|
||||||
|
Gids: idmap,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
expectedMounts: []*runtime.Mount{
|
||||||
|
{
|
||||||
|
ContainerPath: "/etc/hostname",
|
||||||
|
HostPath: filepath.Join(testRootDir, sandboxesDir, testSandboxID, "hostname"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: idmap,
|
||||||
|
GidMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
ContainerPath: "/etc/hosts",
|
||||||
|
HostPath: filepath.Join(testRootDir, sandboxesDir, testSandboxID, "hosts"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: idmap,
|
||||||
|
GidMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
ContainerPath: resolvConfPath,
|
||||||
|
HostPath: filepath.Join(testRootDir, sandboxesDir, testSandboxID, "resolv.conf"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
UidMappings: idmap,
|
||||||
|
GidMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
ContainerPath: "/dev/shm",
|
||||||
|
HostPath: filepath.Join(testStateDir, sandboxesDir, testSandboxID, "shm"),
|
||||||
|
Readonly: false,
|
||||||
|
SelinuxRelabel: true,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
{
|
{
|
||||||
desc: "should use host /dev/shm when host ipc is set",
|
desc: "should use host /dev/shm when host ipc is set",
|
||||||
securityContext: &runtime.LinuxContainerSecurityContext{
|
securityContext: &runtime.LinuxContainerSecurityContext{
|
||||||
@ -2259,3 +2311,141 @@ containerEdits:
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// TestLinuxVolumeMounts tests the linux-specific parts of VolumeMounts.
|
||||||
|
func TestLinuxVolumeMounts(t *testing.T) {
|
||||||
|
testContainerRootDir := "test-container-root"
|
||||||
|
idmap := []*runtime.IDMapping{
|
||||||
|
{
|
||||||
|
ContainerId: 0,
|
||||||
|
HostId: 100,
|
||||||
|
Length: 1,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, test := range []struct {
|
||||||
|
desc string
|
||||||
|
criMounts []*runtime.Mount
|
||||||
|
imageVolumes map[string]struct{}
|
||||||
|
usernsEnabled bool
|
||||||
|
expectedMountDest []string
|
||||||
|
expectedMappings []*runtime.IDMapping
|
||||||
|
}{
|
||||||
|
{
|
||||||
|
desc: "should skip image volumes if already mounted by CRI",
|
||||||
|
usernsEnabled: true,
|
||||||
|
criMounts: []*runtime.Mount{
|
||||||
|
{
|
||||||
|
ContainerPath: "/test-volume-1",
|
||||||
|
HostPath: "/test-hostpath-1",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"/test-volume-1": {},
|
||||||
|
"/test-volume-2": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-2",
|
||||||
|
},
|
||||||
|
expectedMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
desc: "should include mappings for image volumes",
|
||||||
|
usernsEnabled: true,
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"/test-volume-1/": {},
|
||||||
|
"/test-volume-2/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-2/",
|
||||||
|
"/test-volume-2/",
|
||||||
|
},
|
||||||
|
expectedMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
desc: "should convert rel imageVolume paths to abs paths",
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"test-volume-1/": {},
|
||||||
|
"./test-volume-2/": {},
|
||||||
|
"../../test-volume-3/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-1",
|
||||||
|
"/test-volume-2",
|
||||||
|
"/test-volume-3",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
desc: "should convert rel imageVolume paths to abs paths and add userns mappings",
|
||||||
|
usernsEnabled: true,
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"test-volume-1/": {},
|
||||||
|
"./test-volume-2/": {},
|
||||||
|
"../../test-volume-3/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-1",
|
||||||
|
"/test-volume-2",
|
||||||
|
"/test-volume-3",
|
||||||
|
},
|
||||||
|
expectedMappings: idmap,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
desc: "doesn't include mappings for image volumes if userns is disabled",
|
||||||
|
imageVolumes: map[string]struct{}{
|
||||||
|
"/test-volume-1/": {},
|
||||||
|
"/test-volume-2/": {},
|
||||||
|
},
|
||||||
|
expectedMountDest: []string{
|
||||||
|
"/test-volume-2/",
|
||||||
|
"/test-volume-2/",
|
||||||
|
},
|
||||||
|
},
|
||||||
|
} {
|
||||||
|
test := test
|
||||||
|
t.Run(test.desc, func(t *testing.T) {
|
||||||
|
config := &imagespec.ImageConfig{
|
||||||
|
Volumes: test.imageVolumes,
|
||||||
|
}
|
||||||
|
containerConfig := &runtime.ContainerConfig{
|
||||||
|
Mounts: test.criMounts,
|
||||||
|
}
|
||||||
|
|
||||||
|
if test.usernsEnabled {
|
||||||
|
containerConfig.Linux = &runtime.LinuxContainerConfig{
|
||||||
|
SecurityContext: &runtime.LinuxContainerSecurityContext{
|
||||||
|
NamespaceOptions: &runtime.NamespaceOption{
|
||||||
|
UsernsOptions: &runtime.UserNamespace{
|
||||||
|
Mode: runtime.NamespaceMode_POD,
|
||||||
|
Uids: idmap,
|
||||||
|
Gids: idmap,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
c := newTestCRIService()
|
||||||
|
got := c.volumeMounts(testContainerRootDir, containerConfig, config)
|
||||||
|
assert.Len(t, got, len(test.expectedMountDest))
|
||||||
|
for _, dest := range test.expectedMountDest {
|
||||||
|
found := false
|
||||||
|
for _, m := range got {
|
||||||
|
if m.ContainerPath != dest {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
found = true
|
||||||
|
assert.Equal(t,
|
||||||
|
filepath.Dir(m.HostPath),
|
||||||
|
filepath.Join(testContainerRootDir, "volumes"))
|
||||||
|
|
||||||
|
if test.expectedMappings != nil {
|
||||||
|
assert.Equal(t, test.expectedMappings, m.UidMappings)
|
||||||
|
assert.Equal(t, test.expectedMappings, m.GidMappings)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
assert.True(t, found)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
@ -279,8 +279,9 @@ func TestVolumeMounts(t *testing.T) {
|
|||||||
config := &imagespec.ImageConfig{
|
config := &imagespec.ImageConfig{
|
||||||
Volumes: test.imageVolumes,
|
Volumes: test.imageVolumes,
|
||||||
}
|
}
|
||||||
|
containerConfig := &runtime.ContainerConfig{Mounts: test.criMounts}
|
||||||
c := newTestCRIService()
|
c := newTestCRIService()
|
||||||
got := c.volumeMounts(testContainerRootDir, test.criMounts, config)
|
got := c.volumeMounts(testContainerRootDir, containerConfig, config)
|
||||||
assert.Len(t, got, len(test.expectedMountDest))
|
assert.Len(t, got, len(test.expectedMountDest))
|
||||||
for _, dest := range test.expectedMountDest {
|
for _, dest := range test.expectedMountDest {
|
||||||
found := false
|
found := false
|
||||||
|
@ -21,6 +21,7 @@ package process
|
|||||||
import (
|
import (
|
||||||
"context"
|
"context"
|
||||||
"encoding/json"
|
"encoding/json"
|
||||||
|
"errors"
|
||||||
"fmt"
|
"fmt"
|
||||||
"io"
|
"io"
|
||||||
"os"
|
"os"
|
||||||
@ -140,6 +141,13 @@ func (p *Init) Create(ctx context.Context, r *CreateConfig) error {
|
|||||||
if socket != nil {
|
if socket != nil {
|
||||||
opts.ConsoleSocket = socket
|
opts.ConsoleSocket = socket
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// runc ignores silently features it doesn't know about, so for things that this is
|
||||||
|
// problematic let's check if this runc version supports them.
|
||||||
|
if err := p.validateRuncFeatures(ctx, r.Bundle); err != nil {
|
||||||
|
return fmt.Errorf("failed to detect OCI runtime features: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
if err := p.runtime.Create(ctx, r.ID, r.Bundle, opts); err != nil {
|
if err := p.runtime.Create(ctx, r.ID, r.Bundle, opts); err != nil {
|
||||||
return p.runtimeError(err, "OCI runtime create failed")
|
return p.runtimeError(err, "OCI runtime create failed")
|
||||||
}
|
}
|
||||||
@ -173,6 +181,56 @@ func (p *Init) Create(ctx context.Context, r *CreateConfig) error {
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func (p *Init) validateRuncFeatures(ctx context.Context, bundle string) error {
|
||||||
|
// TODO: We should remove the logic from here and rebase on #8509.
|
||||||
|
// This way we can avoid the call to readConfig() here and the call to p.runtime.Features()
|
||||||
|
// in validateIDMapMounts().
|
||||||
|
// But that PR is not yet merged nor it is clear if it will be refactored.
|
||||||
|
// Do this contained hack for now.
|
||||||
|
spec, err := readConfig(bundle)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to read config: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := p.validateIDMapMounts(ctx, spec); err != nil {
|
||||||
|
return fmt.Errorf("OCI runtime doesn't support idmap mounts: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func (p *Init) validateIDMapMounts(ctx context.Context, spec *specs.Spec) error {
|
||||||
|
var used bool
|
||||||
|
for _, m := range spec.Mounts {
|
||||||
|
if m.UIDMappings != nil || m.GIDMappings != nil {
|
||||||
|
used = true
|
||||||
|
break
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if !used {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// From here onwards, we require idmap mounts. So if we fail to check, we return an error.
|
||||||
|
features, err := p.runtime.Features(ctx)
|
||||||
|
if err != nil {
|
||||||
|
// If the features command is not implemented, then runc is too old.
|
||||||
|
return fmt.Errorf("features command failed: %w", err)
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
if features.Linux.MountExtensions == nil || features.Linux.MountExtensions.IDMap == nil {
|
||||||
|
return errors.New("missing `mountExtensions.idmap` entry in `features` command")
|
||||||
|
}
|
||||||
|
|
||||||
|
if enabled := features.Linux.MountExtensions.IDMap.Enabled; enabled == nil || !*enabled {
|
||||||
|
return errors.New("idmap mounts not supported")
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
func (p *Init) openStdin(path string) error {
|
func (p *Init) openStdin(path string) error {
|
||||||
sc, err := fifo.OpenFifo(context.Background(), path, unix.O_WRONLY|unix.O_NONBLOCK, 0)
|
sc, err := fifo.OpenFifo(context.Background(), path, unix.O_WRONLY|unix.O_NONBLOCK, 0)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
|
@ -21,6 +21,7 @@ package process
|
|||||||
import (
|
import (
|
||||||
"context"
|
"context"
|
||||||
"encoding/json"
|
"encoding/json"
|
||||||
|
"errors"
|
||||||
"fmt"
|
"fmt"
|
||||||
"io"
|
"io"
|
||||||
"os"
|
"os"
|
||||||
@ -31,6 +32,7 @@ import (
|
|||||||
|
|
||||||
"github.com/containerd/containerd/errdefs"
|
"github.com/containerd/containerd/errdefs"
|
||||||
runc "github.com/containerd/go-runc"
|
runc "github.com/containerd/go-runc"
|
||||||
|
specs "github.com/opencontainers/runtime-spec/specs-go"
|
||||||
"golang.org/x/sys/unix"
|
"golang.org/x/sys/unix"
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -39,6 +41,8 @@ const (
|
|||||||
RuncRoot = "/run/containerd/runc"
|
RuncRoot = "/run/containerd/runc"
|
||||||
// InitPidFile name of the file that contains the init pid
|
// InitPidFile name of the file that contains the init pid
|
||||||
InitPidFile = "init.pid"
|
InitPidFile = "init.pid"
|
||||||
|
// configFile is the name of the runc config file
|
||||||
|
configFile = "config.json"
|
||||||
)
|
)
|
||||||
|
|
||||||
// safePid is a thread safe wrapper for pid.
|
// safePid is a thread safe wrapper for pid.
|
||||||
@ -184,3 +188,23 @@ func stateName(v interface{}) string {
|
|||||||
}
|
}
|
||||||
panic(fmt.Errorf("invalid state %v", v))
|
panic(fmt.Errorf("invalid state %v", v))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func readConfig(path string) (spec *specs.Spec, err error) {
|
||||||
|
cfg := filepath.Join(path, configFile)
|
||||||
|
f, err := os.Open(cfg)
|
||||||
|
if err != nil {
|
||||||
|
if os.IsNotExist(err) {
|
||||||
|
return nil, fmt.Errorf("JSON specification file %s not found", cfg)
|
||||||
|
}
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
defer f.Close()
|
||||||
|
|
||||||
|
if err = json.NewDecoder(f).Decode(&spec); err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to parse config: %w", err)
|
||||||
|
}
|
||||||
|
if spec == nil {
|
||||||
|
return nil, errors.New("config cannot be null")
|
||||||
|
}
|
||||||
|
return spec, nil
|
||||||
|
}
|
||||||
|
@ -1 +1 @@
|
|||||||
1.8.7
|
1.9
|
||||||
|
24
vendor/github.com/opencontainers/runtime-spec/specs-go/features/features.go
generated
vendored
24
vendor/github.com/opencontainers/runtime-spec/specs-go/features/features.go
generated
vendored
@ -36,11 +36,12 @@ type Linux struct {
|
|||||||
// Nil value means "unknown", not "no support for any capability".
|
// Nil value means "unknown", not "no support for any capability".
|
||||||
Capabilities []string `json:"capabilities,omitempty"`
|
Capabilities []string `json:"capabilities,omitempty"`
|
||||||
|
|
||||||
Cgroup *Cgroup `json:"cgroup,omitempty"`
|
Cgroup *Cgroup `json:"cgroup,omitempty"`
|
||||||
Seccomp *Seccomp `json:"seccomp,omitempty"`
|
Seccomp *Seccomp `json:"seccomp,omitempty"`
|
||||||
Apparmor *Apparmor `json:"apparmor,omitempty"`
|
Apparmor *Apparmor `json:"apparmor,omitempty"`
|
||||||
Selinux *Selinux `json:"selinux,omitempty"`
|
Selinux *Selinux `json:"selinux,omitempty"`
|
||||||
IntelRdt *IntelRdt `json:"intelRdt,omitempty"`
|
IntelRdt *IntelRdt `json:"intelRdt,omitempty"`
|
||||||
|
MountExtensions *MountExtensions `json:"mountExtensions,omitempty"`
|
||||||
}
|
}
|
||||||
|
|
||||||
// Cgroup represents the "cgroup" field.
|
// Cgroup represents the "cgroup" field.
|
||||||
@ -123,3 +124,16 @@ type IntelRdt struct {
|
|||||||
// Nil value means "unknown", not "false".
|
// Nil value means "unknown", not "false".
|
||||||
Enabled *bool `json:"enabled,omitempty"`
|
Enabled *bool `json:"enabled,omitempty"`
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// MountExtensions represents the "mountExtensions" field.
|
||||||
|
type MountExtensions struct {
|
||||||
|
// IDMap represents the status of idmap mounts support.
|
||||||
|
IDMap *IDMap `json:"idmap,omitempty"`
|
||||||
|
}
|
||||||
|
|
||||||
|
type IDMap struct {
|
||||||
|
// Enabled represents whether idmap mounts supports is compiled in.
|
||||||
|
// Unrelated to whether the host supports it or not.
|
||||||
|
// Nil value means "unknown", not "false".
|
||||||
|
Enabled *bool `json:"enabled,omitempty"`
|
||||||
|
}
|
||||||
|
2
vendor/github.com/opencontainers/runtime-spec/specs-go/version.go
generated
vendored
2
vendor/github.com/opencontainers/runtime-spec/specs-go/version.go
generated
vendored
@ -11,7 +11,7 @@ const (
|
|||||||
VersionPatch = 0
|
VersionPatch = 0
|
||||||
|
|
||||||
// VersionDev indicates development branch. Releases will be empty string.
|
// VersionDev indicates development branch. Releases will be empty string.
|
||||||
VersionDev = ""
|
VersionDev = "+dev"
|
||||||
)
|
)
|
||||||
|
|
||||||
// Version is the specification version that the package types support.
|
// Version is the specification version that the package types support.
|
||||||
|
2
vendor/modules.txt
vendored
2
vendor/modules.txt
vendored
@ -343,7 +343,7 @@ github.com/opencontainers/image-spec/specs-go/v1
|
|||||||
# github.com/opencontainers/runc v1.1.9
|
# github.com/opencontainers/runc v1.1.9
|
||||||
## explicit; go 1.17
|
## explicit; go 1.17
|
||||||
github.com/opencontainers/runc/libcontainer/user
|
github.com/opencontainers/runc/libcontainer/user
|
||||||
# github.com/opencontainers/runtime-spec v1.1.0
|
# github.com/opencontainers/runtime-spec v1.1.1-0.20230823135140-4fec88fd00a4
|
||||||
## explicit
|
## explicit
|
||||||
github.com/opencontainers/runtime-spec/specs-go
|
github.com/opencontainers/runtime-spec/specs-go
|
||||||
github.com/opencontainers/runtime-spec/specs-go/features
|
github.com/opencontainers/runtime-spec/specs-go/features
|
||||||
|
Loading…
Reference in New Issue
Block a user