The ability to handle KVM based runtimes with SELinux has been added as
part of d715d00906.
However, that commit introduced some logic to check whether the
"container_kvm_t" label would or not be present in the system, and while
the intentions were good, there's two major issues with the approach:
1. Inspecting "/etc/selinux/targeted/contexts/customizable_types" is not
the way to go, as it doesn't list the "container_kvm_t" at all.
2. There's no need to check for the label, as if the label is invalid an
"Invalid Label" error will be returned and that's it.
With those two in mind, let's simplify the logic behind setting the
"container_kvm_t" label, removing all the unnecessary code.
Here's an output of VMM process running, considering:
* The state before this patch:
```
$ containerd --version
containerd github.com/containerd/containerd v1.6.0-beta.3-88-g7fa44fc98 7fa44fc98f
$ kubectl apply -f ~/simple-pod.yaml
pod/nginx created
$ ps -auxZ | grep cloud-hypervisor
system_u:system_r:container_runtime_t:s0 root 609717 4.0 0.5 2987512 83588 ? Sl 08:32 0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/be9d5cbabf440510d58d89fc8a8e77c27e96ddc99709ecaf5ab94c6b6b0d4c89/clh-api.sock
```
* The state after this patch:
```
$ containerd --version
containerd github.com/containerd/containerd v1.6.0-beta.3-89-ga5f2113c9 a5f2113c9fc15b19b2c364caaedb99c22de4eb32
$ kubectl apply -f ~/simple-pod.yaml
pod/nginx created
$ ps -auxZ | grep cloud-hypervisor
system_u:system_r:container_kvm_t:s0:c638,c999 root 614842 14.0 0.5 2987512 83228 ? Sl 08:40 0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/f8ff838afdbe0a546f6995fe9b08e0956d0d0cdfe749705d7ce4618695baa68c/clh-api.sock
```
Note, the tests were performed using the following configuration snippet:
```
[plugins]
[plugins.cri]
enable_selinux = true
[plugins.cri.containerd]
[plugins.cri.containerd.runtimes]
[plugins.cri.containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
privileged_without_host_devices = true
```
And using the following pod yaml:
```
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
runtimeClassName: kata
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
```
Fixes: #6371
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The error message was unnecessary cryptic. `snapshot-[name]` notation
was only used here and hard to understand.
Instead it should say `snapshots on "..." snapshotter`.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
CRI API has been updated to include a an optional `resources` field in the
LinuxPodSandboxConfig field, as part of the RunPodSandbox request.
Having sandbox level resource details at sandbox creation time will have
large benefits for sandboxed runtimes. In the case of Kata Containers,
for example, this'll allow for better support of SW/HW architectures
which don't allow for CPU/memory hotplug, and it'll allow for better
queue sizing for virtio devices associated with the sandbox (in the VM
case).
If this sandbox resource information is provided as part of the run
sandbox request, let's introduce a pattern where we will update the
pause container's runtiem spec to include this information in the
annotations field.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
The CRI-plugin will setup watcher for each container after
StartContainer or RunPodSandbox. It will cleanup task(container/sandbox)
if received the exit event from watcher.
The original test design is to `Delete` sandbox container to get
NOT_READY state and expect to receive NotFound error. It depends on that
CRI-plugin cleanups container after `Delete` API. If not, the shim will
be cleanup and test code will receive `ttrpc: closed: unknown` or other
unknown error. It is flaky.
In this patch, the test will only send the kill signal and wait for the
exit event. When sandbox exits, the state will and must be NOT_READY.
```plain
// test fail log
=== RUN TestContainerdRestart
restart_test.go:92: Make sure no sandbox is running before test
restart_test.go:97: Start test sandboxes and containers
common.go:115: Image "k8s.gcr.io/pause:3.6" already exists, not pulling.
common.go:115: Image "k8s.gcr.io/pause:3.6" already exists, not pulling.
restart_test.go:139:
Error Trace: restart_test.go:139
Error: Should be true
Test: TestContainerdRestart
Messages: delete should return not found error but returned failed to delete task: ttrpc: closed: unknown
--- FAIL: TestContainerdRestart (4.25s)
// containerd log
&TaskExit{ContainerID:4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184,ID:4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184,Pid:28430,ExitStatus:137,ExitedAt:2021-12-12 07:56:01.400753012 +0000 UTC,XXX_unrecognized:[],}"
time="2021-12-12T07:56:01.401120516Z" level=debug msg="event forwarded" ns=k8s.io topic=/tasks/exit type=containerd.events.TaskExit
time="2021-12-12T07:56:01.418934208Z" level=debug msg="event forwarded" ns=k8s.io topic=/tasks/delete type=containerd.events.TaskDelete
time="2021-12-12T07:56:01.419192910Z" level=info msg="shim disconnected" id=4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184
time="2021-12-12T07:56:01.419235911Z" level=warning msg="cleaning up after shim disconnected" id=4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184 namespace=k8s.io
time="2021-12-12T07:56:01.419247711Z" level=info msg="cleaning up dead shim"
time="2021-12-12T07:56:01.419235311Z" level=error msg="failed sending message on channel" error="write unix /run/containerd/s/18afde7fcde70236eb31b9f43f3bd92af1dc1186583c501aa1396255f87f95d4->@: write: broken pipe"
time="2021-12-12T07:56:01.419354712Z" level=debug msg="failed to delete task" error="ttrpc: closed" id=4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184
```
CI Link: `https://pipelines.actions.githubusercontent.com/G4SighzWVVZ6vsyiz7FFMFjLjRzveJHseEnVyibkSq87Cl2x4O/_apis/pipelines/1/runs/9501/signedlogcontent/76?urlExpires=2021-12-12T08%3A42%3A08.0765750Z&urlSigningMethod=HMACV1&urlSignature=pH93isMSFdZUo1ndnZynJpZbPGrEyvt12MO03fgUU7I%3D`
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Skip this test until this error can be evaluated and the appropriate
test fix or environment configuration can be determined.
Signed-off-by: Derek McGowan <derek@mcg.dev>
Allow rootless containers with privileged to mount devices that are accessible
(ignore permission errors in rootless mode).
This patch updates oci.getDevices() to ignore access denied errors on sub-
directories and files within the given path if the container is running with
userns enabled.
Note that these errors are _only_ ignored on paths _under_ the specified path,
and not the path itself, so if `HostDevices()` is used, and `/dev` itself is
not accessible, or `WithDevices()` is used to specify a device that is not
accessible, an error is still produced.
Tests were added, which includes a temporary workaround for compatibility
with Go 1.16 (we could decide to skip these tests on Go 1.16 instead).
To verify the patch in a container:
docker run --rm -v $(pwd):/go/src/github.com/containerd/containerd -w /go/src/github.com/containerd/containerd golang:1.17 sh -c 'go test -v -run TestHostDevices ./oci'
=== RUN TestHostDevicesOSReadDirFailure
--- PASS: TestHostDevicesOSReadDirFailure (0.00s)
=== RUN TestHostDevicesOSReadDirFailureInUserNS
--- PASS: TestHostDevicesOSReadDirFailureInUserNS (0.00s)
=== RUN TestHostDevicesDeviceFromPathFailure
--- PASS: TestHostDevicesDeviceFromPathFailure (0.00s)
=== RUN TestHostDevicesDeviceFromPathFailureInUserNS
--- PASS: TestHostDevicesDeviceFromPathFailureInUserNS (0.00s)
=== RUN TestHostDevicesAllValid
--- PASS: TestHostDevicesAllValid (0.00s)
PASS
ok github.com/containerd/containerd/oci 0.006s
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The OCI image spec did a v1.0.2 security release for CVE-2021-41190, however
commit 09c9270fee, depends on MediaTypes that
have not yet been released by the OCI image-spec, so using current "main" instead.
full diff: 5ad6f50d62...693428a734
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Add file system options for config file, so that user can use
non-default file system parameters for the fs type of choice
Using file system options in config file overwrites the default
options already being used.
Signed-off-by: Alakesh Haloi <alakeshh@amazon.com>
Considering Windows 2004's EoL on the 14th of December, 2021,
this PR removes all periodic integration testing for 2004.
Signed-off-by: Nashwan Azhari <nazhari@cloudbasesolutions.com>
This change enables the TestVolumeOwnership on Windows. The test
assumes that the volume-ownership image is built on Windows, thus
ensuring that Windows file security info (ACLs and ownership info)
are attached to the C:\volumes\test_dir path.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
As like other integration tests, Windows integration tests should not
fail-fast. So developers can see whether an issue is platform-specific
or not.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>