Commit Graph

366 Commits

Author SHA1 Message Date
Kazuyoshi Kato
c149e6c2ea
Merge pull request #6996 from dcantah/hpc-validations
Add validations for Windows HostProcess CRI configs
2022-06-01 11:37:12 -07:00
Kazuyoshi Kato
49ca87d727 Limit the response size of ExecSync
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-05-31 22:21:35 +00:00
Daniel Canter
b5e1b8f619 Use t.Run for /pkg/cri tests
A majority of the tests in /pkg/cri are testing/validating multiple
things per test (generally spec or options validations). This flow
lends itself well to using *testing.T's Run method to run each thing
as a subtest so `go test` output can actually display which subtest
failed/passed.

Some of the tests in the packages in pkg/cri already did this, but
a bunch simply logged what sub-testcase was currently running without
invoking t.Run.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-05-29 18:32:09 -07:00
Daniel Canter
978ff393d2 Add validations for Windows HostProcess CRI configs
HostProcess containers require every container in the pod to be a
host process container and have the corresponding field set. The Kubelet
usually enforces this so we'd error before even getting here but we recently
found a bug in this logic so better to be safe than sorry.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-05-27 21:17:07 -07:00
AllenZMC
eaec6530d7 fix some confusing typos
Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>
2022-05-17 23:53:36 +08:00
Akihiro Suda
42584167b7
Officially deprecate Schema 1
Schema 1 has been substantially deprecated since circa. 2017 in favor of Schema 2 introduced in Docker 1.10 (Feb 2016)
and its successor OCI Image Spec v1, but we have not officially deprecated Schema 1.

One of the reasons was that Quay did not support Schema 2 so far, but it is reported that Quay has been
supporting Schema 2 since Feb 2020 (moby/buildkit issue 409).

This PR deprecates pulling Schema 1 images but the feature will not be removed before containerd 2.0.
Pushing Schema 1 images was never implemented in containerd (and its consumers such as BuildKit).

Docker/Moby already disabled pushing Schema 1 images in Docker 20.10 (moby/moby PR 41295),
but Docker/Moby has not yet disabled pulling Schema 1 as containerd has not yet deprecated Schema 1.
(See the comments in moby/moby PR 42300.)
Docker/Moby is expected to disable pulling Schema 1 images in future after this deprecation.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-05-02 19:08:38 +09:00
Mike Brown
6b35307594
Merge pull request #5490 from askervin/5Bu_blockio
Support for cgroups blockio
2022-04-29 10:07:56 -05:00
Antti Kervinen
10576c298e cri: support blockio class in pod and container annotations
This patch adds support for a container annotation and two separate
pod annotations for controlling the blockio class of containers.

The container annotation can be used by a CRI client:
  "io.kubernetes.cri.blockio-class"

Pod annotations specify the blockio class in the K8s pod spec level:
  "blockio.resources.beta.kubernetes.io/pod"
  (pod-wide default for all containers within)

  "blockio.resources.beta.kubernetes.io/container.<container_name>"
  (container-specific overrides)

Correspondingly, this patch adds support for --blockio-class and
--blockio-config-file to ctr, too.

This implementation follows the resource class annotation pattern
introduced in RDT and merged in commit 893701220.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2022-04-29 11:44:09 +03:00
Derek McGowan
6e0231f992
Merge pull request #6150 from fuweid/support-4984
feature: support image pull progress timeout
2022-04-26 12:15:09 -07:00
Wei Fu
00d102da9f feature: support image pull progress timeout
Kubelet sends the PullImage request without timeout, because the image size
is unknown and timeout is hard to defined. The pulling request might run
into 0B/s speed, if containerd can't receive any packet in that connection.
For this case, the containerd should cancel the PullImage request.

Although containerd provides ingester manager to track the progress of pulling
request, for example `ctr image pull` shows the console progress bar, it needs
more CPU resources to open/read the ingested files to get status.

In order to support progress timeout feature with lower overhead, this
patch uses http.RoundTripper wrapper to track active progress. That
wrapper will increase active-request number and return the
countingReadCloser wrapper for http.Response.Body. Each bytes-read
can be count and the active-request number will be descreased when the
countingReadCloser wrapper has been closed. For the progress tracker,
it can check the active-request number and bytes-read at intervals. If
there is no any progress, the progress tracker should cancel the
request.

NOTE: For each blob data, the containerd will make sure that the content
writer is opened before sending http request to the registry. Therefore, the
progress reporter can rely on the active-request number.

fixed: #4984

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-27 00:02:27 +08:00
Derek McGowan
3dbd6a2498
Merge pull request #6841 from kzys/proto-upgrade-6
Migrate off from github.com/gogo/protobuf
2022-04-25 15:12:51 -07:00
Kazuyoshi Kato
f140400c0e
Merge pull request #5686 from dtnyn/issue-5679
Add flag to allow oci.WithAllDevicesAllowed on PrivilegedWithoutHostDevices
2022-04-25 11:44:01 -07:00
Kazuyoshi Kato
7a4f81d8ba Fix tests
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:41:05 +00:00
Kazuyoshi Kato
e3db7de8f5 Remove gogo/protobuf and adjust types
This commit migrates containerd/protobuf from github.com/gogo/protobuf
to google.golang.org/protobuf and adjust types. Proto-generated structs
cannot be passed as values.

Fixes #6564.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:31:53 +00:00
Kazuyoshi Kato
88c0c7201e Consolidate gogo/protobuf dependencies under our own protobuf package
This would make gogo/protobuf migration easier.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 15:53:36 +00:00
Kazuyoshi Kato
80b825ca2c Remove gogoproto.stdtime
This commit removes gogoproto.stdtime, since it is not supported by
Google's official toolchain
(see https://github.com/containerd/containerd/issues/6564).

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 13:39:30 +00:00
Eric Lin
a5dfbfcf5a cri: load sandboxes/containers/images in parallel
Parallelizing them decreases loading duration.

Time to complete recover():
* Without competing IOs + without opt: 21s
* Without competing IOs + with opt: 14s
* Competing IOs + without opt: 3m44s
* Competing IOs + with opt: 33s

Signed-off-by: Eric Lin <linxiulei@gmail.com>
2022-04-09 13:01:14 +00:00
Ed Bartosh
ff5c55847a move CDI calls to the linux-only code
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:10:59 +03:00
Ed Bartosh
c9b4ccf83e add configuration for CDI
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:10:54 +03:00
Ed Bartosh
aed0538dac cri: implement CDI device injection
Extract the names of requested CDI devices and update the OCI
Spec according to the corresponding CDI device specifications.

CDI devices are requested using container annotations in the
cdi.k8s.io namespace. Once CRI gains dedicated fields for CDI
injection the snippet for extracting CDI names will need an
update.

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:07:54 +03:00
Wei Fu
8113758568 CRI: improve image pulling performance
Background:

With current design, the content backend uses key-lock for long-lived
write transaction. If the content reference has been marked for write
transaction, the other requestes on the same reference will fail fast with
unavailable error. Since the metadata plugin is based on boltbd which
only supports single-writer, the content backend can't block or handle
the request too long. It requires the client to handle retry by itself,
like OpenWriter - backoff retry helper. But the maximum retry interval
can be up to 2 seconds. If there are several concurrent requestes fo the
same image, the waiters maybe wakeup at the same time and there is only
one waiter can continue. A lot of waiters will get into sleep and we will
take long time to finish all the pulling jobs and be worse if the image
has many more layers, which mentioned in issue #4937.

After fetching, containerd.Pull API allows several hanlers to commit
same ChainID snapshotter but only one can be done successfully. Since
unpack tar.gz is time-consuming job, it can impact the performance on
unpacking for same ChainID snapshotter in parallel.

For instance, the Request 2 doesn't need to prepare and commit, it
should just wait for Request 1 finish, which mentioned in pull
request #6318.

```text
	Request 1	Request 2

	Prepare
	   |
	   |
	   |
	   |		Prepare
	Commit		   |
			   |
			   |
			   |
			Commit(failed on exist)
```

Both content backoff retry and unnecessary unpack impacts the performance.

Solution:

Introduced the duplicate suppression in fetch and unpack context. The
deplicate suppression uses key-mutex and single-waiter-notify to support
singleflight. The caller can use the duplicate suppression in different
PullImage handlers so that we can avoid unnecessary unpack and spin-lock
in OpenWriter.

Test Result:

Before enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	1m6.172s
user	0m0.268s
sys	0m0.193s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.324s
user	0m0.441s
sys	0m0.316s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	1m47.657s
user	0m0.284s
sys	0m0.224s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.381s
user	0m0.488s
sys	0m0.358s
```

With this enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	0m1.140s
user	0m0.243s
sys	0m0.178s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.239s
user	0m0.463s
sys	0m0.275s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	0m5.546s
user	0m0.217s
sys	0m0.219s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.090s
user	0m0.501s
sys	0m0.331s
```

Test Script:

localhost:5000/{redis|golang}:latest is equal to
docker.io/library/{redis|golang}:latest. The image is hold in local registry
service by `docker run -d -p 5000:5000 --name registry registry:2`.

```bash

image_name="${1}"
pull_times="${2:-10}"

cleanup() {
  ctr image rmi "${image_name}"
  ctr -n k8s.io image rmi "${image_name}"
  crictl rmi "${image_name}"
  docker rmi "${image_name}"
  sleep 2
}

crictl_testing() {
  for idx in $(seq 1 ${pull_times}); do
    crictl pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

docker_testing() {
  for idx in $(seq 1 ${pull_times}); do
    docker pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

cleanup > /dev/null 2>&1

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "crictl pull $image_name (x${pull_times}) takes ..."
time crictl_testing
echo

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "docker pull $image_name (x${pull_times}) takes ..."
time docker_testing
```

Fixes: #4937
Close: #4985
Close: #6318

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-06 07:14:18 +08:00
Kazuyoshi Kato
96b16b447d Use typeurl.Any instead of github.com/gogo/protobuf/types.Any
This commit upgrades github.com/containerd/typeurl to use typeurl.Any.
The interface hides gogo/protobuf/types.Any from containerd's Go client.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-03-24 20:50:07 +00:00
Derek McGowan
551516a18d
Merge pull request from GHSA-c9cp-9c75-9v8c
Fix the Inheritable capability defaults.
2022-03-23 10:50:56 -07:00
Phil Estes
ee49c4d557
Add nolint:staticcheck to platform-specific calls
The linter on platforms that have a hardcoded response complains about
"if xyz == nil" checks; ignore those.

Signed-off-by: Phil Estes <estesp@amazon.com>
2022-03-17 18:24:00 -04:00
Eng Zer Jun
18ec2761c0
test: use T.TempDir to create temporary test directory
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-03-15 14:03:50 +08:00
Paul "TBBle" Hampson
39d52118f5 Plumb CRI Devices through to OCI WindowsDevices
There's two mappings of hostpath to IDType and ID in the wild:
- dockershim and dockerd-cri (implicitly via docker) use class/ID
-- The only supported IDType in Docker is 'class'.
-- https://github.com/aarnaud/k8s-directx-device-plugin generates this form
- https://github.com/jterry75/cri (windows_port branch) uses IDType://ID
-- hcsshim's CRI test suite generates this form

`://` is much more easily distinguishable, so I've gone with that one as
the generic separator, with `class/` as a special-case.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2022-03-12 08:16:43 +11:00
Shengjing Zhu
352a8f49f7 cri: relax test for system without hugetlb
These unit tests don't check hugetlb. However by setting
TolerateMissingHugetlbController to false, these tests can't
be run on system without hugetlb (e.g. Debian buildd).

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-28 01:38:58 +08:00
Shengjing Zhu
f4f41296c2 Replace golang.org/x/net/context with std library
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-22 02:27:05 +08:00
Derek McGowan
c0f8188469
Update go-cni to v1.1.2
Fixes panic when exec is nil

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-02-10 12:40:51 -08:00
Markus Lehtonen
9b1fb82584 cri: fix handling of ignore_rdt_not_enabled_errors config option
We were not properly ignoring errors from
gorestrl.rdt.ContainerClassFromAnnotations() causing the config option
to be ineffective, in practice.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-02-04 13:54:03 +02:00
Andrew G. Morgan
6906b57c72
Fix the Inheritable capability defaults.
The Linux kernel never sets the Inheritable capability flag to
anything other than empty. Non-empty values are always exclusively
set by userspace code.

[The kernel stopped defaulting this set of capability values to the
 full set in 2000 after a privilege escalation with Capabilities
 affecting Sendmail and others.]

Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
2022-02-01 13:55:46 -08:00
Derek McGowan
4e9e14c2b6
Fix rdt build tags for go 1.16
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-01-19 11:09:29 -08:00
Takumasa Sakao
18592b2f5a Fix wrong log message
Signed-off-by: Takumasa Sakao <tsakao@zlab.co.jp>
2022-01-09 16:01:23 +09:00
haoyun
bbe46b8c43 feat: replace github.com/pkg/errors to errors
Signed-off-by: haoyun <yun.hao@daocloud.io>
Co-authored-by: zounengren <zouyee1989@gmail.com>
2022-01-07 10:27:03 +08:00
Derek McGowan
644a01e13b
Merge pull request from GHSA-mvff-h3cj-wj9c
only relabel cri managed host mounts
2022-01-05 09:30:58 -08:00
Markus Lehtonen
9c2e3835fa cri: add ignore_rdt_not_enabled_errors config option
Enabling this option effectively causes RDT class of a container to be a
soft requirement. If RDT support has not been enabled the RDT class
setting will not have any effect.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-01-04 09:27:54 +02:00
Markus Lehtonen
f4a191917b cri: annotations for controlling RDT class
Use goresctrl for parsing container and pod annotations related to RDT.

In practice, from the users' point of view, this patchs adds support for
a container annotation and two separate pod annotations for controlling
the RDT class of containers.

Container annotation can be used by a CRI client:
  "io.kubernetes.cri.rdt-class"

Pod annotations for specifying the RDT class in the K8s pod spec level:
  "rdt.resources.beta.kubernetes.io/pod"
  (pod-wide default for all containers within)

  "rdt.resources.beta.kubernetes.io/container.<container_name>"
  (container-specific overrides)

Annotations are intended as an intermediate step before the CRI API
supports RDT.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-01-04 09:27:54 +02:00
Derek McGowan
2c9d80aba5
Merge pull request #6372 from fidencio/wip/seutil-fix-container_kvm_t-type-detection
seutil: Fix setting the "container_kvm_t" label
2021-12-15 10:35:04 -08:00
Phil Estes
330961c2d5
Merge pull request #6358 from jonyhy96/feat-error
refactor: functions for error log and error return
2021-12-14 10:16:54 -05:00
Derek McGowan
ac531108ab
Merge pull request #6155 from egernst/cri-update-for-sandbox-sizing
CRI update for sandbox sizing
2021-12-13 16:21:30 -08:00
Fabiano Fidêncio
f1c7993311 seutil: Fix setting the "container_kvm_t" label
The ability to handle KVM based runtimes with SELinux has been added as
part of d715d00906.

However, that commit introduced some logic to check whether the
"container_kvm_t" label would or not be present in the system, and while
the intentions were good, there's two major issues with the approach:
1. Inspecting "/etc/selinux/targeted/contexts/customizable_types" is not
   the way to go, as it doesn't list the "container_kvm_t" at all.
2. There's no need to check for the label, as if the label is invalid an
   "Invalid Label" error will be returned and that's it.

With those two in mind, let's simplify the logic behind setting the
"container_kvm_t" label, removing all the unnecessary code.

Here's an output of VMM process running, considering:
* The state before this patch:
  ```
  $ containerd --version
  containerd github.com/containerd/containerd v1.6.0-beta.3-88-g7fa44fc98 7fa44fc98f
  $ kubectl apply -f ~/simple-pod.yaml
  pod/nginx created
  $ ps -auxZ | grep cloud-hypervisor
  system_u:system_r:container_runtime_t:s0 root 609717 4.0  0.5 2987512 83588 ?    Sl   08:32   0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/be9d5cbabf440510d58d89fc8a8e77c27e96ddc99709ecaf5ab94c6b6b0d4c89/clh-api.sock
  ```

* The state after this patch:
  ```
  $ containerd --version
  containerd github.com/containerd/containerd v1.6.0-beta.3-89-ga5f2113c9 a5f2113c9fc15b19b2c364caaedb99c22de4eb32
  $ kubectl apply -f ~/simple-pod.yaml
  pod/nginx created
  $ ps -auxZ | grep cloud-hypervisor
  system_u:system_r:container_kvm_t:s0:c638,c999 root 614842 14.0  0.5 2987512 83228 ? Sl 08:40   0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/f8ff838afdbe0a546f6995fe9b08e0956d0d0cdfe749705d7ce4618695baa68c/clh-api.sock
  ```

Note, the tests were performed using the following configuration snippet:
```
[plugins]
  [plugins.cri]
    enable_selinux = true
    [plugins.cri.containerd]
      [plugins.cri.containerd.runtimes]
        [plugins.cri.containerd.runtimes.kata]
           runtime_type = "io.containerd.kata.v2"
           privileged_without_host_devices = true
```

And using the following pod yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  runtimeClassName: kata
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
```

Fixes: #6371

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-12-14 00:09:17 +01:00
Alexander Minbaev
c8a009d18c add-list-stat: return container list if filter is nil
Signed-off-by: Alexander Minbaev <alexander.minbaev@ibm.com>
2021-12-13 15:09:18 -06:00
Eric Ernst
20419feaac cri, sandbox: pass sandbox resource details if available, applicable
CRI API has been updated to include a an optional `resources` field in the
LinuxPodSandboxConfig field, as part of the RunPodSandbox request.

Having sandbox level resource details at sandbox creation time will have
large benefits for sandboxed runtimes. In the case of Kata Containers,
for example, this'll allow for better support of SW/HW architectures
which don't allow for CPU/memory hotplug, and it'll allow for better
queue sizing for virtio devices associated with the sandbox (in the VM
case).

If this sandbox resource information is provided as part of the run
sandbox request, let's introduce a pattern where we will update the
pause container's runtiem spec to include this information in the
annotations field.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-12-13 08:41:41 -08:00
haoyun
c0d07094be feat: Errorf usage
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-12-13 14:31:53 +08:00
Michael Crosby
9b0303913f
only relabel cri managed host mounts
Co-authored-by: Samuel Karp <skarp@amazon.com>
Signed-off-by: Michael Crosby <michael@thepasture.io>
Signed-off-by: Samuel Karp <skarp@amazon.com>
2021-12-09 09:53:47 -08:00
Sebastiaan van Stijn
2d3009038c
cri/server: use consistent alias for pkg/ioutil
Consistently use cioutil to prevent it being confused for Golang's ioutil.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-12-09 17:47:22 +01:00
Fu Wei
69822aa936
Merge pull request #6258 from wllenyj/fix-registry-panic 2021-11-19 13:35:46 +08:00
wanglei01
5f293d9ac4 [CRI] Fix panic when registry.mirrors use localhost
When containerd use this config:

```
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
      endpoint = ["http://localhost:5000"]
```

Due to the `newTransport` function does not initialize the `TLSClientConfig` field.
Then use `TLSClientConfig` to cause nil pointer dereference

Signed-off-by: wanglei <wllenyj@linux.alibaba.com>
2021-11-19 10:56:46 +08:00
Michael Crosby
aa2733c202
Merge pull request #6170 from olljanat/default-sysctls
CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
2021-11-18 11:37:23 -05:00
Derek McGowan
9afc778b73
Merge pull request #6111 from crosbymichael/latency-metrics
[cri] add sandbox and container latency metrics
2021-11-16 16:59:33 -08:00
Derek McGowan
d055487b00
Merge pull request #6206 from mxpv/path
Allow absolute path to shim binaries
2021-11-15 18:05:48 -08:00
Olli Janatuinen
2a81c9f677 CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2021-11-15 18:30:09 +02:00
Maksym Pavlenko
6870f3b1b8 Support custom runtime path when launching tasks
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-11-09 13:31:46 -08:00
Michael Crosby
91bbaf6799 [cri] add sandbox and container latency metrics
These are simple metrics that allow users to view more fine grained metrics on
internal operations.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-09 21:07:38 +00:00
Michael Crosby
4b7cc560b2
Merge pull request #6222 from jonyhy96/add-more-description
cleanup: add more description on comment
2021-11-09 15:55:32 -05:00
haoyun
5748006337 cleanup: add more description on comment
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-09 19:13:37 +08:00
David Porter
2e6d5709e3 Implement CRI container and pods stats
See https://kep.k8s.io/2371

* Implement new CRI RPCs - `ListPodSandboxStats` and `PodSandboxStats`
  * `ListPodSandboxStats` and `PodSandboxStats` which return stats about
    pod sandbox. To obtain pod sandbox stats, underlying metrics are
    read from the pod sandbox cgroup parent.
  * Process info is obtained by calling into the underlying task
  * Network stats are taken by looking up network metrics based on the
    pod sandbox network namespace path
* Return more detailed stats for cpu and memory for existing container
  stats. These metrics use the underlying task's metrics to obtain
  stats.

Signed-off-by: David Porter <porterdavid@google.com>
2021-11-03 17:52:05 -07:00
Dat Nguyen
afe39bebfe add oci.WithAllDevicesAllowed flag for privileged_without_host_devices
This commit adds a flag that enable all devices whitelisting when
privileged_without_host_devices is already enabled.

Fixes #5679

Signed-off-by: Dat Nguyen <dnguyen7@atlassian.com>
2021-11-04 10:24:19 +11:00
Mike Brown
ea89788105 adds additional debug out to timebox cni setup
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-11-01 09:34:29 -05:00
zounengren
a217b5ac8f bump CNI to spec v1.0.0
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-10-22 10:58:40 +08:00
Sambhav Kothari
2a8dac12a7 Output a warning for label image labels instead of erroring
This change ignore errors during container runtime due to large
image labels and instead outputs warning. This is necessary as certain
image building tools like buildpacks may have large labels in the images
which need not be passed to the container.

Signed-off-by: Sambhav Kothari <sambhavs.email@gmail.com>
2021-10-14 19:25:48 +01:00
Claudiu Belu
2bc77b8a28 Adds Windows resource limits support
This will allow running Windows Containers to have their resource
limits updated through containerd. The CPU resource limits support
has been added for Windows Server 20H2 and newer, on older versions
hcsshim will raise an Unimplemented error.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-25 13:20:55 -07:00
Derek McGowan
26ee1b1ee5
Merge pull request #4695 from crosbymichael/cri-class
[cri] Add CNI conf based on runtime class
2021-10-08 09:27:49 -07:00
Derek McGowan
63b7e5771e
Merge pull request #5973 from Juneezee/deprecate-ioutil
refactor: move from io/ioutil to io and os package
2021-10-01 10:52:06 -07:00
haoyun
5c2426a7b2 cleanup: import from k8s.io/utils/clock/testing instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:34:56 +08:00
zounengren
fcffe0c83a switch usage directly to errdefs.(ErrAlreadyExists and ErrNotFound)
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2021-09-24 18:26:58 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Michael Crosby
55893b9be7 Add CNI conf based on runtime class
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-17 19:05:06 +00:00
Phil Estes
f40df3d72b
Enable image config labels in ctr and CRI container creation
Signed-off-by: Phil Estes <estesp@amazon.com>
2021-09-15 15:31:19 -04:00
Fu Wei
e1ad779107
Merge pull request #5817 from dmcgowan/shim-plugins
Add support for shim plugins
2021-09-12 18:18:20 +08:00
Phil Estes
6589876d20
Merge pull request #5964 from crosbymichael/cni-pref
add ip_pref CNI options for primary pod ip
2021-09-10 12:06:23 -04:00
Fu Wei
689a863efe
Merge pull request #5939 from scuzhanglei/privileged-device 2021-09-10 22:15:46 +08:00
Michael Crosby
1ddc54c00d
Merge pull request #5954 from claudiubelu/fix-sandbox-remove
sandbox: Allows the sandbox to be deleted in NotReady state
2021-09-10 10:12:34 -04:00
Michael Crosby
1efed43090
add ip_pref CNI options for primary pod ip
This fixes the TODO of this function and also expands on how the primary pod ip
is selected. This change allows the operator to prefer ipv4, ipv6, or retain the
ordering provided by the return results of the CNI plugins.

This makes it much more flexible for ops to configure containerd and how IPs are
set on the pod.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-10 10:04:21 -04:00
scuzhanglei
756f4a3147 cri: add devices for privileged container
Signed-off-by: scuzhanglei <greatzhanglei@gmail.com>
2021-09-10 10:16:26 +08:00
Fu Wei
d58542a9d1
Merge pull request #5627 from payall4u/payall4u/cri-support-cgroup-v2 2021-09-09 23:10:33 +08:00
Wei Fu
2bcd6a4e88 cri: patch update image labels
The CRI-plugin subscribes the image event on k8s.io namespace. By
default, the image event is created by CRI-API. However, the image can
be downloaded by containerd API on k8s.io with the customized labels.
The CRI-plugin should use patch update for `io.cri-containerd.image`
label in this case.

Fixes: #5900

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-09-05 18:48:26 +08:00
Claudiu Belu
24cec9be56 sandbox: Allows the sandbox to be deleted in NotReady state
The Pod Sandbox can enter in a NotReady state if the task associated
with it no longer exists (it died, or it was killed). In this state,
the Pod network namespace could still be open, which means we can't
remove the sandbox, even if --force was used.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-02 03:40:56 -07:00
Mikko Ylinen
e0f8c04dad cri: Devices ownership from SecurityContext
CRI container runtimes mount devices (set via kubernetes device plugins)
to containers by taking the host user/group IDs (uid/gid) to the
corresponding container device.

This triggers a problem when trying to run those containers with
non-zero (root uid/gid = 0) uid/gid set via runAsUser/runAsGroup:
the container process has no permission to use the device even when
its gid is permissive to non-root users because the container user
does not belong to that group.

It is possible to workaround the problem by manually adding the device
gid(s) to supplementalGroups. However, this is also problematic because
the device gid(s) may have different values depending on the workers'
distro/version in the cluster.

This patch suggests to take RunAsUser/RunAsGroup set via SecurityContext
as the device UID/GID, respectively. The feature must be enabled by
setting device_ownership_from_security_context runtime config value to
true (valid on Linux only).

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-08-30 09:30:00 +03:00
Phil Estes
af1a0908d0
Merge pull request #5865 from dcantah/windows-pod-runasusername
Add RunAsUserName functionality for the Windows pod sandbox container
2021-08-25 22:25:14 -04:00
Daniel Canter
25644b4614 Add RunAsUserName functionality for the Windows Pod Sandbox Container
There was recent changes to cri to bring in a Windows section containing a
security context object to the pod config. Before this there was no way to specify
a user for the pod sandbox container to run as. In addition, the security context
is a field for field mirror of the Windows container version of it, so add the
ability to specify a GMSA credential spec for the pod sandbox container as well.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-08-23 07:35:22 -07:00
payall4u
f8dfbee178 add cri test case
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:59:19 +08:00
Akihiro Suda
d3aa7ee9f0
Run go fmt with Go 1.17
The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-08-22 09:31:50 +09:00
Phil Estes
ff2e58d114
Merge pull request #5131 from perithompson/windows-hostnetwork
Add Windows HostProcess Support
2021-08-20 14:29:37 -04:00
Kazuyoshi Kato
4dd5ca70fb script: update golangci-lint from v1.38.0 and v1.36.0 to v1.42.0
golint has been deprecated and replaced by revive since v1.41.0.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-08-19 16:27:16 -07:00
Derek McGowan
8d135d2842
Add support for shim plugins
Refactor shim v2 to load and register plugins.
Update init shim interface to not require task service implementation on
returned service, but register as plugin if it is.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-17 11:06:09 -07:00
Gunju Kim
1224060f89 Allow expanded DNS configuration
Signed-off-by: Gunju Kim <gjkim042@gmail.com>
2021-08-14 06:13:01 +09:00
Peri Thompson
79b369a0bb
Added windows hostProcess cni skip
Signed-off-by: Peri Thompson <perit@vmware.com>
2021-08-11 22:23:49 +01:00
Derek McGowan
6f027e38a8
Remove redundant build tags
Remove build tags which are already implied by the name of the file.
Ensures build tags are used consistently

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:27:46 -07:00
Kazuyoshi Kato
1d3d08026d Support SIGRTMIN+n signals
systemd uses SIGRTMIN+n signals, but containerd didn't support the signals
since Go's sys/unix doesn't support them.

This change introduces SIGRTMIN+n handling by utilizing moby/sys/signal.

Fixes #5402.

https://www.freedesktop.org/software/systemd/man/systemd.html#Signals

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-07-26 09:36:43 -07:00
Phil Estes
cf600abecc
Merge pull request #5619 from mikebrow/cri-add-v1-proxy-alpha
[CRI] move up to CRI v1 and support v1alpha in parallel
2021-07-09 14:07:24 -04:00
Mike Brown
d1c1051927 use fu wei's suggeted interface pick for marshaling
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-07 15:45:45 -05:00
Mike Brown
14962dcbd2 add alpha version
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-06 11:40:20 -05:00
Mike Brown
a5c417ac06 move up to CRI v1 and support v1alpha in parallel
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-28 09:34:12 -05:00
Dan Williams
dac2543a07 sandbox: send pod UID to CNI plugins as K8S_POD_UID
CNI plugins that need to wait for network state to converge
may want to cancel waiting when a short lived pod is deleted.
However, there is a race between when kubelet asks the runtime
to create the sandbox for the pod, and when the plugin is able
request the pod object from the apiserver. It may be the case
that the plugin receives the new pod, rather than the pod
the sandbox request was initiated for.

Passing the pod UID to the plugin allows the plugin to check
whether the pod it gets from the apiserver is actually the
pod its sandbox request was started for.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-06-22 22:53:30 -05:00
Kazuyoshi Kato
1bbee573af github.com/golang/protobuf/proto is deprecated
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-06-17 10:28:48 -04:00
Quan Tian
728743eb28 Fix cleanup context of teardownPodNetwork
Similar to other deferred cleanup operations, teardownPodNetwork should
use a different context as the original context may have expired,
otherwise CNI wouldn't been invoked, leading to leak of network
resources, e.g. IP addresses.

Signed-off-by: Quan Tian <qtian@vmware.com>
2021-06-04 19:17:05 +08:00
Phil Estes
e47400cbd2
Merge pull request #5100 from adisky/skip-tls-localHost
Skip TLS verification for localhost
2021-05-12 14:56:53 -04:00
Mike Brown
c1a35232d8
Merge pull request #5446 from Random-Liu/fix-auth-config
Fix different registry hosts referencing the same auth config.
2021-05-04 06:21:02 -05:00
Lantao Liu
81402e4758 Fix different registry hosts referencing the same auth config.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-05-03 17:42:57 -07:00
Aditi Sharma
8014d9fee0 Skip TLS verification for localhost
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-05-03 10:21:54 +05:30
Thomas Hartland
efcb187429 Add unit tests for PID NamespaceMode_TARGET validation
Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 19:59:10 +02:00
Thomas Hartland
b48f27df6b Support PID NamespaceMode_TARGET
This commit adds support for the PID namespace mode TARGET
when generating a container spec.

The container that is created will be sharing its PID namespace
with the target container that was specified by ID in the namespace
options.

Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 17:54:17 +02:00
Sebastiaan van Stijn
864a3322b3
go.mod: github.com/containerd/go-cni v1.0.2
full diff: https://github.com/containerd/go-cni/compare/v1.0.1...v1.0.2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-14 09:09:18 +02:00
Sebastiaan van Stijn
9bc8d63c9f
cri/server: use containerd/oci instead of libcontainer/devices
Looks like we had our own copy of the "getDevices" code already, so use
that code (which also matches the code that's used to _generate_ the spec,
so a better match).

Moving the code to a separate file, I also noticed that the _unix and _linux
code was _exactly_ the same (baring some `//nolint:` comments), so also
removing the duplicated code.

With this patch applied, we removed the dependency on the libcontainer/devices
package (leaving only libcontainer/user).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-08 23:25:21 +02:00
Aditi Sharma
4d4117415e Change CRI config runtime options type
Changing Runtime.Options type to map[string]interface{}
to correctly marshal it from go to JSON.
See issue: https://github.com/kubernetes-sigs/cri-tools/issues/728

Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-04-08 15:11:33 +05:30
Akihiro Suda
8ba8533bde
pkg/cri/opts.WithoutRunMount -> oci.WithoutRunMount
Move `pkg/cri/opts.WithoutRunMount` function to `oci.WithoutRunMount`
so that it can be used without dependency on CRI.

Also add `oci.WithoutMounts(dests ...string)` for generality.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-07 21:25:36 +09:00
Derek McGowan
261c107ffc
Merge pull request #5278 from mxpv/toml
Migrate TOML to github.com/pelletier/go-toml
2021-04-01 21:24:52 -07:00
Mike Brown
1b05b605c8
Merge pull request #5145 from aojea/happyeyeballs
use (sort of) happy-eyeballs for port-forwarding
2021-03-26 09:51:29 -05:00
Maksym Pavlenko
ddd4298a10 Migrate current TOML code to github.com/pelletier/go-toml
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-25 13:13:33 -07:00
Maksym Pavlenko
4674ad7beb Ignore some tests on darwin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-24 22:40:22 -07:00
Sebastiaan van Stijn
708299ca40
Move RunningInUserNS() to its own package
This allows using the utility without bringing whole of "sys" with it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-23 11:29:53 +01:00
Antonio Ojea
305b425830 use happy-eyeballs for port-forwarding
golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12.
It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.
However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially, trying IPv4 first for keeping the same behaviour.
xref https://github.com/golang/go/issues/44922

Signed-off-by: Antonio Ojea <aojea@redhat.com>
2021-03-22 20:15:24 +01:00
Brian Goff
b0b6d9aa03 Add support for using a host registry dir in cri
This will be used instead of the cri registry config in the main config
toml.

---

Also pulls in changes from containerd/cri@d0b4eecbb3

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-12 22:42:22 +00:00
Iceber Gu
92ab1a63b0 cri: fix container status
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-03-05 00:00:10 +08:00
f00231050
591caece0c cri: check fsnotify watcher when receiving cni conf dir events
carry: 612f5f9f44

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-03-03 16:46:41 +08:00
Yohei Ueda
07f1df4541
cri: set default masked/readonly paths to empty paths
Fixes #5029.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2021-02-24 23:50:40 +09:00
Phil Estes
757be0a090
Merge pull request #5017 from AkihiroSuda/parse-cap
oci.WithPrivileged: set the current caps, not the known caps
2021-02-23 09:10:57 -05:00
Mike Brown
9173d3e929
Merge pull request #5021 from wzshiming/fix/signal_repeatedly
Fix repeated sending signal
2021-02-22 09:45:56 -06:00
Justin Terry (SF)
06e4e09567 cri: append envs from image config to empty slice to avoid env lost
Signed-off-by: Justin Terry (SF) <juterry@microsoft.com>
2021-02-18 16:39:28 -08:00
Phil Estes
c32ccdf8be
Merge pull request #5024 from yadzhang/deepcopy-imageconfig
cri: append envs from image config to empty slice to avoid env lost
2021-02-18 12:51:51 -05:00
Akihiro Suda
746cef0bc2
Merge pull request #5044 from wzshiming/fix/empty-error-warpping
Fix empty error warpping
2021-02-18 13:47:13 +09:00
zhangyadong.0808
08318b1ab9 cri: append envs from image config to empty slice to avoid env lost
Signed-off-by: Yadong Zhang <yadzhang@gmail.com>
2021-02-18 11:37:41 +08:00
Shiming Zhang
59db8a10e0 Fix empty error warpping
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-02-18 11:06:59 +08:00
Shiming Zhang
dc6f5ef3b9 Fix repeated sending signal
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-02-17 21:33:49 +08:00
Lorenz Brun
36d0bc1f2b Allow moving netns directory into StateDir
Signed-off-by: Lorenz Brun <lorenz@nexantic.com>
2021-02-10 18:33:14 +01:00
Akihiro Suda
a2d1a8a865
oci.WithPrivileged: set the current caps, not the known caps
This change is needed for running the latest containerd inside Docker
that is not aware of the recently added caps (BPF, PERFMON, CHECKPOINT_RESTORE).

Without this change, containerd inside Docker fails to run containers with
"apply caps: operation not permitted" error.

See kubernetes-sigs/kind 2058

NOTE: The caller process of this function is now assumed to be as
privileged as possible.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-10 17:14:17 +09:00
Michael Crosby
e874e2597e [cri] add pod annotations to CNI call
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-02-09 13:24:01 -05:00
Derek McGowan
b3f2402062
Merge pull request #5002 from crosbymichael/anno-image-name
[cri] add image-name annotation
2021-02-05 08:27:41 -08:00
Akihiro Suda
e908be5b58
Merge pull request #5001 from kzys/no-lint-upgrade 2021-02-06 00:40:38 +09:00
Kazuyoshi Kato
07db46ee23 lint: update nolint syntax for golangci-lint
Newer golangci-lint needs explicit `//` separator. Otherwise it treats
the entire line (`staticcheck deprecated ... yet`) as a name.

https://golangci-lint.run/usage/false-positives/#nolint

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-02-04 11:59:55 -08:00
Sebastiaan van Stijn
04d061fa6a
update runc to v1.0.0-rc93
full diff: https://github.com/opencontainers/runc/compare/v1.0.0-rc92...v1.0.0-rc93

also removes dependency on libcontainer/configs

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-04 16:13:30 +01:00
Sebastiaan van Stijn
54cc3483ff
pkg/cri/server: don't import libcontainer/configs
Looks like this import was not needed for the test; simplified the test
by just using the device-path (a counter would work, but for debugging,
having the list of paths can be useful).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-04 16:08:39 +01:00
Michael Crosby
99cb62f233 [cri] add image-name annotation
For some tools having the actual image name in the annotations is helpful for
debugging and auditing the workload.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-02-04 07:05:11 -05:00
Michael Crosby
591d7e2fb1 remove exec sync debug contents from logs
This was dumping untrusted output to the debug logs from user containers.
We should not dump this type of information to reduce log sizes and any
information leaks from user containers.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-01-26 14:57:54 -05:00
Alban Crequy
28e4fb25f4 cri: add annotations for pod name and namespace
cri-o has annotations for pod name, namespace and container name:
https://github.com/containers/podman/blob/master/pkg/annotations/annotations.go

But so far containerd had only the container name.

This patch will be useful for seccomp agents to have a different
behaviour depending on the pod (see runtime-spec PR 1074 and runc PR
2682). This should simplify the code in:
b2d423695d/pkg/kuberesolver/kuberesolver.go (L16-L27)

Signed-off-by: Alban Crequy <alban@kinvolk.io>
2021-01-26 12:10:39 +01:00
Wei Fu
e56de63099 cri: handle sandbox/container exit event separately
The event monitor handles exit events one by one. If there is something
wrong about deleting task, it will slow down the terminating Pods. In
order to reduce the impact, the exit event watcher should handle exit
event separately. If it failed, the watcher should put it into backoff
queue and retry it.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-01-24 13:43:38 +08:00
Shengjing Zhu
2818fdebaa Move runtimeoptions out of cri package
Since it's a standard set of runtime opts, and used in ctr as well,
it could be moved out of cri.

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2021-01-23 01:24:35 +08:00
Michael Crosby
a731039238 [cri] label etc files for selinux containers
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-01-19 13:42:09 -05:00
Mike Brown
550b4949cb
Merge pull request #4700 from mikebrow/cri-security-profile-update
CRI security profile update for CRI graduation
2021-01-12 12:21:56 -06:00
Sebastiaan van Stijn
2374178c9b
pkg/cri/server: optimizations in unmountRecursive()
Use a PrefixFilter() to get only the mounts we're interested in,
which removes the need to manually filter mounts from the mountinfo
results.

Additional optimizations can be made, as:

> ... there's a little known fact that `umount(MNT_DETACH)` is actually
> recursive in Linux, IOW this function can be replaced with
> `unix.Umount(target, unix.MNT_DETACH)` (or `mount.UnmountAll(target, unix.MNT_DETACH)`
>  (provided that target itself is a mount point).

e8fb2c392f (r535450446)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-01-08 17:32:01 +01:00
Sebastiaan van Stijn
7572919201
mount: remove remaining uses of mount.Self()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-01-08 17:31:59 +01:00
Davanum Srinivas
1f5b84f27c
[CRI] Reduce clutter of log entries during process execution
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-01-06 13:09:03 -05:00
Shengjing Zhu
5988bfc1ef docs: Various typo found by codespell
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2020-12-22 13:22:16 +08:00
Michael Crosby
2e442ea485 [cri] ensure log dir is created
containerd is responsible for creating the log but there is no code to ensure
that the log dir exists.  While kubelet should have created this there can be
times where this is not the case and this can cause stuck tasks.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-12-17 15:04:39 -05:00
Akihiro Suda
7e6e4c466f
remove "selinux" build tag
The build tag was removed in go-selinux v1.8.0: opencontainers/selinux#132

Related: remove "apparmor" build tag: 0a9147f3aa

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-15 20:05:25 +09:00
Mike Brown
6467c3374d refactor based on comments
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-12-07 21:39:31 -06:00
Phil Estes
efad13faaf
Merge pull request #4811 from AkihiroSuda/expose-apparmor
expose hostSupportsAppArmor()
2020-12-07 08:22:16 -05:00
Akihiro Suda
55eda46b22
expose hostSupportsAppArmor()
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-07 19:12:59 +09:00
Mike Brown
b4727eafbe adding code to support seccomp apparmor securityprofile
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-12-04 15:15:32 -06:00
Michael Crosby
3d358c9df3 [cri] don't clear base security settings
When a base runtime spec is being used, admins can configure defaults for the
spec so that default ulimits or other security related settings get applied for
all containers launched.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-12-02 06:51:37 -05:00
Maksym Pavlenko
2837fb35a7
Merge pull request #4715 from thaJeztah/remove_libcontainer_apparmor
pkg/cri/server: remove dependency on libcontainer/apparmor, libcontainer/utils
2020-11-18 14:34:48 -08:00
Sebastiaan van Stijn
eba94a15c8
pkg/cri/server: remove dependency on libcontainer/apparmor, libcontainer/utils
recent versions of libcontainer/apparmor simplified the AppArmor
check to only check if the host supports AppArmor, but no longer
checks if apparmor_parser is installed, or if we're running
docker-in-docker;

bfb4ea1b1b

> The `apparmor_parser` binary is not really required for a system to run
> AppArmor from a runc perspective. How to apply the profile is more in
> the responsibility of higher level runtimes like Podman and Docker,
> which may do the binary check on their own.

This patch copies the logic from libcontainer/apparmor, and
restores the additional checks.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-11-12 15:42:25 +01:00
Jacob Blain Christen
a1e7dd939d cri: selinuxrelabel=false for /dev/shm w/ host ipc
This is a followup to #4699 that addresses an oversight that could cause
the CRI to relabel the host /dev/shm, which should be a no-op in most
cases. Additionally, fixes unit tests to make correct assertions for
/dev/shm relabeling.

Discovered while applying the changes for #4699 to containerd/cri 1.4:
https://github.com/containerd/cri/pull/1605

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-11 15:22:17 -07:00
Jacob Blain Christen
e8d8ae3b97 cri: selinux relabel /dev/shm
Address an issue originally seen in the k3s 1.3 and 1.4 forks of containerd/cri, https://github.com/rancher/k3s/issues/2240

Even with updated container-selinux policy, container-local /dev/shm
will get mounted with container_runtime_tmpfs_t because it is a tmpfs
created by the runtime and not the container (thus, container_runtime_t
transition rules apply). The relabel mitigates such, allowing envoy
proxy to work correctly (and other programs that wish to write to their
/dev/shm) under selinux.

Tested locally with:
- SELINUX=Enforcing vagrant up --provision-with=shell,selinux,test-integration
- SELINUX=Enforcing CRITEST_ARGS=--ginkgo.skip='HostIpc is true' vagrant up --provision-with=shell,selinux,test-cri
- SELINUX=Permissive CRITEST_ARGS=--ginkgo.focus='HostIpc is true' vagrant up --provision-with=shell,selinux,test-cri

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-06 12:05:17 -07:00
Akihiro Suda
8ff2707a3c
Merge pull request #4610 from shahzzzam/samashah/add-annotations
Add manifest digest annotation for snapshotters
2020-10-28 13:11:49 +09:00
zhuangqh
30c9addd6c fix: always set unknown to false when handling exit event
Signed-off-by: jerryzhuang <zhuangqhc@gmail.com>
2020-10-27 10:50:15 +08:00
Daniel Canter
cdb2f9c66f Filter snapshotter labels passed to WithNewSnapshot
Made a change yesterday that passed through snapshotter labels into the wrapper of
WithNewSnapshot, but it passed the entirety of the annotations into the snapshotter.
This change just filters the set that we care about down to snapshotter specific
labels.

Will probably be future changes to add some more labels for LCOW/WCOW and the corresponding
behavior for these new labels.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-15 04:49:39 -07:00
Phil Estes
9b70de01d6
Merge pull request #4630 from dcantah/pass-snapshotter-opt
Cri - Pass snapshotter labels into customopts.WithNewSnapshot
2020-10-14 10:54:06 -04:00
Daniel Canter
9a1f6ea4dc Cri - Pass snapshotter labels into customopts.WithNewSnapshot
Previously there wwasn't a way to pass any labels to snapshotters as the wrapper
around WithNewSnapshot didn't have a parm to pass them in.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-14 04:14:03 -07:00
Daniel Canter
d74225b588 Fix comment in RemovePodSandbox
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-12 17:59:08 -07:00
zhangjianming
116902cd21 fix no-pivot not working in io.containerd.runtime.v1.linux
Signed-off-by: zhangjianming <zhang.jianming7@zte.com.cn>
2020-10-12 09:39:59 +08:00
Maksym Pavlenko
3d02441a79 Refactor pkg packages
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-10-08 17:30:17 -07:00
Samarth Shah
5fc721370d Add manifest digest annotation for snapshotters
Signed-off-by: Samarth Shah <samarthmshah@gmail.com>
2020-10-07 23:12:01 +00:00
Maksym Pavlenko
3508ddd3dd Refactor CRI packages
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-10-07 14:45:57 -07:00
Derek McGowan
b22b627300
Move cri server packages under pkg/cri
Organizes the cri related server packages under pkg/cri

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 13:09:37 -07:00