containerd

Author	SHA1	Message	Date
Mike Brown	e3b4d750db	update go-cni/for cni update fixing plugins that don't respond with version Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2022-06-01 17:20:18 -05:00
Kazuyoshi Kato	c149e6c2ea	Merge pull request #6996 from dcantah/hpc-validations Add validations for Windows HostProcess CRI configs	2022-06-01 11:37:12 -07:00
Kazuyoshi Kato	49ca87d727	Limit the response size of ExecSync Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-05-31 22:21:35 +00:00
Daniel Canter	b5e1b8f619	Use t.Run for /pkg/cri tests A majority of the tests in /pkg/cri are testing/validating multiple things per test (generally spec or options validations). This flow lends itself well to using *testing.T's Run method to run each thing as a subtest so `go test` output can actually display which subtest failed/passed. Some of the tests in the packages in pkg/cri already did this, but a bunch simply logged what sub-testcase was currently running without invoking t.Run. Signed-off-by: Daniel Canter <dcanter@microsoft.com>	2022-05-29 18:32:09 -07:00
Daniel Canter	978ff393d2	Add validations for Windows HostProcess CRI configs HostProcess containers require every container in the pod to be a host process container and have the corresponding field set. The Kubelet usually enforces this so we'd error before even getting here but we recently found a bug in this logic so better to be safe than sorry. Signed-off-by: Daniel Canter <dcanter@microsoft.com>	2022-05-27 21:17:07 -07:00
AllenZMC	eaec6530d7	fix some confusing typos Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>	2022-05-17 23:53:36 +08:00
Akihiro Suda	42584167b7	Officially deprecate Schema 1 Schema 1 has been substantially deprecated since circa. 2017 in favor of Schema 2 introduced in Docker 1.10 (Feb 2016) and its successor OCI Image Spec v1, but we have not officially deprecated Schema 1. One of the reasons was that Quay did not support Schema 2 so far, but it is reported that Quay has been supporting Schema 2 since Feb 2020 (moby/buildkit issue 409). This PR deprecates pulling Schema 1 images but the feature will not be removed before containerd 2.0. Pushing Schema 1 images was never implemented in containerd (and its consumers such as BuildKit). Docker/Moby already disabled pushing Schema 1 images in Docker 20.10 (moby/moby PR 41295), but Docker/Moby has not yet disabled pulling Schema 1 as containerd has not yet deprecated Schema 1. (See the comments in moby/moby PR 42300.) Docker/Moby is expected to disable pulling Schema 1 images in future after this deprecation. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2022-05-02 19:08:38 +09:00
Mike Brown	6b35307594	Merge pull request #5490 from askervin/5Bu_blockio Support for cgroups blockio	2022-04-29 10:07:56 -05:00
Antti Kervinen	10576c298e	cri: support blockio class in pod and container annotations This patch adds support for a container annotation and two separate pod annotations for controlling the blockio class of containers. The container annotation can be used by a CRI client: "io.kubernetes.cri.blockio-class" Pod annotations specify the blockio class in the K8s pod spec level: "blockio.resources.beta.kubernetes.io/pod" (pod-wide default for all containers within) "blockio.resources.beta.kubernetes.io/container.<container_name>" (container-specific overrides) Correspondingly, this patch adds support for --blockio-class and --blockio-config-file to ctr, too. This implementation follows the resource class annotation pattern introduced in RDT and merged in commit `893701220`. Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>	2022-04-29 11:44:09 +03:00
Derek McGowan	6e0231f992	Merge pull request #6150 from fuweid/support-4984 feature: support image pull progress timeout	2022-04-26 12:15:09 -07:00
Wei Fu	00d102da9f	feature: support image pull progress timeout Kubelet sends the PullImage request without timeout, because the image size is unknown and timeout is hard to defined. The pulling request might run into 0B/s speed, if containerd can't receive any packet in that connection. For this case, the containerd should cancel the PullImage request. Although containerd provides ingester manager to track the progress of pulling request, for example `ctr image pull` shows the console progress bar, it needs more CPU resources to open/read the ingested files to get status. In order to support progress timeout feature with lower overhead, this patch uses http.RoundTripper wrapper to track active progress. That wrapper will increase active-request number and return the countingReadCloser wrapper for http.Response.Body. Each bytes-read can be count and the active-request number will be descreased when the countingReadCloser wrapper has been closed. For the progress tracker, it can check the active-request number and bytes-read at intervals. If there is no any progress, the progress tracker should cancel the request. NOTE: For each blob data, the containerd will make sure that the content writer is opened before sending http request to the registry. Therefore, the progress reporter can rely on the active-request number. fixed: #4984 Signed-off-by: Wei Fu <fuweid89@gmail.com>	2022-04-27 00:02:27 +08:00
Derek McGowan	3dbd6a2498	Merge pull request #6841 from kzys/proto-upgrade-6 Migrate off from github.com/gogo/protobuf	2022-04-25 15:12:51 -07:00
Kazuyoshi Kato	f140400c0e	Merge pull request #5686 from dtnyn/issue-5679 Add flag to allow oci.WithAllDevicesAllowed on PrivilegedWithoutHostDevices	2022-04-25 11:44:01 -07:00
Kazuyoshi Kato	7a4f81d8ba	Fix tests Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-04-22 15:41:05 +00:00
Kazuyoshi Kato	e3db7de8f5	Remove gogo/protobuf and adjust types This commit migrates containerd/protobuf from github.com/gogo/protobuf to google.golang.org/protobuf and adjust types. Proto-generated structs cannot be passed as values. Fixes #6564. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-04-22 15:31:53 +00:00
Kazuyoshi Kato	88c0c7201e	Consolidate gogo/protobuf dependencies under our own protobuf package This would make gogo/protobuf migration easier. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-04-19 15:53:36 +00:00
Kazuyoshi Kato	80b825ca2c	Remove gogoproto.stdtime This commit removes gogoproto.stdtime, since it is not supported by Google's official toolchain (see https://github.com/containerd/containerd/issues/6564). Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-04-19 13:39:30 +00:00
Eric Lin	a5dfbfcf5a	cri: load sandboxes/containers/images in parallel Parallelizing them decreases loading duration. Time to complete recover(): * Without competing IOs + without opt: 21s * Without competing IOs + with opt: 14s * Competing IOs + without opt: 3m44s * Competing IOs + with opt: 33s Signed-off-by: Eric Lin <linxiulei@gmail.com>	2022-04-09 13:01:14 +00:00
Ed Bartosh	ff5c55847a	move CDI calls to the linux-only code Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2022-04-06 13:10:59 +03:00
Ed Bartosh	c9b4ccf83e	add configuration for CDI Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2022-04-06 13:10:54 +03:00
Ed Bartosh	aed0538dac	cri: implement CDI device injection Extract the names of requested CDI devices and update the OCI Spec according to the corresponding CDI device specifications. CDI devices are requested using container annotations in the cdi.k8s.io namespace. Once CRI gains dedicated fields for CDI injection the snippet for extracting CDI names will need an update. Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2022-04-06 13:07:54 +03:00
Wei Fu	8113758568	CRI: improve image pulling performance Background: With current design, the content backend uses key-lock for long-lived write transaction. If the content reference has been marked for write transaction, the other requestes on the same reference will fail fast with unavailable error. Since the metadata plugin is based on boltbd which only supports single-writer, the content backend can't block or handle the request too long. It requires the client to handle retry by itself, like OpenWriter - backoff retry helper. But the maximum retry interval can be up to 2 seconds. If there are several concurrent requestes fo the same image, the waiters maybe wakeup at the same time and there is only one waiter can continue. A lot of waiters will get into sleep and we will take long time to finish all the pulling jobs and be worse if the image has many more layers, which mentioned in issue #4937. After fetching, containerd.Pull API allows several hanlers to commit same ChainID snapshotter but only one can be done successfully. Since unpack tar.gz is time-consuming job, it can impact the performance on unpacking for same ChainID snapshotter in parallel. For instance, the Request 2 doesn't need to prepare and commit, it should just wait for Request 1 finish, which mentioned in pull request #6318. ```text Request 1 Request 2 Prepare \| \| \| \| Prepare Commit \| \| \| \| Commit(failed on exist) ``` Both content backoff retry and unnecessary unpack impacts the performance. Solution: Introduced the duplicate suppression in fetch and unpack context. The deplicate suppression uses key-mutex and single-waiter-notify to support singleflight. The caller can use the duplicate suppression in different PullImage handlers so that we can avoid unnecessary unpack and spin-lock in OpenWriter. Test Result: Before enhancement: ```bash ➜ /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20 crictl pull localhost:5000/redis:latest (x20) takes ... real 1m6.172s user 0m0.268s sys 0m0.193s docker pull localhost:5000/redis:latest (x20) takes ... real 0m1.324s user 0m0.441s sys 0m0.316s ➜ /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20 crictl pull localhost:5000/golang:latest (x20) takes ... real 1m47.657s user 0m0.284s sys 0m0.224s docker pull localhost:5000/golang:latest (x20) takes ... real 0m6.381s user 0m0.488s sys 0m0.358s ``` With this enhancement: ```bash ➜ /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20 crictl pull localhost:5000/redis:latest (x20) takes ... real 0m1.140s user 0m0.243s sys 0m0.178s docker pull localhost:5000/redis:latest (x20) takes ... real 0m1.239s user 0m0.463s sys 0m0.275s ➜ /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20 crictl pull localhost:5000/golang:latest (x20) takes ... real 0m5.546s user 0m0.217s sys 0m0.219s docker pull localhost:5000/golang:latest (x20) takes ... real 0m6.090s user 0m0.501s sys 0m0.331s ``` Test Script: localhost:5000/{redis\|golang}:latest is equal to docker.io/library/{redis\|golang}:latest. The image is hold in local registry service by `docker run -d -p 5000:5000 --name registry registry:2`. ```bash image_name="${1}" pull_times="${2:-10}" cleanup() { ctr image rmi "${image_name}" ctr -n k8s.io image rmi "${image_name}" crictl rmi "${image_name}" docker rmi "${image_name}" sleep 2 } crictl_testing() { for idx in $(seq 1 ${pull_times}); do crictl pull "${image_name}" > /dev/null 2>&1 & done wait } docker_testing() { for idx in $(seq 1 ${pull_times}); do docker pull "${image_name}" > /dev/null 2>&1 & done wait } cleanup > /dev/null 2>&1 echo 3 > /proc/sys/vm/drop_caches sleep 3 echo "crictl pull $image_name (x${pull_times}) takes ..." time crictl_testing echo echo 3 > /proc/sys/vm/drop_caches sleep 3 echo "docker pull $image_name (x${pull_times}) takes ..." time docker_testing ``` Fixes: #4937 Close: #4985 Close: #6318 Signed-off-by: Wei Fu <fuweid89@gmail.com>	2022-04-06 07:14:18 +08:00
Kazuyoshi Kato	96b16b447d	Use typeurl.Any instead of github.com/gogo/protobuf/types.Any This commit upgrades github.com/containerd/typeurl to use typeurl.Any. The interface hides gogo/protobuf/types.Any from containerd's Go client. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-03-24 20:50:07 +00:00
Derek McGowan	551516a18d	Merge pull request from GHSA-c9cp-9c75-9v8c Fix the Inheritable capability defaults.	2022-03-23 10:50:56 -07:00
Phil Estes	ee49c4d557	Add nolint:staticcheck to platform-specific calls The linter on platforms that have a hardcoded response complains about "if xyz == nil" checks; ignore those. Signed-off-by: Phil Estes <estesp@amazon.com>	2022-03-17 18:24:00 -04:00
Eng Zer Jun	18ec2761c0	test: use `T.TempDir` to create temporary test directory The directory created by `T.TempDir` is automatically removed when the test and all its subtests complete. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-03-15 14:03:50 +08:00
Paul "TBBle" Hampson	39d52118f5	Plumb CRI Devices through to OCI WindowsDevices There's two mappings of hostpath to IDType and ID in the wild: - dockershim and dockerd-cri (implicitly via docker) use class/ID -- The only supported IDType in Docker is 'class'. -- https://github.com/aarnaud/k8s-directx-device-plugin generates this form - https://github.com/jterry75/cri (windows_port branch) uses IDType://ID -- hcsshim's CRI test suite generates this form `://` is much more easily distinguishable, so I've gone with that one as the generic separator, with `class/` as a special-case. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>	2022-03-12 08:16:43 +11:00
Shengjing Zhu	352a8f49f7	cri: relax test for system without hugetlb These unit tests don't check hugetlb. However by setting TolerateMissingHugetlbController to false, these tests can't be run on system without hugetlb (e.g. Debian buildd). Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2022-02-28 01:38:58 +08:00
Shengjing Zhu	f4f41296c2	Replace golang.org/x/net/context with std library Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2022-02-22 02:27:05 +08:00
Derek McGowan	c0f8188469	Update go-cni to v1.1.2 Fixes panic when exec is nil Signed-off-by: Derek McGowan <derek@mcg.dev>	2022-02-10 12:40:51 -08:00
Markus Lehtonen	9b1fb82584	cri: fix handling of ignore_rdt_not_enabled_errors config option We were not properly ignoring errors from gorestrl.rdt.ContainerClassFromAnnotations() causing the config option to be ineffective, in practice. Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>	2022-02-04 13:54:03 +02:00
Andrew G. Morgan	6906b57c72	Fix the Inheritable capability defaults. The Linux kernel never sets the Inheritable capability flag to anything other than empty. Non-empty values are always exclusively set by userspace code. [The kernel stopped defaulting this set of capability values to the full set in 2000 after a privilege escalation with Capabilities affecting Sendmail and others.] Signed-off-by: Andrew G. Morgan <morgan@kernel.org>	2022-02-01 13:55:46 -08:00
Derek McGowan	4e9e14c2b6	Fix rdt build tags for go 1.16 Signed-off-by: Derek McGowan <derek@mcg.dev>	2022-01-19 11:09:29 -08:00
Takumasa Sakao	18592b2f5a	Fix wrong log message Signed-off-by: Takumasa Sakao <tsakao@zlab.co.jp>	2022-01-09 16:01:23 +09:00
haoyun	bbe46b8c43	feat: replace github.com/pkg/errors to errors Signed-off-by: haoyun <yun.hao@daocloud.io> Co-authored-by: zounengren <zouyee1989@gmail.com>	2022-01-07 10:27:03 +08:00
Derek McGowan	644a01e13b	Merge pull request from GHSA-mvff-h3cj-wj9c only relabel cri managed host mounts	2022-01-05 09:30:58 -08:00
Markus Lehtonen	9c2e3835fa	cri: add ignore_rdt_not_enabled_errors config option Enabling this option effectively causes RDT class of a container to be a soft requirement. If RDT support has not been enabled the RDT class setting will not have any effect. Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>	2022-01-04 09:27:54 +02:00
Markus Lehtonen	f4a191917b	cri: annotations for controlling RDT class Use goresctrl for parsing container and pod annotations related to RDT. In practice, from the users' point of view, this patchs adds support for a container annotation and two separate pod annotations for controlling the RDT class of containers. Container annotation can be used by a CRI client: "io.kubernetes.cri.rdt-class" Pod annotations for specifying the RDT class in the K8s pod spec level: "rdt.resources.beta.kubernetes.io/pod" (pod-wide default for all containers within) "rdt.resources.beta.kubernetes.io/container.<container_name>" (container-specific overrides) Annotations are intended as an intermediate step before the CRI API supports RDT. Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>	2022-01-04 09:27:54 +02:00
Derek McGowan	2c9d80aba5	Merge pull request #6372 from fidencio/wip/seutil-fix-container_kvm_t-type-detection seutil: Fix setting the "container_kvm_t" label	2021-12-15 10:35:04 -08:00
Phil Estes	330961c2d5	Merge pull request #6358 from jonyhy96/feat-error refactor: functions for error log and error return	2021-12-14 10:16:54 -05:00
Derek McGowan	ac531108ab	Merge pull request #6155 from egernst/cri-update-for-sandbox-sizing CRI update for sandbox sizing	2021-12-13 16:21:30 -08:00
Fabiano Fidêncio	f1c7993311	seutil: Fix setting the "container_kvm_t" label The ability to handle KVM based runtimes with SELinux has been added as part of `d715d00906`. However, that commit introduced some logic to check whether the "container_kvm_t" label would or not be present in the system, and while the intentions were good, there's two major issues with the approach: 1. Inspecting "/etc/selinux/targeted/contexts/customizable_types" is not the way to go, as it doesn't list the "container_kvm_t" at all. 2. There's no need to check for the label, as if the label is invalid an "Invalid Label" error will be returned and that's it. With those two in mind, let's simplify the logic behind setting the "container_kvm_t" label, removing all the unnecessary code. Here's an output of VMM process running, considering: * The state before this patch: ``` $ containerd --version containerd github.com/containerd/containerd v1.6.0-beta.3-88-g7fa44fc98 `7fa44fc98f` $ kubectl apply -f ~/simple-pod.yaml pod/nginx created $ ps -auxZ \| grep cloud-hypervisor system_u:system_r:container_runtime_t:s0 root 609717 4.0 0.5 2987512 83588 ? Sl 08:32 0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/be9d5cbabf440510d58d89fc8a8e77c27e96ddc99709ecaf5ab94c6b6b0d4c89/clh-api.sock ``` * The state after this patch: ``` $ containerd --version containerd github.com/containerd/containerd v1.6.0-beta.3-89-ga5f2113c9 a5f2113c9fc15b19b2c364caaedb99c22de4eb32 $ kubectl apply -f ~/simple-pod.yaml pod/nginx created $ ps -auxZ \| grep cloud-hypervisor system_u:system_r:container_kvm_t:s0:c638,c999 root 614842 14.0 0.5 2987512 83228 ? Sl 08:40 0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/f8ff838afdbe0a546f6995fe9b08e0956d0d0cdfe749705d7ce4618695baa68c/clh-api.sock ``` Note, the tests were performed using the following configuration snippet: ``` [plugins] [plugins.cri] enable_selinux = true [plugins.cri.containerd] [plugins.cri.containerd.runtimes] [plugins.cri.containerd.runtimes.kata] runtime_type = "io.containerd.kata.v2" privileged_without_host_devices = true ``` And using the following pod yaml: ``` apiVersion: v1 kind: Pod metadata: name: nginx spec: runtimeClassName: kata containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 ``` Fixes: #6371 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2021-12-14 00:09:17 +01:00
Alexander Minbaev	c8a009d18c	add-list-stat: return container list if filter is nil Signed-off-by: Alexander Minbaev <alexander.minbaev@ibm.com>	2021-12-13 15:09:18 -06:00
Eric Ernst	20419feaac	cri, sandbox: pass sandbox resource details if available, applicable CRI API has been updated to include a an optional `resources` field in the LinuxPodSandboxConfig field, as part of the RunPodSandbox request. Having sandbox level resource details at sandbox creation time will have large benefits for sandboxed runtimes. In the case of Kata Containers, for example, this'll allow for better support of SW/HW architectures which don't allow for CPU/memory hotplug, and it'll allow for better queue sizing for virtio devices associated with the sandbox (in the VM case). If this sandbox resource information is provided as part of the run sandbox request, let's introduce a pattern where we will update the pause container's runtiem spec to include this information in the annotations field. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-12-13 08:41:41 -08:00
haoyun	c0d07094be	feat: Errorf usage Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-12-13 14:31:53 +08:00
Michael Crosby	9b0303913f	only relabel cri managed host mounts Co-authored-by: Samuel Karp <skarp@amazon.com> Signed-off-by: Michael Crosby <michael@thepasture.io> Signed-off-by: Samuel Karp <skarp@amazon.com>	2021-12-09 09:53:47 -08:00
Sebastiaan van Stijn	2d3009038c	cri/server: use consistent alias for pkg/ioutil Consistently use cioutil to prevent it being confused for Golang's ioutil. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2021-12-09 17:47:22 +01:00
Fu Wei	69822aa936	Merge pull request #6258 from wllenyj/fix-registry-panic	2021-11-19 13:35:46 +08:00
wanglei01	5f293d9ac4	[CRI] Fix panic when registry.mirrors use localhost When containerd use this config: ``` [plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"] endpoint = ["http://localhost:5000"] ``` Due to the `newTransport` function does not initialize the `TLSClientConfig` field. Then use `TLSClientConfig` to cause nil pointer dereference Signed-off-by: wanglei <wllenyj@linux.alibaba.com>	2021-11-19 10:56:46 +08:00
Michael Crosby	aa2733c202	Merge pull request #6170 from olljanat/default-sysctls CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options	2021-11-18 11:37:23 -05:00

... 2 3 4 5 6 ...

317 Commits