containerd

Author	SHA1	Message	Date
Derek McGowan	65b3922df7	Split streaming config from runtime config Signed-off-by: Derek McGowan <derek@mcg.dev>	2024-01-28 23:14:59 -08:00
Derek McGowan	9795677fe9	Move cri base plugin to CRI runtime service Create new plugin type for CRI runtime and image services. Signed-off-by: Derek McGowan <derek@mcg.dev>	2024-01-28 20:57:18 -08:00
Akihiro Suda	a83449cf11	Merge pull request #9621 from bart0sh/PR011-enable-CDI-by-default config: enable CDI by default	2024-01-17 00:48:55 +00:00
James Jenkins	8aa2551ce0	Move DefaultSnapshotter constants Move the DefaultSnapshotter constants to the defaults package. Fixes issue #8226. Signed-off-by: James Jenkins <James.Jenkins@ibm.com>	2024-01-12 13:28:46 -05:00
Ed Bartosh	c8e8a093ce	config: enable CDI by default Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2024-01-12 09:31:39 +02:00
Derek McGowan	3baf5edb8b	Separate the CRI image config from the main plugin config This change simplifies the CRI plugin dependencies by not requiring the CRI image plugin to depend on any other CRI components. Since other CRI plugins depend on the image plugin, this allows prevents a dependency cycle for CRI configurations on a base plugin. Signed-off-by: Derek McGowan <derek@mcg.dev>	2024-01-11 09:55:09 -08:00
Derek McGowan	02a9a456e1	Split image config from CRI plugin Signed-off-by: Derek McGowan <derek@mcg.dev>	2024-01-11 09:55:09 -08:00
Wei Fu	23278c81fb	: introduce image_pull_with_sync_fs in CRI It's to ensure the data integrity during unexpected power failure. Background: Since release 1.3, in Linux system, containerD unpacks and writes files into overlayfs snapshot directly. It doesn’t involve any mount-umount operations so that the performance of pulling image has been improved. As we know, the umount syscall for overlayfs will force kernel to flush all the dirty pages into disk. Without umount syscall, the files’ data relies on kernel’s writeback threads or filesystem's commit setting (for instance, ext4 filesystem). The files in committed snapshot can be loss after unexpected power failure. However, the snapshot has been committed and the metadata also has been fsynced. There is data inconsistency between snapshot metadata and files in that snapshot. We, containerd, received several issues about data loss after unexpected power failure. https://github.com/containerd/containerd/issues/5854 * https://github.com/containerd/containerd/issues/3369#issuecomment-1787334907 Solution: * Option 1: SyncFs after unpack Linux platform provides [syncfs][syncfs] syscall to synchronize just the filesystem containing a given file. * Option 2: Fsync directories recursively and fsync on regular file The fsync doesn't support symlink/block device/char device files. We need to use fsync the parent directory to ensure that entry is persisted. However, based on [xfstest-dev][xfstest-dev], there is no case to ensure fsync-on-parent can persist the special file's metadata, for example, uid/gid, access mode. Checkout [generic/690][generic/690]: Syncing parent dir can persist symlink. But for f2fs, it needs special mount option. And it doesn't say that uid/gid can be persisted. All the details are behind the implemetation. > NOTE: All the related test cases has `_flakey_drop_and_remount` in [xfstest-dev]. Based on discussion about [Documenting the crash-recovery guarantees of Linux file systems][kernel-crash-recovery-data-integrity], we can't rely on Fsync-on-parent. * Option 1 is winner This patch is using option 1. There is test result based on [test-tool][test-tool]. All the networking traffic created by pull is local. * Image: docker.io/library/golang:1.19.4 (992 MiB) * Current: 5.446738579s * WIOS=21081, WBytes=1329741824, RIOS=79, RBytes=1197056 * Option 1: 6.239686088s * WIOS=34804, WBytes=1454845952, RIOS=79, RBytes=1197056 * Option 2: 1m30.510934813s * WIOS=42143, WBytes=1471397888, RIOS=82, RBytes=1209344 * Image: docker.io/tensorflow/tensorflow:latest (1.78 GiB, ~32590 Inodes) * Current: 8.852718042s * WIOS=39417, WBytes=2412818432, RIOS=2673, RBytes=335987712 * Option 1: 9.683387174s * WIOS=42767, WBytes=2431750144, RIOS=89, RBytes=1238016 * Option 2: 1m54.302103719s * WIOS=54403, WBytes=2460528640, RIOS=1709, RBytes=208237568 The Option 1 will increase `wios`. So, the `image_pull_with_sync_fs` is option in CRI plugin. [syncfs]: <https://man7.org/linux/man-pages/man2/syncfs.2.html> [xfstest-dev]: <https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git> [generic/690]: <https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/690?h=v2023.11.19> [kernel-crash-recovery-data-integrity]: <https://lore.kernel.org/linux-fsdevel/1552418820-18102-1-git-send-email-jaya@cs.utexas.edu/> [test-tool]: <`a17fb2010d/contrib/syncfs/containerd/main_test.go (L51)`> Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-12-12 10:18:39 +08:00
Wei Fu	80dd779deb	remotes/docker: close connection if no more data Close connection if no more data. It's to fix false alert filed by image pull progress. ``` dst = OpenWriter (--> Content Store) src = Fetch Open (--> Registry) Mark it as active request Copy(dst, src) (--> Keep updating total received bytes) ^ \| (Active Request > 0, but total received bytes won't be updated) v defer src.Close() content.Commit(dst) ``` Before migrating to transfer service, CRI plugin doesn't limit global concurrent downloads for ImagePulls. Each ImagePull requests have 3 concurrent goroutines to download blob and 1 goroutine to unpack blob. Like ext4 filesystem [1][1], the fsync from content.Commit may sync unrelated dirty pages into disk. The host is running under IO pressure, and then the content.Commit will take long time and block other goroutines. If httpreadseeker doesn't close the connection after io.EOF, this connection will be considered as active. The pull progress reporter reports there is no bytes transfered and cancels the ImagePull. The original 1-minute timeout[2][2] is from kubelet settting. Since CRI-plugin can't limit the total concurrent downloads, this patch is to update 1-minute to 5-minutes to prevent from unexpected cancel. [1]: https://lwn.net/Articles/842385/ [2]: https://github.com/kubernetes/kubernetes/blob/release-1.23/pkg/kubelet/config/flags.go#L45-L48 Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-11-18 10:23:05 +08:00
rongfu.leng	7b9fcfd7c6	add default enable unprivileged icmp/ports Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2023-11-08 23:00:35 +08:00
Derek McGowan	261e01c2ac	Move client to subpackage Signed-off-by: Derek McGowan <derek@mcg.dev>	2023-11-01 10:37:00 -07:00
Derek McGowan	5fdf55e493	Update go module to github.com/containerd/containerd/v2 Signed-off-by: Derek McGowan <derek@mcg.dev>	2023-10-29 20:52:21 -07:00
Maksym Pavlenko	f90f80d9b3	Merge pull request #9254 from adisky/cri-streaming-from-k8s Use staging k8s.io/kubelet/cri/streaming package	2023-10-19 12:32:12 -07:00
Aditi Sharma	03d81f595f	Use cri streaming pkg from k8s staging Use staging k8s.io/kubelet/cri/streaming package Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>	2023-10-18 09:14:28 +05:30
Abel Feng	69e501e7cd	sandbox: change SandboxMode to Sandboxer Signed-off-by: Abel Feng <fshb1988@gmail.com>	2023-10-16 20:49:36 +08:00
Derek McGowan	b5615caf11	Update go-toml to v2 Updates host file parsing to use new v2 method rather than the removed toml.Tree. Signed-off-by: Derek McGowan <derek@mcg.dev>	2023-09-22 15:35:12 -07:00
rongfu.leng	9287711b7a	upgrade registry.k8s.io/pause version Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2023-05-28 07:59:10 +08:00
Iceber Gu	23d288a809	Remove the CriuPath field from runc's options Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>	2023-03-16 17:12:51 +08:00
Maksym Pavlenko	07c2ae12e1	Remove v1 runctypes Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2023-03-15 09:18:16 -07:00
Wei Fu	791f137a5b	*: update drainExecSyncIO docs and validate the timeout We should validate the drainExecSyncIO timeout at the beginning and raise the error for any invalid input. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-03-03 11:58:52 +08:00
Wei Fu	3c18decea7	*: add DrainExecSyncIOTimeout config and disable as by default Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-03-03 00:21:55 +08:00
Maksym Pavlenko	3bc8fc4d30	Cleanup build constraints Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2022-12-08 09:36:20 -08:00
Fei Su	f6232793b4	can set up the network serially by CNI plugins Signed-off-by: Fei Su <sofat1989@126.com>	2022-11-18 15:19:00 +08:00
Zhang Tianyang	c953eecb79	Sandbox API: Add a new mode config for sandbox controller impls Add a new config as sandbox controller mod, which can be either "podsandbox" or "shim". If empty, set it to default "podsandbox" when CRI plugin inits. Signed-off-by: Zhang Tianyang <burning9699@gmail.com>	2022-11-09 12:12:39 +08:00
lengrongfu	3c0e6c40ad	feat: upgrade registry.k8s.io/pause version Signed-off-by: rongfu.leng <1275177125@qq.com>	2022-09-06 15:59:20 +08:00
Paco Xu	9525b3148a	migrate from k8s.gcr.io to registry.k8s.io Signed-off-by: Paco Xu <paco.xu@daocloud.io>	2022-08-24 13:46:46 +08:00
Paco Xu	1cf6f20320	promote pause image to 3.7 Signed-off-by: Paco Xu <paco.xu@daocloud.io>	2022-05-30 15:08:28 +08:00
Wei Fu	00d102da9f	feature: support image pull progress timeout Kubelet sends the PullImage request without timeout, because the image size is unknown and timeout is hard to defined. The pulling request might run into 0B/s speed, if containerd can't receive any packet in that connection. For this case, the containerd should cancel the PullImage request. Although containerd provides ingester manager to track the progress of pulling request, for example `ctr image pull` shows the console progress bar, it needs more CPU resources to open/read the ingested files to get status. In order to support progress timeout feature with lower overhead, this patch uses http.RoundTripper wrapper to track active progress. That wrapper will increase active-request number and return the countingReadCloser wrapper for http.Response.Body. Each bytes-read can be count and the active-request number will be descreased when the countingReadCloser wrapper has been closed. For the progress tracker, it can check the active-request number and bytes-read at intervals. If there is no any progress, the progress tracker should cancel the request. NOTE: For each blob data, the containerd will make sure that the content writer is opened before sending http request to the registry. Therefore, the progress reporter can rely on the active-request number. fixed: #4984 Signed-off-by: Wei Fu <fuweid89@gmail.com>	2022-04-27 00:02:27 +08:00
Ed Bartosh	c9b4ccf83e	add configuration for CDI Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>	2022-04-06 13:10:54 +03:00
Adelina Tuvenie	6d3d34b85d	Update Pause image in tests & config With the introduction of Windows Server 2022, some images have been updated to support WS2022 in their manifest list. This commit updates the test images accordingly. Signed-off-by: Adelina Tuvenie <atuvenie@cloudbasesolutions.com>	2021-08-31 19:42:57 +03:00
Akihiro Suda	d3aa7ee9f0	Run `go fmt` with Go 1.17 The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools). Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2021-08-22 09:31:50 +09:00
Mike Brown	dd16b006e5	merge in the move to the new options type Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2021-04-08 14:09:59 -05:00
Mike Brown	9144ce9677	shows our runc.v2 default options in the containerd default config Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2021-04-08 14:09:59 -05:00
Mike Brown	d4be6aa8fa	rm mirror defaults; doc registry deprecations Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2021-04-07 12:29:43 -05:00
Maksym Pavlenko	ddd4298a10	Migrate current TOML code to github.com/pelletier/go-toml Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2021-03-25 13:13:33 -07:00
Fu, Wei	80fa9fe32a	Merge pull request #5135 from AkihiroSuda/default-config-crypt add imgcrypt stream processors to the default config	2021-03-25 14:31:38 +08:00
pacoxu	ffff688663	upgrade pause image to 3.5 for non-root Signed-off-by: pacoxu <paco.xu@daocloud.io>	2021-03-16 23:20:35 +08:00
Akihiro Suda	ecb881e5e6	add imgcrypt stream processors to the default config Enable the following config by default: ```toml version = 2 [plugins."io.containerd.grpc.v1.cri".image_decryption] key_model = "node" [stream_processors] [stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"] accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"] returns = "application/vnd.oci.image.layer.v1.tar+gzip" path = "ctd-decoder" args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"] env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"] [stream_processors."io.containerd.ocicrypt.decoder.v1.tar"] accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"] returns = "application/vnd.oci.image.layer.v1.tar" path = "ctd-decoder" args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"] env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"] ``` Fix issue 5128 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2021-03-15 13:27:16 +09:00
Iceber Gu	f37ae8fc35	move to v3.4.1 for the pause image Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>	2021-03-07 15:21:20 +08:00
Derek McGowan	b2642458f9	Update make snapshot annotations disabled by default This experimental feature should not be enabled by default as it is not used by any default snapshotters. Signed-off-by: Derek McGowan <derek@mcg.dev>	2020-10-27 21:32:25 -07:00
Maksym Pavlenko	3508ddd3dd	Refactor CRI packages Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2020-10-07 14:45:57 -07:00
Derek McGowan	b22b627300	Move cri server packages under pkg/cri Organizes the cri related server packages under pkg/cri Signed-off-by: Derek McGowan <derek@mcg.dev>	2020-10-07 13:09:37 -07:00

42 Commits