containerd

Author	SHA1	Message	Date
Wei Fu	cb5a48e645	*: enable ARM64 runner There are many Kubernetes clusters running on ARM64. Enable ARM64 runner is to commit to support ARM64 platform officially. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-12-07 23:55:36 +08:00
Sebastiaan van Stijn	2af6db672e	switch back from golang.org/x/sys/execabs to os/exec (go1.19) This is effectively a revert of `2ac9968401`, which switched from os/exec to the golang.org/x/sys/execabs package to mitigate security issues (mainly on Windows) with lookups resolving to binaries in the current directory. from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path > ## PATH lookups > > Command and LookPath no longer allow results from a PATH search to be found > relative to the current directory. This removes a common source of security > problems but may also break existing programs that depend on using, say, > exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in > the current directory. See the os/exec package documentation for information > about how best to update such programs. > > On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath > environment variable, making it possible to disable the default implicit search > of “.” in PATH lookups on Windows systems. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2023-11-02 21:15:40 +01:00
Phil Estes	3d6c5ea487	Merge pull request #9308 from ZhangShuaiyi/fix/TestRwLoop test: remove /dev/loopX in TestRwLoop	2023-11-02 14:44:59 +00:00
Shuaiyi Zhang	b6adf43d4a	test: use 'Autoclear: ture' in TestRwLoop and add Autoclear test Signed-off-by: Shuaiyi Zhang <zhang_syi@qq.com>	2023-11-01 11:49:12 +08:00
Derek McGowan	5fdf55e493	Update go module to github.com/containerd/containerd/v2 Signed-off-by: Derek McGowan <derek@mcg.dev>	2023-10-29 20:52:21 -07:00
Derek McGowan	788f7f248a	Merge pull request #9218 from fuweid/followup-idmapped idmapped: use pidfd to avoid pid reuse issue	2023-10-20 17:34:02 +00:00
Samuel Karp	423c7ad4fe	Merge pull request #9211 from UiPath/use-loop-configure	2023-10-16 23:40:58 -07:00
Alexandru Matei	a782fd6da2	Use LOOP_CONFIGURE when creating loop devices LOOP_CONFIGURE is a new ioctl that is a lot faster than the LOOP_SET_FD+LOOP_SET_STATUS64 calls Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>	2023-10-16 13:02:12 +03:00
Wei Fu	3742f7f0db	idmapped: use pidfd to avoid pid reuse issue It's followup for #5890. The containerd-shim process depends on the mount package to init rootfs for container. For the container enable user namespace, the mount package needs to fork child process to get the brand-new user namespace. However, there are two reapers in one process (described by the following list) and there are race-condition cases. 1. mount package 2. sys.Reaper as global one which watch all the SIGCHLD. === [kill(2)][kill] the wrong process === Currently, we use pipe to ensure that child process is alive. However, the pide file descriptor can be hold by other process, which the child process cannot exit by self. We should use [kill(2)][kill] to ensure the child process. But we might kill the wrong process if the child process might be reaped by containerd-shim and the PID might be reused by other process. === [waitid(2)][waitid] on the wrong child process === ``` containerd-shim process: Goroutine 1(GetUsernsFD): Goroutine 2(Reaper) 1. Ready to wait for child process X 2. Received SIGCHLD from X 3. Reaped the zombie child process X (X has been reused by other child process) 4. Wait on process X The goroutine 1 will be stuck until the process X has been terminated. ``` === open `/proc/X/ns/user` on the wrong child process === There is also pid-reused risk between opening `/proc/$pid/ns/user` and writing `/proc/$pid/u[g]id_map`. ``` containerd-shim process: Goroutine 1(GetUsernsFD): Goroutine 2(Reaper) 1. Fork child process X 2. Write /proc/X/uid_map,gid_map 3. Received SIGCHLD from X 4. Reaped the zombie child process X (X has been reused by other process) 5. Open /proc/X/ns/user file as usernsFD The usernsFD links to the wrong X!!! ``` In order to fix the race-condition, we should use [CLONE_PIDFD][clone2] (Since Linux v5.2). When we fork child process `X`, the kernel will return a process file descriptor `X_PIDFD` referencing to child process `X`. With the pidfd, we can use [pidfd_send_signal(2)][pidfd_send_signal] (Since Linux v5.1) to send signal(0) to ensure the child process `X` is alive. If the `X` has terminated and its PID has been recycled for another process. The pidfd_send_signal fails with the error ESRCH. Therefore, we can open `/proc/X/{ns/user,uid_map,gid_map}` file descriptors as first and then use pidfd_send_signal to check the process is still alive. If so, we can ensure the file descriptors are valid and reference to the child process `X`. Even if the `X` PID has been reused after pidfd_send_signal call, the file descriptors are still valid. ```code X, pidfd = clone2(CLONE_PIDFD) usernsFD = open /proc/X/ns/user uidmapFD = open /proc/X/uid_map gidmapFD = open /proc/X/gid_map pidfd_send_signal pidfd, signal(0) return err if no such process == When we arrive here, we can ensure usernsFD/uidmapFD/gidmapFD are correct == even if X has been reused after pidfd_send_signal call. update uid/gid mapping by uidmapFD/gidmapFD return usernsFD ``` And the [waitid(2)][waitid] also supports pidfd type (Since Linux 5.4). We can use pidfd type waitid to ensure we are waiting for the correct process. All the PID related race-condition issues can be resolved by pidfd. ```bash ➜ mount git:(followup-idmapped) pwd /home/fuwei/go/src/github.com/containerd/containerd/mount ➜ mount git:(followup-idmapped) sudo go test -test.root -run TestGetUsernsFD -count=1000 -failfast -p 100 ./... PASS ok github.com/containerd/containerd/mount 3.446s ``` [kill]: <https://man7.org/linux/man-pages/man2/kill.2.html> [clone2]: <https://man7.org/linux/man-pages/man2/clone.2.html> [pidfd_send_signal]: <https://man7.org/linux/man-pages/man2/pidfd_send_signal.2.html> [waitid]: <https://man7.org/linux/man-pages/man2/waitid.2.html> Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-10-13 00:56:55 +08:00
Akihiro Suda	9ffb34ac49	Merge pull request #9054 from macOScontainers/canonicalize-filter-mount-path Fix usages of `mountinfo.PrefixFilter`	2023-09-27 05:10:27 +09:00
Derek McGowan	508aa3a1ef	Move to use github.com/containerd/log Add github.com/containerd/log to go.mod Signed-off-by: Derek McGowan <derek@mcg.dev>	2023-09-22 07:53:23 -07:00
Marat Radchenko	d94a789d15	Fix usages of `mountinfo.PrefixFilter` It says: The prefix path must be absolute, have all symlinks resolved, and cleaned. But those requirements are violated in lots of places. What happens when it is given a non-canonicalized path is that `mountinfo.GetMounts` will not find mounts. The trivial case is: ``` $ mkdir a && ln -s a b && mkdir b/c b/d && mount --bind b/c b/d && cat /proc/mounts \| grep -- '[ab]/d' /dev/sdd3 /home/user/a/d ext4 rw,noatime,discard 0 0 ``` We asked to bind-mount b/c to b/d, but ended up with mount in a/d. So, mount table always contains canonicalized mount points, and it is an error to look for non-canonicalized paths in it. Signed-off-by: Marat Radchenko <marat@slonopotamus.org>	2023-09-10 15:14:26 +03:00
Ilya Hanov	1555a31bf6	mount: support idmapped mount points This patch introduces idmapped mounts support for container rootfs. The idmapped mounts support was merged in Linux kernel 5.12 torvalds/linux@7d6beb7. This functionality allows to address chown overhead for containers that use user namespace. The changes are based on experimental patchset published by Mauricio Vásquez #4734. Current version reiplements support of idmapped mounts using Golang. Performance measurement results: Image idmapped mount recursive chown BusyBox 00.135 04.964 Ubuntu 00.171 15.713 Fedora 00.143 38.799 Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Artem Kuzin <artem.kuzin@huawei.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com> Signed-off-by: Ilya Hanov <ilya.hanov@huawei-partners.com>	2023-09-05 01:23:30 +03:00
Akihiro Suda	98f27e1d9c	Revert "Add support for mounts on Darwin" This reverts commit `2799b28e61`. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2023-07-19 00:22:20 +09:00
Marat Radchenko	2799b28e61	Add support for mounts on Darwin Signed-off-by: Marat Radchenko <marat@slonopotamus.org>	2023-07-17 23:27:04 +03:00
Danny Canter	7ef133ad47	Fix mount pkg typo retired -> retried Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-07-10 01:45:17 -07:00
Danny Canter	55a8102ec1	mount: Add From/ToProto helpers Helpers to convert from containerd's [Mount] to its protobuf structure for [Mount] and vice-versa appear three times. It seems sane to just expose this facility in /mount. Signed-off-by: Danny Canter <danny@dcantah.dev>	2023-06-28 04:03:18 -07:00
Wei Fu	72b7d16505	mount: support direct-io for loopback device Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-06-15 23:51:46 +08:00
Craig Ingram	d2605de734	add handling of a '.' commondir and bounds checking to mount_linux Signed-off-by: Craig Ingram <Cjingram@google.com>	2023-05-30 21:13:16 +00:00
Cardy.Tang	a6cd5e3f4f	bugfix: resolve symlink when looking up mountpoint Signed-off-by: Cardy.Tang <zuniorone@gmail.com>	2023-05-22 11:03:51 +08:00
Gabriel Adrian Samfira	c9e5c33a18	UnmountAll is a no-op for missing mount points Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-04 12:59:52 -07:00
Gabriel Adrian Samfira	8538e7a2ac	Improve error messages and remove check * Improve error messages * remove a check for the existance of unmount target. We probably should not mask that the target was missing. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-04 12:07:34 -07:00
Gabriel Adrian Samfira	ba74cdf150	Make ReadOnly() available on all platforms Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-04 02:04:56 -07:00
Gabriel Adrian Samfira	1279ad880c	Remove bind code path in mount() Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-03 23:18:44 -07:00
Gabriel Adrian Samfira	6a5b4c9c24	Remove "bind" code path from diff Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-03 08:11:35 -07:00
Gabriel Adrian Samfira	d373ebc4de	Properly mount base layers As opposed to a writable layer derived from a base layer, the volume path of a base layer, once activated and prepared will not be a WCIFS volume, but the actual path on disk to the snapshot. We cannot directly mount this folder, as that would mean a client may gain access and potentially damage important metadata files that would render the layer unusabble. For base layers we need to mount the Files folder which must exist in any valid base windows-layer. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-02 08:35:34 -07:00
Gabriel Adrian Samfira	7f82dd91f4	Add ReadOnly() function Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-04-01 08:43:14 -07:00
Gabriel Adrian Samfira	4012c1b853	Remove escalated privileges Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-03-31 06:17:35 -07:00
Gabriel Adrian Samfira	95687a9324	Fix go.mod, simplify boolean logic, add logging Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-03-31 06:16:56 -07:00
Gabriel Adrian Samfira	7a36efd75e	Ignore ERROR_NOT_FOUND error when removing mount Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-03-31 06:16:55 -07:00
Gabriel Adrian Samfira	db32798592	Update continuity, go-winio and hcsshim Update dependencies and remove the local bindfilter files. Those have been moved to go-winio. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-03-31 06:16:52 -07:00
Gabriel Adrian Samfira	00efd3e6d8	Remove unused function Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-03-31 06:15:19 -07:00
Gabriel Adrian Samfira	36dc2782c4	Use bind filer for mounts The bind filter supports bind-like mounts and volume mounts. It also allows us to have read-only mounts. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-03-31 06:15:18 -07:00
Paul "TBBle" Hampson	474a257b16	Implement Windows mounting for bind and windows-layer mounts Using symlinks for bind mounts means we are not protecting an RO-mounted layer against modification. Windows doesn't currently appear to offer a better approach though, as we cannot create arbitrary empty WCOW scratch layers at this time. For windows-layer mounts, Unmount does not have access to the mounts used to create it. So we store the relevant data in an Alternate Data Stream on the mountpoint in order to be able to Unmount later. Based on approach in https://github.com/containerd/containerd/pull/2366, with sign-offs recorded as 'Based-on-work-by' trailers below. This also partially-reverts some changes made in #6034 as they are not needed with this mounting implmentation, which no longer needs to be handled specially by the caller compared to non-Windows mounts. Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com> Based-on-work-by: Michael Crosby <crosbymichael@gmail.com> Based-on-work-by: Darren Stahl <darst@microsoft.com>	2023-03-31 06:15:17 -07:00
Laura Brehm	daa3a7665e	Add `WithReadonlyTempMount` to create readonly temporary mounts This is necessary so we can mount snapshots more than once with overlayfs, otherwise mounts enter an unknown state. related: https://github.com/moby/buildkit/pull/1100 Signed-off-by: Laura Brehm <laurabrehm@hey.com> Co-authored-by: Zou Nengren <zouyee1989@gmail.com>	2023-03-17 15:51:18 +00:00
Akihiro Suda	d8b68e3ccc	Stop using math/rand.Read and rand.Seed (deprecated in Go 1.20) From golangci-lint: > SA1019: rand.Read has been deprecated since Go 1.20 because it >shouldn't be used: For almost all use cases, crypto/rand.Read is more >appropriate. (staticcheck) > SA1019: rand.Seed has been deprecated since Go 1.20 and an alternative >has been available since Go 1.0: Programs that call Seed and then expect >a specific sequence of results from the global random source (using >functions such as Int) can be broken when a dependency changes how >much it consumes from the global random source. To avoid such breakages, >programs that need a specific result sequence should use >NewRand(NewSource(seed)) to obtain a random generator that other >packages cannot access. (staticcheck) See also: - https://pkg.go.dev/math/rand@go1.20#Read - https://pkg.go.dev/math/rand@go1.20#Seed Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2023-02-16 03:50:23 +09:00
Kohei Tokunaga	eeab052425	Make `mount.UnmountRecursive` compatible to `mount.UnmountAll` Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>	2023-01-31 22:07:44 +09:00
Edgar Lee	34d5878185	Use mount.Target to specify subdirectory of rootfs mount - Add Target to mount.Mount. - Add UnmountMounts to unmount a list of mounts in reverse order. - Add UnmountRecursive to unmount deepest mount first for a given target, using moby/sys/mountinfo. Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>	2023-01-27 09:51:58 +08:00
Maksym Pavlenko	3bc8fc4d30	Cleanup build constraints Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2022-12-08 09:36:20 -08:00
Brian Goff	a24ef09937	Replace mount fork hack with CLONE_FS This change spins up a new goroutine, locks it to a thread, then unshares CLONE_FS which allows us to `Chdir` from inside the thread without affecting the rest of the program. The thread is no longer usable after unshare so it leaves the thread locked to prevent go from returning the thread to the thread pool. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2022-11-03 22:30:35 +00:00
Phil Estes	455127859b	Merge pull request #7342 from tklauser/losetup-unix Use ioctl helpers from x/sys/unix	2022-08-30 12:32:20 -04:00
Tobias Klauser	3cc3d8a560	mount: use ioctl helpers from x/sys/unix Use the IoctlRetInt, IoctlSetInt and IoctlLoopSetStatus64 helper functions defined in the golang.org/x/sys/unix package instead of manually wrapping these using a locally defined ioctl function. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2022-08-30 10:38:29 +02:00
Sebastiaan van Stijn	9ae2cc3a8a	mount: remove unused ErrNotImplementOnWindows This error was added in `c5843b7615`, but no longer used since `a5a9f91832`, which implemented Windows support. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-08-29 10:55:04 +02:00
Maksym Pavlenko	871b6b6a9f	Use testify Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2022-04-01 18:17:58 -07:00
Eng Zer Jun	18ec2761c0	test: use `T.TempDir` to create temporary test directory The directory created by `T.TempDir` is automatically removed when the test and all its subtests complete. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-03-15 14:03:50 +08:00
Shengjing Zhu	d28981d48e	Fix build with gccgo gccgo changes the mangling scheme `b483d0e0a2` The change is available in gcc-11, which is the least version that implements go1.16. Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2022-02-22 02:31:58 +08:00
Derek McGowan	8816006d1e	Fix followup items from errors replacement Signed-off-by: Derek McGowan <derek@mcg.dev>	2022-01-07 12:16:00 -08:00
haoyun	bbe46b8c43	feat: replace github.com/pkg/errors to errors Signed-off-by: haoyun <yun.hao@daocloud.io> Co-authored-by: zounengren <zouyee1989@gmail.com>	2022-01-07 10:27:03 +08:00
haoyun	c0d07094be	feat: Errorf usage Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-12-13 14:31:53 +08:00
Michael Crosby	7b8a697f28	Merge pull request #6034 from claudiubelu/windows/fixes-image-volume Fixes Windows containers with image volumes	2021-10-07 11:50:01 -04:00

1 2 3

128 Commits