Commit Graph

22 Commits

Author SHA1 Message Date
Rodrigo Campos
30f2893351 core/mount: Only remove dirs if unmount succeeded
The detached mount is less likely to fail in our case, but if we see any
failure to unmount, we should just skip the removal of directories.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2024-09-24 17:45:34 +02:00
Rodrigo Campos
f8d84ecf92 core/mount: Prevent accidental removal of rootfs files
Using os.RemoveAll() is quite risky, as if the unmount failed and we
can delete files from the container rootfs. In fact, we were doing just
that.

Let's use os.Remove() to make sure we only deleted empty dirs.

Big kudos to @mbaynton for reporting this issue with lot of details,
nailing it down to containerd lines of code and showing all the log
lines to understand the big picture.

Fixes: #10704

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2024-09-24 17:45:34 +02:00
Rodrigo Campos
004f3951d5 core/mount: Use MNT_DETACH for umount of tmp layers
Overlayfs needs to do an idmap mount of each layer and the cleanup
function just unmounts and deletes the directories. However, when the
resource is busy, the umount fails.

Let's make the unmount detached so the unmount will eventually be done
when it's not busy anymore. Also, making it detached solves the issues with
the unmount failing because it is busy.

Big kudos to @mbaynton for reporting this issue with lot of details,
nailing it down to containerd lines of code and showing all the log
lines to understand the big picture.

Fixes: #10704

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2024-09-24 17:34:52 +02:00
Akihiro Suda
cae19b14f3
Merge pull request #10658 from darwin-containers/reorganize-mount-unmount
Reorganize mount/unmount code so it is easier to add Darwin-specific implementation
2024-09-03 01:51:24 +00:00
Marat Radchenko
bfc1465a2c Reorganize mount/unmount code so it is easier to add Darwin-specific implementation
After these changes, in order to add Darwin bind-mount implementation, one only needs:
* Adjust HasBindMounts definition in mount.go
* Provide implementation in mount_darwin.go

There was no consensus on adding dependency on bindfs, that seems to be the only working solution for bind-mounts on Darwin as of today, in https://github.com/containerd/containerd/pull/8789, that's why the actual implementation is not added in current PR.

As a bonus, Linux FUSE-related code was moved to a separate file and possibly could be reused on FreeBSD, though this needs testing.

Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2024-08-30 15:25:06 +03:00
Wei Fu
3cd8f9734d core/mount: use ptrace instead of go:linkname
The Go runtime has started to [lock down future uses of linkname][1] since
go1.23. In the go source code, containerd project has been marked in the
comment, [hall of shame][2]. Well, the go:linkname is used to fork no-op
subprocess efficiently. However, since that comment, I would like to use
ptrace and remove go:linkname in the whole repository.

With go1.22 `go:linkname`:

```bash
$ go test -bench=.  -benchmem ./ -exec sudo
goos: linux
goarch: amd64
pkg: github.com/containerd/containerd/v2/core/mount
cpu: AMD Ryzen 7 5800H with Radeon Graphics
BenchmarkBatchRunGetUsernsFD_Concurrent1-16                 2440            533320 ns/op            1145 B/op         43 allocs/op
BenchmarkBatchRunGetUsernsFD_Concurrent10-16                 342           3661616 ns/op           11562 B/op        421 allocs/op
PASS
ok      github.com/containerd/containerd/v2/core/mount  2.983s
```

With go1.22 `ptrace`:

```bash
$ go test -bench=.  -benchmem ./ -exec sudo
goos: linux
goarch: amd64
pkg: github.com/containerd/containerd/v2/core/mount
cpu: AMD Ryzen 7 5800H with Radeon Graphics
BenchmarkBatchRunGetUsernsFD_Concurrent1-16                 1785            739557 ns/op            3948 B/op         68 allocs/op
BenchmarkBatchRunGetUsernsFD_Concurrent10-16                 328           4024300 ns/op           39601 B/op        671 allocs/op
PASS
ok      github.com/containerd/containerd/v2/core/mount  3.104s
```

With go1.23 `ptrace`:

```bash
$ go test -bench=.  -benchmem ./ -exec sudo
goos: linux
goarch: amd64
pkg: github.com/containerd/containerd/v2/core/mount
cpu: AMD Ryzen 7 5800H with Radeon Graphics
BenchmarkBatchRunGetUsernsFD_Concurrent1-16                 1815            723252 ns/op            4220 B/op         69 allocs/op
BenchmarkBatchRunGetUsernsFD_Concurrent10-16                 319           3957157 ns/op           42351 B/op        682 allocs/op
PASS
ok      github.com/containerd/containerd/v2/core/mount  3.051s
```

Diff:

The `ptrace` is slower than `go:linkname` mode. However, it's accepctable.

```
goos: linux
goarch: amd64
pkg: github.com/containerd/containerd/v2/core/mount
cpu: AMD Ryzen 7 5800H with Radeon Graphics
                                    │ go122-golinkname │             go122-ptrace              │             go123-ptrace              │
                                    │      sec/op      │    sec/op     vs base                 │    sec/op     vs base                 │
BatchRunGetUsernsFD_Concurrent1-16        533.3µ ± ∞ ¹   739.6µ ± ∞ ¹        ~ (p=1.000 n=1) ²   723.3µ ± ∞ ¹        ~ (p=1.000 n=1) ²
BatchRunGetUsernsFD_Concurrent10-16       3.662m ± ∞ ¹   4.024m ± ∞ ¹        ~ (p=1.000 n=1) ²   3.957m ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                   1.397m         1.725m        +23.45%                   1.692m        +21.06%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │ go122-golinkname │              go122-ptrace               │              go123-ptrace               │
                                    │       B/op       │     B/op       vs base                  │     B/op       vs base                  │
BatchRunGetUsernsFD_Concurrent1-16       1.118Ki ± ∞ ¹   3.855Ki ± ∞ ¹         ~ (p=1.000 n=1) ²   4.121Ki ± ∞ ¹         ~ (p=1.000 n=1) ²
BatchRunGetUsernsFD_Concurrent10-16      11.29Ki ± ∞ ¹   38.67Ki ± ∞ ¹         ~ (p=1.000 n=1) ²   41.36Ki ± ∞ ¹         ~ (p=1.000 n=1) ²
geomean                                  3.553Ki         12.21Ki        +243.65%                   13.06Ki        +267.43%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │ go122-golinkname │             go122-ptrace             │             go123-ptrace             │
                                    │    allocs/op     │  allocs/op   vs base                 │  allocs/op   vs base                 │
BatchRunGetUsernsFD_Concurrent1-16         43.00 ± ∞ ¹   68.00 ± ∞ ¹        ~ (p=1.000 n=1) ²   69.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
BatchRunGetUsernsFD_Concurrent10-16        421.0 ± ∞ ¹   671.0 ± ∞ ¹        ~ (p=1.000 n=1) ²   682.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                    134.5         213.6        +58.76%                   216.9        +61.23%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
```

[1]: <https://github.com/golang/go/issues/67401>
[2]: <https://github.com/golang/go/blob/release-branch.go1.23/src/runtime/proc.go#L4820>

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-08-26 21:19:50 +08:00
Wei Fu
bcdf507363 core/mount: add benchmark test for GetUsernsFD
```bash
$ go test -bench=.  -benchmem ./ -exec sudo
goos: linux
goarch: amd64
pkg: github.com/containerd/containerd/v2/core/mount
cpu: AMD Ryzen 7 5800H with Radeon Graphics
BenchmarkBatchRunGetUsernsFD_Concurrent1-16                 2398            532424 ns/op            1145 B/op         43 allocs/op
BenchmarkBatchRunGetUsernsFD_Concurrent10-16                 343           3701695 ns/op           11552 B/op        421 allocs/op
PASS
ok      github.com/containerd/containerd/v2/core/mount  2.978s
```

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-08-19 19:11:25 +08:00
Sebastiaan van Stijn
9776047243
migrate to github.com/moby/sys/userns
Commit 8437c567d8 migrated the use of the
userns package to the github.com/moby/sys/user module.

After further discussion with maintainers, it was decided to move the
userns package to a separate module, as it has no direct relation with
"user" operations (other than having "user" in its name).

This patch migrates our code to use the new module.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-08-08 12:48:54 +02:00
Sebastiaan van Stijn
8437c567d8
pkg/userns: deprecate and migrate to github.com/moby/sys/user/userns
The userns package in libcontainer was integrated into the moby/sys/user
module at commit [3778ae603c706494fd1e2c2faf83b406e38d687d][1].

This patch deprecates the containerd fork of that package, and adds it as
an alias for the moby/sys/user/userns package.

[1]: 3778ae603c

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-07-26 09:47:50 +02:00
Sebastiaan van Stijn
ed64e6503a
core/mount: remove logrus import
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-06-17 12:40:18 +02:00
Wei Fu
4123170a39 *: export RemoveVolatileOption for CRI image volumes
Remove volatile option when CRI prepares image volumes.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-05-30 09:56:37 +08:00
Swagat Bora
0597317759 Preserve CL_UNPRIVILEGED locked flags during remount of bind mounts
Signed-off-by: Swagat Bora <sbora@amazon.com>
2024-05-10 00:31:21 +00:00
Derek McGowan
2ac2b9c909
Make api a Go sub-module
Allow the api to stay at the same v1 go package name and keep using a
1.x version number. This indicates the API is still at 1.x and allows
sharing proto types with containerd 1.6 and 1.7 releases.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-05-02 11:03:00 -07:00
Danny Canter
b50e9eae43 Refactor spots to make use of sys.IgnoringEintr
This makes use of pkg/sys's IgnoringEintr function
to clean up some of the redundant eintr loops we
had laying around.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2024-04-10 11:24:01 -07:00
Danny Canter
ad584ebecb Replace direct waitid syscall with unix.Waitid
This also replaces the PPidFD constant with the definition in
x/sys/unix

Signed-off-by: Danny Canter <danny@dcantah.dev>
2024-04-10 05:52:43 -07:00
Alexandru Matei
ccec1e6e4c Remove internal LoopConfig struct
The struct is now part of golang.org/x/sys.
Follow-up for #9805

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-02-13 11:49:28 +02:00
Alexandru Matei
c2dfae8d05 go.mod: Bump golang.org/x/sys to v0.17.0
Replace internal LOOP_CONFIGURE ioctl implementation with
IoctlLoopConfigure from sys

Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
2024-02-12 15:55:33 +02:00
Akihiro Suda
eb8981f352
mv contrib/seccomp/kernelversion pkg/kernelversion
The package isn't really relevant to seccomp

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-01-24 19:03:53 +09:00
krglosse
cfe8321b4c strip-volatile-option-tmp-mounts
Signed-off-by: krglosse <krglosse@us.ibm.com>

do not alter original slice

Signed-off-by: krglosse <krglosse@us.ibm.com>

Update core/mount/temp.go

makes sense, thank you!

Co-authored-by: Derek McGowan <derek@mcg.dev>
Signed-off-by: KodieGlosserIBM <39170759+KodieGlosserIBM@users.noreply.github.com>

do not copy mount structure unless conditional is met and adding a test case for it

Signed-off-by: krglosse <krglosse@us.ibm.com>

copy option slice when removing the element instead of giving the element an empty string

remove unneeded block

Signed-off-by: krglosse <krglosse@us.ibm.com>

simplify

Signed-off-by: krglosse <krglosse@us.ibm.com>
2024-01-19 12:51:34 -06:00
Derek McGowan
4ee6419fad
Move pkg/randutil to internal/randutil
Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-01-17 09:57:10 -08:00
Derek McGowan
6be90158cd
Move sys to pkg/sys
Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-01-17 09:56:16 -08:00
Derek McGowan
6e5408dcec
Move mount to core/mount
Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-01-17 09:52:12 -08:00