Commit Graph

1860 Commits

Author SHA1 Message Date
Phil Estes
3f315fcabf
Merge pull request #9095 from thaJeztah/isolate_platform 2023-09-14 08:31:50 -04:00
Rodrigo Campos
2e13d39546 pkg/process: Only use idmap mounts if runc supports it
runc, as mandated by the runtime-spec, ignores unknown fields in the
config.json. This is unfortunate for cases where we _must_ enable that
feature or fail.

For example, if we want to start a container with user namespaces and
volumes, using the uidMappings/gidMappings field is needed so the
UID/GIDs in the volume don't end up with garbage. However, if we don't
fail when runc will ignore these fields (because they are unknown to
runc), we will just start a container without using the mappings and the
UID/GIDs the container will persist to volumes the hostUID/GID, that can
change if the container is re-scheduled by Kubernetes.

This will end up in volumes having "garbage" and unmapped UIDs that the
container can no longer change. So, let's avoid this entirely by just
checking that runc supports idmap mounts if the container we are about
to create needs them.

Please note that the "runc features" subcommand is only run when we are
using idmap mounts. If idmap mounts are not used, the subcommand is not
run and therefore this should not affect containers that don't use idmap
mounts in any way.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-09-13 16:44:54 +02:00
Rodrigo Campos
a81f80884b Revert "cri: Throw an error if idmap mounts is requested"
This reverts commit 7e6ab84884.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-09-13 16:44:54 +02:00
Rodrigo Campos
ab5b43fe80 cri/sbserver: Pass down UID/GID mappings to OCI runtime
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-09-13 16:44:54 +02:00
Sebastiaan van Stijn
e916d77c81
platforms: move ToProto, FromProto to api/types
These utilities resulted in the platforms package to have the containerd
API as dependency. As this package is used in many parts of the code, as
well as external consumers, we should try to keep it light on dependencies,
with the potential to make it a standalone module.

These utilities were added in f3b7436b61,
which has not yet been included in a release, so skipping deprecation
and aliases for these.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-13 16:44:52 +02:00
Rodrigo Campos
e0b2b17de3 cri/server: Add tests for the linux-specific parts of VolumeMounts()
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-09-13 16:42:31 +02:00
Rodrigo Campos
10cb112e4a cri/server: Add tests for ContainerMounts()
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-09-13 16:42:31 +02:00
Rodrigo Campos
97dfa7f556 cri/server: Pass down uidMappings to OCI runtime
When the kubelet sends the uid/gid mappings for a mount, just pass them
down to the OCI runtime.

OCI runtimes support this since runc 1.2 and crun 1.8.1.

And whenever we add mounts (container mounts or image spec volumes) and
userns are requested by the kubelet, we use those mappings in the mounts
so the mounts are idmapped correctly. If no userns is used, we don't
send any mappings which just keeps the current behavior.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-09-13 16:42:31 +02:00
Marat Radchenko
d94a789d15 Fix usages of mountinfo.PrefixFilter
It says: The prefix path **must be absolute, have all symlinks resolved, and cleaned**. But those requirements are violated in lots of places.

What happens when it is given a non-canonicalized path is that `mountinfo.GetMounts` will not find mounts.

The trivial case is:
```
$ mkdir a && ln -s a b && mkdir b/c b/d && mount --bind b/c b/d && cat /proc/mounts | grep -- '[ab]/d'
/dev/sdd3 /home/user/a/d ext4 rw,noatime,discard 0 0
```
We asked to bind-mount b/c to b/d, but ended up with mount in a/d.
So, mount table always contains canonicalized mount points, and it is an error to look for non-canonicalized paths in it.

Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-09-10 15:14:26 +03:00
Sam Edwards
f77185f9e8 Fix "even if IPv4 comes first" test to have IPv4 first
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
2023-09-08 21:46:10 -06:00
Sam Edwards
88a849626f Don't use To16() != nil to detect IPv6 addresses
The ip.To16() function returns non-nil if `ip` is any kind
of IP address, including IPv4. To look for IPv6 specifically,
use ip.To4() == nil.

Signed-off-by: Sam Edwards <CFSworks@gmail.com>
2023-09-08 21:44:49 -06:00
Ethan Lowman
ac1d556b92
Add image verifier transfer service plugin system based on a binary directory
Signed-off-by: Ethan Lowman <ethan.lowman@datadoghq.com>
2023-09-07 18:45:02 -04:00
Maksym Pavlenko
c13f47a3ae
Merge pull request #9029 from dmcgowan/push-inherit-distribution-sources
push: inherit distribution sources from parent
2023-09-07 12:46:18 -07:00
Derek McGowan
b11439fc4b
Merge pull request #9034 from thaJeztah/replace_reference
replace reference/docker for github.com/distribution/reference v0.5.0
2023-09-05 06:52:29 -07:00
Akihiro Suda
e30a40eb65
Merge pull request #9016 from djdongjin/remove-most-logrus
Remove most logrus import
2023-09-05 16:09:12 +09:00
Fu Wei
e2bf34feaf
Merge pull request #9033 from dcantah/sberror-include-id
CRI: Include sandbox ID in failed to recover error
2023-09-02 10:48:34 +08:00
Sebastiaan van Stijn
5d31e93787
pkg/systemd: use sync.Once for systemd detection
This brings over the enhancement from a506630e57.

We don't expect the systemd state to change while containerd is running,
so we can use a `sync.Once` for this, to prevent stat'ing each time.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-01 12:14:56 +02:00
Sebastiaan van Stijn
7d0ab4fc2c
remove uses of github.com/runc/libcontainer/cgroups
runc considers libcontainer to be "unstable" (not for external use),
so we try not to use it. Commit ed47d6ba76
brought back the dependency on other parts of libcontainer, but looks to
be only depending on a single utility, which in itself was borrowed from
github.com/coreos/go-systemd to not introduce CGO code in the same package.

This patch copies the version from github.com/coreos/go-systemd (adding
proper attribution, although the function is pretty trivial).

runc is in process of moving the libcontainer/user package to an external
module, which means we can remove the dependency on libcontainer entirely
in the near future. There is one more use of `libcontainer` in our vendor
tree; it looks like CDI is depending on one utility (devices.DeviceFromPath);
a943033a8b/vendor/github.com/container-orchestrated-devices/container-device-interface/pkg/cdi/container-edits_unix.go (L38)

We should remove the dependency on that utility, and add a CI check to
prevent bringing it back.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-09-01 12:10:55 +02:00
Derek McGowan
24aca53fa0
Update use of content.Infoprovider
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-08-31 11:04:33 -07:00
Danny Canter
a2817ca16d CRI: Include sandbox ID in failed to load error
The failed to recover state message didn't include the ID making this
not as useful as it could be..

This additionally moves some of the other logs to include the id for
the sandbox/container as a field instead of part of a format string.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-08-31 10:07:07 -07:00
Sebastiaan van Stijn
4923470902
replace reference/docker for github.com/distribution/reference v0.5.0
The reference/docker package was a fork of github.com/distribution/distribution,
which could not easily be used as a direct dependency, as it brought many other
dependencies with it.

The "reference' package has now moved to a separate repository, which means
we can replace the local fork, and use the upstream implementation again.

The new module was extracted from the distribution repository at commit:
b9b19409cf

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-31 15:54:50 +02:00
Edgar Lee
779875a057 Add missing unpacker.Wait for image import
- For remote snapshotters, the unpack phase serves as an important step for
  preparing the remote snapshot. With the missing unpacker.Wait, the
  snapshotter `Prepare` context is always canceled.
- This patch allows remote snapshotter based archives to be imported via
  the transfer service or `ctr image import`

Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
2023-08-29 15:34:20 -07:00
Jin Dong
fc45365fa1 Remove most logrus
Signed-off-by: Jin Dong <jin.dong@databricks.com>
2023-08-26 14:31:53 -04:00
Akihiro Suda
f48bbef193
Merge pull request #8994 from mxpv/cri
Use sandboxed CRI by default
2023-08-24 13:42:58 +09:00
Phil Estes
8e7a25856b
Merge pull request #8998 from dmcgowan/image-inspect
ctr: images inspect
2023-08-23 14:12:56 -04:00
Maksym Pavlenko
c3f3cad287
Use sandboxed CRI by default
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-08-23 08:50:40 -07:00
Sebastiaan van Stijn
b76cd4d9fd
replace some fmt.Sprintfs with strconv
Teeny-tiny optimizations:

    BenchmarkSprintf-10       37735996    32.31  ns/op  0 B/op  0 allocs/op
    BenchmarkItoa-10         591945836     2.031 ns/op  0 B/op  0 allocs/op
    BenchmarkFormatUint-10   593701444     2.014 ns/op  0 B/op  0 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-23 16:43:02 +02:00
Derek McGowan
78308b4a44
Add manifest printer library
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-08-23 06:52:35 -07:00
Sebastiaan van Stijn
d7bc8694be
pkg/cri: replace some fmt.Sprintfs with strconv
Teeny-tiny optimizations:

    BenchmarkSprintf-10       37735996    32.31  ns/op  0 B/op  0 allocs/op
    BenchmarkItoa-10         591945836     2.031 ns/op  0 B/op  0 allocs/op
    BenchmarkFormatUint-10   593701444     2.014 ns/op  0 B/op  0 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-23 10:10:56 +02:00
Fu Wei
738c153573
Merge pull request #8992 from djdongjin/remove-hashicorp-multierror
Remove hashicorp/go-multierror dependency
2023-08-23 13:13:51 +08:00
Derek McGowan
2bac6ffb79
Merge pull request #8663 from helen-frank/feature/MergeSortedStringSlices
MergeStringSlices use sets
2023-08-22 16:31:28 -07:00
Phil Estes
de066a37dc
Merge pull request #8935 from lengrongfu/feat/add-metrics-for-dropped-events
add metrics for discarding events
2023-08-22 09:09:31 -04:00
Fu Wei
3ffde050a4
Merge pull request #8988 from kinvolk/rata/userns-fix-platform
cri: Fix sandbox_mode "shim"
2023-08-22 16:40:34 +08:00
Derek McGowan
b8f32e863c
Merge pull request #8951 from kiashok/exposeCommitMemWindows
Populate commit memory for windows memory usage stats
2023-08-21 15:42:34 -07:00
Jin Dong
cd8c8ae4bc Remove hashicorp/go-multierror
Signed-off-by: Jin Dong <jin.dong@databricks.com>
2023-08-20 17:59:45 -07:00
Rodrigo Campos
d09f7cbe00 cri: Fix sandbox_mode "shim"
This is a partial revert of "cri/sbserver: Use platform instead of GOOS
for userns detection".

While what that commit did is 100% the right thing to do, when the
sandbox_mode is "shim" all controller.XXX() calls are RPCs and the
controller.Create() call initializes the controller. Therefore, things
like "getSandboxController()" don't work in the case of "shim"
sandbox_mode until after the controller.Create().

Due to this asymmetry and the lack of tests for shim mode, we didn't
catch it before.

This patch just reverts that commit so that the Create() and
getSandboxController() calls remain where they were, and just relies on
the config Linux section as a hack to detect if the pod sandbox will use
user namespaces or not.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-08-18 15:13:10 +02:00
Enrico Weigelt, metux IT consult
0c1ad52eac cri: spec_linux: drop unused retvals
cgroupv1HasHugetlb() and cgroupv2HasHugetlb() may return errors, but nobody
(there's just one call site anyways) ever cares. So drop the unnecessary code.

Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
2023-08-17 18:52:37 +02:00
Fu Wei
ba852faf41
Merge pull request #8954 from fuweid/fix-shim-leak 2023-08-17 08:16:20 +08:00
Kirtana Ashok
e2ce4f58f6 Populate commit memory for windows memory usage stats
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2023-08-15 16:48:22 -07:00
Kirtana Ashok
823e0420eb Fix transfer service dependencies:
- Fill OSVersion field of ocispec.Platform for windows OS in
transfer service plugin init()
- Do not return error from transfer service ReceiveStream if
stream.Recv() returned context.Canceled error

Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2023-08-15 15:32:51 -07:00
Wei Fu
8dcb2a6e6d pkg/cri/sbserver: fix leaked shim issue for podsandbox mode
Fixes: #7496 #8931

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-11 17:43:51 +08:00
Wei Fu
72bc63d83d pkg/cri/server: fix leaked shim issue
Fixes: #7496 #8931

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-11 17:43:51 +08:00
Kirtana Ashok
a645ff2e68 Update dependencies after protobuf update in hcsshim
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
(cherry picked from commit d129b6f890bceb56b050bbb23ad330bb5699f78c)
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2023-08-09 11:56:45 -07:00
rongfu.leng
54baf766e5 add metrics for discarding events
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2023-08-09 09:56:26 +08:00
Fu Wei
2b2195c36b
Merge pull request #8722 from marquiz/devel/cgroup-driver-autoconfig
cri: implement RuntimeConfig rpc
2023-08-04 16:09:34 +08:00
Rodrigo Campos
c80a3ecafd cri/sbserver: Use platform instead of GOOS for userns detection
In the sbserver we should not use the GOOS, as windows hosts can run
linux containers. On the sbserver we should use the platform param.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-08-02 12:32:05 +02:00
Rodrigo Campos
2d64ab8d79 cri: Don't use rel path for image volumes
Runc 1.1 throws a warning when using rel destination paths, and runc 1.2
is planning to thow an error (i.e. won't start the container).

Let's just make this an abs path in the only place it might not be: the
mounts created due to `VOLUME` directives in the Dockerfile.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-31 12:33:54 +02:00
Markus Lehtonen
ed47d6ba76 cri: implement RuntimeConfig rpc
The rpc only reports one field, i.e. the cgroup driver, to kubelet.
Containerd determines the effective cgroup driver by looking at all
runtime handlers, starting from the default runtime handler (the rest in
alphabetical order), and returning the cgroup driver setting of the
first runtime handler that supports one. If no runtime handler supports
cgroup driver (i.e. has a config option for it) containerd falls back to
auto-detection, returning systemd if systemd is running and cgroupfs
otherwise.

This patch implements the CRI server side of Kubernetes KEP-4033:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4033-group-driver-detection-over-cri

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-07-28 13:50:43 +03:00
Phil Estes
81895d22c9
Merge pull request #8867 from Iceber/pinned_image_label
cri: fix using the labels to pin image
2023-07-27 09:51:23 -04:00
Iceber Gu
7f7ba31b64 cri: fix using the pinned label to pin image
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-07-26 12:26:00 +08:00
Akihiro Suda
4807571352
pkg/epoch: fix Y2038 on 32-bit hosts
`strconv.Itoa(int(tm.Unix()))` rounds the time to 32-bit int on 32-bit hosts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-07-26 13:17:39 +09:00
Qasim Sarfraz
06f18c69d2 cri: memory.memsw.limit_in_bytes: no such file or directory
If kubelet passes the swap limit (default memory limit = swap limit ),
it is configured for container irrespective if the node supports swap.

Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
2023-07-21 14:43:33 +02:00
Akihiro Suda
98f27e1d9c
Revert "Add support for mounts on Darwin"
This reverts commit 2799b28e61.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-07-19 00:22:20 +09:00
Kazuyoshi Kato
ef1c9f0a63
Merge pull request #8766 from lengrongfu/fix/ci-Integration-fail
fix ci Linux Integration test fail
2023-07-18 10:18:12 -07:00
Kazuyoshi Kato
e5a49e6ceb
Merge pull request #8789 from slonopotamus/macos-bind-mount
Add support for bind-mounts on Darwin (a.k.a. "make native snapshotter work")
2023-07-18 10:16:10 -07:00
Marat Radchenko
2799b28e61 Add support for mounts on Darwin
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-07-17 23:27:04 +03:00
Phil Estes
a94918b591
Merge pull request #8803 from kinvolk/rata/userns-sbserver
cri/sbserver: Add support for user namespaces (KEP-127)
2023-07-17 10:57:01 -04:00
Sebastiaan van Stijn
9c673f9673
pkg/cri/server: TestImageGetLabels: use registry.k8s.io
These are not actually being pulled, just removing the deprecated k8s.gcr.io
from the code-base. While at it, also renamed / removed vars that shadowed
with package-level definitions

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-14 11:22:39 +02:00
Mike Brown
3ed1bc108f
Merge pull request #8671 from jsturtevant/fix-windows-edge-cases
[cri] Handle pod transition states gracefully while listing pod stats
2023-07-12 15:43:21 -05:00
James Sturtevant
f914edf4f6
[cri] Handle Windows pod transitions gracefully
When the pods are transitioning there are several
cases where containers might not be in valid state.
There were several cases where the stats where
failing hard but we should just continue on as
they are transient and will be picked up again
when kubelet queries for the stats again.

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2023-07-12 09:57:14 -07:00
Rodrigo Campos
9160386ecc cri/sbserver: Test net.ipv4.ping_group_range works with userns
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:15:25 +02:00
Rodrigo Campos
1c6e268447 cri/sbserver: Fix net.ipv4.ping_group_range with userns
This commit just updates the sbserver with the same fix we did on main:
	9bf5aeca77 ("cri: Fix net.ipv4.ping_group_range with userns ")

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:15:25 +02:00
Rodrigo Campos
36a96d7f32 cri/sbserver: Remap snapshots for sbserver too
This is a port of 31a6449734 ("Add capability for snapshotters to
declare support for UID remapping") to sbserver.

This patch remaps the rootfs in the platform-specific if user namespaces
are in use, so the pod can read/write to the rootfs.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:15:22 +02:00
Rodrigo Campos
508e6f6e03 cri/sbserver: Add userns tests to TestLinuxSandboxContainerSpec()
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:42 +02:00
Rodrigo Campos
fb9ce5d482 cri/sbserver: Support pods with user namespaces
This patch requests the OCI runtime to create a userns when the CRI
message includes such request.

This is an adaptation of a7adeb6976 ("cri: Support pods with user
namespaces") to sbserver, although the container_create.go parts were
already ported as part of 40be96efa9 ("Have separate spec builder for
each platform"),

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:42 +02:00
Rodrigo Campos
c99cb95f07 cri/sbserver: Let OCI runtime create netns when userns is used
This commit just ports 36f520dc04 ("Let OCI runtime create netns when
userns is used") to sbserver.

The CNI network setup is done after OCI start, as it didn't seem simple
to get the sandbox PID we need for the netns otherwise.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:42 +02:00
Rodrigo Campos
73c75e2c73 cri/sbserver: Copy userns helpers to podsandbox
Currently there is a big c&p of the helpers between these two folders
and a TODO in the platform agnostic file to organize them in the future,
when some other things settle.

So, let's just copy them for now.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:12 +02:00
Rodrigo Campos
0b6a0fe773 cri/sbserver: Move runtimeStart to match position with cri/server
Commit c085fac1e5 ("Move sandbox start behind controller") moved the
runtimeStart to only account for time _after_ the netns has been
created.

To match what we currently do in cri/server, let's move it to just after
the get the sandbox runtime.

This come up when porting userns to sbserver, as the CNI network setup
needs to be done at a later stage and runtimeStart was accounting for
the CNI network setup time only when userns is enabled.

To avoid that discrepancy, let's just move it earlier, that also matches
what we do in cri/server.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 13:58:45 +02:00
Rodrigo Campos
9d9903565a cri: Fix comment typos
Beside the "in future the when" typo, we take the chance to reflect that
user namespaces are already merged.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 13:58:45 +02:00
wangxiang
232538b768 bugfix(port-forward): Correctly handle known errors
These two errors can occur in the following scenarios:

ECONNRESET: the target process reset connection between CRI and itself.
see: #111825 for detail

EPIPE: the target process did not read the received data, causing the
buffer in the kernel to be full, resulting in the occurrence of Zero Window,
then closing the connection (FIN, RESET)
see: #74551 for detail

In both cases, we should RESET the httpStream.

Signed-off-by: wangxiang <scottwangsxll@gmail.com>
2023-07-11 11:06:13 +08:00
rongfu.leng
38f9bc3e0a fix ci Linux Integration test fail
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2023-07-07 14:51:04 +08:00
Rodrigo Campos
c17d3bdb54 pkg/cri/server: Test net.ipv4.ping_group_range works with userns
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-06 14:20:26 +02:00
Rodrigo Campos
9bf5aeca77 pkg/cri/server: Fix net.ipv4.ping_group_range with userns
userns.RunningInUserNS() checks if the code calling that function is
running inside a user namespace. But we need to check if the container
we will create will use a user namespace, in that case we need to
disable the sysctl too (or we would need to take the userns mapping into
account to set the IDs).

This was added in PR:
        https://github.com/containerd/containerd/pull/6170/

And the param documentation says it is not enabled when user namespaces
are in use:
        https://github.com/containerd/containerd/pull/6170/files#diff-91d0a4c61f6d3523b5a19717d1b40b5fffd7e392d8fe22aed7c905fe195b8902R118

I'm not sure if the intention was to disable this if containerd is
running inside a userns (rootless, if that is even supported) or just
when the pod has user namespaces.

Out of an abundance of caution, I'm keeping the userns.RunningInUserNS()
so it is still not used if containerd runs inside a user namespace.

With this patch and "enable_unprivileged_icmp = true" in the config,
running containerd as root on the host, pods with user namespaces start
just fine. Without this patch they fail with:
        ... failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: w
 /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

Thanks a lot to Andy on the k8s slack for reporting the issue. He also
mentions he hits this with k3s on a default installation (the param
is off by default on containerd, but k3s turns that on by default it
seems). He also debugged which part of the stack was setting that
sysctl, found the PR that added this code in containerd and a workaround
(to turn the bool off).

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-06 14:20:26 +02:00
Kazuyoshi Kato
099d2e7c76
Merge pull request #8757 from dcantah/proto-api-conversions
Add From/ToProto helpers
2023-07-03 10:59:08 -07:00
Danny Canter
f3b7436b61 Platforms: Add From/ToProto helpers for types
Helpers to convert from a slice of platforms to our protobuf representation
and vice-versa appear a couple times. It seems sane to just expose this facility
in the platforms pkg.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-28 19:54:56 -07:00
Kazuyoshi Kato
81bc6ce6e9
Merge pull request #8740 from djdongjin/platform-parseall
Add a platform.ParseAll helper
2023-06-28 08:01:12 -07:00
Jin Dong
0a92661e69 Add a platform.ParseAll helper
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2023-06-26 20:34:37 +00:00
Kazuyoshi Kato
9b4ed8acc2
Merge pull request #8696 from fuweid/deflaky-blockfile
chore: deflake the blockfile testsuite
2023-06-26 09:54:33 -07:00
Phil Estes
1a5eaa9ad0
Merge pull request #8732 from thaJeztah/epoch_export_parse
pkg/epoch: extract parsing SOURCE_DATE_EPOCH to a function
2023-06-23 17:06:21 -04:00
helen
e89d7204eb MergeStringSlices use sets
Signed-off-by: helen <haitao.zhang@daocloud.io>
2023-06-24 03:04:24 +08:00
Sebastiaan van Stijn
8760b87174
pkg/epoch: extract parsing SOURCE_DATE_EPOCH to a function
This introduces a ParseSourceDateEpoch function, which can be used
to parse "SOURCE_DATE_EPOCH" values for situations where those
values are not passed through an env-var (or the env-var has been
read through other means).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 17:32:02 +02:00
Sebastiaan van Stijn
9924e56f42
pkg/epoch: fix tests on macOS
These tests were failing on my macOS; could be the precision issue (like on
Windows), or just because they're "too fast".

    === RUN   TestSourceDateEpoch/WithoutSourceDateEpoch
        epoch_test.go:51:
                Error Trace:	/Users/thajeztah/go/src/github.com/containerd/containerd/pkg/epoch/epoch_test.go:51
                Error:      	Should be true
                Test:       	TestSourceDateEpoch/WithoutSourceDateEpoch
                Messages:   	now: 2023-06-23 11:47:09.93118 +0000 UTC, v: 2023-06-23 11:47:09.93118 +0000 UTC

This patch:

- updates the rightAfter utility to allow the timestamps to be "equal"
- updates the asserts to provide some details about the timestamps
- uses UTC for the value we're comparing to, to match the timestamps
  that are generated.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 17:29:55 +02:00
Danny Canter
dfd7ad8b37 Reword Windows file related TODO
https://github.com/golang/go/issues/32088 was never accepted or implemented
in 1.14.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-23 05:42:44 -07:00
Sebastiaan van Stijn
44e2b26a87
pkg/epoch: replace some fmt.Sprintfs with strconv
Teeny-tiny optimizations:

    BenchmarkSprintf-10       37735996    32.31  ns/op  0 B/op  0 allocs/op
    BenchmarkItoa-10         591945836     2.031 ns/op  0 B/op  0 allocs/op
    BenchmarkFormatUint-10   593701444     2.014 ns/op  0 B/op  0 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 13:10:58 +02:00
Markus Lehtonen
f60a4a2718 cri: drop unused arg from generateRuntimeOptions
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-06-19 16:11:36 +03:00
Wei Fu
6dfb16f99a snapshots|pkg: umount without DETACH and nosync after umount
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-06-15 23:53:47 +08:00
Danny Canter
d278d37caa Sandbox: Add Metrics rpc for controller
As a follow up change to adding a SandboxMetrics rpc to the core
sandbox service, the controller needed a corresponding rpc for CRI
and others to eventually implement.

This leaves the CRI (non-shim mode) controller unimplemented just to
have a change with the API addition to start.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-13 00:24:09 -07:00
Derek McGowan
dd5e9f6538
Merge pull request #7944 from adisky/new-pinned-image
CRI Pinned image support
2023-06-10 22:29:34 -07:00
Derek McGowan
98b7dfb870
Merge pull request #8673 from thaJeztah/no_any
avoid "any" as variable name
2023-06-10 20:44:30 -07:00
Sebastiaan van Stijn
4bb709c018
avoid "any" as variable name
Avoid shadowing / confusion with Go's "any" built-in type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-10 13:49:06 +02:00
Sebastiaan van Stijn
577696f608
replace some basic uses of fmt.Sprintf()
Really tiny gains here, and doesn't significantly impact readability:

    BenchmarkSprintf
    BenchmarkSprintf-10    11528700     91.59 ns/op   32 B/op  1 allocs/op
    BenchmarkConcat
    BenchmarkConcat-10    100000000     11.76 ns/op    0 B/op  0 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-10 13:24:43 +02:00
Kazuyoshi Kato
326cd0623e
Merge pull request #8362 from gabriel-samfira/fix-non-c-volume
Fix non C volumes on Windows
2023-06-08 21:07:23 -07:00
Gabriel Adrian Samfira
6dd529e400
Pass in imagespec.Platform to WithVolumes()
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-08 12:31:04 +03:00
Derek McGowan
b03103152a
Merge pull request #8652 from hangscer8/release_ticker_correctly
fix release `ticker` correctly in `HandleProgress`
2023-06-07 10:47:00 -07:00
Phil Estes
0a821b968c
Merge pull request #8633 from jsturtevant/fix-for-init-containers-windows-pod-stats
[CRI] Windows Pod Stats: Add a check to skip stats for containers that are not running.
2023-06-07 13:29:27 -04:00
hang.jiang
d18026592f release ticker correctly
Signed-off-by: hang.jiang <hang.jiang@daocloud.io>
2023-06-07 11:45:38 +08:00
James Sturtevant
28a5199ff6
Add a check to skip stats for containers that are not running
When a container is just created, exited state the container will not have stats. A common case for this in k8s is the init containers for a pod. The will be present in the listed containers but will not have a running task and there for no stats.

Signed-off-by: James Sturtevant <jstur@microsoft.com>
2023-06-06 12:59:56 -07:00
Akihiro Suda
1f54e8fb21
Merge pull request #8637 from AkihiroSuda/followup-8606
RELEASES.md: de-deprecation of CNI conf_template will be v1.7.3
2023-06-06 17:19:41 +09:00
Samuel Karp
f92e576f6b
Merge pull request #8609 from samuelkarp/issue-8607 2023-06-05 10:31:45 -07:00
Akihiro Suda
69b451af5a
RELEASES.md: de-deprecation of CNI conf_template will be v1.7.3
Cherry-pick of PR 8606 missed the v1.7.2 milestone

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-06-03 17:04:14 +09:00
Samuel Karp
3c4a1ab1cb
cri: write generated CNI config atomically on Unix
The 10-containerd-net.conflist file generated from the conf_template
should be written atomically so that partial writes are not visible to
CNI plugins. Use the new consistentfile package to ensure this on
Unix-like platforms such as Linux, FreeBSD, and Darwin.

Fixes https://github.com/containerd/containerd/issues/8607

Signed-off-by: Samuel Karp <samuelkarp@google.com>
2023-06-02 16:56:34 -07:00
Samuel Karp
f3ba7c8a35
atomicfile: new package for atomic file writes
Certain files may need to be written atomically so that partial writes
are not visible to other processes. On Unix-like platforms such as
Linux, FreeBSD, and Darwin, this is accomplished by writing a temporary
file, syncing, and renaming over the destination file name. On Windows,
the same operations are performed, but Windows does not guarantee that a
rename operation is atomic.

Partial/inconsistent reads can occur due to:
1. A process attempting to read the file while containerd is writing it
   (both in the case of a new file with a short/incomplete write or in
   the case of an existing, updated file where new bytes may be written
   at the beginning but old bytes may still be present after).
2. Concurrent goroutines in containerd leading to multiple active
   writers of the same file.

The above mechanism explicitly protects against (1) as all writes are to
a file with a temporary name.

There is no explicit protection against multiple, concurrent goroutines
attempting to write the same file. However, atomically writing the file
should mean only one writer will "win" and a consistent file will be
visible.

Signed-off-by: Samuel Karp <samuelkarp@google.com>
2023-06-02 16:56:33 -07:00
hang.jiang
28d8c79de7 Replace atomicBool with the standard library atomic.Bool
Signed-off-by: hang.jiang <hang.jiang@daocloud.io>
2023-06-02 14:02:55 +08:00
Aditi Sharma
fe4f8bd884 Pinned image support
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2023-06-02 09:57:22 +05:30
James Sturtevant
738c4c6fa5
Fix issue for HPC pod metrics
The initial PR had a check for nil metrics but after some refactoring in the PR the test case that was suppose cover HPC was missing a scenario where the metric was not nil but didn't contain any metrics. This fixes that case and adds a testcase to cover it.

Signed-off-by: James Sturtevant <jstur@microsoft.com>
2023-06-01 15:12:36 -07:00
Kazuyoshi Kato
73645b1dfe
Merge pull request #8588 from lengrongfu/feat/cleanup_config_tls
Cleanup DEPRECATED TLS config
2023-05-31 18:50:54 -07:00
Kazuyoshi Kato
3ad032e9d0
Merge pull request #8606 from adisky/remove-conf-template-deprecation
Remove cni conf_template deprecation
2023-05-31 09:47:21 -07:00
Evan Lezar
d3887b2e62 Support CDI devices in ctr --device flag
This change adds support for CDI devices to the ctr --device flag.
If a fully-qualified CDI device name is specified, this is injected
into the OCI specification before creating the container.

Note that the CDI specifications and the devices that they represent
are local and mirror the behaviour of linux devices in the ctr command.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-05-31 16:14:01 +02:00
Phil Estes
80eb76332e
Merge pull request #8602 from mxpv/sbevents
Publish sandbox events
2023-05-31 09:14:08 -04:00
Akihiro Suda
65bca439a9
Merge pull request #8599 from lengrongfu/doc/update-auths-code-comment
update auths code comment
2023-05-31 22:13:54 +09:00
Aditi Sharma
3ca5b4437e Remove cni conf_template deprecation
As discussed in the issue
https://github.com/containerd/containerd/issues/8596
It is a helpful feature at many places and no replacement
readily available

Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2023-05-31 17:34:33 +05:30
rongfu.leng
d2b7a1e293 cleanup DEPRECATED TLS config
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2023-05-31 09:37:41 +08:00
Maksym Pavlenko
f857626d64 Move PLEG event back to CRI
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-05-30 16:40:58 -07:00
Maksym Pavlenko
fc50334ca9 Generate sandbox exit events from CRI
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-05-30 16:40:58 -07:00
Maksym Pavlenko
cf56054594 Move pod sandbox recovery to podsandbox/ package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-05-30 13:54:35 -07:00
Tianon Gravi
21b3318ebe Fix several conversions of "ocispec.Image" to "ocispec.Platform"
Several bits of code unmarshal image config JSON into an `ocispec.Image`, and then immediately create an `ocispec.Platform` out of it, but then discard the original image *and* miss several potential platform fields (most notably, `variant`).

Because `ocispec.Platform` is a strict subset of `ocispec.Image`, most of these can be updated to simply unmarshal the image config directly to `ocispec.Platform` instead, which allows these additional fields to be picked up appropriately.

We can use `tianon/raspbian` as a concrete reproducer to demonstrate.

Before:

```console
$ ctr content fetch docker.io/tianon/raspbian:bullseye-slim
...

$ ctr image ls
REF                                     TYPE                                                 DIGEST                                                                  SIZE     PLATFORMS    LABELS
docker.io/tianon/raspbian:bullseye-slim application/vnd.docker.distribution.manifest.v2+json sha256:66e96f8af40691b335acc54e5f69711584ef7f926597b339e7d12ab90cc394ce 28.6 MiB linux/arm/v7 -
```

(Note that the `PLATFORMS` column lists `linux/arm/v7` -- the image itself is actually `linux/arm/v6`, but one of these bits of code leads to only `linux/arm` being extracted from the image config, which `platforms.Normalize` then updates to an explicit `v7`.)

After:

```console
$ ctr image ls
REF                                     TYPE                                                 DIGEST                                                                  SIZE     PLATFORMS    LABELS
docker.io/tianon/raspbian:bullseye-slim application/vnd.docker.distribution.manifest.v2+json sha256:66e96f8af40691b335acc54e5f69711584ef7f926597b339e7d12ab90cc394ce 28.6 MiB linux/arm/v6 -
```

Signed-off-by: Tianon Gravi <admwiggin@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
2023-05-30 13:13:02 -07:00
Derek McGowan
6d7060099b
Merge pull request #8552 from dcantah/cross-plat-stats
CRI: Make stats respect sandbox's platform
2023-05-30 09:58:50 -07:00
rongfu.leng
314d758fa1 update auths code comment
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2023-05-30 23:05:48 +08:00
rongfu.leng
9287711b7a upgrade registry.k8s.io/pause version
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2023-05-28 07:59:10 +08:00
Henry Wang
4bfcac85fa notify readiness when registered plugins are ready
Signed-off-by: Henry Wang <henwang@amazon.com>
2023-05-26 03:07:40 +00:00
Gabriel Adrian Samfira
88a3e25b3d Add targetOS to WithVolumes()
Windows systems are capable of running both Windows Containers and Linux
containers. For windows containers we need to sanitize the volume path
and skip non-C volumes from the copy existing contents code path. Linux
containers running on Windows and Linux must not have the path sanitized
in any way.

Supplying the targetOS of the container allows us to proprely decide
when to activate that code path.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-05-25 09:38:34 +00:00
Gabriel Adrian Samfira
c7ec95caf4 Reword comment and make slight change to code
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-05-25 09:38:34 +00:00
Gabriel Adrian Samfira
ec2bec6481 Fix non C volumes on Windows
Images may be created with a VOLUME stanza pointed to drive letters that
are not C:. Currently, an image that has such VOLUMEs defined, will
cause containerd to error out when starting a container.

This change skips copying existing contents to volumes that are not C:.
as an image can only hold files that are destined for the C: drive of a
container.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-05-25 09:38:34 +00:00
Danny Canter
7274e33e38 CRI: Make stats respect sandbox's platform
To further some ongoing work in containerd to make as much code as possible
able to be used on any platform (to handle runtimes that can virtualize/emulate
a variety of different OSes), this change makes stats able to be handled on
any of the supported stat types (just linux and windows). To accomplish this,
we use the platform the sandbox returns from its `Platform` rpc to decide
what format the containers in a given sandbox are returning metrics in, then
we can typecast/marshal accordingly.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-05-23 01:32:36 -07:00
Wei Fu
d280cb83b6 chore: update comment for NetworkPluginSetupSerially
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-05-17 22:39:10 +08:00
Samuel Karp
c60ba138b6
Merge pull request #8502 from mstmdev/fix-typos 2023-05-16 08:41:02 -07:00
mstmdev
cdaa4025e9 Fix some typos
Signed-off-by: Pan Yibo <mstmdev@gmail.com>
2023-05-16 10:12:50 +08:00
Danny Canter
66307d0b4e CRI: Support Linux usernames for !linux platforms
The oci.WithUser option was being applied in container_create_linux.go
instead of the cross plat buildLinuxSpec method. There's been recent
work to try and make every spec option that can be applied on any platform
able to do so, and this falls under that. However, WithUser on linux platforms
relies on the containers SnapshotKey being filled out, which means the spec
option needs to be applied during container creation.

To make this a little more generic, I've created a new platformSpecOpts
method that handles any spec opts that rely on runtime state (rootfs mounted
for example) for some platforms, or just platform options that we still don't
have workarounds for to be able to specify them for other platforms
(apparmor, seccomp etc.) by internally calling the already existing
containerSpecOpts method.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-05-11 12:32:24 -07:00
Fu Wei
dc60137467
Merge pull request #8252 from bart0sh/PR008-CDI-use-CRI-field
CDI: Use CRI Config.CDIDevices field for CDI injection
2023-05-10 21:16:49 +08:00
Akihiro Suda
4347fc8bc2
go.mod: github.com/opencontainers/image-spec v1.1.0-rc3
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-05-09 23:35:58 +09:00
Phil Estes
c6d7e45c14
Merge pull request #8496 from ktock/golangci-lint-1.52.2
Bump up golangci-lint to v1.52.2
2023-05-09 13:03:06 -07:00
Fu Wei
465c804d22
Merge pull request #8489 from dcantah/readdirnames-fun
Change to Readdirnames for some cases
2023-05-09 15:43:36 +08:00
Kohei Tokunaga
6e2c915a44
Bump up golangci-lint to v1.52.2
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2023-05-09 15:07:55 +09:00
Danny Canter
f5211ee3fc Change to Readdirnames for some cases
There was a couple uses of Readdir/ReadDir here where the only thing the return
value was used for was the Name of the entry. This is exactly what Readdirnames
returns, so we can avoid the overhead of making/returning a bunch of interfaces
and calling lstat everytime in the case of Readdir(-1).

https://cs.opensource.google/go/go/+/refs/tags/go1.20.4:src/os/dir_unix.go;l=114-137

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-05-08 00:41:13 -07:00
Samuel Karp
52afa34f52
cri: update WithoutDefaultSecuritySettings comment
This pointer to an issue never got updated after the CRI plugin was
absorbed into the main containerd repo as an in-tree plugin.

Signed-off-by: Samuel Karp <samuelkarp@google.com>
2023-05-07 15:22:35 -07:00
Maksym Pavlenko
6f34da5f80 Cleanup logrus imports
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-05-05 11:54:14 -07:00
Brad Davidson
27f56e607f
Fix umarshal metrics for CRI server
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-05-03 20:50:04 +00:00
Derek McGowan
d56466cf39
[transfer] avoid setting limiters when max is 0
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-05-02 18:17:34 -07:00
Derek McGowan
a7ceac8b63
Merge pull request #8337 from keloyang/imagePullThroughput
Register imagePullThroughput and count with MiB
2023-05-02 10:30:19 -07:00
Fu Wei
b27301cd08
Merge pull request #8414 from kiashok/deleteCtrFromCtrStore
Remove entry for container from container store on error
2023-04-26 18:24:27 +08:00
Kirtana Ashok
d9f3e387c6 Remove entry for container from container store on error
If containerd does not see a container but criservice's
container store does, then we should try to recover from
this error state by removing the container from criservice's
container store as well.

Signed-off-by: Kirtana Ashok <Kirtana.Ashok@microsoft.com>
2023-04-25 16:32:22 -07:00
Maksym Pavlenko
4a67fe01b0
Merge pull request #8441 from mxpv/logrus
Move logrus setup code to log package
2023-04-24 22:05:33 +02:00
Maksym Pavlenko
370be0c18f Move logrus setup code to log package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-04-24 10:14:13 -07:00
Samuel Karp
08afb12339
Merge pull request #8430 from fangn2/update-doc-from-master-to-main 2023-04-22 00:03:50 -07:00
Mike Brown
159d3055a5
Merge pull request #8367 from dcantah/sbserver-podsbstatus-enhance
CRI Sbserver: Make PodSandboxStatus friendlier to shim crashes
2023-04-21 17:49:29 -05:00
Tony Fang
8c80ccc7f4 Update external repo links that changed default branch to main
Signed-off-by: Tony Fang <nhfang@amazon.com>
2023-04-21 20:26:48 +00:00
Maksym Pavlenko
290a800e83
Merge pull request #8398 from fuweid/chore-ut
pkg/cri/sbserver: sub-test uses array and capture range var
2023-04-18 12:35:30 +02:00
Wei Fu
4192ca8f8c pkg/cri/server: sub-test uses array and capture range var
Using array to build sub-tests is to avoid random pick. The shuffle
thing should be handled by go-test framework. And we should capture
range var before runing sub-test.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-16 16:47:02 +08:00
Wei Fu
8bcfdda39b pkg/cri/sbserver: sub-test uses array and capture range var
Using array to build sub-tests is to avoid random pick. The shuffle
thing should be handled by go-test framework. And we should capture
range var before runing sub-test.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-16 15:22:13 +08:00
Ed Bartosh
cd16b31cd2 Get CDI devices from CRI Config.CDIDevices field
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2023-04-14 13:41:08 +03:00
Rodrigo Campos
7e6ab84884 cri: Throw an error if idmap mounts is requested
We need support in containerd and the OCI runtime to use idmap mounts.
Let's just throw an error for now if the kubelet requests some mounts
with mappings.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-04-11 21:31:12 +02:00
Derek McGowan
c5a43b0007
Merge pull request #8366 from mxpv/stats
[sbserver] Backport CRI stats patches to sandboxed CRI
2023-04-10 13:38:30 -07:00
Shingo Omura
dc2fc987ca
capture desc variable in range variable just in case that it run in parallel mode
Signed-off-by: Shingo Omura <everpeace@gmail.com>
2023-04-10 20:59:11 +09:00
Shingo Omura
05bb52b273
Use t.TempDir instead of os.MkdirTemp
Signed-off-by: Shingo Omura <everpeace@gmail.com>
2023-04-10 20:58:36 +09:00
Danny Canter
7a7519a780 CRI Sbserver: Make PodSandboxStatus friendlier to shim crashes
Currently if you're using the shim-mode sandbox server support, if your
shim that's hosting the Sandbox API dies for any reason that wasn't
intentional (segfault, oom etc.) PodSandboxStatus is kind of wedged.
We can use the fact that if we didn't go through the usual k8s flow
of Stop->Remove and we still have an entry in our sandbox store,
us not having a shim mapping anymore means this was likely unintentional.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-04-10 04:39:50 -07:00
Fu Wei
5885db62c8
Merge pull request #8136 from everpeace/fix-additiona-gids-to-read-image-user
[CRI] fix additionalGids: it should fallback to imageConfig.User when securityContext.RunAsUser,RunAsUsername are empty
2023-04-09 14:59:07 +08:00
Maksym Pavlenko
79cb4b0000 [sbserver] handle missing cpu stats
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-04-07 15:59:40 -07:00
Maksym Pavlenko
464a4977a6 [sbserver] Refactor usageNanoCores be to used for all OSes
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-04-07 15:56:23 -07:00
Shukui Yang
db223271e3 Register imagePullThroughput and count with MiB
Signed-off-by: Shukui Yang <yangshukui@bytedance.com>
2023-04-07 10:12:41 +08:00
Paul "TBBle" Hampson
84cc3e496b Unify testutil.Unmount on Windows and Unix
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2023-03-31 06:15:17 -07:00
Paul "TBBle" Hampson
474a257b16 Implement Windows mounting for bind and windows-layer mounts
Using symlinks for bind mounts means we are not protecting an RO-mounted
layer against modification. Windows doesn't currently appear to offer a
better approach though, as we cannot create arbitrary empty WCOW scratch
layers at this time.

For windows-layer mounts, Unmount does not have access to the mounts
used to create it. So we store the relevant data in an Alternate Data
Stream on the mountpoint in order to be able to Unmount later.

Based on approach in https://github.com/containerd/containerd/pull/2366,
with sign-offs recorded as 'Based-on-work-by' trailers below.

This also partially-reverts some changes made in #6034 as they are not
needed with this mounting implmentation, which no longer needs to be
handled specially by the caller compared to non-Windows mounts.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Based-on-work-by: Michael Crosby <crosbymichael@gmail.com>
Based-on-work-by: Darren Stahl <darst@microsoft.com>
2023-03-31 06:15:17 -07:00
Samuel Karp
8f756bc8c2
Merge pull request #8309 from vinayakankugoyal/fixresolv
Add noexec nodev and nosuid to sandbox /etc/resolv.conf mount bind.
2023-03-30 17:34:08 -07:00
Vinayak Goyal
ac84bf7c89 Update sbserver to add noexec nodev and nosuid to /etc/resolv.conf mount bind.
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2023-03-30 21:54:21 +00:00
Maksym Pavlenko
126ab72fea Keep linux mounts for linux sandboxes on Windows/Darwin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-29 19:00:06 -07:00
Vinayak Goyal
990199a021 Test to ensure nosuid,nodev,noexec are set on /etc/reolv.conf mount.
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2023-03-29 20:34:05 +00:00
Maksym Pavlenko
3557ac884b Extract image service from CRI
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-28 20:37:26 -07:00
Maksym Pavlenko
a11e47b48c Use built in atomic.Bool
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-27 12:08:06 -07:00
Vinayak Goyal
ae4dbb60d5 Add noexec nodev and nosuid to sandbox /etc/resolv.conf mount bind.
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2023-03-24 21:56:53 +00:00
Fu Wei
584d13d5cb
Merge pull request #8276 from Iceber/remove_cri_v1alpha2
Remove CRI v1alpha2 [deprecated since v1.7]
2023-03-22 13:25:07 +08:00
Phil Estes
3a1047319f
Merge pull request #8279 from Iceber/remove_criu_path
Remove the CriuPath field from runc's options
2023-03-20 14:50:33 -04:00
June Rhodes
f48ae22273
fix: Update error message format based on feedback
Signed-off-by: June Rhodes <504826+hach-que@users.noreply.github.com>
2023-03-17 06:49:12 +11:00
June Rhodes
3193650f13
fix: 'failed to resolve symlink' error messaging
This error message currently does not provide useful information, because the `src` value that is interleaved will have been overridden by the call to `osi.ResolveSymbolicLink`. This stores the original `src` before the `osi.ResolveSymbolicLink` call so the error message can be useful.

Signed-off-by: June Rhodes <504826+hach-que@users.noreply.github.com>
2023-03-17 05:12:43 +11:00
Iceber Gu
c011502bd1 Remove cri v1alpha1 services
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-03-16 17:48:49 +08:00
Iceber Gu
23d288a809 Remove the CriuPath field from runc's options
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-03-16 17:12:51 +08:00
Danny Canter
62f98a1c11 CRI: Don't always close netConfMonitor channel
In the CRI server initialization a syncgroup is setup that adds to the
counter for every cni config found/registered. This functions on platforms
where CNI is supported/theres an assumption that there will always be
the loopback config. However, on platforms like Darwin where there's generally
nothing registered the Wait() on the syncgroup returns immediately and the
channel used to return any Network config sync errors is closed. This channel
is one of three that's used to monitor if we should Close the CRI service in
containerd, so it's not great if this happens.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-03-15 20:01:17 -07:00
Maksym Pavlenko
c5f1086adf Update docs
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-15 09:22:15 -07:00
Maksym Pavlenko
8bd82e355a Remove no_pivot when creating container from CRI
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-15 09:18:16 -07:00
Maksym Pavlenko
07c2ae12e1 Remove v1 runctypes
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-15 09:18:16 -07:00
Shingo Omura
50740a1a0c
use strings.Cut instead of strings.Split for parsing imageConfig.User
Signed-off-by: Shingo Omura <everpeace@gmail.com>
2023-03-14 13:52:03 +09:00
Akihiro Suda
625217d5fb
RELEASES.md: describe the deprecated config properties
These deprecations were mentioned in `pkg/cri/config/config.go`
but not mentioned in `RELEASES.md`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-03-09 15:12:54 +09:00
Maksym Pavlenko
48a1350658
Merge pull request #8149 from Burning1020/sb-netns
sandbox: create sandbox with network namespace path
2023-03-08 14:22:00 -08:00
Zhang Tianyang
5144ba9c49 sandbox: create sandbox with network namespace path
Signed-off-by: Zhang Tianyang <burning9699@gmail.com>
2023-03-08 18:54:14 +08:00
Akihiro Suda
6d95132313
go.mod: github.com/containerd/cgroups/v3 v3.0.1
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-03-07 22:06:38 +09:00
Akihiro Suda
c77ddf5381
Merge pull request #8131 from lucacome/bump-k8s.io-deps
Bump k8s.io deps
2023-03-07 21:44:13 +09:00
Fu Wei
5ae3a7f417
Merge pull request #8198 from kiashok/argsEscapedSupportInCri
Add ArgsEscaped support for CRI
2023-03-07 16:12:24 +08:00
Fu Wei
d780583a3c
Merge pull request #8205 from knight42/feat/transfer-tag
[Feature] Transfer tag image
2023-03-07 16:04:18 +08:00
Kevin Parsons
31c9a66385
Merge pull request #7099 from jsturtevant/cri-only-stats-windows
[cri] Implement CRI Pod and Container stats for Windows
2023-03-06 09:31:41 -08:00
Jian Zeng
f706576500
feat: tag image using Transfer api
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2023-03-05 23:22:17 +08:00
Samuel Karp
8ce3e4e159
epoch: fix unit test when SOURCE_DATE_EPOCH is set
Fixes https://github.com/containerd/containerd/issues/8200

Signed-off-by: Samuel Karp <samuelkarp@google.com>
2023-03-03 15:12:22 -08:00
James Sturtevant
32ed559c86
Add Windows Sandbox Stats (sbserver)
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2023-03-03 14:37:39 -08:00
James Sturtevant
08aa576a95
Add Windows Sandbox Stats
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2023-03-03 14:37:38 -08:00
Derek McGowan
7a77da2c26
Merge pull request #7832 from fuweid/fix-7802
pkg/cri: add timeout to drain exec io
2023-03-03 13:54:53 -08:00
Kirtana Ashok
8137e41c48 Add ArgsEscaped support for CRI
This commit adds supports for the ArgsEscaped
value for the image got from the dockerfile.
It is used to evaluate and process the image
entrypoint/cmd and container entrypoint/cmd
options got from the podspec.

Signed-off-by: Kirtana Ashok <Kirtana.Ashok@microsoft.com>
2023-03-03 13:38:06 -08:00
Wei Fu
5946c1051e *: fix code style issue
1. it's easy to check wrong input if using drain_exec_sync_io_timeout in error
2. avoid to use full error message, as part of error generated by go
   stdlib would be changed in the future
3. delete the extra empty line

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-03 17:51:03 +08:00
Wei Fu
98cb6d7eb8 cri/sbserver: ignore the NOT_FOUND error in exec cleanup
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-03 12:20:09 +08:00
Wei Fu
01671e9fc5 cri: add config ut for invalid drain io timeout value
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-03 12:00:19 +08:00
Wei Fu
ffebcb1223 cri: disable drain-exec-IO if it is empty timeout
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-03 11:59:07 +08:00
Wei Fu
791f137a5b *: update drainExecSyncIO docs and validate the timeout
We should validate the drainExecSyncIO timeout at the beginning and
raise the error for any invalid input.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-03 11:58:52 +08:00
Derek McGowan
13bf5565eb
[transfer] update export to use image store references
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-03-02 11:14:32 -08:00
Jian Zeng
f6491b0049
feat: export images using Transfer api
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2023-03-02 09:04:25 -08:00
Wei Fu
3c18decea7 *: add DrainExecSyncIOTimeout config and disable as by default
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-03 00:21:55 +08:00
Wei Fu
a9cbddd65d *: fix typo and skip exec-io-drain-testcase in win
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-02 21:57:43 +08:00
Tony Fang
2e96ba95e0 Create config struct to take user input
Signed-off-by: Tony Fang <nhfang@amazon.com>
2023-03-02 05:44:25 +00:00
Luca Comellini
f25ec98d0d
Fix linting error sets.String is deprecated
Signed-off-by: Luca Comellini <luca.com@gmail.com>
2023-03-01 21:37:30 -08:00
Wei Fu
04dfd6275e pkg/cri/sbserver: add timeout to drain exec io
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-02 13:06:45 +08:00
Wei Fu
82c0f4ff86 pkg/cri/server: add timeout to drain exec io
By default, the child processes spawned by exec process will inherit standard
io file descriptors. The shim server creates a pipe as data channel. Both exec
process and its children write data into the write end of the pipe. And the
shim server will read data from the pipe. If the write end is still open, the
shim server will continue to wait for data from pipe.

So, if the exec command is like `bash -c "sleep 365d &"`, the exec process is
bash and quit after create `sleep 365d`. But the `sleep 365d` will hold the
write end of the pipe for a year! It doesn't make senses that CRI plugin
should wait for it.

For this case, we should use timeout to drain exec process's io instead of
waiting for it.

Fixes: #7802

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-02 13:06:45 +08:00
Akihiro Suda
e0a05b56e5
Merge pull request #8152 from bart0sh/PR007-upgrade-CDI-to-0.5.4
update CDI version to v0.5.4
2023-02-28 09:22:30 +09:00
Mike Brown
d5425c4c41
Merge pull request #8140 from klihub/devel/update-nri-config
pkg/nri: pull in latest NRI, update NRI configuration.
2023-02-27 10:41:03 -06:00
Krisztian Litkey
310be5ce6e pkg/nri: update NRI configuration.
Update NRI plugin configuration to match that of NRI. Remove
option for the eliminated NRI configuration file. Add option
to disable connections from externally launched plugins. Add
options to override default plugin registration and request
timeouts.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2023-02-26 19:56:31 +02:00
Tony Fang
8a47c6910f Add a leading space after the comment sign
Fix coding standards

Signed-off-by: Tony Fang <nhfang@amazon.com>
2023-02-26 17:49:15 +00:00
Tony Fang
f53417921d Add unit test to getSupportedPlatform
Signed-off-by: Tony Fang <nhfang@amazon.com>
2023-02-26 17:49:02 +00:00
Fu Wei
a18709442b
Merge pull request #8062 from fangn2/config-options
Add configuration options to local transfer service
2023-02-26 00:11:43 +08:00
Tony Fang
47305392c6 Add configuration options to local transfer service
Signed-off-by: Tony Fang <nhfang@amazon.com>
2023-02-25 03:40:06 +00:00
Changwei Ge
bd0a2a9273 CRI: remove duplicated snapshotters code
The snapshotter annotation definitions and related functions have been
public in the new packge snapshotter

Also remove a test for container image layer's annotation.

Signed-off-by: Changwei Ge <gechangwei@bytedance.com>
2023-02-23 11:46:14 +08:00
Ed Bartosh
49abbe4f2b fix failing TestCDIInjections
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2023-02-22 20:07:34 +02:00
Fu Wei
8cb00f45c9
Merge pull request #8143 from mxpv/log
Add Fields type alias to log package
2023-02-21 10:22:23 +08:00
Maksym Pavlenko
06e085c8b5 Add Fields type alias to log package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-02-20 17:29:08 -08:00
Shingo Omura
727b254039
fix userstr for dditionalGids on Linux
It should fallback to imageConfig.User when no securityContext.RunAsUser/RunAsUsername

Signed-off-by: Shingo Omura <everpeace@gmail.com>
2023-02-19 22:09:00 +09:00
Daniel Lenar
a48dbefc15 Fix concurrent writes for UpdateContainerStats
Signed-off-by: Daniel Lenar <dlenar@vailsys.com>
2023-02-17 15:13:18 -06:00
Maksym Pavlenko
24cf85f5a3
Merge pull request #8103 from AkihiroSuda/go-1.20
Go 1.20.1
2023-02-15 20:09:28 -08:00
Derek McGowan
12a3162605
Merge pull request #8041 from yankay/fix-mistack-docs
pkg/cri/config: fix Mirrors deprecation comment
2023-02-15 15:25:04 -08:00
Derek McGowan
179f00c883
Merge pull request #8051 from yulng/goroutine
fix: 'go routine' should be 'goroutine'
2023-02-15 15:20:47 -08:00
Derek McGowan
aa6418fadd
Merge pull request from GHSA-hmfx-3pcx-653p
oci: fix additional GIDs
2023-02-15 13:45:14 -08:00
Akihiro Suda
d8b68e3ccc
Stop using math/rand.Read and rand.Seed (deprecated in Go 1.20)
From golangci-lint:

> SA1019: rand.Read has been deprecated since Go 1.20 because it
>shouldn't be used: For almost all use cases, crypto/rand.Read is more
>appropriate. (staticcheck)

> SA1019: rand.Seed has been deprecated since Go 1.20 and an alternative
>has been available since Go 1.0: Programs that call Seed and then expect
>a specific sequence of results from the global random source (using
>functions such as Int) can be broken when a dependency changes how
>much it consumes from the global random source. To avoid such breakages,
>programs that need a specific result sequence should use
>NewRand(NewSource(seed)) to obtain a random generator that other
>packages cannot access. (staticcheck)

See also:

- https://pkg.go.dev/math/rand@go1.20#Read
- https://pkg.go.dev/math/rand@go1.20#Seed

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-16 03:50:23 +09:00
Akihiro Suda
a9ac5f9cb5
lint: remove //nolint:dupword that are no longer needed
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-16 03:50:23 +09:00
Kazuyoshi Kato
fe5d1d3e7c
Merge pull request #7954 from klihub/devel/sbserver-nri-integration
pkg/cri/sbserver: experimental NRI integration for CRI.
2023-02-15 10:42:25 -08:00
Zechun Chen
39bac0dbef error strings should not be capitalized
Signed-off-by: Zechun Chen <zechun.chen@daocloud.io>
2023-02-15 14:30:36 +08:00
Maksym Pavlenko
3548f59fd8
Merge pull request #8060 from dcantah/cri-annots-other
CRI: Pass sandbox annotations to _other platforms
2023-02-14 18:34:46 -08:00
Casey Callendrello
0166783c79 cni: pass in the cgroupPath capability argument
There is a new CNI capability argument, cgroupPath, where runtimes can
pass cgroup paths to CNI plugins.

Implement that.

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
2023-02-14 16:49:29 +01:00
Akihiro Suda
4e2eb8ba4e
Merge pull request #7964 from dmcgowan/transfer-image-store-references
[transfer] update imagestore interface to support multiple references
2023-02-14 11:22:27 +09:00
Derek McGowan
081601f521
Update imagestore interface to support multiple references
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-02-13 13:58:33 -08:00
Danny Canter
646bc3a94e CRI: Create DefaultCRIAnnotations helper
All of the CRI sandbox and container specs all get assigned
almost the exact same default annotations (sandboxID, name, metadata,
container type etc.) so lets make a helper to return the right set for
a sandbox or regular workload container.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-02-13 13:05:01 -08:00
Danny Canter
5aab634e14 CRI: Pass sandbox annotations to _other platforms
!windows and !linux weren't getting passed the sandbox annotations.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-02-13 13:03:51 -08:00
Maksym Pavlenko
2b24af8d13 Use options to pass PodSandboxConfig to shims
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-02-13 12:36:20 -08:00
Krisztian Litkey
ebbcb57a4c pkg/cri/sbserver: experimental NRI integration for CRI.
Hook the NRI service plugin into CRI sbserver request
processing.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2023-02-13 22:08:18 +02:00
Krisztian Litkey
8a1dca0f4a pkg/cri: split out NRI API from pkg/cri/server.
Split out the criService-agnostic bits of nri-api* from
pkg/cri/server to pkg/cri/nri to allow sharing a single
implementation betwen the server and sbserver versions.
Rework the interfaces to not require access to package
internals.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2023-02-13 22:05:45 +02:00
Derek McGowan
edb8ebaf07
Merge pull request #8047 from ruiwen-zhao/send_nil
Send container events with nil PodSandboxStatus
2023-02-13 11:38:14 -08:00
Derek McGowan
164ac924f8
Merge pull request #7984 from aitumik/aitumik/add-host-network-tests
test: add hostNetwork tests for both windows and linux
2023-02-13 11:37:20 -08:00
Fu Wei
2654ece1d0
Merge pull request #8066 from fuweid/cleanup-blockio-init
*: introduce wrapper pkgs for blockio and rdt
2023-02-13 14:05:32 +08:00
Derek McGowan
c6cf6b2522
Merge pull request #8093 from mxpv/instrument
Extract CRI instrument into separate package
2023-02-12 21:45:13 -08:00
Maksym Pavlenko
750d18aced Extract CRI instrument package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-02-12 20:49:15 -08:00
Fu Wei
040fcf85f0
Merge pull request #8091 from dcantah/mirror-generic-toml-change 2023-02-12 11:23:34 +08:00
Wei Fu
60d04b0b0f pkg: rename {blockio,rdt}_default.go -> nonlinux.go
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-02-12 08:35:17 +08:00
Akihiro Suda
b61988670c
go.mod: github.com/containerd/typeurl/v2 v2.1.0
Changes: https://github.com/containerd/typeurl/compare/7f6e6d160d67...v2.1.0

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-11 23:39:52 +09:00
Danny Canter
74b371b98a CRI: Mirror generic toml runtime config under server
In https://github.com/containerd/containerd/pull/7764 it was made
so that generic runtime options in the containerd toml config file
would get passed to shims regardless of if containerd knew of the
type beforehand and could supply the struct. However, this was only
added for the sandbox server fork here and not the regular ol' CRI
server. This change just mirrors the parts that need to be plopped in
pkg/cri/server

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-02-11 05:18:52 -08:00
ruiwen-zhao
51a8db233d Send container events with nil PodSandboxStatus
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-02-11 01:34:39 +00:00
ruiwen-zhao
27c8f4085c Move PLEG event generation back to sbserver to avoid missing pod sandbox status
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-02-11 01:34:33 +00:00
Fu Wei
cf7b705dcd
Merge pull request #8086 from neersighted/apparmor_parser_regression
Revert `apparmor_parser` regression
2023-02-11 09:27:53 +08:00
Fu Wei
362ba2c743
Merge pull request #7981 from dmcgowan/sandbox-controller-interface-refactor
[sandbox] refactor controller interface
2023-02-11 09:22:36 +08:00
Nathan
7cf5560754 test: add hostNetwork tests for both windows and linux
Signed-off-by: Nathan <aitumik@protonmail.com>
2023-02-11 00:15:48 +03:00
Bjorn Neergaard
d33a43cc23
pkg/apparmor: clarify Godoc
Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
2023-02-10 10:23:59 -07:00
Bjorn Neergaard
a3265102d9
Revert "Don't check for apparmor_parser to be present"
This reverts commit 1acca8bba3.

As stated in the Godoc, this function is intended to check for presence
of `apparmor_parser`. Changing this regressed the public API of
containerd, and directly contradicts the way that this function is
consumed inside of containerd itself:
* fdfdc9bfc0/pkg/apparmor/apparmor.go (L20)
* fdfdc9bfc0/pkg/cri/sbserver/helpers_linux.go (L85)
* fdfdc9bfc0/pkg/cri/server/helpers_linux.go (L144)

This has lead to a number of painful regressions and attempted fixes in
Moby:
* https://github.com/moby/moby/issues/44900
* https://github.com/moby/moby/pull/44902
* https://github.com/moby/moby/issues/44970

While reverting this late into the life of 1.6 and at the start of the
life of 1.7 is likely painful, I think this is ultimately the best path
to take, as containerd is subject to the same failure to start
containers with an AppArmor kernel when `apparmor_parser` is missing as
Moby.

Signed-off-by: Bjorn Neergaard <bneergaard@mirantis.com>
2023-02-10 10:05:56 -07:00
Zechun Chen
b944b108df Clean up repeated package import
Signed-off-by: Zechun Chen <zechun.chen@daocloud.io>
2023-02-10 16:21:55 +08:00
Akihiro Suda
3eda46af12
oci: fix additional GIDs
Test suite:
```yaml

---
apiVersion: v1
kind: Pod
metadata:
  name: test-no-option
  annotations:
    description: "Equivalent of `docker run` (no option)"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=0(root) gid=0(root) groups=0(root),10(wheel)" ]']
---
apiVersion: v1
kind: Pod
metadata:
  name: test-group-add-1-group-add-1234
  annotations:
    description: "Equivalent of `docker run --group-add 1 --group-add 1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=0(root) gid=0(root) groups=0(root),1(daemon),10(wheel),1234" ]']
  securityContext:
    supplementalGroups: [1, 1234]
---
apiVersion: v1
kind: Pod
metadata:
  name: test-user-1234
  annotations:
    description: "Equivalent of `docker run --user 1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=1234 gid=0(root) groups=0(root)" ]']
  securityContext:
    runAsUser: 1234
---
apiVersion: v1
kind: Pod
metadata:
  name: test-user-1234-1234
  annotations:
    description: "Equivalent of `docker run --user 1234:1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=1234 gid=1234 groups=1234" ]']
  securityContext:
    runAsUser: 1234
    runAsGroup: 1234
---
apiVersion: v1
kind: Pod
metadata:
  name: test-user-1234-group-add-1234
  annotations:
    description: "Equivalent of `docker run --user 1234 --group-add 1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=1234 gid=0(root) groups=0(root),1234" ]']
  securityContext:
    runAsUser: 1234
    supplementalGroups: [1234]
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-10 15:53:00 +09:00
Wei Fu
62df35df66 *: introduce wrapper pkgs for blockio and rdt
Before this patch, both the RdtEnabled and BlockIOEnabled are provided
by services/tasks pkg. Since the services/tasks can be pkg plugin which
can be initialized multiple times or concurrently. It will fire data-race
issue as there is no mutex to protect `enable`.

This patch is aimed to provide wrapper pkgs to use intel/{blockio,rdt}
safely.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-02-10 08:21:34 +08:00
yulng
6cdc221f59 'go routine' should be 'goroutine'
Signed-off-by: yulng <wei.yang@daocloud.io>
2023-02-08 14:10:34 +08:00
Derek McGowan
b0e97c0f9b
Use multierror for cleanup error
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-02-07 11:06:14 -08:00
Derek McGowan
a788f6c799
Move local sandbox controller under plugins package
Add options to sandbox controller interface.
Update sandbox controller interface to fully utilize sandbox controller
interface.
Move grpc error conversion to service.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-02-06 22:04:45 -08:00
Derek McGowan
2717685dad
Refactor sandbox controller interface
Update the sandbox controller interface to use local types rather than
using the API types.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-02-06 21:39:30 -08:00
Kay Yan
0b33a45fad cri: fix Mirrors deprecation comment
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2023-02-07 09:53:57 +08:00
Maksym Pavlenko
1f35b03369 Fix sandbox exit monitor
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-02-02 14:02:52 -08:00
Phil Estes
6116820aeb
Merge pull request #8036 from ktock/remotesnlabel
Export remote snapshotter label handler
2023-02-02 11:53:43 -05:00
Kohei Tokunaga
dbf384a5a8 Export remote snapshotter label handler
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2023-02-01 23:03:23 +09:00
Phil Estes
0181b103ea
Merge pull request #8037 from AkihiroSuda/epoch-drop-timezone
pkg/epoch: drop timezone
2023-01-31 17:04:50 -05:00
Akihiro Suda
e551d734fb
pkg/epoch: drop timezone
For determinism of human-readable string representation.
e.g., "2023-01-10T12:34:56Z" vs "2023-01-10T21:34:56+09:00"

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-01-31 23:26:02 +09:00
Derek McGowan
287320d4de
Merge pull request #7840 from hinshun/feature/mount-subdirectory
Use mount.Target to specify subdirectory of rootfs mount
2023-01-30 21:35:34 -08:00
Derek McGowan
ee0e22f01c
Merge pull request #8020 from AkihiroSuda/mkdir-etc-cni-0755
cri: mkdir /etc/cni with 0755, not 0700
2023-01-30 10:21:30 -08:00
Akihiro Suda
b36b415526
cri: mkdir /etc/cni with 0755, not 0700
/etc/cni has to be readable for non-root users (0755), because /etc/cni/tuning/allowlist.conf is used for rootless mode too.
This file was introduced in CNI plugins 1.2.0 (containernetworking/plugins PR 693), and its path is hard-coded.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-01-29 07:49:36 +09:00
Edgar Lee
34d5878185 Use mount.Target to specify subdirectory of rootfs mount
- Add Target to mount.Mount.
- Add UnmountMounts to unmount a list of mounts in reverse order.
- Add UnmountRecursive to unmount deepest mount first for a given target, using
moby/sys/mountinfo.

Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
2023-01-27 09:51:58 +08:00
Maksym Pavlenko
21fe0ceaad Move PLEG events for pause container to podsandbox
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-25 19:28:48 -08:00
Sebastiaan van Stijn
4f39b164f3
pkg/cri: optimize slice initialization
Some of this code was originally added in b7b1200dd3,
which likely meant to initialize the slice with a length to reduce allocations,
however, instead of initializing with a zero-length and a capacity, it
initialized the slice with a fixed length, which was corrected in commit
0c63c42f81.

This patch initializes the slice with a zero-length and expected capacity.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-01-24 20:46:20 +01:00
Maksym Pavlenko
f9f8455332 Backport #7393 to sbserver
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-17 14:36:21 -08:00
Maksym Pavlenko
0cbfb3375f Backport #7661 to sbserver
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-17 14:31:47 -08:00
Maksym Pavlenko
41eabf134a Backport #7685 to sbserver
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-17 14:26:16 -08:00
Maksym Pavlenko
b0d7a96976 Backport unit test from #7882 to sbserver
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-17 14:26:16 -08:00
Maksym Pavlenko
1ade777c24 Add basic spec and mounts for Darwin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-12 17:00:40 -08:00
Maksym Pavlenko
3c8469a782 Use Platform instead of generated API
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-12 10:30:42 -08:00
Maksym Pavlenko
40be96efa9 Have separate spec builder for each platform
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:12:25 -08:00
Maksym Pavlenko
fdfa3519a3 Remove unused params from platformSpec
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Maksym Pavlenko
1c1d8fb057 Update OCI spec tests for generic platform
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Maksym Pavlenko
f43d8924e4 Move most of OCI spec options to common builder
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Maksym Pavlenko
21338d2777 Add stub to build common OCI spec
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Maksym Pavlenko
f318e5630b Update sandbox API to return target platform
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Maksym Pavlenko
dd22a3a806 Move WithMounts to specs
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Maksym Pavlenko
0ae0399b16 Make OCI spec opts available on all platforms
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:58 -08:00
Qasim Sarfraz
9c8c4508ec cri: Fix TestUpdateOCILinuxResource for host w/o swap controller
Tested on Ubuntu 20.04 w/o swap controller:
```
$ stat -fc %T /sys/fs/cgroup/
tmpfs
$ la -la /sys/fs/cgroup/memory/memory.memsw.limit_in_bytes
ls: cannot access '/sys/fs/cgroup/memory/memory.memsw.limit_in_bytes': No such file or directory
$  go test -v ./pkg/cri/sbserver/ -run TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_update_each_resource
=== RUN   TestUpdateOCILinuxResource/should_skip_empty_fields
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields
--- PASS: TestUpdateOCILinuxResource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_update_each_resource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_skip_empty_fields (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields (0.00s)
PASS
ok      github.com/containerd/containerd/pkg/cri/sbserver       (cached)
$ go test -v ./pkg/cri/server/ -run TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_update_each_resource
=== RUN   TestUpdateOCILinuxResource/should_skip_empty_fields
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map
--- PASS: TestUpdateOCILinuxResource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_update_each_resource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_skip_empty_fields (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map (0.00s)
PASS
ok      github.com/containerd/containerd/pkg/cri/server (cached)
```

Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
2023-01-10 15:41:04 +01:00
Fu Wei
5fc727224e
Merge pull request #7861 from dmcgowan/cleanup-context
Add cleanup package for context management during cleanup
2023-01-05 13:18:31 +08:00
Derek McGowan
b550526ccd
Use cleanup.Background instead of context.Background for cleanup
Use the cleanup context to re-use values from the original context

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-01-04 11:22:24 -08:00
Maksym Pavlenko
06bfcd658c Enable dupword linter
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-03 12:47:16 -08:00
Derek McGowan
f606c4eba7
Add cleanup package for context management during cleanup
Provides a couple helper functions that provide a background context for
running cleanup jobs while preserving the original context values.
The new contexts will not inherit the errors or cancellations.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-01-03 12:30:26 -08:00
Akihiro Suda
4adf3fb3af
Merge pull request #7906 from Iceber/use_label_uncompressed
Use the const labels.LabelUncompressed
2023-01-04 01:04:20 +09:00
Iceber Gu
778e8f2af4 Use the const labels.LabelUncompressed
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-01-03 18:29:21 +08:00
Danny Canter
3f0edb249b CRI: Comment cleanup/misc fixes
Comments in initPlatform for Windows states that the options were
Linux specific. Additionally properly wrap an error after trying
to setup CDI on Linux.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-02 18:55:31 -08:00
xin.li
1753e5af7a Reused errdefs for error
Signed-off-by: xin.li <xin.li@daocloud.io>
2023-01-02 21:39:20 +08:00
Rodrigo Campos
72ef986222 cri: Simplify parseUsernsIDs()
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-30 16:49:28 -03:00
Rodrigo Campos
4eed20fc31 cri: Verify userns container config is consisten with sandbox
The sandbox and container both have the userns config. Lets make sure
they are the same, therefore consistent.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-30 15:07:54 -03:00
Rodrigo Campos
a44b356274 cri: Fix assert vs require in tests
Currently we require that c.containerSpec() does not return an error
if test.err is not set.

However, if the require fails (i.e. it indeed returned an error) the
rest of the code is executed anyways. The rest of the code assumes it
did not return an error (so code assumes spec is not nil). This fails
miserably if it indeed returned an error, as spec is nil and go crashes
while running the unit tests.

Let's require it is not an error, so code does not continue to execute
if that fails and go doesn't crash.

In the test.err case is not harmful the bug of using assert, but let's
switch it to require too as that is what we really want.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-30 14:02:10 -03:00
Samuel Karp
b0b28f1d8e
Merge pull request #7879 from fuweid/clean-build-tags 2022-12-30 00:22:03 -08:00
Rodrigo Campos
3b48fb5b59 cri: Shadow variables to avoid t.Parallel() issues
This is a follow-up suggested by Fu Wei.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-29 18:16:20 -03:00
Mike Brown
66f186d42d
Merge pull request #7679 from kinvolk/rata/userns-stateless-pods
Add support for user namespaces in stateless pods (KEP-127)
2022-12-29 14:08:24 -06:00