Commit Graph

317 Commits

Author SHA1 Message Date
Zechun Chen
b944b108df Clean up repeated package import
Signed-off-by: Zechun Chen <zechun.chen@daocloud.io>
2023-02-10 16:21:55 +08:00
Akihiro Suda
3eda46af12 oci: fix additional GIDs
Test suite:
```yaml

---
apiVersion: v1
kind: Pod
metadata:
  name: test-no-option
  annotations:
    description: "Equivalent of `docker run` (no option)"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=0(root) gid=0(root) groups=0(root),10(wheel)" ]']
---
apiVersion: v1
kind: Pod
metadata:
  name: test-group-add-1-group-add-1234
  annotations:
    description: "Equivalent of `docker run --group-add 1 --group-add 1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=0(root) gid=0(root) groups=0(root),1(daemon),10(wheel),1234" ]']
  securityContext:
    supplementalGroups: [1, 1234]
---
apiVersion: v1
kind: Pod
metadata:
  name: test-user-1234
  annotations:
    description: "Equivalent of `docker run --user 1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=1234 gid=0(root) groups=0(root)" ]']
  securityContext:
    runAsUser: 1234
---
apiVersion: v1
kind: Pod
metadata:
  name: test-user-1234-1234
  annotations:
    description: "Equivalent of `docker run --user 1234:1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=1234 gid=1234 groups=1234" ]']
  securityContext:
    runAsUser: 1234
    runAsGroup: 1234
---
apiVersion: v1
kind: Pod
metadata:
  name: test-user-1234-group-add-1234
  annotations:
    description: "Equivalent of `docker run --user 1234 --group-add 1234`"
spec:
  restartPolicy: Never
  containers:
    - name: main
      image: ghcr.io/containerd/busybox:1.28
      args: ['sh', '-euxc',
             '[ "$(id)" = "uid=1234 gid=0(root) groups=0(root),1234" ]']
  securityContext:
    runAsUser: 1234
    supplementalGroups: [1234]
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-10 15:53:00 +09:00
Wei Fu
62df35df66 *: introduce wrapper pkgs for blockio and rdt
Before this patch, both the RdtEnabled and BlockIOEnabled are provided
by services/tasks pkg. Since the services/tasks can be pkg plugin which
can be initialized multiple times or concurrently. It will fire data-race
issue as there is no mutex to protect `enable`.

This patch is aimed to provide wrapper pkgs to use intel/{blockio,rdt}
safely.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-02-10 08:21:34 +08:00
yulng
6cdc221f59 'go routine' should be 'goroutine'
Signed-off-by: yulng <wei.yang@daocloud.io>
2023-02-08 14:10:34 +08:00
Phil Estes
6116820aeb Merge pull request #8036 from ktock/remotesnlabel
Export remote snapshotter label handler
2023-02-02 11:53:43 -05:00
Kohei Tokunaga
dbf384a5a8 Export remote snapshotter label handler
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2023-02-01 23:03:23 +09:00
Akihiro Suda
b36b415526 cri: mkdir /etc/cni with 0755, not 0700
/etc/cni has to be readable for non-root users (0755), because /etc/cni/tuning/allowlist.conf is used for rootless mode too.
This file was introduced in CNI plugins 1.2.0 (containernetworking/plugins PR 693), and its path is hard-coded.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-01-29 07:49:36 +09:00
Sebastiaan van Stijn
4f39b164f3 pkg/cri: optimize slice initialization
Some of this code was originally added in b7b1200dd3,
which likely meant to initialize the slice with a length to reduce allocations,
however, instead of initializing with a zero-length and a capacity, it
initialized the slice with a fixed length, which was corrected in commit
0c63c42f81.

This patch initializes the slice with a zero-length and expected capacity.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-01-24 20:46:20 +01:00
Maksym Pavlenko
40be96efa9 Have separate spec builder for each platform
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:12:25 -08:00
Maksym Pavlenko
dd22a3a806 Move WithMounts to specs
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-11 13:03:59 -08:00
Qasim Sarfraz
9c8c4508ec cri: Fix TestUpdateOCILinuxResource for host w/o swap controller
Tested on Ubuntu 20.04 w/o swap controller:
```
$ stat -fc %T /sys/fs/cgroup/
tmpfs
$ la -la /sys/fs/cgroup/memory/memory.memsw.limit_in_bytes
ls: cannot access '/sys/fs/cgroup/memory/memory.memsw.limit_in_bytes': No such file or directory
$  go test -v ./pkg/cri/sbserver/ -run TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_update_each_resource
=== RUN   TestUpdateOCILinuxResource/should_skip_empty_fields
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields
--- PASS: TestUpdateOCILinuxResource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_update_each_resource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_skip_empty_fields (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields (0.00s)
PASS
ok      github.com/containerd/containerd/pkg/cri/sbserver       (cached)
$ go test -v ./pkg/cri/server/ -run TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_update_each_resource
=== RUN   TestUpdateOCILinuxResource/should_skip_empty_fields
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields
=== RUN   TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map
--- PASS: TestUpdateOCILinuxResource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_update_each_resource (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_skip_empty_fields (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_fill_empty_fields (0.00s)
    --- PASS: TestUpdateOCILinuxResource/should_be_able_to_patch_the_unified_map (0.00s)
PASS
ok      github.com/containerd/containerd/pkg/cri/server (cached)
```

Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
2023-01-10 15:41:04 +01:00
Maksym Pavlenko
06bfcd658c Enable dupword linter
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-01-03 12:47:16 -08:00
Danny Canter
3f0edb249b CRI: Comment cleanup/misc fixes
Comments in initPlatform for Windows states that the options were
Linux specific. Additionally properly wrap an error after trying
to setup CDI on Linux.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-01-02 18:55:31 -08:00
xin.li
1753e5af7a Reused errdefs for error
Signed-off-by: xin.li <xin.li@daocloud.io>
2023-01-02 21:39:20 +08:00
Rodrigo Campos
72ef986222 cri: Simplify parseUsernsIDs()
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-30 16:49:28 -03:00
Rodrigo Campos
4eed20fc31 cri: Verify userns container config is consisten with sandbox
The sandbox and container both have the userns config. Lets make sure
they are the same, therefore consistent.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-30 15:07:54 -03:00
Rodrigo Campos
a44b356274 cri: Fix assert vs require in tests
Currently we require that c.containerSpec() does not return an error
if test.err is not set.

However, if the require fails (i.e. it indeed returned an error) the
rest of the code is executed anyways. The rest of the code assumes it
did not return an error (so code assumes spec is not nil). This fails
miserably if it indeed returned an error, as spec is nil and go crashes
while running the unit tests.

Let's require it is not an error, so code does not continue to execute
if that fails and go doesn't crash.

In the test.err case is not harmful the bug of using assert, but let's
switch it to require too as that is what we really want.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-30 14:02:10 -03:00
Samuel Karp
b0b28f1d8e Merge pull request #7879 from fuweid/clean-build-tags 2022-12-30 00:22:03 -08:00
Rodrigo Campos
3b48fb5b59 cri: Shadow variables to avoid t.Parallel() issues
This is a follow-up suggested by Fu Wei.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-29 18:16:20 -03:00
Mike Brown
66f186d42d Merge pull request #7679 from kinvolk/rata/userns-stateless-pods
Add support for user namespaces in stateless pods (KEP-127)
2022-12-29 14:08:24 -06:00
Wei Fu
6b7e237fc7 chore: use go fix to cleanup old +build buildtag
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-12-29 14:25:14 +08:00
Danny Canter
229779a4e5 oci: Add WithDomainname
A domainname field was recently added to the OCI spec. Prior to this
folks would need to set this with a sysctl, but now runtimes should be
able to setdomainname(2). There's an open change to runc at the moment
to add support for this so I've just left testing as a couple spec
validations in CRI until that's in and usable.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2022-12-26 04:03:45 -05:00
Antonio Ojea
ba0a7185f0 add network plugin metrics
Add network plugin metrics.

The metrics are the same that were used in dockershim/kubelet until
it was deprecated in kubernetes 1.23

https://github.com/kubernetes/kubernetes/blob/release-1.23/pkg/kubelet/dockershim/network/metrics/metrics.go

Signed-off-by: Antonio Ojea <aojea@google.com>
2022-12-23 09:23:56 +00:00
Derek McGowan
b3b79813f3 Merge pull request #7165 from zouyee/nit
prevent Server reuse after a Shutdown
2022-12-22 14:09:29 -08:00
Rodrigo Campos
a7adeb6976 cri: Support pods with user namespaces
This patch requests the OCI runtime to create a userns when the CRI
message includes such request.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-21 17:56:56 -03:00
David Leadbeater
31a6449734 Add capability for snapshotters to declare support for UID remapping
This allows user namespace support to progress, either by allowing
snapshotters to deal with ownership, or falling back to containerd doing
a recursive chown.

In the future, when snapshotters implement idmap mounts, they should
report the "remap-ids" capability.

Co-authored-by: Rodrigo Campos <rodrigoca@microsoft.com>
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Signed-off-by: David Leadbeater <dgl@dgl.cx>
2022-12-21 15:08:28 -03:00
Rodrigo Campos
36f520dc04 Let OCI runtime create netns when userns is used
As explained in the comments, this patch lets the OCI runtime create the
netns when userns are in use. This is needed because the netns needs to
be owned by the userns (otherwise can't modify the IP, etc.).

Before this patch, we are creating the netns and then starting the pod
sandbox asking to join this netns. This can't never work with userns, as
the userns needs to be created first for the netns ownership to be
correct.

One option would be to also create the userns in containerd, then create
the netns. But this is painful (needs tricks with the go runtime,
special care to write the mapping, etc.).

So, we just let the OCI runtime create the userns and netns, that
creates them with the proper ownership.

As requested by Mike Brown, the current code when userns is not used is
left unchanged. We can unify the cases (with and without userns) in a
future release.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2022-12-21 10:40:30 -03:00
Danny Canter
3ee6dd5c1b CRI: Fix no CNI info for pod sandbox on restart
Due to when we were updating the pod sandboxes underlying container
object, the pointer to the sandbox would have the right info, but
the on-disk representation of the data was behind. This would cause
the data returned from loading any sandboxes after a restart to have
no CNI result or IP information for the pod.

This change does an additional update to the on-disk container info
right after we invoke the CNI plugin so the metadata for the CNI result
and other networking information is properly flushed to disk.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2022-12-20 13:20:27 -08:00
Kazuyoshi Kato
52a7480399 Remove github.com/gogo/protobuf again
While we need to support CRI v1alpha2, the implementation doesn't have
to be tied to gogo/protobuf.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-12-15 22:54:15 +00:00
Danny Canter
84529072d2 CRI: Add host networking helper
We do a ton of host networking checks around the CRI plugin, all mainly
doing the same thing of checking the different quirks on various platforms
(for windows are we a HostProcess pod, for linux is namespace mode the
right thing, darwin doesn't have CNI support etc.) which could all be
bundled up into a small helper that can be re-used.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2022-12-14 01:47:22 -08:00
Phil Estes
ecf00ffe84 Merge pull request #7783 from inspektor-gadget/qasim/cri-disable-swap
cri: make swapping disabled with memory limit
2022-12-13 15:21:51 -05:00
Fu Wei
d2f68bfb36 Merge pull request #7313 from pacoxu/image-pull-metrics
add metrics for image pulling: error; in progress count; thoughput
2022-12-13 19:49:22 +08:00
Fu Wei
f2cf411b79 Merge pull request #7073 from ruiwen-zhao/event
Add container event support to containerd
2022-12-09 15:24:23 +08:00
ruiwen-zhao
a6929f9f6b Add Evented PLEG support to sandbox server
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-12-08 19:31:36 +00:00
ruiwen-zhao
a338abc902 Add container event support to containerd
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-12-08 19:30:39 +00:00
Maksym Pavlenko
3bc8fc4d30 Cleanup build constraints
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-12-08 09:36:20 -08:00
Qasim Sarfraz
69975b92bb cri: make swapping disabled with memory limit
OCI runtime spec defines memory.swap as 'limit of memory+Swap usage'
so setting them to equal should disable the swap. Also, this change
should make containerd behaviour same as other runtimes e.g
'cri-dockerd/dockershim' and won't be impacted when user turn on
'NodeSwap' (https://github.com/kubernetes/enhancements/issues/2400) feature.

Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
2022-12-08 13:54:55 +01:00
Paco Xu
c59f1635f0 add metrics for image pulling: success/failure count; in progress count; thoughput
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-12-07 15:11:00 +08:00
Derek McGowan
c469f67a2b Merge pull request #6019 from klihub/pr/proto/nri
NRI: add support for NRI with extended scope.
2022-11-30 10:42:17 -08:00
Kirtana Ashok
08d5879f32 Added nullptr checks to pkg/cri/server and sbserver
Signed-off-by: Kirtana Ashok <Kirtana.Ashok@microsoft.com>
2022-11-29 13:25:49 -08:00
Krisztian Litkey
02f0a8b50e pkg/cri/server: nuke old v0.1.0 NRI hooks.
Remove direct invocation of old v0.1.0 NRI plugins. They
can be enabled using the revised NRI API and the v0.1.0
adapter plugin.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2022-11-28 21:51:42 +02:00
Krisztian Litkey
b27ef6f169 pkg/cri/server: experimental NRI integration for CRI.
Implement the adaptation interface required by the NRI
service plugin to handle CRI sandboxes and containers.
Hook the NRI service plugin into CRI request processing.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2022-11-28 21:51:08 +02:00
Phil Estes
99acefaad9 Merge pull request #7697 from inspektor-gadget/qasim/add-sandbox-uid-annotation
cri: add pod uid annotation
2022-11-21 10:54:20 -05:00
yanggang
579c7f43de Change fsnotify event status condition.
Signed-off-by: yanggang <gang.yang@daocloud.io>
2022-11-20 09:43:54 +08:00
Fu Wei
8e787543de Merge pull request #7685 from sofat1989/mainrunserially
can set up the network serially by CNI plugins
2022-11-19 12:33:40 +08:00
Qasim Sarfraz
0c4d32c131 cri: add pod uid annotation
Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
2022-11-19 01:12:02 +01:00
ruiwen-zhao
792294ce06 Update to cri-api v0.26.0-beta.0
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-11-18 21:13:34 +00:00
ruiwen-zhao
234bf990dc Copy cri-api v1alpha2 from v0.25.4 to containerd internal directory
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-11-18 21:09:43 +00:00
Fei Su
f6232793b4 can set up the network serially by CNI plugins
Signed-off-by: Fei Su <sofat1989@126.com>
2022-11-18 15:19:00 +08:00
Kazuyoshi Kato
6596a70861 Use github.com/containerd/cgroups/v3 to remove gogo
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-11-14 21:07:48 +00:00