Commit Graph

1570 Commits

Author SHA1 Message Date
Fu Wei
ba852faf41 Merge pull request #8954 from fuweid/fix-shim-leak 2023-08-17 08:16:20 +08:00
Kirtana Ashok
823e0420eb Fix transfer service dependencies:
- Fill OSVersion field of ocispec.Platform for windows OS in
transfer service plugin init()
- Do not return error from transfer service ReceiveStream if
stream.Recv() returned context.Canceled error

Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2023-08-15 15:32:51 -07:00
Wei Fu
8dcb2a6e6d pkg/cri/sbserver: fix leaked shim issue for podsandbox mode
Fixes: #7496 #8931

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-11 17:43:51 +08:00
Wei Fu
72bc63d83d pkg/cri/server: fix leaked shim issue
Fixes: #7496 #8931

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-11 17:43:51 +08:00
Kirtana Ashok
a645ff2e68 Update dependencies after protobuf update in hcsshim
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
(cherry picked from commit d129b6f890bceb56b050bbb23ad330bb5699f78c)
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2023-08-09 11:56:45 -07:00
Fu Wei
2b2195c36b Merge pull request #8722 from marquiz/devel/cgroup-driver-autoconfig
cri: implement RuntimeConfig rpc
2023-08-04 16:09:34 +08:00
Rodrigo Campos
c80a3ecafd cri/sbserver: Use platform instead of GOOS for userns detection
In the sbserver we should not use the GOOS, as windows hosts can run
linux containers. On the sbserver we should use the platform param.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-08-02 12:32:05 +02:00
Rodrigo Campos
2d64ab8d79 cri: Don't use rel path for image volumes
Runc 1.1 throws a warning when using rel destination paths, and runc 1.2
is planning to thow an error (i.e. won't start the container).

Let's just make this an abs path in the only place it might not be: the
mounts created due to `VOLUME` directives in the Dockerfile.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-31 12:33:54 +02:00
Markus Lehtonen
ed47d6ba76 cri: implement RuntimeConfig rpc
The rpc only reports one field, i.e. the cgroup driver, to kubelet.
Containerd determines the effective cgroup driver by looking at all
runtime handlers, starting from the default runtime handler (the rest in
alphabetical order), and returning the cgroup driver setting of the
first runtime handler that supports one. If no runtime handler supports
cgroup driver (i.e. has a config option for it) containerd falls back to
auto-detection, returning systemd if systemd is running and cgroupfs
otherwise.

This patch implements the CRI server side of Kubernetes KEP-4033:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4033-group-driver-detection-over-cri

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-07-28 13:50:43 +03:00
Phil Estes
81895d22c9 Merge pull request #8867 from Iceber/pinned_image_label
cri: fix using the labels to pin image
2023-07-27 09:51:23 -04:00
Iceber Gu
7f7ba31b64 cri: fix using the pinned label to pin image
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-07-26 12:26:00 +08:00
Akihiro Suda
4807571352 pkg/epoch: fix Y2038 on 32-bit hosts
`strconv.Itoa(int(tm.Unix()))` rounds the time to 32-bit int on 32-bit hosts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-07-26 13:17:39 +09:00
Qasim Sarfraz
06f18c69d2 cri: memory.memsw.limit_in_bytes: no such file or directory
If kubelet passes the swap limit (default memory limit = swap limit ),
it is configured for container irrespective if the node supports swap.

Signed-off-by: Qasim Sarfraz <qasimsarfraz@microsoft.com>
2023-07-21 14:43:33 +02:00
Akihiro Suda
98f27e1d9c Revert "Add support for mounts on Darwin"
This reverts commit 2799b28e61.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-07-19 00:22:20 +09:00
Kazuyoshi Kato
ef1c9f0a63 Merge pull request #8766 from lengrongfu/fix/ci-Integration-fail
fix ci Linux Integration test fail
2023-07-18 10:18:12 -07:00
Kazuyoshi Kato
e5a49e6ceb Merge pull request #8789 from slonopotamus/macos-bind-mount
Add support for bind-mounts on Darwin (a.k.a. "make native snapshotter work")
2023-07-18 10:16:10 -07:00
Marat Radchenko
2799b28e61 Add support for mounts on Darwin
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-07-17 23:27:04 +03:00
Phil Estes
a94918b591 Merge pull request #8803 from kinvolk/rata/userns-sbserver
cri/sbserver: Add support for user namespaces (KEP-127)
2023-07-17 10:57:01 -04:00
Sebastiaan van Stijn
9c673f9673 pkg/cri/server: TestImageGetLabels: use registry.k8s.io
These are not actually being pulled, just removing the deprecated k8s.gcr.io
from the code-base. While at it, also renamed / removed vars that shadowed
with package-level definitions

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-07-14 11:22:39 +02:00
Mike Brown
3ed1bc108f Merge pull request #8671 from jsturtevant/fix-windows-edge-cases
[cri] Handle pod transition states gracefully while listing pod stats
2023-07-12 15:43:21 -05:00
James Sturtevant
f914edf4f6 [cri] Handle Windows pod transitions gracefully
When the pods are transitioning there are several
cases where containers might not be in valid state.
There were several cases where the stats where
failing hard but we should just continue on as
they are transient and will be picked up again
when kubelet queries for the stats again.

Signed-off-by: James Sturtevant <jstur@microsoft.com>

Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2023-07-12 09:57:14 -07:00
Rodrigo Campos
9160386ecc cri/sbserver: Test net.ipv4.ping_group_range works with userns
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:15:25 +02:00
Rodrigo Campos
1c6e268447 cri/sbserver: Fix net.ipv4.ping_group_range with userns
This commit just updates the sbserver with the same fix we did on main:
	9bf5aeca77 ("cri: Fix net.ipv4.ping_group_range with userns ")

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:15:25 +02:00
Rodrigo Campos
36a96d7f32 cri/sbserver: Remap snapshots for sbserver too
This is a port of 31a6449734 ("Add capability for snapshotters to
declare support for UID remapping") to sbserver.

This patch remaps the rootfs in the platform-specific if user namespaces
are in use, so the pod can read/write to the rootfs.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:15:22 +02:00
Rodrigo Campos
508e6f6e03 cri/sbserver: Add userns tests to TestLinuxSandboxContainerSpec()
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:42 +02:00
Rodrigo Campos
fb9ce5d482 cri/sbserver: Support pods with user namespaces
This patch requests the OCI runtime to create a userns when the CRI
message includes such request.

This is an adaptation of a7adeb6976 ("cri: Support pods with user
namespaces") to sbserver, although the container_create.go parts were
already ported as part of 40be96efa9 ("Have separate spec builder for
each platform"),

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:42 +02:00
Rodrigo Campos
c99cb95f07 cri/sbserver: Let OCI runtime create netns when userns is used
This commit just ports 36f520dc04 ("Let OCI runtime create netns when
userns is used") to sbserver.

The CNI network setup is done after OCI start, as it didn't seem simple
to get the sandbox PID we need for the netns otherwise.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:42 +02:00
Rodrigo Campos
73c75e2c73 cri/sbserver: Copy userns helpers to podsandbox
Currently there is a big c&p of the helpers between these two folders
and a TODO in the platform agnostic file to organize them in the future,
when some other things settle.

So, let's just copy them for now.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 15:14:12 +02:00
Rodrigo Campos
0b6a0fe773 cri/sbserver: Move runtimeStart to match position with cri/server
Commit c085fac1e5 ("Move sandbox start behind controller") moved the
runtimeStart to only account for time _after_ the netns has been
created.

To match what we currently do in cri/server, let's move it to just after
the get the sandbox runtime.

This come up when porting userns to sbserver, as the CNI network setup
needs to be done at a later stage and runtimeStart was accounting for
the CNI network setup time only when userns is enabled.

To avoid that discrepancy, let's just move it earlier, that also matches
what we do in cri/server.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 13:58:45 +02:00
Rodrigo Campos
9d9903565a cri: Fix comment typos
Beside the "in future the when" typo, we take the chance to reflect that
user namespaces are already merged.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-11 13:58:45 +02:00
wangxiang
232538b768 bugfix(port-forward): Correctly handle known errors
These two errors can occur in the following scenarios:

ECONNRESET: the target process reset connection between CRI and itself.
see: #111825 for detail

EPIPE: the target process did not read the received data, causing the
buffer in the kernel to be full, resulting in the occurrence of Zero Window,
then closing the connection (FIN, RESET)
see: #74551 for detail

In both cases, we should RESET the httpStream.

Signed-off-by: wangxiang <scottwangsxll@gmail.com>
2023-07-11 11:06:13 +08:00
rongfu.leng
38f9bc3e0a fix ci Linux Integration test fail
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2023-07-07 14:51:04 +08:00
Rodrigo Campos
c17d3bdb54 pkg/cri/server: Test net.ipv4.ping_group_range works with userns
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-06 14:20:26 +02:00
Rodrigo Campos
9bf5aeca77 pkg/cri/server: Fix net.ipv4.ping_group_range with userns
userns.RunningInUserNS() checks if the code calling that function is
running inside a user namespace. But we need to check if the container
we will create will use a user namespace, in that case we need to
disable the sysctl too (or we would need to take the userns mapping into
account to set the IDs).

This was added in PR:
        https://github.com/containerd/containerd/pull/6170/

And the param documentation says it is not enabled when user namespaces
are in use:
        https://github.com/containerd/containerd/pull/6170/files#diff-91d0a4c61f6d3523b5a19717d1b40b5fffd7e392d8fe22aed7c905fe195b8902R118

I'm not sure if the intention was to disable this if containerd is
running inside a userns (rootless, if that is even supported) or just
when the pod has user namespaces.

Out of an abundance of caution, I'm keeping the userns.RunningInUserNS()
so it is still not used if containerd runs inside a user namespace.

With this patch and "enable_unprivileged_icmp = true" in the config,
running containerd as root on the host, pods with user namespaces start
just fine. Without this patch they fail with:
        ... failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: w
 /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

Thanks a lot to Andy on the k8s slack for reporting the issue. He also
mentions he hits this with k3s on a default installation (the param
is off by default on containerd, but k3s turns that on by default it
seems). He also debugged which part of the stack was setting that
sysctl, found the PR that added this code in containerd and a workaround
(to turn the bool off).

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
2023-07-06 14:20:26 +02:00
Kazuyoshi Kato
099d2e7c76 Merge pull request #8757 from dcantah/proto-api-conversions
Add From/ToProto helpers
2023-07-03 10:59:08 -07:00
Danny Canter
f3b7436b61 Platforms: Add From/ToProto helpers for types
Helpers to convert from a slice of platforms to our protobuf representation
and vice-versa appear a couple times. It seems sane to just expose this facility
in the platforms pkg.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-28 19:54:56 -07:00
Kazuyoshi Kato
81bc6ce6e9 Merge pull request #8740 from djdongjin/platform-parseall
Add a platform.ParseAll helper
2023-06-28 08:01:12 -07:00
Jin Dong
0a92661e69 Add a platform.ParseAll helper
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2023-06-26 20:34:37 +00:00
Kazuyoshi Kato
9b4ed8acc2 Merge pull request #8696 from fuweid/deflaky-blockfile
chore: deflake the blockfile testsuite
2023-06-26 09:54:33 -07:00
Phil Estes
1a5eaa9ad0 Merge pull request #8732 from thaJeztah/epoch_export_parse
pkg/epoch: extract parsing SOURCE_DATE_EPOCH to a function
2023-06-23 17:06:21 -04:00
Sebastiaan van Stijn
8760b87174 pkg/epoch: extract parsing SOURCE_DATE_EPOCH to a function
This introduces a ParseSourceDateEpoch function, which can be used
to parse "SOURCE_DATE_EPOCH" values for situations where those
values are not passed through an env-var (or the env-var has been
read through other means).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 17:32:02 +02:00
Sebastiaan van Stijn
9924e56f42 pkg/epoch: fix tests on macOS
These tests were failing on my macOS; could be the precision issue (like on
Windows), or just because they're "too fast".

    === RUN   TestSourceDateEpoch/WithoutSourceDateEpoch
        epoch_test.go:51:
                Error Trace:	/Users/thajeztah/go/src/github.com/containerd/containerd/pkg/epoch/epoch_test.go:51
                Error:      	Should be true
                Test:       	TestSourceDateEpoch/WithoutSourceDateEpoch
                Messages:   	now: 2023-06-23 11:47:09.93118 +0000 UTC, v: 2023-06-23 11:47:09.93118 +0000 UTC

This patch:

- updates the rightAfter utility to allow the timestamps to be "equal"
- updates the asserts to provide some details about the timestamps
- uses UTC for the value we're comparing to, to match the timestamps
  that are generated.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 17:29:55 +02:00
Danny Canter
dfd7ad8b37 Reword Windows file related TODO
https://github.com/golang/go/issues/32088 was never accepted or implemented
in 1.14.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-23 05:42:44 -07:00
Sebastiaan van Stijn
44e2b26a87 pkg/epoch: replace some fmt.Sprintfs with strconv
Teeny-tiny optimizations:

    BenchmarkSprintf-10       37735996    32.31  ns/op  0 B/op  0 allocs/op
    BenchmarkItoa-10         591945836     2.031 ns/op  0 B/op  0 allocs/op
    BenchmarkFormatUint-10   593701444     2.014 ns/op  0 B/op  0 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-23 13:10:58 +02:00
Markus Lehtonen
f60a4a2718 cri: drop unused arg from generateRuntimeOptions
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2023-06-19 16:11:36 +03:00
Wei Fu
6dfb16f99a snapshots|pkg: umount without DETACH and nosync after umount
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-06-15 23:53:47 +08:00
Danny Canter
d278d37caa Sandbox: Add Metrics rpc for controller
As a follow up change to adding a SandboxMetrics rpc to the core
sandbox service, the controller needed a corresponding rpc for CRI
and others to eventually implement.

This leaves the CRI (non-shim mode) controller unimplemented just to
have a change with the API addition to start.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-13 00:24:09 -07:00
Derek McGowan
dd5e9f6538 Merge pull request #7944 from adisky/new-pinned-image
CRI Pinned image support
2023-06-10 22:29:34 -07:00
Derek McGowan
98b7dfb870 Merge pull request #8673 from thaJeztah/no_any
avoid "any" as variable name
2023-06-10 20:44:30 -07:00
Sebastiaan van Stijn
4bb709c018 avoid "any" as variable name
Avoid shadowing / confusion with Go's "any" built-in type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-10 13:49:06 +02:00