1. it's easy to check wrong input if using drain_exec_sync_io_timeout in error
2. avoid to use full error message, as part of error generated by go
stdlib would be changed in the future
3. delete the extra empty line
Signed-off-by: Wei Fu <fuweid89@gmail.com>
By default, the child processes spawned by exec process will inherit standard
io file descriptors. The shim server creates a pipe as data channel. Both exec
process and its children write data into the write end of the pipe. And the
shim server will read data from the pipe. If the write end is still open, the
shim server will continue to wait for data from pipe.
So, if the exec command is like `bash -c "sleep 365d &"`, the exec process is
bash and quit after create `sleep 365d`. But the `sleep 365d` will hold the
write end of the pipe for a year! It doesn't make senses that CRI plugin
should wait for it.
For this case, we should use timeout to drain exec process's io instead of
waiting for it.
Fixes: #7802
Signed-off-by: Wei Fu <fuweid89@gmail.com>
The snapshotter annotation definitions and related functions have been
public in the new packge snapshotter
Also remove a test for container image layer's annotation.
Signed-off-by: Changwei Ge <gechangwei@bytedance.com>
From golangci-lint:
> SA1019: rand.Read has been deprecated since Go 1.20 because it
>shouldn't be used: For almost all use cases, crypto/rand.Read is more
>appropriate. (staticcheck)
> SA1019: rand.Seed has been deprecated since Go 1.20 and an alternative
>has been available since Go 1.0: Programs that call Seed and then expect
>a specific sequence of results from the global random source (using
>functions such as Int) can be broken when a dependency changes how
>much it consumes from the global random source. To avoid such breakages,
>programs that need a specific result sequence should use
>NewRand(NewSource(seed)) to obtain a random generator that other
>packages cannot access. (staticcheck)
See also:
- https://pkg.go.dev/math/rand@go1.20#Read
- https://pkg.go.dev/math/rand@go1.20#Seed
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
There is a new CNI capability argument, cgroupPath, where runtimes can
pass cgroup paths to CNI plugins.
Implement that.
Signed-off-by: Casey Callendrello <cdc@isovalent.com>
All of the CRI sandbox and container specs all get assigned
almost the exact same default annotations (sandboxID, name, metadata,
container type etc.) so lets make a helper to return the right set for
a sandbox or regular workload container.
Signed-off-by: Danny Canter <danny@dcantah.dev>
Split out the criService-agnostic bits of nri-api* from
pkg/cri/server to pkg/cri/nri to allow sharing a single
implementation betwen the server and sbserver versions.
Rework the interfaces to not require access to package
internals.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
In https://github.com/containerd/containerd/pull/7764 it was made
so that generic runtime options in the containerd toml config file
would get passed to shims regardless of if containerd knew of the
type beforehand and could supply the struct. However, this was only
added for the sandbox server fork here and not the regular ol' CRI
server. This change just mirrors the parts that need to be plopped in
pkg/cri/server
Signed-off-by: Danny Canter <danny@dcantah.dev>
Before this patch, both the RdtEnabled and BlockIOEnabled are provided
by services/tasks pkg. Since the services/tasks can be pkg plugin which
can be initialized multiple times or concurrently. It will fire data-race
issue as there is no mutex to protect `enable`.
This patch is aimed to provide wrapper pkgs to use intel/{blockio,rdt}
safely.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
/etc/cni has to be readable for non-root users (0755), because /etc/cni/tuning/allowlist.conf is used for rootless mode too.
This file was introduced in CNI plugins 1.2.0 (containernetworking/plugins PR 693), and its path is hard-coded.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Some of this code was originally added in b7b1200dd3,
which likely meant to initialize the slice with a length to reduce allocations,
however, instead of initializing with a zero-length and a capacity, it
initialized the slice with a fixed length, which was corrected in commit
0c63c42f81.
This patch initializes the slice with a zero-length and expected capacity.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Comments in initPlatform for Windows states that the options were
Linux specific. Additionally properly wrap an error after trying
to setup CDI on Linux.
Signed-off-by: Danny Canter <danny@dcantah.dev>
The sandbox and container both have the userns config. Lets make sure
they are the same, therefore consistent.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Currently we require that c.containerSpec() does not return an error
if test.err is not set.
However, if the require fails (i.e. it indeed returned an error) the
rest of the code is executed anyways. The rest of the code assumes it
did not return an error (so code assumes spec is not nil). This fails
miserably if it indeed returned an error, as spec is nil and go crashes
while running the unit tests.
Let's require it is not an error, so code does not continue to execute
if that fails and go doesn't crash.
In the test.err case is not harmful the bug of using assert, but let's
switch it to require too as that is what we really want.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
A domainname field was recently added to the OCI spec. Prior to this
folks would need to set this with a sysctl, but now runtimes should be
able to setdomainname(2). There's an open change to runc at the moment
to add support for this so I've just left testing as a couple spec
validations in CRI until that's in and usable.
Signed-off-by: Danny Canter <danny@dcantah.dev>
This patch requests the OCI runtime to create a userns when the CRI
message includes such request.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
This allows user namespace support to progress, either by allowing
snapshotters to deal with ownership, or falling back to containerd doing
a recursive chown.
In the future, when snapshotters implement idmap mounts, they should
report the "remap-ids" capability.
Co-authored-by: Rodrigo Campos <rodrigoca@microsoft.com>
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Signed-off-by: David Leadbeater <dgl@dgl.cx>
As explained in the comments, this patch lets the OCI runtime create the
netns when userns are in use. This is needed because the netns needs to
be owned by the userns (otherwise can't modify the IP, etc.).
Before this patch, we are creating the netns and then starting the pod
sandbox asking to join this netns. This can't never work with userns, as
the userns needs to be created first for the netns ownership to be
correct.
One option would be to also create the userns in containerd, then create
the netns. But this is painful (needs tricks with the go runtime,
special care to write the mapping, etc.).
So, we just let the OCI runtime create the userns and netns, that
creates them with the proper ownership.
As requested by Mike Brown, the current code when userns is not used is
left unchanged. We can unify the cases (with and without userns) in a
future release.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Due to when we were updating the pod sandboxes underlying container
object, the pointer to the sandbox would have the right info, but
the on-disk representation of the data was behind. This would cause
the data returned from loading any sandboxes after a restart to have
no CNI result or IP information for the pod.
This change does an additional update to the on-disk container info
right after we invoke the CNI plugin so the metadata for the CNI result
and other networking information is properly flushed to disk.
Signed-off-by: Danny Canter <danny@dcantah.dev>
Skip automatic `if swapLimit == 0 { s.Linux.Resources.Memory.Swap = &limit }` when the swap controller is missing.
(default on Ubuntu 20.04)
Fix issue 7828 (regression in PR 7783 "cri: make swapping disabled with memory limit")
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
While we need to support CRI v1alpha2, the implementation doesn't have
to be tied to gogo/protobuf.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
We do a ton of host networking checks around the CRI plugin, all mainly
doing the same thing of checking the different quirks on various platforms
(for windows are we a HostProcess pod, for linux is namespace mode the
right thing, darwin doesn't have CNI support etc.) which could all be
bundled up into a small helper that can be re-used.
Signed-off-by: Danny Canter <danny@dcantah.dev>