go1.16.7 (released 2021-08-05) includes a security fix to the net/http/httputil
package, as well as bug fixes to the compiler, the linker, the runtime, the go
command, and the net/http package. See the Go 1.16.7 milestone on the issue
tracker for details:
https://github.com/golang/go/issues?q=milestone%3AGo1.16.7+label%3ACherryPickApproved
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Looking at how this image is used, I think we don't even need the
source in the final image, so we can build containerd in a separate
stage, and copy the binaries.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Not critical for intermediate stages, but a minor optimization to
reduce the image cache. Ideally, this would use cache-mounts for this,
but those may not be supported by podman, so taking the traditional
approach.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Building critools only requires the install script and the critools-version
file (to determin the version to build). Moving it to a separate stage
prevents rebuilding it if unrelated changes are made in the code.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Building cni only requires the install script, and the go.mod (to determin
the version to install). Moving it to a separate stage prevents it from
being rebuilt if unrelated changes were made in the codebase.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This allows for easier copying artifacts from stages, by just copying
the directory content to the stage where it's used. These stages are
not used to be run individually so do not have to be "runnable".
Each stage is "responsible" for colllecting all aftifacts in the directory,
so that "consumer" stages do not have to be aware of what needs to be copied.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This makes the following changes:
- The containerd/config.toml, and docker-entrypoint.sh only occasionally change,
so copy them before copying the source code to allow them to be cached.
- The cri-in-userns stage does not need files from proto3, so do not copy them
- The dev environment does need the file from the proto3 stage, so copy them there.
- Change the order of stages. Our CI uses `podman build` which (I think) does not
skips stages that are not used for the specified target (like BuildKit does).
So I moved stages that are not used for the `cri-in-userns` after that stage.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
The `cri-in-userns` stage is for testing "CRI-in-UserNS", which should be used in conjunction with "Kubelet-in-UserNS":
https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2033-kubelet-in-userns-aka-rootless
This feature is mostly expected to be used for `kind` and `minikube`.
Requires Rootless Docker/Podman/nerdctl with cgroup v2 delegation: https://rootlesscontaine.rs/getting-started/common/cgroup2/
(Rootless Docker/Podman/nerdctl prepares the UserNS, so we do not need to create UserNS by ourselves)
Usage:
```
podman build --target cri-in-userns -t cri-in-userns -f contrib/Dockerfile.test .
podman run -it --rm --privileged cri-in-userns
```
The stage is tested on CI with Rootless Podman on Fedora 34 on Vagrant.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Note that this is the code in containerd that uses runc (as almost
a library). Please see the other commit for the update to runc binary
itself.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
This moves the runc version to build to scripts/setup/runc-version,
which makes it easier for packagers to find the default version
to use.
The RUNC_VERSION environment variable can still be used to override
the version, which can be used (e.g.) to test against different versions
in our CI.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Now that the dependency on runc (libcontaienr) code has been reduced
considerably, it is probbaly ok to cut the version dependency between
libcontainer and the runc binary that is supported.
This patch separates the runc binary version from the version of
libcontainer that is defined in go.mod, and updates the documentation
accordingly.
The RUNC_COMMIT variable in the install-runc script is renamed to
RUNC_VERSION to encourage using tagged versions, and the Dockerfile
in contrib is updated to allow building with a custom version.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This function will be used by nerdctl for printing the default AppArmor
profile: `nerdctl system inspect apparmor-profile`
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
On newer kernels and systems, AppArmor will block sending signals in
many scenarios by default resulting in strange behaviours (container
programs cannot signal each other, or host processes like containerd
cannot signal containers).
The reason this happens only on some distributions (and is not a kernel
regression) is that the kernel doesn't enforce signal mediation unless
the profile contains signal rules. However because our profies #include
the distribution-managed <abstractions/base>, some distributions added
signal rules -- which results in AppArmor enforcing signal mediation and
thus a regression. On these systems, containers cannot send and receive
signals at all -- meaning they cannot signal each other and the
container runtime cannot kill them either.
This issue was fixed in Docker in 2018[1] but this code was copied
before then and thus the patches weren't carried. It also contains a new
fix for a more esoteric case[2]. Ideally this code should live in a
project like "containerd/apparmor" so that Docker, libpod, and
containerd can share it, but that's probably something to do separately.
In addition, the copyright header is updated to reference that the code
is copied from Docker (and thus was not written entirely by the
containerd authors).
[1]: https://github.com/docker/docker/pull/37831
[2]: https://github.com/docker/docker/pull/41337
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
These syscalls (some of which have been in Linux for a while but were
missing from the profile) fall into a few buckets:
* close_range(2), epoll_wait2(2) are just extensions of existing "safe
for everyone" syscalls.
* The mountv2 API syscalls (fs*(2), move_mount(2), open_tree(2)) are
all equivalent to aspects of mount(2) and thus go into the
CAP_SYS_ADMIN category.
* process_madvise(2) is similar to the other process_*(2) syscalls and
thus goes in the CAP_SYS_PTRACE category.
Co-authored-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>