Commit Graph

1223 Commits

Author SHA1 Message Date
Derek McGowan
c1bcabb454
Merge pull request from GHSA-5ffw-gxpp-mxpf
Limit the response size of ExecSync
2022-06-06 10:19:23 -07:00
Kazuyoshi Kato
40aa4f3f1b
Implicitly discard the input to drain the reader
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-06-06 09:57:13 -07:00
Phil Estes
2b661b890f
Merge pull request #6899 from shuaichang/ISSUE6657-support-runtime-snapshotter
Support runtime level snapshotter for issue 6657
2022-06-03 10:04:53 +02:00
shuaichang
7b9f1d4058 Added support for runtime level snapshotter, issue 6657
Signed-off-by: shuaichang <shuai.chang@databricks.com>

Updated annotation name
2022-06-02 16:29:59 -07:00
Fu Wei
aa0aaa4947
Merge pull request #7009 from mikebrow/update-gocni 2022-06-02 11:09:46 +08:00
Mike Brown
e3b4d750db update go-cni/for cni update fixing plugins that don't respond with version
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2022-06-01 17:20:18 -05:00
Kazuyoshi Kato
c149e6c2ea
Merge pull request #6996 from dcantah/hpc-validations
Add validations for Windows HostProcess CRI configs
2022-06-01 11:37:12 -07:00
Kazuyoshi Kato
fcd0c86c70
Merge pull request #7007 from dmcgowan/move-docker-sort
Move docker reference logic to reference/docker package
2022-06-01 11:33:52 -07:00
Phil Estes
5bc2d2e429
Merge pull request #7003 from pacoxu/pause-3.7
promote pause image to 3.7 (sync with kube v1.24)
2022-06-01 05:59:14 -04:00
Derek McGowan
8ed54849a6
Move docker reference logic to reference/docker package
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-05-31 22:40:49 -07:00
Mike Brown
8c27ce4193
Merge pull request #6993 from mxpv/images
CRI: cleanup cri/store package
2022-05-31 20:38:43 -05:00
Kazuyoshi Kato
49ca87d727 Limit the response size of ExecSync
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-05-31 22:21:35 +00:00
Paco Xu
1cf6f20320 promote pause image to 3.7
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-05-30 15:08:28 +08:00
Daniel Canter
b5e1b8f619 Use t.Run for /pkg/cri tests
A majority of the tests in /pkg/cri are testing/validating multiple
things per test (generally spec or options validations). This flow
lends itself well to using *testing.T's Run method to run each thing
as a subtest so `go test` output can actually display which subtest
failed/passed.

Some of the tests in the packages in pkg/cri already did this, but
a bunch simply logged what sub-testcase was currently running without
invoking t.Run.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-05-29 18:32:09 -07:00
Maksym Pavlenko
b572a82ad8 CRI: Remove deprecated error types and update error msg
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-28 13:53:28 -07:00
Daniel Canter
978ff393d2 Add validations for Windows HostProcess CRI configs
HostProcess containers require every container in the pod to be a
host process container and have the corresponding field set. The Kubelet
usually enforces this so we'd error before even getting here but we recently
found a bug in this logic so better to be safe than sorry.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-05-27 21:17:07 -07:00
Maksym Pavlenko
688b30cf52 CRI: Move truncindex to pkg
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-26 13:02:45 -07:00
Maksym Pavlenko
e44335800e CRI: Move reference sorting to reference package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-26 12:52:36 -07:00
Maksym Pavlenko
b5366f8d7e CRI: Retrieve image spec on client
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-26 12:38:55 -07:00
AllenZMC
eaec6530d7 fix some confusing typos
Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>
2022-05-17 23:53:36 +08:00
Akihiro Suda
42584167b7
Officially deprecate Schema 1
Schema 1 has been substantially deprecated since circa. 2017 in favor of Schema 2 introduced in Docker 1.10 (Feb 2016)
and its successor OCI Image Spec v1, but we have not officially deprecated Schema 1.

One of the reasons was that Quay did not support Schema 2 so far, but it is reported that Quay has been
supporting Schema 2 since Feb 2020 (moby/buildkit issue 409).

This PR deprecates pulling Schema 1 images but the feature will not be removed before containerd 2.0.
Pushing Schema 1 images was never implemented in containerd (and its consumers such as BuildKit).

Docker/Moby already disabled pushing Schema 1 images in Docker 20.10 (moby/moby PR 41295),
but Docker/Moby has not yet disabled pulling Schema 1 as containerd has not yet deprecated Schema 1.
(See the comments in moby/moby PR 42300.)
Docker/Moby is expected to disable pulling Schema 1 images in future after this deprecation.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-05-02 19:08:38 +09:00
Mike Brown
6b35307594
Merge pull request #5490 from askervin/5Bu_blockio
Support for cgroups blockio
2022-04-29 10:07:56 -05:00
Antti Kervinen
10576c298e cri: support blockio class in pod and container annotations
This patch adds support for a container annotation and two separate
pod annotations for controlling the blockio class of containers.

The container annotation can be used by a CRI client:
  "io.kubernetes.cri.blockio-class"

Pod annotations specify the blockio class in the K8s pod spec level:
  "blockio.resources.beta.kubernetes.io/pod"
  (pod-wide default for all containers within)

  "blockio.resources.beta.kubernetes.io/container.<container_name>"
  (container-specific overrides)

Correspondingly, this patch adds support for --blockio-class and
--blockio-config-file to ctr, too.

This implementation follows the resource class annotation pattern
introduced in RDT and merged in commit 893701220.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2022-04-29 11:44:09 +03:00
Kazuyoshi Kato
29b9379560 make protos
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-27 21:31:16 +00:00
Kazuyoshi Kato
fcba486366 Remove gogo from .proto files
While gogo isn't actually used, it is still referenced from .proto files
and its corresponding Go package is imported from the auto-generated
files.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-27 20:27:55 +00:00
Kazuyoshi Kato
7bd42d226a
Merge pull request #6856 from kangclzjc/container-remove-dup-20220426
remove duplicate
2022-04-27 09:32:08 -07:00
Derek McGowan
6e0231f992
Merge pull request #6150 from fuweid/support-4984
feature: support image pull progress timeout
2022-04-26 12:15:09 -07:00
Wei Fu
00d102da9f feature: support image pull progress timeout
Kubelet sends the PullImage request without timeout, because the image size
is unknown and timeout is hard to defined. The pulling request might run
into 0B/s speed, if containerd can't receive any packet in that connection.
For this case, the containerd should cancel the PullImage request.

Although containerd provides ingester manager to track the progress of pulling
request, for example `ctr image pull` shows the console progress bar, it needs
more CPU resources to open/read the ingested files to get status.

In order to support progress timeout feature with lower overhead, this
patch uses http.RoundTripper wrapper to track active progress. That
wrapper will increase active-request number and return the
countingReadCloser wrapper for http.Response.Body. Each bytes-read
can be count and the active-request number will be descreased when the
countingReadCloser wrapper has been closed. For the progress tracker,
it can check the active-request number and bytes-read at intervals. If
there is no any progress, the progress tracker should cancel the
request.

NOTE: For each blob data, the containerd will make sure that the content
writer is opened before sending http request to the registry. Therefore, the
progress reporter can rely on the active-request number.

fixed: #4984

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-27 00:02:27 +08:00
Fu Wei
0d696d2f4b
Merge pull request #6749 from dmcgowan/unpacker-interface 2022-04-26 20:54:51 +08:00
Kang.Zhang
fceab7f4c4 remove duplicate
Signed-off-by: Kang.Zhang <Kang.zhang@intel.com>
2022-04-26 10:44:45 +08:00
Derek McGowan
0e6c7bf931
Fix undefined error in use of errors package
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-25 15:21:21 -07:00
Derek McGowan
3dbd6a2498
Merge pull request #6841 from kzys/proto-upgrade-6
Migrate off from github.com/gogo/protobuf
2022-04-25 15:12:51 -07:00
Kazuyoshi Kato
f140400c0e
Merge pull request #5686 from dtnyn/issue-5679
Add flag to allow oci.WithAllDevicesAllowed on PrivilegedWithoutHostDevices
2022-04-25 11:44:01 -07:00
Kazuyoshi Kato
7a4f81d8ba Fix tests
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:41:05 +00:00
Kazuyoshi Kato
9dbe000a38 make protos
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:31:53 +00:00
Kazuyoshi Kato
e3db7de8f5 Remove gogo/protobuf and adjust types
This commit migrates containerd/protobuf from github.com/gogo/protobuf
to google.golang.org/protobuf and adjust types. Proto-generated structs
cannot be passed as values.

Fixes #6564.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:31:53 +00:00
Henry Wang
8710d4d014 cri: close fifos when container is deleted
Signed-off-by: Henry Wang <henwang@amazon.com>
2022-04-21 21:46:50 +00:00
Kazuyoshi Kato
237ef0de9b Remove all gogoproto extensions
This commit removes the following gogoproto extensions;

- gogoproto.nullable
- gogoproto.customename
- gogoproto.unmarshaller_all
- gogoproto.stringer_all
- gogoproto.sizer_all
- gogoproto.marshaler_all
- gogoproto.goproto_unregonized_all
- gogoproto.goproto_stringer_all
- gogoproto.goproto_getters_all

None of them are supported by Google's toolchain (see #6564).

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-20 07:23:28 +00:00
Derek McGowan
39692e7672
unpack: return error when no platforms defined
Require platforms to be non-empty to avoid no-op unpack

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-19 18:21:57 -07:00
Derek McGowan
8017daa12d
Add unpack interface to be used by client
Move client unpacker to pkg/unpack

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-19 18:21:57 -07:00
Kazuyoshi Kato
88c0c7201e Consolidate gogo/protobuf dependencies under our own protobuf package
This would make gogo/protobuf migration easier.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 15:53:36 +00:00
Kazuyoshi Kato
80b825ca2c Remove gogoproto.stdtime
This commit removes gogoproto.stdtime, since it is not supported by
Google's official toolchain
(see https://github.com/containerd/containerd/issues/6564).

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 13:39:30 +00:00
Derek McGowan
fe8da6dcaf
Move lease manager plugin to separate package
Create lease plugin type to separate lease manager from services plugin.
This allows other service plugins to depend on the lease manager.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-15 11:08:47 -07:00
Derek McGowan
98260e1b18
Merge pull request #6806 from mikebrow/netns-hardening
check for duplicate nspath possibilities
2022-04-14 15:02:44 -07:00
Mike Brown
147f0a7e02 check for duplicate nspath possibilities
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2022-04-14 18:33:19 +00:00
Andrey Klimentyev
5f3ce9512b Do not append []string{""} to command to preserve Docker compatibility
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
2022-04-13 13:29:49 +03:00
Eric Lin
a5dfbfcf5a cri: load sandboxes/containers/images in parallel
Parallelizing them decreases loading duration.

Time to complete recover():
* Without competing IOs + without opt: 21s
* Without competing IOs + with opt: 14s
* Competing IOs + without opt: 3m44s
* Competing IOs + with opt: 33s

Signed-off-by: Eric Lin <linxiulei@gmail.com>
2022-04-09 13:01:14 +00:00
Ed Bartosh
ff5c55847a move CDI calls to the linux-only code
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:10:59 +03:00
Ed Bartosh
c9b4ccf83e add configuration for CDI
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:10:54 +03:00
Ed Bartosh
aed0538dac cri: implement CDI device injection
Extract the names of requested CDI devices and update the OCI
Spec according to the corresponding CDI device specifications.

CDI devices are requested using container annotations in the
cdi.k8s.io namespace. Once CRI gains dedicated fields for CDI
injection the snippet for extracting CDI names will need an
update.

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:07:54 +03:00
Wei Fu
8113758568 CRI: improve image pulling performance
Background:

With current design, the content backend uses key-lock for long-lived
write transaction. If the content reference has been marked for write
transaction, the other requestes on the same reference will fail fast with
unavailable error. Since the metadata plugin is based on boltbd which
only supports single-writer, the content backend can't block or handle
the request too long. It requires the client to handle retry by itself,
like OpenWriter - backoff retry helper. But the maximum retry interval
can be up to 2 seconds. If there are several concurrent requestes fo the
same image, the waiters maybe wakeup at the same time and there is only
one waiter can continue. A lot of waiters will get into sleep and we will
take long time to finish all the pulling jobs and be worse if the image
has many more layers, which mentioned in issue #4937.

After fetching, containerd.Pull API allows several hanlers to commit
same ChainID snapshotter but only one can be done successfully. Since
unpack tar.gz is time-consuming job, it can impact the performance on
unpacking for same ChainID snapshotter in parallel.

For instance, the Request 2 doesn't need to prepare and commit, it
should just wait for Request 1 finish, which mentioned in pull
request #6318.

```text
	Request 1	Request 2

	Prepare
	   |
	   |
	   |
	   |		Prepare
	Commit		   |
			   |
			   |
			   |
			Commit(failed on exist)
```

Both content backoff retry and unnecessary unpack impacts the performance.

Solution:

Introduced the duplicate suppression in fetch and unpack context. The
deplicate suppression uses key-mutex and single-waiter-notify to support
singleflight. The caller can use the duplicate suppression in different
PullImage handlers so that we can avoid unnecessary unpack and spin-lock
in OpenWriter.

Test Result:

Before enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	1m6.172s
user	0m0.268s
sys	0m0.193s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.324s
user	0m0.441s
sys	0m0.316s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	1m47.657s
user	0m0.284s
sys	0m0.224s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.381s
user	0m0.488s
sys	0m0.358s
```

With this enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	0m1.140s
user	0m0.243s
sys	0m0.178s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.239s
user	0m0.463s
sys	0m0.275s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	0m5.546s
user	0m0.217s
sys	0m0.219s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.090s
user	0m0.501s
sys	0m0.331s
```

Test Script:

localhost:5000/{redis|golang}:latest is equal to
docker.io/library/{redis|golang}:latest. The image is hold in local registry
service by `docker run -d -p 5000:5000 --name registry registry:2`.

```bash

image_name="${1}"
pull_times="${2:-10}"

cleanup() {
  ctr image rmi "${image_name}"
  ctr -n k8s.io image rmi "${image_name}"
  crictl rmi "${image_name}"
  docker rmi "${image_name}"
  sleep 2
}

crictl_testing() {
  for idx in $(seq 1 ${pull_times}); do
    crictl pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

docker_testing() {
  for idx in $(seq 1 ${pull_times}); do
    docker pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

cleanup > /dev/null 2>&1

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "crictl pull $image_name (x${pull_times}) takes ..."
time crictl_testing
echo

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "docker pull $image_name (x${pull_times}) takes ..."
time docker_testing
```

Fixes: #4937
Close: #4985
Close: #6318

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-06 07:14:18 +08:00
Maksym Pavlenko
871b6b6a9f Use testify
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-04-01 18:17:58 -07:00
Fu Wei
d394e00c7e
Merge pull request #6738 from zhsj/fix-test-msg
Fix error message in TestNewBinaryIO
2022-03-25 23:40:06 +08:00
Phil Estes
3633cae64b
Merge pull request #6706 from kzys/typeurl-upgrade
Use typeurl.Any instead of github.com/gogo/protobuf/types.Any
2022-03-25 10:38:46 -04:00
Shengjing Zhu
2689432bfa Fix error message in TestNewBinaryIO
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-03-25 11:56:21 +08:00
Kazuyoshi Kato
96b16b447d Use typeurl.Any instead of github.com/gogo/protobuf/types.Any
This commit upgrades github.com/containerd/typeurl to use typeurl.Any.
The interface hides gogo/protobuf/types.Any from containerd's Go client.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-03-24 20:50:07 +00:00
Fu Wei
e7cba85e1c
Merge pull request #6323 from jepio/jepio/fix-cgroupv2-oom-event 2022-03-24 15:51:28 +08:00
Derek McGowan
551516a18d
Merge pull request from GHSA-c9cp-9c75-9v8c
Fix the Inheritable capability defaults.
2022-03-23 10:50:56 -07:00
Phil Estes
36dcc76fa9
Merge pull request #6496 from thaJeztah/deprecate_criu_opt
runtime: deprecate runc --criu / -criu-path option
2022-03-23 11:17:58 -04:00
Sebastiaan van Stijn
d2013d2c99
runtime: deprecate runc --criu / -criu-path option
runc option --criu is now ignored (with a warning), and the option will be
removed entirely in a future release. Users who need a non- standard criu
binary should rely on the standard way of looking up binaries in $PATH.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-03-23 14:42:43 +01:00
Amit Barve
bfde58e3cd Bug fix for mount path handling
Currently when handling 'container_path' elements in container mounts we simply call
filepath.Clean on those paths. However, filepath.Clean adds an extra '.' if the path is a
simple drive letter ('E:' or 'Z:' etc.). These type of paths cause failures (with incorrect
parameter error) when creating containers via hcsshim. This commit checks for such paths
and doesn't call filepath.Clean on them.
It also adds a new check to error out if the destination path is a C drive and moves the
dst path checks out of the named pipe condition.

Signed-off-by: Amit Barve <ambarve@microsoft.com>
2022-03-21 09:40:19 -07:00
Phil Estes
ee49c4d557
Add nolint:staticcheck to platform-specific calls
The linter on platforms that have a hardcoded response complains about
"if xyz == nil" checks; ignore those.

Signed-off-by: Phil Estes <estesp@amazon.com>
2022-03-17 18:24:00 -04:00
Fu Wei
d9797673b0
Merge pull request #6593 from qiutongs/improve-container-mount
Make the temp mount as ready only in container WithVolumes
2022-03-18 00:03:28 +08:00
Eng Zer Jun
18ec2761c0
test: use T.TempDir to create temporary test directory
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-03-15 14:03:50 +08:00
Paul "TBBle" Hampson
39d52118f5 Plumb CRI Devices through to OCI WindowsDevices
There's two mappings of hostpath to IDType and ID in the wild:
- dockershim and dockerd-cri (implicitly via docker) use class/ID
-- The only supported IDType in Docker is 'class'.
-- https://github.com/aarnaud/k8s-directx-device-plugin generates this form
- https://github.com/jterry75/cri (windows_port branch) uses IDType://ID
-- hcsshim's CRI test suite generates this form

`://` is much more easily distinguishable, so I've gone with that one as
the generic separator, with `class/` as a special-case.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2022-03-12 08:16:43 +11:00
Derek McGowan
8acbb27647
Merge pull request from GHSA-crp2-qrr5-8pq7
Clean image volume path
2022-03-02 10:03:17 -08:00
Shengjing Zhu
352a8f49f7 cri: relax test for system without hugetlb
These unit tests don't check hugetlb. However by setting
TolerateMissingHugetlbController to false, these tests can't
be run on system without hugetlb (e.g. Debian buildd).

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-28 01:38:58 +08:00
Qiutong Song
ec90efbe99 Make the temp mount as ready only in container WithVolumes
Signed-off-by: Qiutong Song <songqt01@gmail.com>
2022-02-25 17:53:30 -08:00
Shengjing Zhu
ea3d2e6433 go.mod: update to github.com/tchap/go-patricia/v2 v2.3.1
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-26 05:04:55 +08:00
Phil Estes
2b2372d43e
Merge pull request #6337 from thaJeztah/bump_go_restful
go.mod: update to github.com/emicklei/go-restful/v3 v3.7.3
2022-02-22 17:33:37 -05:00
Shengjing Zhu
f4f41296c2 Replace golang.org/x/net/context with std library
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-22 02:27:05 +08:00
Sebastiaan van Stijn
481fb923c5
go.mod: update to github.com/emicklei/go-restful/v3 v3.7.3
full diff: https://github.com/emicklei/go-restful/compare/v2.9.5...v3.7.3

- Switch to using go modules
- Add check for wildcard to fix CORS filter
- Add check on writer to prevent compression of response twice
- Add OPTIONS shortcut WebService receiver
- Add Route metadata to request attributes or allow adding attributes to routes
- Add wroteHeader set
- Enable content encoding on Handle and ServeHTTP
- Feat: support google custom verb
- Feature: override list of method allowed without content-type
- Fix Allow header not set on '405: Method Not Allowed' responses
- Fix Go 1.15: conversion from int to string yields a string of one rune
- Fix WriteError return value
- Fix: use request/response resulting from filter chain
- handle path params with prefixes and suffixes
- HTTP response body was broken, if struct to be converted to JSON has boolean value
- List available representations in 406 body
- Support describing response headers
- Unwrap function in filter chain + remove unused dispatchWithFilters

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-18 21:54:27 +01:00
ruiwen-zhao
fb0b8d6177 Use fs.RootPath when mounting volumes
Signed-off-by: Ruiwen Zhao <ruiwen@google.com>
2022-02-17 19:20:00 +00:00
Derek McGowan
c0f8188469
Update go-cni to v1.1.2
Fixes panic when exec is nil

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-02-10 12:40:51 -08:00
Fu Wei
6a628b64ac
Merge pull request #6514 from marquiz/fixes/rdt 2022-02-08 09:31:49 +08:00
Markus Lehtonen
9b1fb82584 cri: fix handling of ignore_rdt_not_enabled_errors config option
We were not properly ignoring errors from
gorestrl.rdt.ContainerClassFromAnnotations() causing the config option
to be ineffective, in practice.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-02-04 13:54:03 +02:00
Jeremi Piotrowski
7275411ec8 cgroup2: monitor OOMKill instead of OOM to prevent missing container OOM events
With the cgroupv2 configuration employed by Kubernetes, the pod cgroup (slice)
and container cgroup (scope) will both have the same memory limit applied. In
that situation, the kernel will consider an OOM event to be triggered by the
parent cgroup (slice), and increment 'oom' there. The child cgroup (scope) only
sees an oom_kill increment. Since we monitor child cgroups for oom events,
check the OOMKill field so that we don't miss events.

This is not visible when running containers through docker or ctr, because they
set the limits differently (only container level). An alternative would be to
not configure limits at the pod level - that way the container limit will be
hit and the OOM will be correctly generated. An interesting consequence is that
when spawning a pod with multiple containers, the oom events also work
correctly, because:

a) if one of the containers has no limit, the pod has no limit so OOM events in
   another container report correctly.
b) if all of the containers have limits then the pod limit will be a sum of
   container events, so a container will be able to hit its limit first.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-02-03 13:39:16 +01:00
Jeremi Piotrowski
821c961c86 pkg/oom/v2: handle EventChan routine shutdown quietly
When the cgroup is removed, EventChan is closed (this was pulled in by
8d69c041c5). This results in a nil error
being received. Don't log an error in that case but instead return.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-02-03 13:20:46 +01:00
Andrew G. Morgan
6906b57c72
Fix the Inheritable capability defaults.
The Linux kernel never sets the Inheritable capability flag to
anything other than empty. Non-empty values are always exclusively
set by userspace code.

[The kernel stopped defaulting this set of capability values to the
 full set in 2000 after a privilege escalation with Capabilities
 affecting Sendmail and others.]

Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
2022-02-01 13:55:46 -08:00
Derek McGowan
4e9e14c2b6
Fix rdt build tags for go 1.16
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-01-19 11:09:29 -08:00
Takumasa Sakao
18592b2f5a Fix wrong log message
Signed-off-by: Takumasa Sakao <tsakao@zlab.co.jp>
2022-01-09 16:01:23 +09:00
haoyun
bbe46b8c43 feat: replace github.com/pkg/errors to errors
Signed-off-by: haoyun <yun.hao@daocloud.io>
Co-authored-by: zounengren <zouyee1989@gmail.com>
2022-01-07 10:27:03 +08:00
Derek McGowan
644a01e13b
Merge pull request from GHSA-mvff-h3cj-wj9c
only relabel cri managed host mounts
2022-01-05 09:30:58 -08:00
Markus Lehtonen
9c2e3835fa cri: add ignore_rdt_not_enabled_errors config option
Enabling this option effectively causes RDT class of a container to be a
soft requirement. If RDT support has not been enabled the RDT class
setting will not have any effect.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-01-04 09:27:54 +02:00
Markus Lehtonen
f4a191917b cri: annotations for controlling RDT class
Use goresctrl for parsing container and pod annotations related to RDT.

In practice, from the users' point of view, this patchs adds support for
a container annotation and two separate pod annotations for controlling
the RDT class of containers.

Container annotation can be used by a CRI client:
  "io.kubernetes.cri.rdt-class"

Pod annotations for specifying the RDT class in the K8s pod spec level:
  "rdt.resources.beta.kubernetes.io/pod"
  (pod-wide default for all containers within)

  "rdt.resources.beta.kubernetes.io/container.<container_name>"
  (container-specific overrides)

Annotations are intended as an intermediate step before the CRI API
supports RDT.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-01-04 09:27:54 +02:00
Derek McGowan
2c9d80aba5
Merge pull request #6372 from fidencio/wip/seutil-fix-container_kvm_t-type-detection
seutil: Fix setting the "container_kvm_t" label
2021-12-15 10:35:04 -08:00
Phil Estes
949db57213
Merge pull request #6320 from endocrimes/dani/cri-swap
cri: add support for configuring swap
2021-12-14 15:02:28 -05:00
Phil Estes
330961c2d5
Merge pull request #6358 from jonyhy96/feat-error
refactor: functions for error log and error return
2021-12-14 10:16:54 -05:00
Fu Wei
d47fa40d1b
Merge pull request #6021 from dmcgowan/runc-shim-plugin 2021-12-14 10:19:23 +08:00
Derek McGowan
ac531108ab
Merge pull request #6155 from egernst/cri-update-for-sandbox-sizing
CRI update for sandbox sizing
2021-12-13 16:21:30 -08:00
Fabiano Fidêncio
f1c7993311 seutil: Fix setting the "container_kvm_t" label
The ability to handle KVM based runtimes with SELinux has been added as
part of d715d00906.

However, that commit introduced some logic to check whether the
"container_kvm_t" label would or not be present in the system, and while
the intentions were good, there's two major issues with the approach:
1. Inspecting "/etc/selinux/targeted/contexts/customizable_types" is not
   the way to go, as it doesn't list the "container_kvm_t" at all.
2. There's no need to check for the label, as if the label is invalid an
   "Invalid Label" error will be returned and that's it.

With those two in mind, let's simplify the logic behind setting the
"container_kvm_t" label, removing all the unnecessary code.

Here's an output of VMM process running, considering:
* The state before this patch:
  ```
  $ containerd --version
  containerd github.com/containerd/containerd v1.6.0-beta.3-88-g7fa44fc98 7fa44fc98f
  $ kubectl apply -f ~/simple-pod.yaml
  pod/nginx created
  $ ps -auxZ | grep cloud-hypervisor
  system_u:system_r:container_runtime_t:s0 root 609717 4.0  0.5 2987512 83588 ?    Sl   08:32   0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/be9d5cbabf440510d58d89fc8a8e77c27e96ddc99709ecaf5ab94c6b6b0d4c89/clh-api.sock
  ```

* The state after this patch:
  ```
  $ containerd --version
  containerd github.com/containerd/containerd v1.6.0-beta.3-89-ga5f2113c9 a5f2113c9fc15b19b2c364caaedb99c22de4eb32
  $ kubectl apply -f ~/simple-pod.yaml
  pod/nginx created
  $ ps -auxZ | grep cloud-hypervisor
  system_u:system_r:container_kvm_t:s0:c638,c999 root 614842 14.0  0.5 2987512 83228 ? Sl 08:40   0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/f8ff838afdbe0a546f6995fe9b08e0956d0d0cdfe749705d7ce4618695baa68c/clh-api.sock
  ```

Note, the tests were performed using the following configuration snippet:
```
[plugins]
  [plugins.cri]
    enable_selinux = true
    [plugins.cri.containerd]
      [plugins.cri.containerd.runtimes]
        [plugins.cri.containerd.runtimes.kata]
           runtime_type = "io.containerd.kata.v2"
           privileged_without_host_devices = true
```

And using the following pod yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  runtimeClassName: kata
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
```

Fixes: #6371

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-12-14 00:09:17 +01:00
Alexander Minbaev
c8a009d18c add-list-stat: return container list if filter is nil
Signed-off-by: Alexander Minbaev <alexander.minbaev@ibm.com>
2021-12-13 15:09:18 -06:00
Eric Ernst
20419feaac cri, sandbox: pass sandbox resource details if available, applicable
CRI API has been updated to include a an optional `resources` field in the
LinuxPodSandboxConfig field, as part of the RunPodSandbox request.

Having sandbox level resource details at sandbox creation time will have
large benefits for sandboxed runtimes. In the case of Kata Containers,
for example, this'll allow for better support of SW/HW architectures
which don't allow for CPU/memory hotplug, and it'll allow for better
queue sizing for virtio devices associated with the sandbox (in the VM
case).

If this sandbox resource information is provided as part of the run
sandbox request, let's introduce a pattern where we will update the
pause container's runtiem spec to include this information in the
annotations field.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-12-13 08:41:41 -08:00
haoyun
c0d07094be feat: Errorf usage
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-12-13 14:31:53 +08:00
Michael Crosby
9b0303913f
only relabel cri managed host mounts
Co-authored-by: Samuel Karp <skarp@amazon.com>
Signed-off-by: Michael Crosby <michael@thepasture.io>
Signed-off-by: Samuel Karp <skarp@amazon.com>
2021-12-09 09:53:47 -08:00
Sebastiaan van Stijn
2d3009038c
cri/server: use consistent alias for pkg/ioutil
Consistently use cioutil to prevent it being confused for Golang's ioutil.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-12-09 17:47:22 +01:00
Danielle Lancashire
2fa4e9c0e2 cri: add support for configuring swap
Signed-off-by: Danielle Lancashire <dani@builds.terrible.systems>
2021-12-02 21:25:33 +01:00
Fu Wei
69822aa936
Merge pull request #6258 from wllenyj/fix-registry-panic 2021-11-19 13:35:46 +08:00
wanglei01
5f293d9ac4 [CRI] Fix panic when registry.mirrors use localhost
When containerd use this config:

```
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
      endpoint = ["http://localhost:5000"]
```

Due to the `newTransport` function does not initialize the `TLSClientConfig` field.
Then use `TLSClientConfig` to cause nil pointer dereference

Signed-off-by: wanglei <wllenyj@linux.alibaba.com>
2021-11-19 10:56:46 +08:00
Michael Crosby
aa2733c202
Merge pull request #6170 from olljanat/default-sysctls
CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
2021-11-18 11:37:23 -05:00
Derek McGowan
9afc778b73
Merge pull request #6111 from crosbymichael/latency-metrics
[cri] add sandbox and container latency metrics
2021-11-16 16:59:33 -08:00
Phil Estes
7758cdc09a
Merge pull request #6253 from jonyhy96/feat-rwmutex
feat: use rwmutex instead
2021-11-16 11:39:29 -05:00
Derek McGowan
6835a94707
Split runc shim into plugin components
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
Derek McGowan
6eea8f3f62
Add shutdown package
Allows shutdown to handle callbacks with similar behavior as context
cancel

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
haoyun
bef792b962 feat: use rwmutex instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-16 11:06:40 +08:00
Derek McGowan
d055487b00
Merge pull request #6206 from mxpv/path
Allow absolute path to shim binaries
2021-11-15 18:05:48 -08:00
Olli Janatuinen
2a81c9f677 CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2021-11-15 18:30:09 +02:00
Michael Crosby
6765524b73 use write lock when updating container stats
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-11 15:17:48 +00:00
Maksym Pavlenko
6870f3b1b8 Support custom runtime path when launching tasks
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-11-09 13:31:46 -08:00
Michael Crosby
91bbaf6799 [cri] add sandbox and container latency metrics
These are simple metrics that allow users to view more fine grained metrics on
internal operations.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-09 21:07:38 +00:00
Michael Crosby
4b7cc560b2
Merge pull request #6222 from jonyhy96/add-more-description
cleanup: add more description on comment
2021-11-09 15:55:32 -05:00
haoyun
5748006337 cleanup: add more description on comment
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-09 19:13:37 +08:00
David Porter
2e6d5709e3 Implement CRI container and pods stats
See https://kep.k8s.io/2371

* Implement new CRI RPCs - `ListPodSandboxStats` and `PodSandboxStats`
  * `ListPodSandboxStats` and `PodSandboxStats` which return stats about
    pod sandbox. To obtain pod sandbox stats, underlying metrics are
    read from the pod sandbox cgroup parent.
  * Process info is obtained by calling into the underlying task
  * Network stats are taken by looking up network metrics based on the
    pod sandbox network namespace path
* Return more detailed stats for cpu and memory for existing container
  stats. These metrics use the underlying task's metrics to obtain
  stats.

Signed-off-by: David Porter <porterdavid@google.com>
2021-11-03 17:52:05 -07:00
Dat Nguyen
afe39bebfe add oci.WithAllDevicesAllowed flag for privileged_without_host_devices
This commit adds a flag that enable all devices whitelisting when
privileged_without_host_devices is already enabled.

Fixes #5679

Signed-off-by: Dat Nguyen <dnguyen7@atlassian.com>
2021-11-04 10:24:19 +11:00
Mike Brown
ea89788105 adds additional debug out to timebox cni setup
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-11-01 09:34:29 -05:00
zounengren
a217b5ac8f bump CNI to spec v1.0.0
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-10-22 10:58:40 +08:00
Sambhav Kothari
2a8dac12a7 Output a warning for label image labels instead of erroring
This change ignore errors during container runtime due to large
image labels and instead outputs warning. This is necessary as certain
image building tools like buildpacks may have large labels in the images
which need not be passed to the container.

Signed-off-by: Sambhav Kothari <sambhavs.email@gmail.com>
2021-10-14 19:25:48 +01:00
Claudiu Belu
2bc77b8a28 Adds Windows resource limits support
This will allow running Windows Containers to have their resource
limits updated through containerd. The CPU resource limits support
has been added for Windows Server 20H2 and newer, on older versions
hcsshim will raise an Unimplemented error.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-25 13:20:55 -07:00
Derek McGowan
cb6fb93af5
Merge pull request #6011 from crosbymichael/schedcore
add runc shim support for sched core
2021-10-08 10:42:16 -07:00
Derek McGowan
26ee1b1ee5
Merge pull request #4695 from crosbymichael/cri-class
[cri] Add CNI conf based on runtime class
2021-10-08 09:27:49 -07:00
Michael Crosby
e48bbe8394 add runc shim support for sched core
In linux 5.14 and hopefully some backports, core scheduling allows processes to
be co scheduled within the same domain on SMT enabled systems.

The containerd impl sets the core sched domain when launching a shim. This
allows a clean way for each shim(container/pod) to be in its own domain and any
additional containers, (v2 pods) be be launched with the same domain as well as
any exec'd process added to the container.

kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-10-08 16:18:09 +00:00
Michael Crosby
7b8a697f28
Merge pull request #6034 from claudiubelu/windows/fixes-image-volume
Fixes Windows containers with image volumes
2021-10-07 11:50:01 -04:00
Akihiro Suda
703b86533b
pkg/cap: remove an outdated comment
pkg/cap no longer depends on github.com/syndtr/gocapability

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-10-06 13:24:30 +09:00
Derek McGowan
63b7e5771e
Merge pull request #5973 from Juneezee/deprecate-ioutil
refactor: move from io/ioutil to io and os package
2021-10-01 10:52:06 -07:00
Claudiu Belu
791e175c79 Windows: Fixes Windows containers with image volumes
Currently, there are few issues that preventing containers
with image volumes to properly start on Windows.

- Unlike the Linux implementation, the Container volume mount paths
  were not created if they didn't exist. Those paths are now created.

- while copying the image volume contents to the container volume,
  the layers were not properly deactivated, which means that the
  container can't start since those layers are still open. The layers
  are now properly deactivated, allowing the container to start.

- even if the above issue didn't exist, the Windows implementation of
  mount/Mount.Mount deactivates the layers, which wouldn't allow us
  to copy files from them. The layers are now deactivated after we've
  copied the necessary files from them.

- the target argument of the Windows implementation of mount/Mount.Mount
  was unused, which means that folder was always empty. We're now
  symlinking the Layer Mount Path into the target folder.

- hcsshim needs its Container Mount Paths to be properly formated, to be
  prefixed by C:. This was an issue for Volumes defined with Linux-like
  paths (e.g.: /test_dir). filepath.Abs solves this issue.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-10-01 09:02:18 +00:00
haoyun
5c2426a7b2 cleanup: import from k8s.io/utils/clock/testing instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:34:56 +08:00
haoyun
6484fab1e0 cleanup: import from k8s.io/utils/clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:27:20 +08:00
zounengren
fcffe0c83a switch usage directly to errdefs.(ErrAlreadyExists and ErrNotFound)
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2021-09-24 18:26:58 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Michael Crosby
55893b9be7 Add CNI conf based on runtime class
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-17 19:05:06 +00:00
Phil Estes
f40df3d72b
Enable image config labels in ctr and CRI container creation
Signed-off-by: Phil Estes <estesp@amazon.com>
2021-09-15 15:31:19 -04:00
Phil Estes
d081457ba4
Merge pull request #5974 from claudiubelu/hanging-task-delete-fix
task delete: Closes task IO before waiting
2021-09-15 11:30:23 -04:00
Fu Wei
e1ad779107
Merge pull request #5817 from dmcgowan/shim-plugins
Add support for shim plugins
2021-09-12 18:18:20 +08:00
Fu Wei
d9f921e4f0
Merge pull request #5906 from thaJeztah/replace_os_exec 2021-09-11 10:38:53 +08:00
Phil Estes
6589876d20
Merge pull request #5964 from crosbymichael/cni-pref
add ip_pref CNI options for primary pod ip
2021-09-10 12:06:23 -04:00
Fu Wei
689a863efe
Merge pull request #5939 from scuzhanglei/privileged-device 2021-09-10 22:15:46 +08:00
Michael Crosby
1ddc54c00d
Merge pull request #5954 from claudiubelu/fix-sandbox-remove
sandbox: Allows the sandbox to be deleted in NotReady state
2021-09-10 10:12:34 -04:00
Michael Crosby
1efed43090
add ip_pref CNI options for primary pod ip
This fixes the TODO of this function and also expands on how the primary pod ip
is selected. This change allows the operator to prefer ipv4, ipv6, or retain the
ordering provided by the return results of the CNI plugins.

This makes it much more flexible for ops to configure containerd and how IPs are
set on the pod.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-10 10:04:21 -04:00
scuzhanglei
756f4a3147 cri: add devices for privileged container
Signed-off-by: scuzhanglei <greatzhanglei@gmail.com>
2021-09-10 10:16:26 +08:00
Fu Wei
d58542a9d1
Merge pull request #5627 from payall4u/payall4u/cri-support-cgroup-v2 2021-09-09 23:10:33 +08:00
Claudiu Belu
55faa5e93d task delete: Closes task IO before waiting
After containerd restarts, it will try to recover its sandboxes,
containers, and images. If it detects a task in the Created or
Stopped state, it will be removed. This will cause the containerd
process it hang on Windows on the t.io.Wait() call.

Calling t.io.Close() beforehand will solve this issue.

Additionally, the same issue occurs when trying to stopp a sandbox
after containerd restarts. This will solve that case as well.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-07 02:17:01 -07:00
Wei Fu
2bcd6a4e88 cri: patch update image labels
The CRI-plugin subscribes the image event on k8s.io namespace. By
default, the image event is created by CRI-API. However, the image can
be downloaded by containerd API on k8s.io with the customized labels.
The CRI-plugin should use patch update for `io.cri-containerd.image`
label in this case.

Fixes: #5900

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-09-05 18:48:26 +08:00
Claudiu Belu
24cec9be56 sandbox: Allows the sandbox to be deleted in NotReady state
The Pod Sandbox can enter in a NotReady state if the task associated
with it no longer exists (it died, or it was killed). In this state,
the Pod network namespace could still be open, which means we can't
remove the sandbox, even if --force was used.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-02 03:40:56 -07:00
Mike Brown
e00f87f1dc
Merge pull request #5927 from adelina-t/ws_2022_image_update
Update Pause image in tests & config
2021-08-31 16:11:57 -05:00
Adelina Tuvenie
6d3d34b85d Update Pause image in tests & config
With the introduction of Windows Server 2022, some images have been updated
to support WS2022 in their manifest list. This commit updates the test images
accordingly.

Signed-off-by: Adelina Tuvenie <atuvenie@cloudbasesolutions.com>
2021-08-31 19:42:57 +03:00
Mikko Ylinen
e0f8c04dad cri: Devices ownership from SecurityContext
CRI container runtimes mount devices (set via kubernetes device plugins)
to containers by taking the host user/group IDs (uid/gid) to the
corresponding container device.

This triggers a problem when trying to run those containers with
non-zero (root uid/gid = 0) uid/gid set via runAsUser/runAsGroup:
the container process has no permission to use the device even when
its gid is permissive to non-root users because the container user
does not belong to that group.

It is possible to workaround the problem by manually adding the device
gid(s) to supplementalGroups. However, this is also problematic because
the device gid(s) may have different values depending on the workers'
distro/version in the cluster.

This patch suggests to take RunAsUser/RunAsGroup set via SecurityContext
as the device UID/GID, respectively. The feature must be enabled by
setting device_ownership_from_security_context runtime config value to
true (valid on Linux only).

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-08-30 09:30:00 +03:00
Phil Estes
af1a0908d0
Merge pull request #5865 from dcantah/windows-pod-runasusername
Add RunAsUserName functionality for the Windows pod sandbox container
2021-08-25 22:25:14 -04:00
Sebastiaan van Stijn
2ac9968401
replace uses of os/exec with golang.org/x/sys/execabs
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows. This issue also
affects Unix users who have “.” listed explicitly in their PATH and are running
“go get” outside of a module or with module mode disabled.

This issue is not limited to the go command itself, and can also affect binaries
that use `os.Command`, `os.LookPath`, etc.

From the related blogpost (ttps://blog.golang.org/path-security):

> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing

This patch replaces all uses of `os/exec` with `golang.org/x/sys/execabs`. While
some uses of `os/exec` should not be problematic (e.g. part of tests), it is
probably good to be consistent, in case code gets moved around.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-25 18:11:09 +02:00
Fu Wei
6fa9588531
Merge pull request #5903 from AkihiroSuda/gofmt117
Run `go fmt` with Go 1.17
2021-08-24 23:01:41 +08:00
Daniel Canter
25644b4614 Add RunAsUserName functionality for the Windows Pod Sandbox Container
There was recent changes to cri to bring in a Windows section containing a
security context object to the pod config. Before this there was no way to specify
a user for the pod sandbox container to run as. In addition, the security context
is a field for field mirror of the Windows container version of it, so add the
ability to specify a GMSA credential spec for the pod sandbox container as well.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-08-23 07:35:22 -07:00