Commit Graph

1646 Commits

Author SHA1 Message Date
Sebastiaan van Stijn
f9c80be1bb
remove unneeded nolint-comments (nolintlint), disable deprecated linters
Remove nolint-comments that weren't hit by linters, and remove the "structcheck"
and "varcheck" linters, as they have been deprecated:

    WARN [runner] The linter 'structcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter.  Replaced by unused.
    WARN [runner] The linter 'varcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter.  Replaced by unused.
    WARN [linters context] structcheck is disabled because of generics. You can track the evolution of the generics support by following the https://github.com/golangci/golangci-lint/issues/2649.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-12 14:41:01 +02:00
Sebastiaan van Stijn
29c7fc9520
clean-up "nolint" comments, remove unused ones
- fix "nolint" comments to be in the correct format (`//nolint:<linters>[,<linter>`
  no leading space, required colon (`:`) and linters.
- remove "nolint" comments for errcheck, which is disabled in our config.
- remove "nolint" comments that were no longer needed (nolintlint).
- where known, add a comment describing why a "nolint" was applied.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-12 14:40:59 +02:00
Sebastiaan van Stijn
d215725136
pkg/cri/(server|sbserver): criService.getTLSConfig() add TODO to verify nolint
This `//nolint`  was added in f5c7ac9272
to suppress warnings about the `NameToCertificate` function being deprecated:

    // Deprecated: NameToCertificate only allows associating a single certificate
    // with a given name. Leave that field nil to let the library select the first
    // compatible chain from Certificates.

Looking at that, it was deprecated in Go 1.14 through
eb93c684d4
(https://go-review.googlesource.com/c/go/+/205059), which describes:

    crypto/tls: select only compatible chains from Certificates

    Now that we have a full implementation of the logic to check certificate
    compatibility, we can let applications just list multiple chains in
    Certificates (for example, an RSA and an ECDSA one) and choose the most
    appropriate automatically.

    NameToCertificate only maps each name to one chain, so simply deprecate
    it, and while at it simplify its implementation by not stripping
    trailing dots from the SNI (which is specified not to have any, see RFC
    6066, Section 3) and by not supporting multi-level wildcards, which are
    not a thing in the WebPKI (and in crypto/x509).

We should at least have a comment describing why we are ignoring this, but preferably
review whether we should still use it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-12 14:40:11 +02:00
Ed Bartosh
643dc16565 improve CDI logging
Added logging of found CDI devices.
Fixed test failures caused by the change.

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-10-12 13:45:20 +03:00
Ed Bartosh
8ed910c46a CDI: configure registry on start
Currently CDI registry is reconfigured on every
WithCDI call, which is a relatively heavy operation.

This happens because cdi.GetRegistry(cdi.WithSpecDirs(cdiSpecDirs...))
unconditionally reconfigures the registry (clears fs notify watch,
sets up new watch, rescans directories).

Moving configuration to the criService.initPlatform should result
in performing registry configuration only once on the service start.

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-10-12 13:45:20 +03:00
Ed Bartosh
eec7a76ecd move WithCDI to pkg/cri/opts
As WithCDI is CRI-only API it makes sense to move it
out of oci module.

This move can also fix possible issues with this API when
CRI plugin is disabled.

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-10-12 13:45:20 +03:00
Phil Estes
9b0ab84449
Merge pull request #7481 from qiutongs/netns-fix
Update container with sandbox metadata after NetNS is created
2022-10-10 15:48:54 -07:00
Qiutong Song
b41d6f40bb Update container with sandbox metadata after NetNS is created
Signed-off-by: Qiutong Song <songqt01@gmail.com>
2022-10-09 01:14:08 +00:00
Akihiro Suda
70fbedc217
archive: add WithSourceDateEpoch() for whiteouts
This makes diff archives to be reproducible.

The value is expected to be passed from CLI applications via the $SOUCE_DATE_EPOCH env var.

See https://reproducible-builds.org/docs/source-date-epoch/
for the $SOURCE_DATE_EPOCH specification.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-10-08 08:45:03 +09:00
wanglei01
a59ecc50e3 CRI: implement Controller.Delete for SandboxAPI
Signed-off-by: WangLei <wllenyj@linux.alibaba.com>
2022-09-30 16:55:54 +08:00
Derek McGowan
1cc38f8df7
Merge pull request #5904 from qiutongs/ip-leakage-fix 2022-09-29 18:14:35 -07:00
Qiutong Song
4f4aad057d Persist container and sandbox if resource cleanup fails, like teardownPodNetwork
Signed-off-by: Qiutong Song <songqt01@gmail.com>
2022-09-27 14:38:41 +00:00
Maksym Pavlenko
39f7cd73e7
Merge pull request #7405 from kzys/cri-fuzz
Refactor CRI fuzzers
2022-09-22 16:55:27 -07:00
Maksym Pavlenko
23b545232c
Merge pull request #7417 from ruiwen-zhao/grpc_code
Set grpc code for unimplemented cri-api methods
2022-09-22 12:12:34 -07:00
Phil Estes
8f95bac049
Merge pull request #7401 from wllenyj/sandbox_stop
Sandbox API: implement Controller.Wait and Controller.Stop
2022-09-22 14:33:52 -04:00
ruiwen-zhao
c6f571fc7d Set grpc code for unimplemented cri-api methods
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-09-22 07:24:48 +00:00
wanglei01
82890dd290 CRI: implement Controller.Stop for SandboxAPI
Signed-off-by: WangLei <wllenyj@linux.alibaba.com>
2022-09-22 14:38:52 +08:00
wanglei01
927906992f CRI: implement Controller.Wait for SandboxAPI
Rework sandbox monitoring, we should rely on Controller.Wait instead of
CRIService.StartSandboxExitMonitor

Signed-off-by: WangLei <wllenyj@linux.alibaba.com>
2022-09-22 14:38:45 +08:00
Ed Bartosh
e22a7a3833 reference CDI configuration details
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-09-21 11:25:28 +03:00
Samuel Karp
c8010b9cbe
sbserver: return resources in ContainerStatus
Port of b7b1200dd3 to sbserver

Signed-off-by: Samuel Karp <samuelkarp@google.com>
2022-09-20 18:38:09 -07:00
Kazuyoshi Kato
a37c64b20c Refactor CRI fuzzers
pkg/cri/sbserver/cri_fuzzer.go and pkg/cri/server/cri_fuzzer.go were
mostly the same.

This commit merges them together and move the unified fuzzer to
contrib/fuzz again to sort out dependencies. pkg/cri/ shouldn't consume
cmd/.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-09-19 22:14:11 +00:00
Phil Estes
a1e4a94694
Merge pull request #7393 from Iceber/skip_verify
remotes/docker/config: Skipping TLS verification for localhost
2022-09-19 10:53:56 -04:00
Iceber Gu
3cfde732e1 remotes/docker/config: Skipping TLS verification for localhost
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2022-09-13 17:40:23 +08:00
Kevin Parsons
de509c0682
Merge pull request #6901 from dcantah/add-wcowhyp-runtime
windows: Add runhcs-wcow-hypervisor runtimeclass to the default config
2022-09-08 10:53:12 -07:00
lengrongfu
3c0e6c40ad feat: upgrade registry.k8s.io/pause version
Signed-off-by: rongfu.leng <1275177125@qq.com>
2022-09-06 15:59:20 +08:00
Abirdcfly
dcfaa30ba2 chore: remove duplicate word in comments
Signed-off-by: Abirdcfly Fu <fp544037857@gmail.com>
2022-08-29 13:05:32 +08:00
Samuel Karp
36d0cfd0fd
Merge pull request #6517 from ruiwen-zhao/return-resource 2022-08-24 14:01:30 -07:00
ruiwen-zhao
b7b1200dd3 ContainerStatus to return container resources
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-08-24 19:08:06 +00:00
Paco Xu
9525b3148a migrate from k8s.gcr.io to registry.k8s.io
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-08-24 13:46:46 +08:00
Daniel Canter
f0036cb9dc windows: Add runhcs-wcow-hypervisor runtimeclass to the default config
As part of the effort of getting hypervisor isolated windows container
support working for the CRI entrypoint here, add the runhcs-wcow-hypervisor
handler for the default config. This sets the correct SandboxIsolation
value that the Windows shim uses to differentiate process vs. hypervisor
isolation. This change additionally sets the wcow-process runtime to
passthrough io.microsoft.container* annotations and the hypervisor runtime
to accept io.microsoft.virtualmachine* annotations.

Note that for K8s users this runtime handler will need to be configured by
creating the corresponding RuntimeClass resources on the cluster as it's
not the default runtime.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-08-19 07:56:43 -07:00
Wei Fu
460b0533b2 pkg/cri/streaming: increase ReadHeaderTimeout
It is follow-up of #7254. This commit will increase ReadHeaderTimeout
from 3s to 30m, which prevent from unexpected timeout when the node is
running with high-load. 30 Minutes is longer enough to get close to
before what #7254 changes.

And ideally, we should allow user to configure the streaming server if
the users want this feature.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-08-18 07:42:12 +08:00
ruiwen-zhao
6e4b6830f1 Update CRI-API
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2022-08-10 03:55:51 +00:00
Maksym Pavlenko
ca3b9b50fe Run gofmt 1.19
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-08-04 18:18:33 -07:00
Maksym Pavlenko
5cf77fc43d Add TODOs for the remaining work
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-08-04 10:29:15 -07:00
Maksym Pavlenko
aa3303b697 Update sandbox protobuf to match CRI
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-29 16:08:07 -07:00
Maksym Pavlenko
8823224174 Update controller's start response to incldue pid and labels
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-29 16:08:07 -07:00
Maksym Pavlenko
3d028308ef Cleanup CRI files
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-29 16:08:07 -07:00
Maksym Pavlenko
c085fac1e5 Move sandbox start behind controller
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-29 16:08:07 -07:00
Brian Goff
f5fb2c32d2 Regenerate protos with updated protoc-gen-go
This fixes CI issues

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2022-07-28 16:59:30 +00:00
Fu Wei
116af9d1cc
Merge pull request #7207 from zouyee/relabel
replace with selinux label
2022-07-28 00:05:31 +08:00
Derek McGowan
6acde90772
Merge pull request #7069 from fuweid/failpoint-in-runc-shimv2
test: introduce failpoint control to runc-shimv2 and cni
2022-07-26 23:12:20 -07:00
zounengren
d121efc6d8 replace with selinux label
Signed-off-by: zounengren <zouyee1989@gmail.com>
2022-07-24 20:11:16 +08:00
zounengren
20e7b399f9 prevent Server reuse after a Shutdown
Signed-off-by: zounengren <zouyee1989@gmail.com>
2022-07-24 15:55:16 +08:00
Jeff Widman
050cd58ce6 Drop deprecated ioutil
`ioutil` has been deprecated by golang. All the code in `ioutil` just
forwards functionality to code in either the `io` or `os` packages.

See https://github.com/golang/go/pull/51961 for more info.

Signed-off-by: Jeff Widman <jeff@jeffwidman.com>
2022-07-23 08:36:20 -07:00
Maksym Pavlenko
500ff95f02 Make getServicesOpts a helper
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-22 19:38:45 -07:00
Wei Fu
cbebeb9440 pkg/failpoint: add FreeBSD link and update pkg doc
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-07-22 23:25:40 +08:00
Wei Fu
1ae6e8b076 pkg/failpoint: add DelegatedEval API
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-07-22 23:25:40 +08:00
Wei Fu
ffd59ba600 pkg/failpoint: init failpoint package
Failpoint is used to control the fail during API call when testing, especially
the API is complicated like CRI-RunPodSandbox. It can help us to test
the unexpected behavior without mock. The control design is based on freebsd
fail(9), but simpler.

REF: https://www.freebsd.org/cgi/man.cgi?query=fail&sektion=9&apropos=0&manpath=FreeBSD%2B10.0-RELEASE

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-07-22 23:25:40 +08:00
Danielle Lancashire
3125f7e1a0 cri_stats: handle missing cpu stats
Signed-off-by: Danielle Lancashire <dani@builds.terrible.systems>
2022-07-22 12:10:24 +00:00
Fu Wei
badb66113c
Merge pull request #7189 from zouyee/ctx 2022-07-22 11:09:02 +08:00
Derek McGowan
24aad6dd46
Merge pull request #7182 from HeavenTonight/main
code cleanup
2022-07-20 13:09:10 -07:00
zounengren
7dc66eee64 using ContextDialer instead
Signed-off-by: zounengren <zouyee1989@gmail.com>
2022-07-20 22:53:42 +08:00
James Sturtevant
0d6881898e Refactor usageNanoCores be to used for all OSes
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2022-07-19 16:49:08 -07:00
guiyong.ou
628f6ac681 code cleanup
Signed-off-by: guiyong.ou <guiyong.ou@daocloud.io>
2022-07-19 22:46:32 +08:00
Maksym Pavlenko
e69a83f356
Merge pull request #7168 from mxpv/linter
Update and align golangci-lint version
2022-07-18 12:23:06 -07:00
Mike Brown
88bcbb0361 adds a comment explaining how to disable experimental sbserver
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2022-07-15 17:00:56 -05:00
Maksym Pavlenko
3a3f43f72f Fix linter warnings
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-15 13:29:04 -07:00
Maksym Pavlenko
98a1b7ff1b Add log messages when choosing CRI server
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-14 09:12:35 -07:00
Maksym Pavlenko
2ba6353316 Change metrics namespace for sandboxed CRI to prevent panic
panic: duplicate metrics collector registration attempted

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-13 12:47:13 -07:00
Maksym Pavlenko
b8e93774c1 Enable integration tests against sandboxed CRI
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-13 12:02:06 -07:00
Maksym Pavlenko
cf5df7e4ac Fork CRI server package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-07-13 10:54:59 -07:00
Daniel Canter
bcdc8468f8 Fix out of date comments for CRI store packages
All of the CRI store related packages all use the standard errdefs
errors now for if a key doesn't or already exists (ErrAlreadyExists,
ErrNotFound), but the comments for the methods still referenced
some unused package specific error definitions. This change just
updates the comments to reflect what errors are actually returned
and adds comments for some previously undocumented exported functions.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-07-11 13:57:39 -07:00
Derek McGowan
aee50aeac2
Merge pull request #7108 from fuweid/refactor-cri-api
pkg/cri: use marshal wrapper for version convertor
2022-06-29 13:58:15 -07:00
Wei Fu
c2703c08c9 pkg/cri: use marshal wrapper for version convertor
Use wrapper for ReopenContainerLog v1alpha proto.

Ref: #5619

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-06-29 22:20:47 +08:00
Kazuyoshi Kato
66cc0fc879 Copy FuzzCRI from cncf/cncf-fuzzing
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-06-27 22:54:25 +00:00
Kazuyoshi Kato
57200edf25 Use testing.F on FuzzParseProcPIDStatus
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-06-15 14:56:20 +00:00
wllenyj
42a386c816 CRI: change the /dev/shm mount options in Sandbox.
All containers except the pause container, mount `/dev/shm" with flags
`nosuid,nodev,noexec`. So change mount options for pause container to
keep consistence.
This also helps to solve issues of failing to mount `/dev/shm` when
pod/container level user namespace is enabled.

Fixes: #6911

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Lei Wang <wllenyj@linux.alibaba.com>
2022-06-14 10:45:06 +08:00
wllenyj
a62a95789c CRI: remove default /dev/shm mount in Sandbox.
This's an optimization to get rid of redundant `/dev/shm" mounts for pause container.
In `oci.defaultMounts`, there is a default `/dev/shm` mount which is redundant for
pause container.

Fixes: #6911

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Signed-off-by: Lei Wang  <wllenyj@linux.alibaba.com>
2022-06-14 10:45:06 +08:00
Maksym Pavlenko
e71ffddb6b
Merge pull request #7042 from samuelkarp/freebsd-unit-tests
Port (some) unit tests to FreeBSD
2022-06-10 15:05:52 -07:00
Kazuyoshi Kato
4ec6a379c0
Merge pull request #6918 from dcantah/windows-snapshotter-cleanup
Windows snapshotter touch ups and new functionality
2022-06-10 11:08:18 -07:00
Samuel Karp
42e019e634
cri/server: Disable tests on FreeBSD
The TestPodAnnotationPassthroughContainerSpec test and the
TestContainerAnnotationPassthroughContainerSpec test both depend on a
platform-specific implementation of criService.containerSpec, which is
unimplemented on FreeBSD.

The TestSandboxContainerSpec depends on a platform-specific
implementation oc criService.sandboxContainerSpec, which is
unimplemented on FreeBSD.

Signed-off-by: Samuel Karp <me@samuelkarp.com>
2022-06-09 18:54:10 -07:00
Shane Jennings
6190b0f04b
Correct spelling mistake ("sanbdox" to "sandbox")
Signed-off-by: Shane Jennings <superzinbo@gmail.com>
2022-06-07 10:55:15 +01:00
Daniel Canter
44e12dc5d8 Windows snapshotter touch ups and new functionality
This change does a couple things to remove some cruft/unused functionality
in the Windows snapshotter, as well as add a way to specify the rootfs
size in bytes for a Windows container via a new field added in the CRI api in
k8s 1.24. Setting the rootfs/scratch volume size was assumed to be working
prior to this but turns out not to be the case.

Previously I'd added a change to pass any annotations in the containerd
snapshot form (containerd.io/snapshot/*) as labels for the containers
rootfs snapshot. This was added as a means for a client to be able to provide
containerd.io/snapshot/io.microsoft.container.storage.rootfs.size-gb as an
annotation and have that be translated to a label and ultimately set the
size for the scratch volume in Windows. However, this actually only worked if
interfacing with the CRI api directly (crictl) as Kubernetes itself will
fail to validate annotations that if split by "/" end up with > 2 parts,
which the snapshot labels will (containerd.io / snapshot / foobarbaz).

With this in mind, passing the annotations and filtering to
containerd.io/snapshot/* is moot, so I've removed this code in favor of
a new `snapshotterOpts()` function that will return platform specific
snapshotter options if ones exist. Now on Windows we can just check if
RootfsSizeInBytes is set on the WindowsContainerResources struct and
then return a snapshotter option that sets the right label.

So all in all this change:
- Gets rid of code to pass CRI annotations as labels down to snapshotters.

- Gets rid of the functionality to create a 1GB sized scratch disk if
the client provided a size < 20GB. This code is not used currently and
has a few logical shortcomings as it won't be able to create the disk
if a container is already running and using the same base layer. WCIFS
(driver that handles the unioning of windows container layers together)
holds open handles to some files that we need to delete to create the
1GB scratch disk is the underlying problem.

- Deprecates the containerd.io/snapshot/io.microsoft.container.storage.rootfs.size-gb
label in favor of a new containerd.io/snapshot/windows/rootfs.sizebytes label.
The previous label/annotation wasn't being used by us, and from a cursory
github search wasn't being used by anyone else either. Now that there is a CRI
field to specify the size, this should just be a field that users can set
on their pod specs and don't need to concern themselves with what it eventually
gets translated to, but non-CRI clients can still use the new label/deprecated
label as usual.

- Add test to cri integration suite to validate expanding the rootfs size.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-06-06 14:57:07 -07:00
Derek McGowan
c1bcabb454
Merge pull request from GHSA-5ffw-gxpp-mxpf
Limit the response size of ExecSync
2022-06-06 10:19:23 -07:00
Kazuyoshi Kato
40aa4f3f1b
Implicitly discard the input to drain the reader
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-06-06 09:57:13 -07:00
Phil Estes
2b661b890f
Merge pull request #6899 from shuaichang/ISSUE6657-support-runtime-snapshotter
Support runtime level snapshotter for issue 6657
2022-06-03 10:04:53 +02:00
shuaichang
7b9f1d4058 Added support for runtime level snapshotter, issue 6657
Signed-off-by: shuaichang <shuai.chang@databricks.com>

Updated annotation name
2022-06-02 16:29:59 -07:00
Fu Wei
aa0aaa4947
Merge pull request #7009 from mikebrow/update-gocni 2022-06-02 11:09:46 +08:00
Mike Brown
e3b4d750db update go-cni/for cni update fixing plugins that don't respond with version
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2022-06-01 17:20:18 -05:00
Kazuyoshi Kato
c149e6c2ea
Merge pull request #6996 from dcantah/hpc-validations
Add validations for Windows HostProcess CRI configs
2022-06-01 11:37:12 -07:00
Kazuyoshi Kato
fcd0c86c70
Merge pull request #7007 from dmcgowan/move-docker-sort
Move docker reference logic to reference/docker package
2022-06-01 11:33:52 -07:00
Phil Estes
5bc2d2e429
Merge pull request #7003 from pacoxu/pause-3.7
promote pause image to 3.7 (sync with kube v1.24)
2022-06-01 05:59:14 -04:00
Derek McGowan
8ed54849a6
Move docker reference logic to reference/docker package
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-05-31 22:40:49 -07:00
Mike Brown
8c27ce4193
Merge pull request #6993 from mxpv/images
CRI: cleanup cri/store package
2022-05-31 20:38:43 -05:00
Kazuyoshi Kato
49ca87d727 Limit the response size of ExecSync
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-05-31 22:21:35 +00:00
Paco Xu
1cf6f20320 promote pause image to 3.7
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-05-30 15:08:28 +08:00
Daniel Canter
b5e1b8f619 Use t.Run for /pkg/cri tests
A majority of the tests in /pkg/cri are testing/validating multiple
things per test (generally spec or options validations). This flow
lends itself well to using *testing.T's Run method to run each thing
as a subtest so `go test` output can actually display which subtest
failed/passed.

Some of the tests in the packages in pkg/cri already did this, but
a bunch simply logged what sub-testcase was currently running without
invoking t.Run.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-05-29 18:32:09 -07:00
Maksym Pavlenko
b572a82ad8 CRI: Remove deprecated error types and update error msg
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-28 13:53:28 -07:00
Daniel Canter
978ff393d2 Add validations for Windows HostProcess CRI configs
HostProcess containers require every container in the pod to be a
host process container and have the corresponding field set. The Kubelet
usually enforces this so we'd error before even getting here but we recently
found a bug in this logic so better to be safe than sorry.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-05-27 21:17:07 -07:00
Maksym Pavlenko
688b30cf52 CRI: Move truncindex to pkg
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-26 13:02:45 -07:00
Maksym Pavlenko
e44335800e CRI: Move reference sorting to reference package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-26 12:52:36 -07:00
Maksym Pavlenko
b5366f8d7e CRI: Retrieve image spec on client
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-05-26 12:38:55 -07:00
AllenZMC
eaec6530d7 fix some confusing typos
Signed-off-by: AllenZMC <zhongming.chang@daocloud.io>
2022-05-17 23:53:36 +08:00
Akihiro Suda
42584167b7
Officially deprecate Schema 1
Schema 1 has been substantially deprecated since circa. 2017 in favor of Schema 2 introduced in Docker 1.10 (Feb 2016)
and its successor OCI Image Spec v1, but we have not officially deprecated Schema 1.

One of the reasons was that Quay did not support Schema 2 so far, but it is reported that Quay has been
supporting Schema 2 since Feb 2020 (moby/buildkit issue 409).

This PR deprecates pulling Schema 1 images but the feature will not be removed before containerd 2.0.
Pushing Schema 1 images was never implemented in containerd (and its consumers such as BuildKit).

Docker/Moby already disabled pushing Schema 1 images in Docker 20.10 (moby/moby PR 41295),
but Docker/Moby has not yet disabled pulling Schema 1 as containerd has not yet deprecated Schema 1.
(See the comments in moby/moby PR 42300.)
Docker/Moby is expected to disable pulling Schema 1 images in future after this deprecation.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-05-02 19:08:38 +09:00
Mike Brown
6b35307594
Merge pull request #5490 from askervin/5Bu_blockio
Support for cgroups blockio
2022-04-29 10:07:56 -05:00
Antti Kervinen
10576c298e cri: support blockio class in pod and container annotations
This patch adds support for a container annotation and two separate
pod annotations for controlling the blockio class of containers.

The container annotation can be used by a CRI client:
  "io.kubernetes.cri.blockio-class"

Pod annotations specify the blockio class in the K8s pod spec level:
  "blockio.resources.beta.kubernetes.io/pod"
  (pod-wide default for all containers within)

  "blockio.resources.beta.kubernetes.io/container.<container_name>"
  (container-specific overrides)

Correspondingly, this patch adds support for --blockio-class and
--blockio-config-file to ctr, too.

This implementation follows the resource class annotation pattern
introduced in RDT and merged in commit 893701220.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2022-04-29 11:44:09 +03:00
Kazuyoshi Kato
29b9379560 make protos
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-27 21:31:16 +00:00
Kazuyoshi Kato
fcba486366 Remove gogo from .proto files
While gogo isn't actually used, it is still referenced from .proto files
and its corresponding Go package is imported from the auto-generated
files.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-27 20:27:55 +00:00
Kazuyoshi Kato
7bd42d226a
Merge pull request #6856 from kangclzjc/container-remove-dup-20220426
remove duplicate
2022-04-27 09:32:08 -07:00
Derek McGowan
6e0231f992
Merge pull request #6150 from fuweid/support-4984
feature: support image pull progress timeout
2022-04-26 12:15:09 -07:00
Wei Fu
00d102da9f feature: support image pull progress timeout
Kubelet sends the PullImage request without timeout, because the image size
is unknown and timeout is hard to defined. The pulling request might run
into 0B/s speed, if containerd can't receive any packet in that connection.
For this case, the containerd should cancel the PullImage request.

Although containerd provides ingester manager to track the progress of pulling
request, for example `ctr image pull` shows the console progress bar, it needs
more CPU resources to open/read the ingested files to get status.

In order to support progress timeout feature with lower overhead, this
patch uses http.RoundTripper wrapper to track active progress. That
wrapper will increase active-request number and return the
countingReadCloser wrapper for http.Response.Body. Each bytes-read
can be count and the active-request number will be descreased when the
countingReadCloser wrapper has been closed. For the progress tracker,
it can check the active-request number and bytes-read at intervals. If
there is no any progress, the progress tracker should cancel the
request.

NOTE: For each blob data, the containerd will make sure that the content
writer is opened before sending http request to the registry. Therefore, the
progress reporter can rely on the active-request number.

fixed: #4984

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-27 00:02:27 +08:00
Fu Wei
0d696d2f4b
Merge pull request #6749 from dmcgowan/unpacker-interface 2022-04-26 20:54:51 +08:00
Kang.Zhang
fceab7f4c4 remove duplicate
Signed-off-by: Kang.Zhang <Kang.zhang@intel.com>
2022-04-26 10:44:45 +08:00
Derek McGowan
0e6c7bf931
Fix undefined error in use of errors package
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-25 15:21:21 -07:00
Derek McGowan
3dbd6a2498
Merge pull request #6841 from kzys/proto-upgrade-6
Migrate off from github.com/gogo/protobuf
2022-04-25 15:12:51 -07:00
Kazuyoshi Kato
f140400c0e
Merge pull request #5686 from dtnyn/issue-5679
Add flag to allow oci.WithAllDevicesAllowed on PrivilegedWithoutHostDevices
2022-04-25 11:44:01 -07:00
Kazuyoshi Kato
7a4f81d8ba Fix tests
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:41:05 +00:00
Kazuyoshi Kato
9dbe000a38 make protos
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:31:53 +00:00
Kazuyoshi Kato
e3db7de8f5 Remove gogo/protobuf and adjust types
This commit migrates containerd/protobuf from github.com/gogo/protobuf
to google.golang.org/protobuf and adjust types. Proto-generated structs
cannot be passed as values.

Fixes #6564.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-22 15:31:53 +00:00
Henry Wang
8710d4d014 cri: close fifos when container is deleted
Signed-off-by: Henry Wang <henwang@amazon.com>
2022-04-21 21:46:50 +00:00
Kazuyoshi Kato
237ef0de9b Remove all gogoproto extensions
This commit removes the following gogoproto extensions;

- gogoproto.nullable
- gogoproto.customename
- gogoproto.unmarshaller_all
- gogoproto.stringer_all
- gogoproto.sizer_all
- gogoproto.marshaler_all
- gogoproto.goproto_unregonized_all
- gogoproto.goproto_stringer_all
- gogoproto.goproto_getters_all

None of them are supported by Google's toolchain (see #6564).

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-20 07:23:28 +00:00
Derek McGowan
39692e7672
unpack: return error when no platforms defined
Require platforms to be non-empty to avoid no-op unpack

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-19 18:21:57 -07:00
Derek McGowan
8017daa12d
Add unpack interface to be used by client
Move client unpacker to pkg/unpack

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-19 18:21:57 -07:00
Kazuyoshi Kato
88c0c7201e Consolidate gogo/protobuf dependencies under our own protobuf package
This would make gogo/protobuf migration easier.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 15:53:36 +00:00
Kazuyoshi Kato
80b825ca2c Remove gogoproto.stdtime
This commit removes gogoproto.stdtime, since it is not supported by
Google's official toolchain
(see https://github.com/containerd/containerd/issues/6564).

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 13:39:30 +00:00
Derek McGowan
fe8da6dcaf
Move lease manager plugin to separate package
Create lease plugin type to separate lease manager from services plugin.
This allows other service plugins to depend on the lease manager.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-04-15 11:08:47 -07:00
Derek McGowan
98260e1b18
Merge pull request #6806 from mikebrow/netns-hardening
check for duplicate nspath possibilities
2022-04-14 15:02:44 -07:00
Mike Brown
147f0a7e02 check for duplicate nspath possibilities
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2022-04-14 18:33:19 +00:00
Andrey Klimentyev
5f3ce9512b Do not append []string{""} to command to preserve Docker compatibility
Signed-off-by: Andrey Klimentyev <andrey.klimentyev@flant.com>
2022-04-13 13:29:49 +03:00
Eric Lin
a5dfbfcf5a cri: load sandboxes/containers/images in parallel
Parallelizing them decreases loading duration.

Time to complete recover():
* Without competing IOs + without opt: 21s
* Without competing IOs + with opt: 14s
* Competing IOs + without opt: 3m44s
* Competing IOs + with opt: 33s

Signed-off-by: Eric Lin <linxiulei@gmail.com>
2022-04-09 13:01:14 +00:00
Ed Bartosh
ff5c55847a move CDI calls to the linux-only code
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:10:59 +03:00
Ed Bartosh
c9b4ccf83e add configuration for CDI
Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:10:54 +03:00
Ed Bartosh
aed0538dac cri: implement CDI device injection
Extract the names of requested CDI devices and update the OCI
Spec according to the corresponding CDI device specifications.

CDI devices are requested using container annotations in the
cdi.k8s.io namespace. Once CRI gains dedicated fields for CDI
injection the snippet for extracting CDI names will need an
update.

Signed-off-by: Ed Bartosh <eduard.bartosh@intel.com>
2022-04-06 13:07:54 +03:00
Wei Fu
8113758568 CRI: improve image pulling performance
Background:

With current design, the content backend uses key-lock for long-lived
write transaction. If the content reference has been marked for write
transaction, the other requestes on the same reference will fail fast with
unavailable error. Since the metadata plugin is based on boltbd which
only supports single-writer, the content backend can't block or handle
the request too long. It requires the client to handle retry by itself,
like OpenWriter - backoff retry helper. But the maximum retry interval
can be up to 2 seconds. If there are several concurrent requestes fo the
same image, the waiters maybe wakeup at the same time and there is only
one waiter can continue. A lot of waiters will get into sleep and we will
take long time to finish all the pulling jobs and be worse if the image
has many more layers, which mentioned in issue #4937.

After fetching, containerd.Pull API allows several hanlers to commit
same ChainID snapshotter but only one can be done successfully. Since
unpack tar.gz is time-consuming job, it can impact the performance on
unpacking for same ChainID snapshotter in parallel.

For instance, the Request 2 doesn't need to prepare and commit, it
should just wait for Request 1 finish, which mentioned in pull
request #6318.

```text
	Request 1	Request 2

	Prepare
	   |
	   |
	   |
	   |		Prepare
	Commit		   |
			   |
			   |
			   |
			Commit(failed on exist)
```

Both content backoff retry and unnecessary unpack impacts the performance.

Solution:

Introduced the duplicate suppression in fetch and unpack context. The
deplicate suppression uses key-mutex and single-waiter-notify to support
singleflight. The caller can use the duplicate suppression in different
PullImage handlers so that we can avoid unnecessary unpack and spin-lock
in OpenWriter.

Test Result:

Before enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	1m6.172s
user	0m0.268s
sys	0m0.193s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.324s
user	0m0.441s
sys	0m0.316s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	1m47.657s
user	0m0.284s
sys	0m0.224s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.381s
user	0m0.488s
sys	0m0.358s
```

With this enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	0m1.140s
user	0m0.243s
sys	0m0.178s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.239s
user	0m0.463s
sys	0m0.275s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	0m5.546s
user	0m0.217s
sys	0m0.219s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.090s
user	0m0.501s
sys	0m0.331s
```

Test Script:

localhost:5000/{redis|golang}:latest is equal to
docker.io/library/{redis|golang}:latest. The image is hold in local registry
service by `docker run -d -p 5000:5000 --name registry registry:2`.

```bash

image_name="${1}"
pull_times="${2:-10}"

cleanup() {
  ctr image rmi "${image_name}"
  ctr -n k8s.io image rmi "${image_name}"
  crictl rmi "${image_name}"
  docker rmi "${image_name}"
  sleep 2
}

crictl_testing() {
  for idx in $(seq 1 ${pull_times}); do
    crictl pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

docker_testing() {
  for idx in $(seq 1 ${pull_times}); do
    docker pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

cleanup > /dev/null 2>&1

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "crictl pull $image_name (x${pull_times}) takes ..."
time crictl_testing
echo

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "docker pull $image_name (x${pull_times}) takes ..."
time docker_testing
```

Fixes: #4937
Close: #4985
Close: #6318

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-06 07:14:18 +08:00
Maksym Pavlenko
871b6b6a9f Use testify
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-04-01 18:17:58 -07:00
Fu Wei
d394e00c7e
Merge pull request #6738 from zhsj/fix-test-msg
Fix error message in TestNewBinaryIO
2022-03-25 23:40:06 +08:00
Phil Estes
3633cae64b
Merge pull request #6706 from kzys/typeurl-upgrade
Use typeurl.Any instead of github.com/gogo/protobuf/types.Any
2022-03-25 10:38:46 -04:00
Shengjing Zhu
2689432bfa Fix error message in TestNewBinaryIO
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-03-25 11:56:21 +08:00
Kazuyoshi Kato
96b16b447d Use typeurl.Any instead of github.com/gogo/protobuf/types.Any
This commit upgrades github.com/containerd/typeurl to use typeurl.Any.
The interface hides gogo/protobuf/types.Any from containerd's Go client.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-03-24 20:50:07 +00:00
Fu Wei
e7cba85e1c
Merge pull request #6323 from jepio/jepio/fix-cgroupv2-oom-event 2022-03-24 15:51:28 +08:00
Derek McGowan
551516a18d
Merge pull request from GHSA-c9cp-9c75-9v8c
Fix the Inheritable capability defaults.
2022-03-23 10:50:56 -07:00
Phil Estes
36dcc76fa9
Merge pull request #6496 from thaJeztah/deprecate_criu_opt
runtime: deprecate runc --criu / -criu-path option
2022-03-23 11:17:58 -04:00
Sebastiaan van Stijn
d2013d2c99
runtime: deprecate runc --criu / -criu-path option
runc option --criu is now ignored (with a warning), and the option will be
removed entirely in a future release. Users who need a non- standard criu
binary should rely on the standard way of looking up binaries in $PATH.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-03-23 14:42:43 +01:00
Amit Barve
bfde58e3cd Bug fix for mount path handling
Currently when handling 'container_path' elements in container mounts we simply call
filepath.Clean on those paths. However, filepath.Clean adds an extra '.' if the path is a
simple drive letter ('E:' or 'Z:' etc.). These type of paths cause failures (with incorrect
parameter error) when creating containers via hcsshim. This commit checks for such paths
and doesn't call filepath.Clean on them.
It also adds a new check to error out if the destination path is a C drive and moves the
dst path checks out of the named pipe condition.

Signed-off-by: Amit Barve <ambarve@microsoft.com>
2022-03-21 09:40:19 -07:00
Phil Estes
ee49c4d557
Add nolint:staticcheck to platform-specific calls
The linter on platforms that have a hardcoded response complains about
"if xyz == nil" checks; ignore those.

Signed-off-by: Phil Estes <estesp@amazon.com>
2022-03-17 18:24:00 -04:00
Fu Wei
d9797673b0
Merge pull request #6593 from qiutongs/improve-container-mount
Make the temp mount as ready only in container WithVolumes
2022-03-18 00:03:28 +08:00
Eng Zer Jun
18ec2761c0
test: use T.TempDir to create temporary test directory
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-03-15 14:03:50 +08:00
Paul "TBBle" Hampson
39d52118f5 Plumb CRI Devices through to OCI WindowsDevices
There's two mappings of hostpath to IDType and ID in the wild:
- dockershim and dockerd-cri (implicitly via docker) use class/ID
-- The only supported IDType in Docker is 'class'.
-- https://github.com/aarnaud/k8s-directx-device-plugin generates this form
- https://github.com/jterry75/cri (windows_port branch) uses IDType://ID
-- hcsshim's CRI test suite generates this form

`://` is much more easily distinguishable, so I've gone with that one as
the generic separator, with `class/` as a special-case.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2022-03-12 08:16:43 +11:00
Derek McGowan
8acbb27647
Merge pull request from GHSA-crp2-qrr5-8pq7
Clean image volume path
2022-03-02 10:03:17 -08:00
Shengjing Zhu
352a8f49f7 cri: relax test for system without hugetlb
These unit tests don't check hugetlb. However by setting
TolerateMissingHugetlbController to false, these tests can't
be run on system without hugetlb (e.g. Debian buildd).

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-28 01:38:58 +08:00
Qiutong Song
ec90efbe99 Make the temp mount as ready only in container WithVolumes
Signed-off-by: Qiutong Song <songqt01@gmail.com>
2022-02-25 17:53:30 -08:00
Shengjing Zhu
ea3d2e6433 go.mod: update to github.com/tchap/go-patricia/v2 v2.3.1
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-26 05:04:55 +08:00
Phil Estes
2b2372d43e
Merge pull request #6337 from thaJeztah/bump_go_restful
go.mod: update to github.com/emicklei/go-restful/v3 v3.7.3
2022-02-22 17:33:37 -05:00
Shengjing Zhu
f4f41296c2 Replace golang.org/x/net/context with std library
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-22 02:27:05 +08:00
Sebastiaan van Stijn
481fb923c5
go.mod: update to github.com/emicklei/go-restful/v3 v3.7.3
full diff: https://github.com/emicklei/go-restful/compare/v2.9.5...v3.7.3

- Switch to using go modules
- Add check for wildcard to fix CORS filter
- Add check on writer to prevent compression of response twice
- Add OPTIONS shortcut WebService receiver
- Add Route metadata to request attributes or allow adding attributes to routes
- Add wroteHeader set
- Enable content encoding on Handle and ServeHTTP
- Feat: support google custom verb
- Feature: override list of method allowed without content-type
- Fix Allow header not set on '405: Method Not Allowed' responses
- Fix Go 1.15: conversion from int to string yields a string of one rune
- Fix WriteError return value
- Fix: use request/response resulting from filter chain
- handle path params with prefixes and suffixes
- HTTP response body was broken, if struct to be converted to JSON has boolean value
- List available representations in 406 body
- Support describing response headers
- Unwrap function in filter chain + remove unused dispatchWithFilters

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-02-18 21:54:27 +01:00
ruiwen-zhao
fb0b8d6177 Use fs.RootPath when mounting volumes
Signed-off-by: Ruiwen Zhao <ruiwen@google.com>
2022-02-17 19:20:00 +00:00
Derek McGowan
c0f8188469
Update go-cni to v1.1.2
Fixes panic when exec is nil

Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-02-10 12:40:51 -08:00
Fu Wei
6a628b64ac
Merge pull request #6514 from marquiz/fixes/rdt 2022-02-08 09:31:49 +08:00
Markus Lehtonen
9b1fb82584 cri: fix handling of ignore_rdt_not_enabled_errors config option
We were not properly ignoring errors from
gorestrl.rdt.ContainerClassFromAnnotations() causing the config option
to be ineffective, in practice.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-02-04 13:54:03 +02:00
Jeremi Piotrowski
7275411ec8 cgroup2: monitor OOMKill instead of OOM to prevent missing container OOM events
With the cgroupv2 configuration employed by Kubernetes, the pod cgroup (slice)
and container cgroup (scope) will both have the same memory limit applied. In
that situation, the kernel will consider an OOM event to be triggered by the
parent cgroup (slice), and increment 'oom' there. The child cgroup (scope) only
sees an oom_kill increment. Since we monitor child cgroups for oom events,
check the OOMKill field so that we don't miss events.

This is not visible when running containers through docker or ctr, because they
set the limits differently (only container level). An alternative would be to
not configure limits at the pod level - that way the container limit will be
hit and the OOM will be correctly generated. An interesting consequence is that
when spawning a pod with multiple containers, the oom events also work
correctly, because:

a) if one of the containers has no limit, the pod has no limit so OOM events in
   another container report correctly.
b) if all of the containers have limits then the pod limit will be a sum of
   container events, so a container will be able to hit its limit first.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-02-03 13:39:16 +01:00
Jeremi Piotrowski
821c961c86 pkg/oom/v2: handle EventChan routine shutdown quietly
When the cgroup is removed, EventChan is closed (this was pulled in by
8d69c041c5). This results in a nil error
being received. Don't log an error in that case but instead return.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2022-02-03 13:20:46 +01:00
Andrew G. Morgan
6906b57c72
Fix the Inheritable capability defaults.
The Linux kernel never sets the Inheritable capability flag to
anything other than empty. Non-empty values are always exclusively
set by userspace code.

[The kernel stopped defaulting this set of capability values to the
 full set in 2000 after a privilege escalation with Capabilities
 affecting Sendmail and others.]

Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
2022-02-01 13:55:46 -08:00
Derek McGowan
4e9e14c2b6
Fix rdt build tags for go 1.16
Signed-off-by: Derek McGowan <derek@mcg.dev>
2022-01-19 11:09:29 -08:00
Takumasa Sakao
18592b2f5a Fix wrong log message
Signed-off-by: Takumasa Sakao <tsakao@zlab.co.jp>
2022-01-09 16:01:23 +09:00
haoyun
bbe46b8c43 feat: replace github.com/pkg/errors to errors
Signed-off-by: haoyun <yun.hao@daocloud.io>
Co-authored-by: zounengren <zouyee1989@gmail.com>
2022-01-07 10:27:03 +08:00
Derek McGowan
644a01e13b
Merge pull request from GHSA-mvff-h3cj-wj9c
only relabel cri managed host mounts
2022-01-05 09:30:58 -08:00
Markus Lehtonen
9c2e3835fa cri: add ignore_rdt_not_enabled_errors config option
Enabling this option effectively causes RDT class of a container to be a
soft requirement. If RDT support has not been enabled the RDT class
setting will not have any effect.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-01-04 09:27:54 +02:00
Markus Lehtonen
f4a191917b cri: annotations for controlling RDT class
Use goresctrl for parsing container and pod annotations related to RDT.

In practice, from the users' point of view, this patchs adds support for
a container annotation and two separate pod annotations for controlling
the RDT class of containers.

Container annotation can be used by a CRI client:
  "io.kubernetes.cri.rdt-class"

Pod annotations for specifying the RDT class in the K8s pod spec level:
  "rdt.resources.beta.kubernetes.io/pod"
  (pod-wide default for all containers within)

  "rdt.resources.beta.kubernetes.io/container.<container_name>"
  (container-specific overrides)

Annotations are intended as an intermediate step before the CRI API
supports RDT.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2022-01-04 09:27:54 +02:00
Derek McGowan
2c9d80aba5
Merge pull request #6372 from fidencio/wip/seutil-fix-container_kvm_t-type-detection
seutil: Fix setting the "container_kvm_t" label
2021-12-15 10:35:04 -08:00
Phil Estes
949db57213
Merge pull request #6320 from endocrimes/dani/cri-swap
cri: add support for configuring swap
2021-12-14 15:02:28 -05:00
Phil Estes
330961c2d5
Merge pull request #6358 from jonyhy96/feat-error
refactor: functions for error log and error return
2021-12-14 10:16:54 -05:00
Fu Wei
d47fa40d1b
Merge pull request #6021 from dmcgowan/runc-shim-plugin 2021-12-14 10:19:23 +08:00
Derek McGowan
ac531108ab
Merge pull request #6155 from egernst/cri-update-for-sandbox-sizing
CRI update for sandbox sizing
2021-12-13 16:21:30 -08:00
Fabiano Fidêncio
f1c7993311 seutil: Fix setting the "container_kvm_t" label
The ability to handle KVM based runtimes with SELinux has been added as
part of d715d00906.

However, that commit introduced some logic to check whether the
"container_kvm_t" label would or not be present in the system, and while
the intentions were good, there's two major issues with the approach:
1. Inspecting "/etc/selinux/targeted/contexts/customizable_types" is not
   the way to go, as it doesn't list the "container_kvm_t" at all.
2. There's no need to check for the label, as if the label is invalid an
   "Invalid Label" error will be returned and that's it.

With those two in mind, let's simplify the logic behind setting the
"container_kvm_t" label, removing all the unnecessary code.

Here's an output of VMM process running, considering:
* The state before this patch:
  ```
  $ containerd --version
  containerd github.com/containerd/containerd v1.6.0-beta.3-88-g7fa44fc98 7fa44fc98f
  $ kubectl apply -f ~/simple-pod.yaml
  pod/nginx created
  $ ps -auxZ | grep cloud-hypervisor
  system_u:system_r:container_runtime_t:s0 root 609717 4.0  0.5 2987512 83588 ?    Sl   08:32   0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/be9d5cbabf440510d58d89fc8a8e77c27e96ddc99709ecaf5ab94c6b6b0d4c89/clh-api.sock
  ```

* The state after this patch:
  ```
  $ containerd --version
  containerd github.com/containerd/containerd v1.6.0-beta.3-89-ga5f2113c9 a5f2113c9fc15b19b2c364caaedb99c22de4eb32
  $ kubectl apply -f ~/simple-pod.yaml
  pod/nginx created
  $ ps -auxZ | grep cloud-hypervisor
  system_u:system_r:container_kvm_t:s0:c638,c999 root 614842 14.0  0.5 2987512 83228 ? Sl 08:40   0:00 /usr/bin/cloud-hypervisor --api-socket /run/vc/vm/f8ff838afdbe0a546f6995fe9b08e0956d0d0cdfe749705d7ce4618695baa68c/clh-api.sock
  ```

Note, the tests were performed using the following configuration snippet:
```
[plugins]
  [plugins.cri]
    enable_selinux = true
    [plugins.cri.containerd]
      [plugins.cri.containerd.runtimes]
        [plugins.cri.containerd.runtimes.kata]
           runtime_type = "io.containerd.kata.v2"
           privileged_without_host_devices = true
```

And using the following pod yaml:
```
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  runtimeClassName: kata
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
```

Fixes: #6371

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-12-14 00:09:17 +01:00
Alexander Minbaev
c8a009d18c add-list-stat: return container list if filter is nil
Signed-off-by: Alexander Minbaev <alexander.minbaev@ibm.com>
2021-12-13 15:09:18 -06:00
Eric Ernst
20419feaac cri, sandbox: pass sandbox resource details if available, applicable
CRI API has been updated to include a an optional `resources` field in the
LinuxPodSandboxConfig field, as part of the RunPodSandbox request.

Having sandbox level resource details at sandbox creation time will have
large benefits for sandboxed runtimes. In the case of Kata Containers,
for example, this'll allow for better support of SW/HW architectures
which don't allow for CPU/memory hotplug, and it'll allow for better
queue sizing for virtio devices associated with the sandbox (in the VM
case).

If this sandbox resource information is provided as part of the run
sandbox request, let's introduce a pattern where we will update the
pause container's runtiem spec to include this information in the
annotations field.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-12-13 08:41:41 -08:00
haoyun
c0d07094be feat: Errorf usage
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-12-13 14:31:53 +08:00
Michael Crosby
9b0303913f
only relabel cri managed host mounts
Co-authored-by: Samuel Karp <skarp@amazon.com>
Signed-off-by: Michael Crosby <michael@thepasture.io>
Signed-off-by: Samuel Karp <skarp@amazon.com>
2021-12-09 09:53:47 -08:00
Sebastiaan van Stijn
2d3009038c
cri/server: use consistent alias for pkg/ioutil
Consistently use cioutil to prevent it being confused for Golang's ioutil.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-12-09 17:47:22 +01:00
Danielle Lancashire
2fa4e9c0e2 cri: add support for configuring swap
Signed-off-by: Danielle Lancashire <dani@builds.terrible.systems>
2021-12-02 21:25:33 +01:00
Fu Wei
69822aa936
Merge pull request #6258 from wllenyj/fix-registry-panic 2021-11-19 13:35:46 +08:00
wanglei01
5f293d9ac4 [CRI] Fix panic when registry.mirrors use localhost
When containerd use this config:

```
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
      endpoint = ["http://localhost:5000"]
```

Due to the `newTransport` function does not initialize the `TLSClientConfig` field.
Then use `TLSClientConfig` to cause nil pointer dereference

Signed-off-by: wanglei <wllenyj@linux.alibaba.com>
2021-11-19 10:56:46 +08:00
Michael Crosby
aa2733c202
Merge pull request #6170 from olljanat/default-sysctls
CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
2021-11-18 11:37:23 -05:00
Derek McGowan
9afc778b73
Merge pull request #6111 from crosbymichael/latency-metrics
[cri] add sandbox and container latency metrics
2021-11-16 16:59:33 -08:00
Phil Estes
7758cdc09a
Merge pull request #6253 from jonyhy96/feat-rwmutex
feat: use rwmutex instead
2021-11-16 11:39:29 -05:00
Derek McGowan
6835a94707
Split runc shim into plugin components
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
Derek McGowan
6eea8f3f62
Add shutdown package
Allows shutdown to handle callbacks with similar behavior as context
cancel

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
haoyun
bef792b962 feat: use rwmutex instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-16 11:06:40 +08:00
Derek McGowan
d055487b00
Merge pull request #6206 from mxpv/path
Allow absolute path to shim binaries
2021-11-15 18:05:48 -08:00
Olli Janatuinen
2a81c9f677 CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2021-11-15 18:30:09 +02:00
Michael Crosby
6765524b73 use write lock when updating container stats
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-11 15:17:48 +00:00
Maksym Pavlenko
6870f3b1b8 Support custom runtime path when launching tasks
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-11-09 13:31:46 -08:00
Michael Crosby
91bbaf6799 [cri] add sandbox and container latency metrics
These are simple metrics that allow users to view more fine grained metrics on
internal operations.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-09 21:07:38 +00:00
Michael Crosby
4b7cc560b2
Merge pull request #6222 from jonyhy96/add-more-description
cleanup: add more description on comment
2021-11-09 15:55:32 -05:00
haoyun
5748006337 cleanup: add more description on comment
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-09 19:13:37 +08:00
David Porter
2e6d5709e3 Implement CRI container and pods stats
See https://kep.k8s.io/2371

* Implement new CRI RPCs - `ListPodSandboxStats` and `PodSandboxStats`
  * `ListPodSandboxStats` and `PodSandboxStats` which return stats about
    pod sandbox. To obtain pod sandbox stats, underlying metrics are
    read from the pod sandbox cgroup parent.
  * Process info is obtained by calling into the underlying task
  * Network stats are taken by looking up network metrics based on the
    pod sandbox network namespace path
* Return more detailed stats for cpu and memory for existing container
  stats. These metrics use the underlying task's metrics to obtain
  stats.

Signed-off-by: David Porter <porterdavid@google.com>
2021-11-03 17:52:05 -07:00
Dat Nguyen
afe39bebfe add oci.WithAllDevicesAllowed flag for privileged_without_host_devices
This commit adds a flag that enable all devices whitelisting when
privileged_without_host_devices is already enabled.

Fixes #5679

Signed-off-by: Dat Nguyen <dnguyen7@atlassian.com>
2021-11-04 10:24:19 +11:00
Mike Brown
ea89788105 adds additional debug out to timebox cni setup
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-11-01 09:34:29 -05:00
zounengren
a217b5ac8f bump CNI to spec v1.0.0
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-10-22 10:58:40 +08:00
Sambhav Kothari
2a8dac12a7 Output a warning for label image labels instead of erroring
This change ignore errors during container runtime due to large
image labels and instead outputs warning. This is necessary as certain
image building tools like buildpacks may have large labels in the images
which need not be passed to the container.

Signed-off-by: Sambhav Kothari <sambhavs.email@gmail.com>
2021-10-14 19:25:48 +01:00
Claudiu Belu
2bc77b8a28 Adds Windows resource limits support
This will allow running Windows Containers to have their resource
limits updated through containerd. The CPU resource limits support
has been added for Windows Server 20H2 and newer, on older versions
hcsshim will raise an Unimplemented error.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-25 13:20:55 -07:00
Derek McGowan
cb6fb93af5
Merge pull request #6011 from crosbymichael/schedcore
add runc shim support for sched core
2021-10-08 10:42:16 -07:00
Derek McGowan
26ee1b1ee5
Merge pull request #4695 from crosbymichael/cri-class
[cri] Add CNI conf based on runtime class
2021-10-08 09:27:49 -07:00
Michael Crosby
e48bbe8394 add runc shim support for sched core
In linux 5.14 and hopefully some backports, core scheduling allows processes to
be co scheduled within the same domain on SMT enabled systems.

The containerd impl sets the core sched domain when launching a shim. This
allows a clean way for each shim(container/pod) to be in its own domain and any
additional containers, (v2 pods) be be launched with the same domain as well as
any exec'd process added to the container.

kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-10-08 16:18:09 +00:00
Michael Crosby
7b8a697f28
Merge pull request #6034 from claudiubelu/windows/fixes-image-volume
Fixes Windows containers with image volumes
2021-10-07 11:50:01 -04:00
Akihiro Suda
703b86533b
pkg/cap: remove an outdated comment
pkg/cap no longer depends on github.com/syndtr/gocapability

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-10-06 13:24:30 +09:00
Derek McGowan
63b7e5771e
Merge pull request #5973 from Juneezee/deprecate-ioutil
refactor: move from io/ioutil to io and os package
2021-10-01 10:52:06 -07:00
Claudiu Belu
791e175c79 Windows: Fixes Windows containers with image volumes
Currently, there are few issues that preventing containers
with image volumes to properly start on Windows.

- Unlike the Linux implementation, the Container volume mount paths
  were not created if they didn't exist. Those paths are now created.

- while copying the image volume contents to the container volume,
  the layers were not properly deactivated, which means that the
  container can't start since those layers are still open. The layers
  are now properly deactivated, allowing the container to start.

- even if the above issue didn't exist, the Windows implementation of
  mount/Mount.Mount deactivates the layers, which wouldn't allow us
  to copy files from them. The layers are now deactivated after we've
  copied the necessary files from them.

- the target argument of the Windows implementation of mount/Mount.Mount
  was unused, which means that folder was always empty. We're now
  symlinking the Layer Mount Path into the target folder.

- hcsshim needs its Container Mount Paths to be properly formated, to be
  prefixed by C:. This was an issue for Volumes defined with Linux-like
  paths (e.g.: /test_dir). filepath.Abs solves this issue.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-10-01 09:02:18 +00:00
haoyun
5c2426a7b2 cleanup: import from k8s.io/utils/clock/testing instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:34:56 +08:00
haoyun
6484fab1e0 cleanup: import from k8s.io/utils/clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:27:20 +08:00
zounengren
fcffe0c83a switch usage directly to errdefs.(ErrAlreadyExists and ErrNotFound)
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2021-09-24 18:26:58 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Michael Crosby
55893b9be7 Add CNI conf based on runtime class
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-17 19:05:06 +00:00
Phil Estes
f40df3d72b
Enable image config labels in ctr and CRI container creation
Signed-off-by: Phil Estes <estesp@amazon.com>
2021-09-15 15:31:19 -04:00
Phil Estes
d081457ba4
Merge pull request #5974 from claudiubelu/hanging-task-delete-fix
task delete: Closes task IO before waiting
2021-09-15 11:30:23 -04:00
Fu Wei
e1ad779107
Merge pull request #5817 from dmcgowan/shim-plugins
Add support for shim plugins
2021-09-12 18:18:20 +08:00
Fu Wei
d9f921e4f0
Merge pull request #5906 from thaJeztah/replace_os_exec 2021-09-11 10:38:53 +08:00
Phil Estes
6589876d20
Merge pull request #5964 from crosbymichael/cni-pref
add ip_pref CNI options for primary pod ip
2021-09-10 12:06:23 -04:00
Fu Wei
689a863efe
Merge pull request #5939 from scuzhanglei/privileged-device 2021-09-10 22:15:46 +08:00
Michael Crosby
1ddc54c00d
Merge pull request #5954 from claudiubelu/fix-sandbox-remove
sandbox: Allows the sandbox to be deleted in NotReady state
2021-09-10 10:12:34 -04:00
Michael Crosby
1efed43090
add ip_pref CNI options for primary pod ip
This fixes the TODO of this function and also expands on how the primary pod ip
is selected. This change allows the operator to prefer ipv4, ipv6, or retain the
ordering provided by the return results of the CNI plugins.

This makes it much more flexible for ops to configure containerd and how IPs are
set on the pod.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-10 10:04:21 -04:00
scuzhanglei
756f4a3147 cri: add devices for privileged container
Signed-off-by: scuzhanglei <greatzhanglei@gmail.com>
2021-09-10 10:16:26 +08:00
Fu Wei
d58542a9d1
Merge pull request #5627 from payall4u/payall4u/cri-support-cgroup-v2 2021-09-09 23:10:33 +08:00
Claudiu Belu
55faa5e93d task delete: Closes task IO before waiting
After containerd restarts, it will try to recover its sandboxes,
containers, and images. If it detects a task in the Created or
Stopped state, it will be removed. This will cause the containerd
process it hang on Windows on the t.io.Wait() call.

Calling t.io.Close() beforehand will solve this issue.

Additionally, the same issue occurs when trying to stopp a sandbox
after containerd restarts. This will solve that case as well.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-07 02:17:01 -07:00
Wei Fu
2bcd6a4e88 cri: patch update image labels
The CRI-plugin subscribes the image event on k8s.io namespace. By
default, the image event is created by CRI-API. However, the image can
be downloaded by containerd API on k8s.io with the customized labels.
The CRI-plugin should use patch update for `io.cri-containerd.image`
label in this case.

Fixes: #5900

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-09-05 18:48:26 +08:00
Claudiu Belu
24cec9be56 sandbox: Allows the sandbox to be deleted in NotReady state
The Pod Sandbox can enter in a NotReady state if the task associated
with it no longer exists (it died, or it was killed). In this state,
the Pod network namespace could still be open, which means we can't
remove the sandbox, even if --force was used.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-02 03:40:56 -07:00
Mike Brown
e00f87f1dc
Merge pull request #5927 from adelina-t/ws_2022_image_update
Update Pause image in tests & config
2021-08-31 16:11:57 -05:00
Adelina Tuvenie
6d3d34b85d Update Pause image in tests & config
With the introduction of Windows Server 2022, some images have been updated
to support WS2022 in their manifest list. This commit updates the test images
accordingly.

Signed-off-by: Adelina Tuvenie <atuvenie@cloudbasesolutions.com>
2021-08-31 19:42:57 +03:00
Mikko Ylinen
e0f8c04dad cri: Devices ownership from SecurityContext
CRI container runtimes mount devices (set via kubernetes device plugins)
to containers by taking the host user/group IDs (uid/gid) to the
corresponding container device.

This triggers a problem when trying to run those containers with
non-zero (root uid/gid = 0) uid/gid set via runAsUser/runAsGroup:
the container process has no permission to use the device even when
its gid is permissive to non-root users because the container user
does not belong to that group.

It is possible to workaround the problem by manually adding the device
gid(s) to supplementalGroups. However, this is also problematic because
the device gid(s) may have different values depending on the workers'
distro/version in the cluster.

This patch suggests to take RunAsUser/RunAsGroup set via SecurityContext
as the device UID/GID, respectively. The feature must be enabled by
setting device_ownership_from_security_context runtime config value to
true (valid on Linux only).

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-08-30 09:30:00 +03:00
Phil Estes
af1a0908d0
Merge pull request #5865 from dcantah/windows-pod-runasusername
Add RunAsUserName functionality for the Windows pod sandbox container
2021-08-25 22:25:14 -04:00
Sebastiaan van Stijn
2ac9968401
replace uses of os/exec with golang.org/x/sys/execabs
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows. This issue also
affects Unix users who have “.” listed explicitly in their PATH and are running
“go get” outside of a module or with module mode disabled.

This issue is not limited to the go command itself, and can also affect binaries
that use `os.Command`, `os.LookPath`, etc.

From the related blogpost (ttps://blog.golang.org/path-security):

> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing

This patch replaces all uses of `os/exec` with `golang.org/x/sys/execabs`. While
some uses of `os/exec` should not be problematic (e.g. part of tests), it is
probably good to be consistent, in case code gets moved around.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-25 18:11:09 +02:00
Fu Wei
6fa9588531
Merge pull request #5903 from AkihiroSuda/gofmt117
Run `go fmt` with Go 1.17
2021-08-24 23:01:41 +08:00
Daniel Canter
25644b4614 Add RunAsUserName functionality for the Windows Pod Sandbox Container
There was recent changes to cri to bring in a Windows section containing a
security context object to the pod config. Before this there was no way to specify
a user for the pod sandbox container to run as. In addition, the security context
is a field for field mirror of the Windows container version of it, so add the
ability to specify a GMSA credential spec for the pod sandbox container as well.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-08-23 07:35:22 -07:00
payall4u
f8dfbee178 add cri test case
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:59:19 +08:00
payall4u
9a8bf13158 feature: add field LinuxContainerResources.Unified on cri
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:49:31 +08:00
Akihiro Suda
d3aa7ee9f0
Run go fmt with Go 1.17
The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-08-22 09:31:50 +09:00
Jacob Blain Christen
c3609ff4ca cri: filter selinux xattr for image volumes
Exclude the `security.selinux` xattr when copying content from layer
storage for image volumes. This allows for the already correct label
at the target location to be applied to the copied content, thus
enabling containers to write to volumes that they implicitly expect to be
able to write to.

- Fixes containerd/containerd#5090
- See rancher/rke2#690

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2021-08-20 23:47:24 -07:00
Phil Estes
ff2e58d114
Merge pull request #5131 from perithompson/windows-hostnetwork
Add Windows HostProcess Support
2021-08-20 14:29:37 -04:00
Kazuyoshi Kato
4dd5ca70fb script: update golangci-lint from v1.38.0 and v1.36.0 to v1.42.0
golint has been deprecated and replaced by revive since v1.41.0.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-08-19 16:27:16 -07:00
Derek McGowan
8d135d2842
Add support for shim plugins
Refactor shim v2 to load and register plugins.
Update init shim interface to not require task service implementation on
returned service, but register as plugin if it is.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-17 11:06:09 -07:00
Gunju Kim
1224060f89 Allow expanded DNS configuration
Signed-off-by: Gunju Kim <gjkim042@gmail.com>
2021-08-14 06:13:01 +09:00
Peri Thompson
79b369a0bb
Added windows hostProcess cni skip
Signed-off-by: Peri Thompson <perit@vmware.com>
2021-08-11 22:23:49 +01:00
Michael Crosby
218db0f9af
Merge pull request #5835 from dmcgowan/plugin-events-cleanup
Move plugin context events into separate plugin
2021-08-07 21:47:11 -04:00
Derek McGowan
0a0621bb47
Move plugin context events into separate plugin
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:59:20 -07:00
Derek McGowan
6f027e38a8
Remove redundant build tags
Remove build tags which are already implied by the name of the file.
Ensures build tags are used consistently

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:27:46 -07:00
Derek McGowan
caf9e256b7
Merge pull request #5693 from kzys/sigrtmin
Support SIGRTMIN+n signals
2021-07-27 11:58:57 -07:00
Davanum Srinivas
494b940f14
Introduce a new go module - containerd/api for use in standalone clients
In containerd 1.5.x, we introduced support for go modules by adding a
go.mod file in the root directory. This go.mod lists all the things
needed across the whole code base (with the exception of
integration/client which has its own go.mod). So when projects that
need to make calls to containerd API will pull in some code from
containerd/containerd, the `go mod` commands will add all the things
listed in the root go.mod to the projects go.mod file. This causes
some problems as the list of things needed to make a simple API call
is enormous. in effect, making a API call will pull everything that a
typical server needs as well as the root go.mod is all encompassing.
In general if we had smaller things folks could use, that will make it
easier by reducing the number of things that will end up in a consumers
go.mod file.

Now coming to a specific problem, the root containerd go.mod has various
k8s.io/* modules listed. Also kubernetes depends on containerd indirectly
via both moby/moby (working with docker maintainers seperately) and via
google/cadvisor. So when the kubernetes maintainers try to use latest
1.5.x containerd, they will see the kubernetes go.mod ending up depending
on the older version of kubernetes!

So if we can expose just the minimum things needed to make a client API
call then projects like cadvisor can adopt that instead of pulling in
the entire go.mod from containerd. Looking at the existing code in
cadvisor the minimum things needed would be the api/ directory from
containerd. Please see proof of concept here:
github.com/google/cadvisor/pull/2908

To enable that, in this PR, we add a go.mod file in api/ directory. we
split the Protobuild.yaml into two, one for just the things in api/
directory and the rest in the root directory. We adjust various targets
to build things correctly using `protobuild` and also ensure that we
end up with the same generated code as before as well. To ensure we
better take care of the various go.mod/go.sum files, we update the
existing `make vendor` and also add a new `make verify-vendor` that one
can run locally as well in the CI.

Ideally, we would have a `containerd/client` either as a standalone repo
or within `containerd/containerd` as a separate go module. but we will
start here to experiment with a standalone api go module first.

Also there are various follow ups we can do, for example @thaJeztah has
identified two tasks we could do after this PR lands:

github.com/containerd/containerd/pull/5716#discussion_r668821396

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-07-27 07:34:59 -04:00
Kazuyoshi Kato
1d3d08026d Support SIGRTMIN+n signals
systemd uses SIGRTMIN+n signals, but containerd didn't support the signals
since Go's sys/unix doesn't support them.

This change introduces SIGRTMIN+n handling by utilizing moby/sys/signal.

Fixes #5402.

https://www.freedesktop.org/software/systemd/man/systemd.html#Signals

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-07-26 09:36:43 -07:00
Wei Fu
ac75071b49 remove pkg/cri/platforms package
The package is a duplicate of platforms. No need to maintain
pkg/cri/platforms.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-07-10 10:14:27 +08:00
Brian Goff
0a8802df67 Allow WithServices to use custom implementations
Before this change, for several of the services that `WithServices`
handles, only the grpc client is supported.
Now, for instance, one can use an `images.Store` directly instead of
only an `imagesapi.StoreSlient`.

Some of the methods have been renamed to satisfy the difference between
using a grpc `<Foo>Client` vs the main interface.

I did not see a good candidate for TaskService so have left that mostly
unchanged.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-07-09 23:30:40 +00:00
Phil Estes
cf600abecc
Merge pull request #5619 from mikebrow/cri-add-v1-proxy-alpha
[CRI] move up to CRI v1 and support v1alpha in parallel
2021-07-09 14:07:24 -04:00
Mike Brown
d1c1051927 use fu wei's suggeted interface pick for marshaling
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-07 15:45:45 -05:00
Mike Brown
14962dcbd2 add alpha version
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-06 11:40:20 -05:00
Mike Brown
a5c417ac06 move up to CRI v1 and support v1alpha in parallel
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-28 09:34:12 -05:00
Dan Williams
dac2543a07 sandbox: send pod UID to CNI plugins as K8S_POD_UID
CNI plugins that need to wait for network state to converge
may want to cancel waiting when a short lived pod is deleted.
However, there is a race between when kubelet asks the runtime
to create the sandbox for the pod, and when the plugin is able
request the pod object from the apiserver. It may be the case
that the plugin receives the new pod, rather than the pod
the sandbox request was initiated for.

Passing the pod UID to the plugin allows the plugin to check
whether the pod it gets from the apiserver is actually the
pod its sandbox request was started for.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-06-22 22:53:30 -05:00
Mike Brown
560e7d4799 fixing some doc links
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-21 18:24:47 -05:00
Kazuyoshi Kato
1bbee573af github.com/golang/protobuf/proto is deprecated
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-06-17 10:28:48 -04:00
Quan Tian
728743eb28 Fix cleanup context of teardownPodNetwork
Similar to other deferred cleanup operations, teardownPodNetwork should
use a different context as the original context may have expired,
otherwise CNI wouldn't been invoked, leading to leak of network
resources, e.g. IP addresses.

Signed-off-by: Quan Tian <qtian@vmware.com>
2021-06-04 19:17:05 +08:00
zounengren
498bb36f67 scrub the stale TODO
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-06-01 11:22:15 +08:00
Shiming Zhang
79e3452213 update the link
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-05-20 11:56:29 +08:00
Shiming Zhang
1acca8bba3 Don't check for apparmor_parser to be present
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-05-20 11:56:29 +08:00
Phil Estes
e47400cbd2
Merge pull request #5100 from adisky/skip-tls-localHost
Skip TLS verification for localhost
2021-05-12 14:56:53 -04:00
Kevin Parsons
b0d3b35b28 windows: Use GetFinalPathNameByHandle for ResolveSymbolicLink
This change splits the definition of pkg/cri/os.ResolveSymbolicLink by
platform (windows/!windows), and switches to an alternate implementation
for Windows. This aims to fix the issue described in containerd/containerd#5405.

The previous implementation which just called filepath.EvalSymlinks has
historically had issues on Windows. One of these issues we were able to
fix in Go, but EvalSymlinks's behavior is not well specified on
Windows, and there could easily be more issues in the future, so it
seems prudent to move to a separate implementation for Windows.

The new implementation uses the Windows GetFinalPathNameByHandle API,
which takes a handle to an open file or directory and some flags, and
returns the "real" name for the object. See comments in the code for
details on the implementation.

I have tested this change with a variety of mounts and everything seems
to work as expected. Functions that make incorrect assumptions on what a
Windows path can look like may have some trouble with the \\?\ path
syntax. For instance EvalSymlinks fails when given a \\?\UNC\ path. For
this reason, the resolvePath implementation modifies the returned path
to translate to the more common form (\\?\UNC\server\share ->
\\server\share).

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
2021-05-04 11:55:11 -07:00
Mike Brown
c1a35232d8
Merge pull request #5446 from Random-Liu/fix-auth-config
Fix different registry hosts referencing the same auth config.
2021-05-04 06:21:02 -05:00
Lantao Liu
81402e4758 Fix different registry hosts referencing the same auth config.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-05-03 17:42:57 -07:00
Aditi Sharma
8014d9fee0 Skip TLS verification for localhost
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-05-03 10:21:54 +05:30
Mike Brown
3dad67eedb
Merge pull request #5203 from tghartland/5008-target-namespace
Support PID NamespaceMode_TARGET
2021-04-21 19:41:38 -05:00
Maksym Pavlenko
cb64dc8250
Merge pull request #5401 from Iceber/use-unbuffered-channel
process: use the unbuffered channel as the done signal
2021-04-21 16:50:32 -07:00
Thomas Hartland
efcb187429 Add unit tests for PID NamespaceMode_TARGET validation
Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 19:59:10 +02:00
Thomas Hartland
b48f27df6b Support PID NamespaceMode_TARGET
This commit adds support for the PID namespace mode TARGET
when generating a container spec.

The container that is created will be sharing its PID namespace
with the target container that was specified by ID in the namespace
options.

Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 17:54:17 +02:00
Iceber Gu
909660ea92 process: use the unbuffered channel as the done signal
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-04-21 18:24:18 +08:00
Michael Crosby
a3fe5c84c0
Merge pull request #5383 from wzshiming/clean/process-io
move common code to pkg/process from runtime
2021-04-20 14:40:12 -04:00
Phil Estes
81c4ac202f
Merge pull request #5256 from kolyshkin/seccomp-enabled
pkg/seccomp: simplify and speed up isEnabled
2021-04-19 15:59:28 -04:00
Shiming Zhang
7966a6652a Cleanup code
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-04-19 16:59:45 +08:00
Sebastiaan van Stijn
1c03c377e5
go.mod: github.com/containerd/fifo v1.0.0
full diff: https://github.com/containerd/fifo/compare/115abcc95a1d...v1.0.0

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-19 09:27:45 +02:00
Kir Kolyshkin
3292ea5862 pkg/seccomp: use sync.Once to speed up IsEnabled
It does not make sense to check if seccomp is supported by the kernel
more than once per runtime, so let's use sync.Once to speed it up.

A quick benchmark (old implementation, before this commit, after):

BenchmarkIsEnabledOld-4           37183            27971 ns/op
BenchmarkIsEnabled-4            1252161              947 ns/op
BenchmarkIsEnabledOnce-4      666274008             2.14 ns/op

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-16 15:52:35 -07:00
Kir Kolyshkin
00b5c99b1a pkg/seccomp: simplify IsEnabled, update doc
Current implementation of seccomp.IsEnabled (rooted in runc) is not
too good.

First, it parses the whole /proc/self/status, adding each key: value
pair into the map (lots of allocations and future work for garbage
collector), when using a single key from that map.

Second, the presence of "Seccomp" key in /proc/self/status merely means
that kernel option CONFIG_SECCOMP is set, but there is a need to _also_
check for CONFIG_SECCOMP_FILTER (the code for which exists but never
executed in case /proc/self/status has Seccomp key).

Replace all this with a single call to prctl; see the long comment in
the code for details.

While at it, improve the IsEnabled documentation.

NOTE historically, parsing /proc/self/status was added after a concern
was raised in https://github.com/opencontainers/runc/pull/471 that
prctl(PR_GET_SECCOMP, ...) can result in the calling process being
killed with SIGKILL. This is a valid concern, so the new code here
does not use PR_GET_SECCOMP at all.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-16 15:52:35 -07:00
Phil Estes
4f18131239
Merge pull request #5286 from payall4u/optimize-cri-redirect-logs
cri: Reduce the cpu usage of  the function redirectLogs in cri
2021-04-14 21:33:05 -04:00
Sebastiaan van Stijn
864a3322b3
go.mod: github.com/containerd/go-cni v1.0.2
full diff: https://github.com/containerd/go-cni/compare/v1.0.1...v1.0.2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-14 09:09:18 +02:00
Mike Brown
8a04bd0521 address recent runtimes config confusion
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-12 15:33:38 -05:00
Mike Brown
e96d2a5d90 Revert "remove two very old no longer used runtime options"
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-12 10:16:01 -05:00
Fu, Wei
7e3fd8da24
Merge pull request #5298 from jsturtevant/issue-5297
Support multi-arch images for Windows via ctr
2021-04-12 13:52:14 +08:00
payall4u
4bc8f692fc optimize cri redirect logs
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-04-09 11:45:53 +08:00
Fu, Wei
d064140369
Merge pull request #5302 from mikebrow/toml-cri-defaults
shows our runc.v2 default options
2021-04-09 11:11:25 +08:00
Sebastiaan van Stijn
9bc8d63c9f
cri/server: use containerd/oci instead of libcontainer/devices
Looks like we had our own copy of the "getDevices" code already, so use
that code (which also matches the code that's used to _generate_ the spec,
so a better match).

Moving the code to a separate file, I also noticed that the _unix and _linux
code was _exactly_ the same (baring some `//nolint:` comments), so also
removing the duplicated code.

With this patch applied, we removed the dependency on the libcontainer/devices
package (leaving only libcontainer/user).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-08 23:25:21 +02:00
Mike Brown
dd16b006e5 merge in the move to the new options type
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-08 14:09:59 -05:00
Mike Brown
9144ce9677 shows our runc.v2 default options in the containerd default config
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-08 14:09:59 -05:00
Aditi Sharma
4d4117415e Change CRI config runtime options type
Changing Runtime.Options type to map[string]interface{}
to correctly marshal it from go to JSON.
See issue: https://github.com/kubernetes-sigs/cri-tools/issues/728

Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-04-08 15:11:33 +05:30
Mike Brown
88880f0f2c
Merge pull request #5304 from mikebrow/cri-registry-doc-updates
remove mirrors from default; document the deprecation of registry.configs and registry.mirrors
merging based on LGTMs from https://github.com/containerd/containerd/pull/5304#pullrequestreview-628234110 and https://github.com/containerd/containerd/pull/5304#pullrequestreview-630478887 thanks!
2021-04-07 14:49:36 -05:00
Mike Brown
d4be6aa8fa rm mirror defaults; doc registry deprecations
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-07 12:29:43 -05:00
Akihiro Suda
8ba8533bde
pkg/cri/opts.WithoutRunMount -> oci.WithoutRunMount
Move `pkg/cri/opts.WithoutRunMount` function to `oci.WithoutRunMount`
so that it can be used without dependency on CRI.

Also add `oci.WithoutMounts(dests ...string)` for generality.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-07 21:25:36 +09:00
Mike Brown
0186a329e9 remove two very old no longer used runtime options
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-06 20:41:09 -05:00
Derek McGowan
261c107ffc
Merge pull request #5278 from mxpv/toml
Migrate TOML to github.com/pelletier/go-toml
2021-04-01 21:24:52 -07:00
Maksym Pavlenko
5ada2f74a7 Keep host order as defined in TOML file
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-04-01 09:29:16 -07:00
James Sturtevant
d9ff8ebef5 support multi-arch images for windows via ctr
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2021-03-31 15:50:01 -07:00
Mike Brown
1b05b605c8
Merge pull request #5145 from aojea/happyeyeballs
use (sort of) happy-eyeballs for port-forwarding
2021-03-26 09:51:29 -05:00
Maksym Pavlenko
ddd4298a10 Migrate current TOML code to github.com/pelletier/go-toml
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-25 13:13:33 -07:00
Derek McGowan
75a0c2b7d3
Merge pull request #5264 from mxpv/tests
Run unit tests on CI for MacOS
2021-03-25 09:46:25 -07:00
Fu, Wei
80fa9fe32a
Merge pull request #5135 from AkihiroSuda/default-config-crypt
add imgcrypt stream processors to the default config
2021-03-25 14:31:38 +08:00
Maksym Pavlenko
4674ad7beb Ignore some tests on darwin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-24 22:40:22 -07:00
Maksym Pavlenko
181e2d4216
Merge pull request #5250 from dmcgowan/cri-fix-reference-ordering
Fix reference ordering in CRI image store
2021-03-23 14:45:16 -07:00
Sebastiaan van Stijn
708299ca40
Move RunningInUserNS() to its own package
This allows using the utility without bringing whole of "sys" with it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-23 11:29:53 +01:00
Derek McGowan
0886ceaea2
Fix reference ordering in CRI image store
Currently image references end up being stored in a
random order due to the way maps are iterated through
in Go. This leads to inconsistent identifiers being
resolved when a single reference is needed to identify
an image and the ordering of the references is used for
the selection.

Sort references in a consistent and ranked manner,
from higher information formats to lower.

Note: A `name + tag` reference is considered higher
information than a `name + digest` reference since a
registry may be used to resolve the digest from a
`name + tag` reference.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-22 22:29:57 -07:00
Antonio Ojea
305b425830 use happy-eyeballs for port-forwarding
golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12.
It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.
However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially, trying IPv4 first for keeping the same behaviour.
xref https://github.com/golang/go/issues/44922

Signed-off-by: Antonio Ojea <aojea@redhat.com>
2021-03-22 20:15:24 +01:00
Michael Crosby
e0c94bb269
Merge pull request #4708 from kzys/enable-criu
Re-enable CRIU tests by not using overlayfs snapshotter
2021-03-19 14:23:05 -04:00
Shiming Zhang
1410220d8f Fix error log when copy file
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-03-20 00:13:00 +08:00
Michael Crosby
3f98a6d2d3
Merge pull request #5211 from pacoxu/pause/3.5
upgrade pause image to 3.5 for non-root
2021-03-18 11:43:59 -04:00
Phil Estes
32a08f1a6a
Merge pull request #4847 from cpuguy83/devices_by_dir
Support adding devices by dir
2021-03-17 09:41:02 -04:00
Kazuyoshi Kato
b520428b5a Fix CRIU
- process.Init#io could be nil
- Make sure CreateTaskRequest#Options is not empty before unmarshaling

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-03-16 16:46:45 -07:00
pacoxu
ffff688663 upgrade pause image to 3.5 for non-root
Signed-off-by: pacoxu <paco.xu@daocloud.io>
2021-03-16 23:20:35 +08:00
Derek McGowan
2755ead927
Merge pull request #4978 from cpuguy83/certs_dir
Add support for using a host registry dir in cri
2021-03-15 13:47:03 -07:00
Brian Goff
7776e5ef2a Support adding devices by dir
This enables cases where devices exist in a subdirectory of /dev,
particularly where those device names are not portable across machines,
which makes it problematic to specify from a runtime such as cri.

Added this to `ctr` as well so I could test that the code at least
works.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-15 16:42:23 +00:00
Akihiro Suda
ecb881e5e6
add imgcrypt stream processors to the default config
Enable the following config by default:

```toml
version = 2

[plugins."io.containerd.grpc.v1.cri".image_decryption]
  key_model = "node"

[stream_processors]
  [stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"]
    accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"]
    returns = "application/vnd.oci.image.layer.v1.tar+gzip"
    path = "ctd-decoder"
    args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
    env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
  [stream_processors."io.containerd.ocicrypt.decoder.v1.tar"]
    accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"]
    returns = "application/vnd.oci.image.layer.v1.tar"
    path = "ctd-decoder"
    args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
    env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
```

Fix issue 5128

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-15 13:27:16 +09:00
Brian Goff
b0b6d9aa03 Add support for using a host registry dir in cri
This will be used instead of the cri registry config in the main config
toml.

---

Also pulls in changes from containerd/cri@d0b4eecbb3

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-12 22:42:22 +00:00
Derek McGowan
8cf669ce34
Fix unsupported files exporting functions for apparmor and seccomp
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-12 08:47:05 -08:00
Derek McGowan
35eeb24a17
Fix exported comments enforcer in CI
Add comments where missing and fix incorrect comments

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-12 08:47:05 -08:00
Iceber Gu
f37ae8fc35
move to v3.4.1 for the pause image
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-03-07 15:21:20 +08:00
Iceber Gu
92ab1a63b0 cri: fix container status
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-03-05 00:00:10 +08:00
f00231050
591caece0c cri: check fsnotify watcher when receiving cni conf dir events
carry: 612f5f9f44

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-03-03 16:46:41 +08:00
Phil Estes
8dbe53a2a9
Merge pull request #5070 from yoheiueda/empty-masked
cri: set default masked/readonly paths to empty paths
2021-02-25 15:38:45 -05:00
Akihiro Suda
7ee610edb5
drop dependency on github.com/syndtr/gocapability
pkg/cap has the full list of the caps (for UT, originally),
so we can drop dependency on github.com/syndtr/gocapability

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-25 15:17:28 +09:00
Akihiro Suda
9822173354
cap: rename FromUint64 to FromBitmap
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-25 15:02:10 +09:00
Yohei Ueda
07f1df4541
cri: set default masked/readonly paths to empty paths
Fixes #5029.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2021-02-24 23:50:40 +09:00
Phil Estes
757be0a090
Merge pull request #5017 from AkihiroSuda/parse-cap
oci.WithPrivileged: set the current caps, not the known caps
2021-02-23 09:10:57 -05:00
Mike Brown
9173d3e929
Merge pull request #5021 from wzshiming/fix/signal_repeatedly
Fix repeated sending signal
2021-02-22 09:45:56 -06:00
Justin Terry (SF)
06e4e09567 cri: append envs from image config to empty slice to avoid env lost
Signed-off-by: Justin Terry (SF) <juterry@microsoft.com>
2021-02-18 16:39:28 -08:00
Phil Estes
c32ccdf8be
Merge pull request #5024 from yadzhang/deepcopy-imageconfig
cri: append envs from image config to empty slice to avoid env lost
2021-02-18 12:51:51 -05:00
Akihiro Suda
746cef0bc2
Merge pull request #5044 from wzshiming/fix/empty-error-warpping
Fix empty error warpping
2021-02-18 13:47:13 +09:00
zhangyadong.0808
08318b1ab9 cri: append envs from image config to empty slice to avoid env lost
Signed-off-by: Yadong Zhang <yadzhang@gmail.com>
2021-02-18 11:37:41 +08:00
Shiming Zhang
59db8a10e0 Fix empty error warpping
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-02-18 11:06:59 +08:00
Shiming Zhang
dc6f5ef3b9 Fix repeated sending signal
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-02-17 21:33:49 +08:00
Michael Crosby
41e3057cc6
Merge pull request #5025 from jeremyje/win20h2
Add references to Windows 20H2 test images.
2021-02-12 11:58:49 -05:00
Lorenz Brun
36d0bc1f2b Allow moving netns directory into StateDir
Signed-off-by: Lorenz Brun <lorenz@nexantic.com>
2021-02-10 18:33:14 +01:00
Akihiro Suda
a2d1a8a865
oci.WithPrivileged: set the current caps, not the known caps
This change is needed for running the latest containerd inside Docker
that is not aware of the recently added caps (BPF, PERFMON, CHECKPOINT_RESTORE).

Without this change, containerd inside Docker fails to run containers with
"apply caps: operation not permitted" error.

See kubernetes-sigs/kind 2058

NOTE: The caller process of this function is now assumed to be as
privileged as possible.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-10 17:14:17 +09:00
Michael Crosby
e874e2597e [cri] add pod annotations to CNI call
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-02-09 13:24:01 -05:00
Jeremy Edwards
1c81071d39 Add references to Windows 20H2 test images.
Signed-off-by: Jeremy Edwards <1312331+jeremyje@users.noreply.github.com>
2021-02-09 16:25:36 +00:00
Derek McGowan
b3f2402062
Merge pull request #5002 from crosbymichael/anno-image-name
[cri] add image-name annotation
2021-02-05 08:27:41 -08:00
Akihiro Suda
e908be5b58
Merge pull request #5001 from kzys/no-lint-upgrade 2021-02-06 00:40:38 +09:00
Kazuyoshi Kato
07db46ee23 lint: update nolint syntax for golangci-lint
Newer golangci-lint needs explicit `//` separator. Otherwise it treats
the entire line (`staticcheck deprecated ... yet`) as a name.

https://golangci-lint.run/usage/false-positives/#nolint

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-02-04 11:59:55 -08:00
Sebastiaan van Stijn
04d061fa6a
update runc to v1.0.0-rc93
full diff: https://github.com/opencontainers/runc/compare/v1.0.0-rc92...v1.0.0-rc93

also removes dependency on libcontainer/configs

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-04 16:13:30 +01:00
Sebastiaan van Stijn
54cc3483ff
pkg/cri/server: don't import libcontainer/configs
Looks like this import was not needed for the test; simplified the test
by just using the device-path (a counter would work, but for debugging,
having the list of paths can be useful).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-04 16:08:39 +01:00
Michael Crosby
99cb62f233 [cri] add image-name annotation
For some tools having the actual image name in the annotations is helpful for
debugging and auditing the workload.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-02-04 07:05:11 -05:00
Lantao Liu
b5bf1fd5d8 Fix deprecated registry auth conversion.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-02-03 19:22:26 -08:00
Aditi Sharma
1423e9199d Update gogo/protobuf to v1.3.2
bump version 1.3.2 for gogo/protobuf due to CVE-2021-3121 discovered
in gogo/protobuf version 1.3.1, CVE has been fixed in 1.3.2

Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-01-28 12:57:50 +00:00
Michael Crosby
591d7e2fb1 remove exec sync debug contents from logs
This was dumping untrusted output to the debug logs from user containers.
We should not dump this type of information to reduce log sizes and any
information leaks from user containers.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-01-26 14:57:54 -05:00
Alban Crequy
28e4fb25f4 cri: add annotations for pod name and namespace
cri-o has annotations for pod name, namespace and container name:
https://github.com/containers/podman/blob/master/pkg/annotations/annotations.go

But so far containerd had only the container name.

This patch will be useful for seccomp agents to have a different
behaviour depending on the pod (see runtime-spec PR 1074 and runc PR
2682). This should simplify the code in:
b2d423695d/pkg/kuberesolver/kuberesolver.go (L16-L27)

Signed-off-by: Alban Crequy <alban@kinvolk.io>
2021-01-26 12:10:39 +01:00
Wei Fu
e56de63099 cri: handle sandbox/container exit event separately
The event monitor handles exit events one by one. If there is something
wrong about deleting task, it will slow down the terminating Pods. In
order to reduce the impact, the exit event watcher should handle exit
event separately. If it failed, the watcher should put it into backoff
queue and retry it.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-01-24 13:43:38 +08:00
Shengjing Zhu
2818fdebaa Move runtimeoptions out of cri package
Since it's a standard set of runtime opts, and used in ctr as well,
it could be moved out of cri.

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2021-01-23 01:24:35 +08:00
Michael Crosby
a731039238 [cri] label etc files for selinux containers
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-01-19 13:42:09 -05:00
Mike Brown
550b4949cb
Merge pull request #4700 from mikebrow/cri-security-profile-update
CRI security profile update for CRI graduation
2021-01-12 12:21:56 -06:00
Sebastiaan van Stijn
2374178c9b
pkg/cri/server: optimizations in unmountRecursive()
Use a PrefixFilter() to get only the mounts we're interested in,
which removes the need to manually filter mounts from the mountinfo
results.

Additional optimizations can be made, as:

> ... there's a little known fact that `umount(MNT_DETACH)` is actually
> recursive in Linux, IOW this function can be replaced with
> `unix.Umount(target, unix.MNT_DETACH)` (or `mount.UnmountAll(target, unix.MNT_DETACH)`
>  (provided that target itself is a mount point).

e8fb2c392f (r535450446)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-01-08 17:32:01 +01:00
Sebastiaan van Stijn
7572919201
mount: remove remaining uses of mount.Self()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-01-08 17:31:59 +01:00
Davanum Srinivas
1f5b84f27c
[CRI] Reduce clutter of log entries during process execution
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-01-06 13:09:03 -05:00
Shengjing Zhu
5988bfc1ef docs: Various typo found by codespell
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2020-12-22 13:22:16 +08:00
Michael Crosby
2e442ea485 [cri] ensure log dir is created
containerd is responsible for creating the log but there is no code to ensure
that the log dir exists.  While kubelet should have created this there can be
times where this is not the case and this can cause stuck tasks.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-12-17 15:04:39 -05:00
Akihiro Suda
7e6e4c466f
remove "selinux" build tag
The build tag was removed in go-selinux v1.8.0: opencontainers/selinux#132

Related: remove "apparmor" build tag: 0a9147f3aa

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-15 20:05:25 +09:00
Akihiro Suda
0a9147f3aa
remove "apparmor" build tag
The "apparmor" build tag does not have any cgo dependency and can be removed safely.

Related: https://github.com/opencontainers/runc/issues/2704

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-08 19:22:39 +09:00
Mike Brown
6467c3374d refactor based on comments
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-12-07 21:39:31 -06:00
Phil Estes
73a301c7a1
Merge pull request #4772 from gaurav1086/ValidatePluginConfig_fix_range_iterator_issue
[cri/config] : fix range iterator issue in ValidatePluginConfig
2020-12-07 12:42:07 -05:00
Phil Estes
efad13faaf
Merge pull request #4811 from AkihiroSuda/expose-apparmor
expose hostSupportsAppArmor()
2020-12-07 08:22:16 -05:00
Akihiro Suda
55eda46b22
expose hostSupportsAppArmor()
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-07 19:12:59 +09:00
Gaurav Singh
071a185506 cri/config: fix range iterator issue in ValidatePluginConfig
Go uses the same address variable while iterating in a range,
so use a copy when using its address.

Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-12-04 17:37:09 -05:00
Mike Brown
b4727eafbe adding code to support seccomp apparmor securityprofile
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-12-04 15:15:32 -06:00
Michael Crosby
3d358c9df3 [cri] don't clear base security settings
When a base runtime spec is being used, admins can configure defaults for the
spec so that default ulimits or other security related settings get applied for
all containers launched.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-12-02 06:51:37 -05:00
Shengjing Zhu
fe767f95c7 Fix package name in cri runtimeoptions protobuf
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2020-11-22 16:15:34 +08:00
Maksym Pavlenko
2837fb35a7
Merge pull request #4715 from thaJeztah/remove_libcontainer_apparmor
pkg/cri/server: remove dependency on libcontainer/apparmor, libcontainer/utils
2020-11-18 14:34:48 -08:00
Sebastiaan van Stijn
eba94a15c8
pkg/cri/server: remove dependency on libcontainer/apparmor, libcontainer/utils
recent versions of libcontainer/apparmor simplified the AppArmor
check to only check if the host supports AppArmor, but no longer
checks if apparmor_parser is installed, or if we're running
docker-in-docker;

bfb4ea1b1b

> The `apparmor_parser` binary is not really required for a system to run
> AppArmor from a runc perspective. How to apply the profile is more in
> the responsibility of higher level runtimes like Podman and Docker,
> which may do the binary check on their own.

This patch copies the logic from libcontainer/apparmor, and
restores the additional checks.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-11-12 15:42:25 +01:00
Jacob Blain Christen
a1e7dd939d cri: selinuxrelabel=false for /dev/shm w/ host ipc
This is a followup to #4699 that addresses an oversight that could cause
the CRI to relabel the host /dev/shm, which should be a no-op in most
cases. Additionally, fixes unit tests to make correct assertions for
/dev/shm relabeling.

Discovered while applying the changes for #4699 to containerd/cri 1.4:
https://github.com/containerd/cri/pull/1605

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-11 15:22:17 -07:00
Jacob Blain Christen
e8d8ae3b97 cri: selinux relabel /dev/shm
Address an issue originally seen in the k3s 1.3 and 1.4 forks of containerd/cri, https://github.com/rancher/k3s/issues/2240

Even with updated container-selinux policy, container-local /dev/shm
will get mounted with container_runtime_tmpfs_t because it is a tmpfs
created by the runtime and not the container (thus, container_runtime_t
transition rules apply). The relabel mitigates such, allowing envoy
proxy to work correctly (and other programs that wish to write to their
/dev/shm) under selinux.

Tested locally with:
- SELINUX=Enforcing vagrant up --provision-with=shell,selinux,test-integration
- SELINUX=Enforcing CRITEST_ARGS=--ginkgo.skip='HostIpc is true' vagrant up --provision-with=shell,selinux,test-cri
- SELINUX=Permissive CRITEST_ARGS=--ginkgo.focus='HostIpc is true' vagrant up --provision-with=shell,selinux,test-cri

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-06 12:05:17 -07:00
Sebastiaan van Stijn
1146098421
replace pkg/symlink with moby/sys/symlink
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-10-30 00:05:15 +01:00
Derek McGowan
5184bccea3
Merge pull request #4631 from dims/copy-a-few-packages-from-moby/moby
Copy pkg/symlink and pkg/truncindex from moby/moby
2020-10-29 09:13:30 -07:00
Wei Fu
f2e8fda82b
Merge pull request #4665 from dmcgowan/update-default-snapshot-annotations
Update make snapshot annotations disabled by default
2020-10-28 21:12:02 +08:00
Derek McGowan
b2642458f9
Update make snapshot annotations disabled by default
This experimental feature should not be enabled by default as
it is not used by any default snapshotters.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-27 21:32:25 -07:00
Akihiro Suda
8ff2707a3c
Merge pull request #4610 from shahzzzam/samashah/add-annotations
Add manifest digest annotation for snapshotters
2020-10-28 13:11:49 +09:00
zhuangqh
30c9addd6c fix: always set unknown to false when handling exit event
Signed-off-by: jerryzhuang <zhuangqhc@gmail.com>
2020-10-27 10:50:15 +08:00
Davanum Srinivas
a9cb22309a
Copy pkg/symlink and pkg/truncindex from moby/moby
moby/moby SHA : 9c15e82f19b0ad3c5fe8617a8ec2dddc6639f40a

github.com/docker/docker/pkg/truncindex/truncindex.go -> pkg/cri/store/truncindex/truncindex.go
github.com/docker/docker/pkg/symlink/LICENSE.APACHE -> pkg/symlink/LICENSE.APACHE
github.com/docker/docker/pkg/symlink/LICENSE.BSD -> pkg/symlink/LICENSE.BSD
github.com/docker/docker/pkg/symlink/README.md -> pkg/symlink/README.md
github.com/docker/docker/pkg/symlink/fs.go -> pkg/symlink/fs.go
github.com/docker/docker/pkg/symlink/fs_unix.go -> pkg/symlink/fs_unix.go
github.com/docker/docker/pkg/symlink/fs_windows.go -> pkg/symlink/fs_windows.go

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-10-15 08:36:35 -04:00
Daniel Canter
cdb2f9c66f Filter snapshotter labels passed to WithNewSnapshot
Made a change yesterday that passed through snapshotter labels into the wrapper of
WithNewSnapshot, but it passed the entirety of the annotations into the snapshotter.
This change just filters the set that we care about down to snapshotter specific
labels.

Will probably be future changes to add some more labels for LCOW/WCOW and the corresponding
behavior for these new labels.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-15 04:49:39 -07:00
Phil Estes
9b70de01d6
Merge pull request #4630 from dcantah/pass-snapshotter-opt
Cri - Pass snapshotter labels into customopts.WithNewSnapshot
2020-10-14 10:54:06 -04:00
Daniel Canter
9a1f6ea4dc Cri - Pass snapshotter labels into customopts.WithNewSnapshot
Previously there wwasn't a way to pass any labels to snapshotters as the wrapper
around WithNewSnapshot didn't have a parm to pass them in.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-14 04:14:03 -07:00
Daniel Canter
d74225b588 Fix comment in RemovePodSandbox
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-12 17:59:08 -07:00
zhangjianming
116902cd21 fix no-pivot not working in io.containerd.runtime.v1.linux
Signed-off-by: zhangjianming <zhang.jianming7@zte.com.cn>
2020-10-12 09:39:59 +08:00
Maksym Pavlenko
3d02441a79 Refactor pkg packages
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-10-08 17:30:17 -07:00
Akihiro Suda
915263f269
Merge pull request #4502 from akshat-kmr/master
Add logging binary support when terminal is true
2020-10-08 12:14:39 +09:00
Samarth Shah
5fc721370d Add manifest digest annotation for snapshotters
Signed-off-by: Samarth Shah <samarthmshah@gmail.com>
2020-10-07 23:12:01 +00:00
Maksym Pavlenko
3508ddd3dd Refactor CRI packages
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-10-07 14:45:57 -07:00
Derek McGowan
b22b627300
Move cri server packages under pkg/cri
Organizes the cri related server packages under pkg/cri

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 13:09:37 -07:00
Derek McGowan
1c60ae7f87
Use local version of cri packages
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 10:59:40 -07:00
Derek McGowan
e7a350176a
Merge containerd/cri into containerd/containerd
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 10:58:39 -07:00
Derek McGowan
0820015314 Prepare cri for merge to containerd
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 10:58:39 -07:00
Michael Crosby
a0b3b4e4da
Merge pull request #1593 from moolen/fix/add-nri-labels
Add missing sandbox labels when invoking nri
2020-10-07 13:17:06 -04:00
Derek McGowan
07c98d0bf1 Fix lint in Unix environments
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-05 17:46:01 -07:00
Moritz Johner
f87302ab20 Add missing sandbox labels when invoking nri plugins
Signed-off-by: Moritz Johner <beller.moritz@googlemail.com>
2020-10-03 23:30:09 +02:00
Derek McGowan
a3c0e8859c Align lint checks with containerd
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-09-30 23:17:46 -07:00
Derek McGowan
83e6efc6fc Use tabs in protofile indentation
This is enforced as part of containerd's fmt checks

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-09-30 21:50:33 -07:00
Mike Brown
2e3bebb297
Merge pull request #1583 from thaJeztah/simplify_ensure_removeall_windows
pkg/server: make ensureRemoveAll() an alias for os.RemoveAll() on Windows
2020-09-28 14:26:18 -05:00
Mike Brown
b1ee4c0d7b
Merge pull request #1570 from yoheiueda/masked
Set masked and readonly paths based on default Unix spec
2020-09-24 15:45:58 -05:00
Sebastiaan van Stijn
e2928124d1
pkg/server: make ensureRemoveAll() an alias for os.RemoveAll() on Windows
The tricks performed by ensureRemoveAll only make sense for Linux and
other Unices, so separate it out, and make ensureRemoveAll for Windows
just an alias of os.RemoveAll.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-09-22 10:11:46 +02:00
ktock
e571fd864f Limit value size of additional annotation for avoiding unpack failure
In containerd, there is a size limit for label size (4096 chars).
Currently if an image has many layers (> (4096-39)/72 > 56),
`containerd.io/snapshot/cri.image-layers` will hit the limit of label size and
the unpack will fail.
This commit fixes this by limiting the size of the annotation.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2020-09-15 22:47:28 +09:00
Yohei Ueda
b582da4438
Set masked and readonly paths based on default Unix spec
The default values of masked and readonly paths are defined
in populateDefaultUnixSpec, and are used when a sandbox is
created.  It is not, however, used for new containers.  If
a container definition does not contain a security context
specifying masked/readonly paths, a container created from
it does not have masked and readonly paths.

This patch applies the default values to masked and
readonly paths of a new container, when any specific values
are not specified.

Fixes #1569

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2020-09-09 23:13:05 +09:00
Michael Crosby
d715d00906 Handle KVM based runtimes with selinux
Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-08-26 21:38:03 -04:00
Akshat Kumar
7a9fbec5fb Add logging binary support when terminal is true
Currently the shims only support starting the logging binary process if the
io.Creator Config does not specify Terminal: true. This means that the program
using containerd will only be able to specify FIFO io when Terminal: true,
rather than allowing the shim to fork the logging binary process. Hence,
containerd consumers face an inconsistent behavior regarding logging binary
management depending on the Terminal option.

Allowing the shim to fork the logging binary process will introduce consistency
between the running container and the logging process. Otherwise, the logging
process may die if its parent process dies whereas the container will keep
running, resulting in the loss of container logs.

Signed-off-by: Akshat Kumar <kshtku@amazon.com>
2020-08-25 17:28:29 -07:00
Derek McGowan
56a89cda34
Merge pull request #1552 from crosbymichael/nri
Add experimental NRI injection points
2020-08-24 13:58:11 -07:00
Antonio Ojea
1403a391c3 bump cni dependencies
Signed-off-by: Antonio Ojea <aojea@redhat.com>
2020-08-21 18:00:20 +02:00
Michael Crosby
63f89eb954 Update server with nri injection points
This allows development with container to be done for NRI without the need for
custom builds.

This is an experimental feature and is not enabled unless a user has a global
`/etc/nri/conf.json` config setup with plugins on the system.  No NRI code will
be executed if this config file does not exist.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-08-20 08:10:09 -04:00
Akihiro Suda
7332e2ad2e
remove libseccomp cgo dependency
The CRI plugin was depending on libseccomp cgo dependency via
libseccomp-golang via libcontainer.

https://github.com/seccomp/libseccomp-golang/blob/v0.9.1/seccomp_internal.go#L17

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-30 18:51:23 +09:00
Mike Brown
8a2d1cc802 adds support for pod id lookup for filter
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-07-29 15:23:22 -05:00
ktock
c80660b82b Allow GC to discard content after successful pull and unpack
This commit adds a config flag for allowing GC to clean layer contents up after
unpacking these contents completed, which leads to deduplication of layer
contents between the snapshotter and the contnet store.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2020-07-28 09:05:47 +09:00
Michael Crosby
5f5d954b6a add selinux category range to config
This allows an admin to set the upper bounds on the category range for selinux
labels.  This can be useful when handling allocation of PVs or other volume
types that need to be shared with selinux enabled on the hosts and volumes.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-07-20 16:02:07 -04:00
Akihiro Suda
707d2c49d1
allow disabling hugepages
This helps with running rootless mode + cgroup v2 + systemd without hugetlb delegation.
Systemd does not (and will not, perhaps) support hugetlb delegation as of systemd v245. https://github.com/systemd/systemd/
issues/14662

From 502bc5427e/src/patches/containerd/0001-DIRTY-VENDOR-cri-allow-disabling-hugepages.patch

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-16 11:46:25 +09:00
James Sturtevant
2bb0b19c4b Update to latest pause image for windows
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2020-07-15 11:45:21 -07:00
Mike Brown
4b3974c4e9 show runc options tag
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-07-10 16:33:36 -05:00
Abhishek Kulkarni
287c52d1c6 Forcibly stop running containers before removal
Signed-off-by: Abhishek Kulkarni <abd.kulkarni@gmail.com>
2020-07-04 15:49:00 -05:00
Akihiro Suda
fb208d015a
vendor runc v1.0.0-rc91
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-03 14:03:21 +09:00
Akihiro Suda
fe6833a9a4
config: TolerateMissingHugePagesCgroupController -> TolerateMissingHugetlbController
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-02 13:49:42 +09:00
Akihiro Suda
b69d7bdc5f
config: fix TOML tag for TolerateMissingHugePagesCgroupController
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-02 13:38:19 +09:00
Mike Brown
c2191fddd7
Merge pull request #1513 from brianpursley/state-name
Change "failed to stop sandbox" error message to use state name instead of numeric value
2020-06-27 16:08:27 -05:00
Brian Pursley
aa04fc9d53 Change "failed to stop sandbox" error message to use state name instead of numeric value
Signed-off-by: Brian Pursley <bpursley@cinlogic.com>
2020-06-27 16:45:08 -04:00
Kevin Parsons
210561a8e3 Support named pipe mounts for Windows containers
Adds support to mount named pipes into Windows containers. This support
already exists in hcsshim, so this change just passes them through
correctly in cri. Named pipe mounts must start with "\\.\pipe\".

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
2020-06-25 12:01:08 -07:00
Mike Brown
f5c7ac9272 fix for image pull linter change
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-06-24 18:10:31 -05:00
Davanum Srinivas
3ee62de2bf
remove unused method
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 15:03:47 -04:00
Davanum Srinivas
cbb7c28f19
Add copyright headers
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 14:49:13 -04:00
Davanum Srinivas
e2072b71cc
Copy kubernetes/pkg/util/bandwidth
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 14:48:25 -04:00
Davanum Srinivas
2909022a6e
Make local copy of kubelet/cri/streaming
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 13:54:34 -04:00
Davanum Srinivas
41f184f15b
Update vendor.conf to kubernetes 1.19.0-beta.2
update streaming import path
switch remote package path

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 08:44:49 -04:00
Michael Crosby
6164822714
Merge pull request #1508 from janosi/sctp-hostport
Remove the protocol filter from the HostPort management
2020-06-15 14:48:37 -04:00
Mike Brown
b661ad711e
Merge pull request #1504 from lorenz/ignore-image-defined-volumes
Add option for ignoring volumes defined in images
2020-06-14 11:52:48 -05:00
Mike Brown
26dc5b9772
Merge pull request #1505 from dcantah/windows-cred-spec
Add GMSA credential spec passing
2020-06-14 11:52:33 -05:00
Laszlo Janosi
479dfbac45
Remove the protocol filter from the portMappings constructor.
Reason: originally it was introduced to prevent the loading of the SCTP kernel module on the nodes. But iptables chain creation alone does not load the kernel module. The module would be loaded if an SCTP socket was created, but neither cri nor the portmap CNI plugin starts managing SCTP sockets if hostPort / portmappings are defined.
Signed-off-by: Laszlo Janosi <laszlo.janosi@ibm.com>
2020-06-14 15:48:00 +00:00
Kenta Tada
730b7a932e Change the type of PdeathSignal
Use x/sys as same as runtime/v1/linux/runtime.go

Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2020-06-11 11:35:51 +09:00
Daniel Canter
9620b2e1da Add GMSA Credential Spec passing
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-06-10 11:15:07 -07:00
Lorenz Brun
5a1d49b063 Add option for ignoring volumes defined in images
Signed-off-by: Lorenz Brun <lorenz@brun.one>
2020-06-09 21:02:47 +02:00
Brian Goff
c694c63176 Add config for registry http headers
This adds a configuration knob for adding request headers to all
registry requests. It is not namespaced to a registry.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-06-08 18:56:15 -07:00
Gaurav Singh
7213cd89d6 Process I/O: Fix goroutine leak
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-06-07 17:38:36 -04:00
Davanum Srinivas
d7ce093d63
Tolerate missing HugeTLB cgroups controller
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-01 12:07:32 -04:00
Akihiro Suda
2f601013e6 cgroup2: implement containerd.events.TaskOOM event
How to test (from https://github.com/opencontainers/runc/pull/2352#issuecomment-620834524):
  (host)$ sudo swapoff -a
  (host)$ sudo ctr run -t --rm --memory-limit $((1024*1024*32)) docker.io/library/alpine:latest foo
  (container)$ sh -c 'VAR=$(seq 1 100000000)'

An event `/tasks/oom {"container_id":"foo"}` will be displayed in `ctr events`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-06-01 14:00:13 +09:00
Maksym Pavlenko
17c61e36cb Fix cgroups path for base OCI spec
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-05-29 11:40:12 -07:00
Maksym Pavlenko
8d54f39753 Allow specify base OCI runtime spec
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-05-28 13:39:31 -07:00
Michael Crosby
72edf3016d Use new SELinux APIs
This moves most of the API calls off of the `labels` package onto the root
selinux package.  This is the newer API for most selinux operations.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-05-26 15:18:46 -04:00
Darren Shepherd
24209b91bf Add MCS label support
Carry of #1246

Signed-off-by: Darren Shepherd <darren@rancher.com>
Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-05-20 13:59:51 -05:00
Sascha Grunert
e2cedb9469
Increase port-forward timeout to 1s to fix e2e test
We encountered two failing end-to-end tests after the adoption of
https://github.com/containerd/cri/pull/1470 in
https://github.com/cri-o/cri-o/pull/3749:

```
Summarizing 2 Failures:
[Fail] [sig-cli] Kubectl Port forwarding With a server listening on 0.0.0.0 that expects a client request [It] should support a client that connects,
sends DATA, and disconnects
test/e2e/kubectl/portforward.go:343

[Fail] [sig-cli] Kubectl Port forwarding With a server listening on localhost that expects a client request [It] should support a client that connects
, sends DATA, and disconnects
test/e2e/kubectl/portforward.go:343
```

Increasing the timeout to 1s fixes the issue.

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2020-05-12 12:43:14 +02:00
Derek McGowan
21ad9c4e21 Use digestset from go-digest
Removes docker/distribution dependency

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-05-11 14:17:34 -07:00
payall4u
b437938d2f
Transfer error to ErrNotFound when kill a not exist container, also add
test case.

Signed-off-by: payall4u <404977848@qq.com>

Add integration test case

Signed-off-by: payall4u <404977848@qq.com>
2020-05-11 21:53:43 +08:00
Wei Fu
8252e54f93
Merge pull request #1472 from mxpv/profile
Add config flag to default empty seccomp profile
2020-05-11 10:16:00 +08:00
Mike Brown
bd0a76565a
Merge pull request #1469 from thaJeztah/remove_libcontainer_system
Remove dependency on libcontainer/system
2020-05-10 19:33:17 -05:00
Derek McGowan
dbedcf8706
Merge pull request #1449 from mikebrow/make-http-with-tlsconfig-a-warning
removes the error when tls is configured for https but http is tried first
2020-05-10 16:09:41 -07:00
Sebastiaan van Stijn
0e1b7bdb59
Remove dependency on libcontainer/system
This swaps the RunningInUserNS() function that we're using
from libcontainer/system with the one in containerd/sys.

This removes the dependency on libcontainer/system, given
these were the only functions we're using from that package.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-10 21:58:16 +02:00
Maksym Pavlenko
674fe72aa8 Update docs for unset seccomp profile
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-05-10 10:46:58 -07:00
Sebastiaan van Stijn
c96373f6d5
newTransport(): remove deprecated DualStack option
The `DualStack` option was deprecated in Go 1.12, and is now enabled by default
(through commit github.com/golang/go@efc185029bf770894defe63cec2c72a4c84b2ee9).

> The Dialer.DualStack field is now meaningless and documented as deprecated.
>
> To disable fallback, set FallbackDelay to a negative value.

The default `FallbackDelay` is 300ms; to make this more explicit, this patch
sets `FallbackDelay` to the default value.

Note that Docker Hub currently does not support IPv6 (DNS for registry-1.docker.io
has no AAAA records, so we should not hit the 300ms delay).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-10 12:39:10 +02:00
Antonio Ojea
11a78d9d0f
don't use socat for port forwarding
use goroutines to copy the data from the stream to the TCP
connection, and viceversa, removing the socat dependency.

Quoting Lantao Liu, the logic is as follow:

When one side (either pod side or user side) of portforward
is closed, we should stop port forwarding.

When one side is closed, the io.Copy use that side as source will close,
but the io.Copy use that side as dest won't.

Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
2020-05-09 00:54:30 +02:00
Maksym Pavlenko
38f19f991e Add config flag to default empty seccomp profile
This changes adds `default_seccomp_profile` config switch to apply default seccomp profile when not provided by k8s.a

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-05-08 13:24:38 -07:00
Wei Fu
48e797c77f RunPodSandbox: destroy network if fails or invalid
Should destroy the pod network if fails to setup or return invalid
net interface, especially multiple CNI configurations.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2020-05-01 12:07:33 +08:00
ktock
ca661c8dc9 Pass chained layer digests to snapshotter for parallel snapshot preparation
Currently, CRI plugin passes each layer digest to remote snapshotters
sequentially, which leads to sequential snapshots preparation. But it costs
extra time especially for remote snapshotters which need to connect to the
remote backend store (e.g. registries) for checking the snapshot existence on
each preparation.

This commit solves this problem by introducing new label
`containerd.io/snapshot/cri.chain` for passing all layer digests in an image to
snapshotters and by allowing them to prepare these snapshots in parallel, which
leads to speed up the preparation.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2020-04-28 15:03:08 +09:00
Mike Brown
4ea4ca99c7
Merge pull request #1455 from 6WIND/master
fix incomplete host device for PrivilegedWithoutHostDevices
2020-04-26 22:28:20 -05:00
Mike Brown
776c125e4f move up to latest critools; add apparmor profile check
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-04-26 16:16:48 -05:00
Mike Brown
1b60224e2e use containerd/project header test
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-04-22 19:35:37 -05:00
Thibaut Collet
98f8ec4995 fix incomplete host device for PrivilegedWithoutHostDevices
For a privilege pods with PrivilegedWithoutHostDevices set to true
host device specified in the config are not provided (whereas it is done for
non privilege pods or privilege pods with PrivilegedWithoutHostDevices set
to false as all devices are included).

Add them in this case.

Fixes: 3353ab76d9 ("Add flag to overload default privileged host device behaviour")
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
2020-04-22 18:20:36 +02:00
Mike Brown
9d37687a95
Merge pull request #1436 from chethanah/add-container-name-annot
Support for additional OCI annotations: 'container-name'
2020-04-19 13:19:47 -05:00
Maksym Pavlenko
917e7646ae Add binary IO tests
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2020-04-17 16:50:43 -07:00
Maksym Pavlenko
9175401b28 Cleanup binary IO resources on error
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2020-04-17 15:56:21 -07:00
Maksym Pavlenko
0dc7c85956 Don't use timeout package when stopping shim logger
containerd loads timeout values from config.toml and populated those
values to `timeout` package at launch. So when using `timeout` package
from shim, there are default values and config file is ignored.
So use a hardcoded value for binary IO.

Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2020-04-17 15:06:18 -07:00
yang yang
d07f7f167a add default scheme if endpoint no scheme
Signed-off-by: yang yang <yang8518296@163.com>
2020-04-17 23:33:28 +08:00
Mike Brown
27f911d663 removes the error when tls is configured for https but http is tried first
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-04-16 13:23:56 -05:00
ktock
c1b7bcf395 Enable to pass additional handler on pull for stargz-based remote snapshots
Throughout container lifecycle, pulling image is one of the time-consuming
steps. Recently, containerd community started to tackle this issue with
stargz-based remote snapshots, as a non-core
subproject(https://github.com/containerd/stargz-snapshotter).

This snapshotter is implemented as a standard proxy plugin but it requires the
client to pass some additional information (image ref and layer digest) for each
pull operation to query layer contents on the registry. Stargz snapshotter
project provides an image handler to do this and stargz snapshot users need to
pass this handler to containerd client.

This commit enables to use stargz-based remote snapshots through CRI by passing
the handler to containerd client on pull operation.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2020-04-16 20:53:52 +09:00
Chethan Suresh
7fc8652e32 Add OCI annotations for container name
Along with type(Sandbox or Container) and Sandbox name annotations
provide support for additional annotation:
  - Container name

This will help us perform per container operation by comparing it
with pass through annotations (eg. pod metadata annotations from K8s)

Signed-off-by: Chethan Suresh <Chethan.Suresh@sony.com>
2020-04-16 07:14:58 +05:30
Shengjing Zhu
4263229a7b Replace docker/distribution/reference with containerd/reference/docker
Since https://github.com/containerd/containerd/pull/3728
The docker/distribution/reference package is copied into containerd core

Signed-off-by: Shengjing Zhu <i@zhsj.me>
2020-04-16 03:29:58 +08:00
Mike Brown
d531dc492a
Merge pull request #1405 from fuweid/me-async-load-cnicnf
reload cni network config if has fs change events
2020-04-15 13:57:32 -05:00
Mike Brown
aa9b1885b5 fixes bad unit tests when selinux is enabled
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-04-15 12:28:11 -05:00
Maksym Pavlenko
0caa233158 Rework shim logger shutdown process
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2020-04-07 12:42:04 -07:00
Wei Fu
4ce334aa49 reload cni network config if has fs change events
With go RWMutex design, no goroutine should expect to be able to
acquire a read lock until the read lock has been released, if one
goroutine call lock.

The original design is to reload cni network config on every single
Status CRI gRPC call. If one RunPodSandbox request holds read lock
to allocate IP for too long, all other RunPodSandbox/StopPodSandbox
requests will wait for the RunPodSandbox request to release read lock.
And the Status CRI call will fail and kubelet becomes NOTReady.

Reload cni network config at every single Status CRI call is not
necessary and also brings NOTReady situation. To lower the possibility
of NOTReady, CRI will reload cni network config if there is any valid fs
change events from the cni network config dir.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2020-04-03 12:28:58 +08:00
Phil Estes
0c78dacbc5
Move isFifo from process/io to sys/ and make public
Make "IsFifo" a public function for use by other parts of containerd
codebase.

Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
2020-03-25 10:44:17 -04:00
Li Yuxuan
cb0140063e Fix goroutine leak when exec/attach
The resize chan is never closed when doing exec/attach now. What's more,
`resize` is a recieved only chan so it can not be closed. Use ctx to
exit the goroutine in `handleResizing` properly.

Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
2020-03-24 10:42:54 +08:00
Sebastiaan van Stijn
e093a0ee08
Use local "ensureRemoveAll" instead of docker/pkg/system
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-03-12 20:21:14 +01:00
lifubang
488d6194f2 fix dial error when clean up a dead shim
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-03-12 10:57:55 +08:00
Akihiro Suda
fa72e2f693 cgroup2: do not unshare cgroup namespace for privileged
Conforms to the latest KEP:
0e409b4749/keps/sig-node/20191118-cgroups-v2.md (cgroup-namespace)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-09 01:49:04 +09:00
Sebastiaan van Stijn
f2edc6f164
vendor: update gotest.tools v3.0.2
full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-02-28 17:47:20 +01:00
Brandon Lum
8d5a8355d0 Updated docs and code for default nil behavior
Signed-off-by: Brandon Lum <lumjjb@gmail.com>
2020-02-27 23:42:03 +00:00
Kiril Vladimiroff
4dd75be2b9
Unify dialer implementations
Instead of having several dialer implementations, leave only one in
`pkg/dialer` and call it from `pkg/ttrpcutil`, `runtime/v(1|2)/shim`
which had their own

Closes #3471.

Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
2020-02-26 23:29:04 +02:00
Brandon Lum
ffcef9dc32 Addressed nits
Signed-off-by: Brandon Lum <lumjjb@gmail.com>
2020-02-24 20:45:57 +00:00
Brandon Lum
8df431fc31 Defer multitenant key model to image auth discussion
Signed-off-by: Brandon Lum <lumjjb@gmail.com>
2020-02-24 20:45:57 +00:00
Brandon Lum
c43a7588f6 Refactor encrypted opts and added unit test
Signed-off-by: Brandon Lum <lumjjb@gmail.com>
2020-02-24 20:45:57 +00:00
Brandon Lum
f0579c7b4d Implmented node key model for image encryption
Signed-off-by: Brandon Lum <lumjjb@gmail.com>
2020-02-24 20:45:57 +00:00
Mike Brown
f4b3cdb892
Merge pull request #1399 from mikebrow/pause-image-update
move to v3.2 for the pause image
2020-02-20 10:45:16 -06:00
Mike Brown
c9ed98462d move to v3.2 for the pause image
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-02-14 12:55:52 -06:00
Mike Brown
cf0e0a1e2c
Merge pull request #1332 from bg-chun/update_cri_for_hugepages
update cri-plugin to parse hugepages limit
2020-02-12 10:05:01 -06:00
Byonggon Chun
c02c24847f update cri-plugin to parse hugepages limit from CRI message
Signed-off-by: Byonggon Chun <bg.chun@samsung.com>
2020-02-06 15:28:24 +09:00
Justin Terry (VM)
a8cc66b37a Fix store error serialization to gRPC status codes
The pkg/store errors are duplicated errors of NotFound and AlreadyExist from
containerd's errdefs package and thus do not properly serialize when running
errdefs.ToGRPC on them. CRI runs this function on every return from a CRI
method so the conversion fails if there is a cache miss from the store caches
for containers or sandboxes. This change verifies that the errors are properly
converted to their gRPC values.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
2020-02-05 18:32:45 -08:00
Akihiro Suda
2d28b60046 vendor kubernetes 1.17.1
Corresponds to https://github.com/kubernetes/kubernetes/blob/v1.17.1/go.mod

note: `k8snet.ChooseBindAddress()` was renamed to `k8snet.ResolveBindAddress()` in afa0b808f8

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-22 02:06:50 +09:00
Akihiro Suda
5e5960f2bc
Merge pull request #1376 from Zyqsempai/add-cgroups-v2-metrics
Cgroupv2: Added CPU, Memory metrics
2020-01-21 23:21:09 +09:00
Boris Popovschi
6b8846cdf8 vendor updated + added cgroupv2 metrics
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-01-17 11:55:06 +02:00
Akihiro Suda
71740399e0 cgroup2: unshare cgroup namespace for containers
In cgroup v1 container implementations, cgroupns is not used by default because
it was not available in the kernel until kernel 4.6 (May 2016), and the default
behavior will not change on cgroup v1 environments, because changing the
default will break compatibility and surprise users.

For cgroup v2, implementations are going to unshare cgroupns by default
so as to hide /sys/fs/cgroup from containers.

* Discussion: https://github.com/containers/libpod/issues/4363
* Podman PR (merged): https://github.com/containers/libpod/pull/4374
* Moby PR: https://github.com/moby/moby/pull/40174

This PR enables cgroupns for containers, but pod sandboxes are untouched
because probably there is no need to do.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-01-09 14:58:30 +09:00
Akihiro Suda
aaddaa2732 bump up the default runtime to "io.containerd.runc.v2"
The former default runtime "io.containerd.runc.v1" won't support new features
like support for cgroup v2: containerd/containerd#3726

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-12-16 11:53:58 +09:00
Lantao Liu
0c2d3b718d Fix privileged devices.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-12-09 17:43:06 -08:00
Lantao Liu
78708b20c7
Merge pull request #1351 from Random-Liu/better-unknown-state-handling
Better handle unknown state.
2019-12-09 10:34:57 -08:00
Lantao Liu
facbaa0e79 Better handle unknown state.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-12-06 10:56:27 -08:00
bpopovschi
5d7bd738e4 Use containerD WithHostDevices
Signed-off-by: bpopovschi <zyqsempai@mail.ru>
2019-12-04 11:34:46 +02:00
Lantao Liu
a6b6097c90 Fix container pid.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-12-02 01:02:22 -08:00
Lantao Liu
444f02a89e
Merge pull request #1344 from darfux/add-resolvconf-to-sandbox-container
Provide resolvConf to sandbox container's mounts
2019-12-01 21:25:19 -08:00
Li Yuxuan
dbc1fb37d0 Provide resolvConf to sandbox container's mounts
As https://github.com/kata-containers/runtime/issues/1603 discussed,
kata relies on such mount spec to setup resolv.conf for pod VM properly.

Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
2019-11-28 12:05:05 +08:00
Lantao Liu
ab6701bd11 Add insecure_skip_verify option.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-11-26 13:25:52 -08:00
Lantao Liu
5c2f33bd0d Cleanup path for windows mount
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-11-15 18:52:11 +00:00
Erik Wilson
7cc3938717 Set default scheme in registryEndpoints for host
Signed-off-by: Erik Wilson <Erik.E.Wilson@gmail.com>
2019-10-31 10:30:17 -07:00
Lantao Liu
65b9c31805 Use http for localhost, 127.0.0.1 and ::1 by default.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-10-28 19:07:43 -07:00
Lantao Liu
d95e21c89b Add container compute stats support.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-10-25 14:32:02 -07:00
Michael Crosby
f8cca26f3c Handle large output in v2 shim with TTY
Reized the I/O buffers to align with the size of the kernel buffers with fifos
and move the close aspect of the console to key off of the stdin closing.

Fixes #3738

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-10-11 15:42:05 -04:00
Lantao Liu
2ce0bb0926 Update code for latest containerd.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-10-09 18:05:20 -07:00
Lantao Liu
18be6e3714 Use cached state instead of runc state.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-10-03 10:53:13 -07:00
Lantao Liu
358d672160 Add hostname CRI validation and unit test.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-09-25 16:11:27 -07:00
Lantao Liu
7fba77f238
Merge pull request #1298 from Random-Liu/set-sandbox-cpu-shares
Set default sandbox container cpu shares on windows.
2019-09-25 11:05:43 -07:00
Lantao Liu
2eba67a7ee
Merge pull request #1287 from crosbymichael/cgroups
Use type alias from containerd for cgroup metric types
2019-09-24 17:34:49 -07:00
Lantao Liu
f3ef10e9a2 Set default sandbox container cpu shares on windows.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-09-24 17:03:11 -07:00