Commit Graph

1223 Commits

Author SHA1 Message Date
Derek McGowan
9afc778b73
Merge pull request #6111 from crosbymichael/latency-metrics
[cri] add sandbox and container latency metrics
2021-11-16 16:59:33 -08:00
Phil Estes
7758cdc09a
Merge pull request #6253 from jonyhy96/feat-rwmutex
feat: use rwmutex instead
2021-11-16 11:39:29 -05:00
Derek McGowan
6835a94707
Split runc shim into plugin components
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
Derek McGowan
6eea8f3f62
Add shutdown package
Allows shutdown to handle callbacks with similar behavior as context
cancel

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
haoyun
bef792b962 feat: use rwmutex instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-16 11:06:40 +08:00
Derek McGowan
d055487b00
Merge pull request #6206 from mxpv/path
Allow absolute path to shim binaries
2021-11-15 18:05:48 -08:00
Olli Janatuinen
2a81c9f677 CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2021-11-15 18:30:09 +02:00
Michael Crosby
6765524b73 use write lock when updating container stats
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-11 15:17:48 +00:00
Maksym Pavlenko
6870f3b1b8 Support custom runtime path when launching tasks
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-11-09 13:31:46 -08:00
Michael Crosby
91bbaf6799 [cri] add sandbox and container latency metrics
These are simple metrics that allow users to view more fine grained metrics on
internal operations.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-09 21:07:38 +00:00
Michael Crosby
4b7cc560b2
Merge pull request #6222 from jonyhy96/add-more-description
cleanup: add more description on comment
2021-11-09 15:55:32 -05:00
haoyun
5748006337 cleanup: add more description on comment
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-09 19:13:37 +08:00
David Porter
2e6d5709e3 Implement CRI container and pods stats
See https://kep.k8s.io/2371

* Implement new CRI RPCs - `ListPodSandboxStats` and `PodSandboxStats`
  * `ListPodSandboxStats` and `PodSandboxStats` which return stats about
    pod sandbox. To obtain pod sandbox stats, underlying metrics are
    read from the pod sandbox cgroup parent.
  * Process info is obtained by calling into the underlying task
  * Network stats are taken by looking up network metrics based on the
    pod sandbox network namespace path
* Return more detailed stats for cpu and memory for existing container
  stats. These metrics use the underlying task's metrics to obtain
  stats.

Signed-off-by: David Porter <porterdavid@google.com>
2021-11-03 17:52:05 -07:00
Dat Nguyen
afe39bebfe add oci.WithAllDevicesAllowed flag for privileged_without_host_devices
This commit adds a flag that enable all devices whitelisting when
privileged_without_host_devices is already enabled.

Fixes #5679

Signed-off-by: Dat Nguyen <dnguyen7@atlassian.com>
2021-11-04 10:24:19 +11:00
Mike Brown
ea89788105 adds additional debug out to timebox cni setup
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-11-01 09:34:29 -05:00
zounengren
a217b5ac8f bump CNI to spec v1.0.0
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-10-22 10:58:40 +08:00
Sambhav Kothari
2a8dac12a7 Output a warning for label image labels instead of erroring
This change ignore errors during container runtime due to large
image labels and instead outputs warning. This is necessary as certain
image building tools like buildpacks may have large labels in the images
which need not be passed to the container.

Signed-off-by: Sambhav Kothari <sambhavs.email@gmail.com>
2021-10-14 19:25:48 +01:00
Claudiu Belu
2bc77b8a28 Adds Windows resource limits support
This will allow running Windows Containers to have their resource
limits updated through containerd. The CPU resource limits support
has been added for Windows Server 20H2 and newer, on older versions
hcsshim will raise an Unimplemented error.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-25 13:20:55 -07:00
Derek McGowan
cb6fb93af5
Merge pull request #6011 from crosbymichael/schedcore
add runc shim support for sched core
2021-10-08 10:42:16 -07:00
Derek McGowan
26ee1b1ee5
Merge pull request #4695 from crosbymichael/cri-class
[cri] Add CNI conf based on runtime class
2021-10-08 09:27:49 -07:00
Michael Crosby
e48bbe8394 add runc shim support for sched core
In linux 5.14 and hopefully some backports, core scheduling allows processes to
be co scheduled within the same domain on SMT enabled systems.

The containerd impl sets the core sched domain when launching a shim. This
allows a clean way for each shim(container/pod) to be in its own domain and any
additional containers, (v2 pods) be be launched with the same domain as well as
any exec'd process added to the container.

kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-10-08 16:18:09 +00:00
Michael Crosby
7b8a697f28
Merge pull request #6034 from claudiubelu/windows/fixes-image-volume
Fixes Windows containers with image volumes
2021-10-07 11:50:01 -04:00
Akihiro Suda
703b86533b
pkg/cap: remove an outdated comment
pkg/cap no longer depends on github.com/syndtr/gocapability

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-10-06 13:24:30 +09:00
Derek McGowan
63b7e5771e
Merge pull request #5973 from Juneezee/deprecate-ioutil
refactor: move from io/ioutil to io and os package
2021-10-01 10:52:06 -07:00
Claudiu Belu
791e175c79 Windows: Fixes Windows containers with image volumes
Currently, there are few issues that preventing containers
with image volumes to properly start on Windows.

- Unlike the Linux implementation, the Container volume mount paths
  were not created if they didn't exist. Those paths are now created.

- while copying the image volume contents to the container volume,
  the layers were not properly deactivated, which means that the
  container can't start since those layers are still open. The layers
  are now properly deactivated, allowing the container to start.

- even if the above issue didn't exist, the Windows implementation of
  mount/Mount.Mount deactivates the layers, which wouldn't allow us
  to copy files from them. The layers are now deactivated after we've
  copied the necessary files from them.

- the target argument of the Windows implementation of mount/Mount.Mount
  was unused, which means that folder was always empty. We're now
  symlinking the Layer Mount Path into the target folder.

- hcsshim needs its Container Mount Paths to be properly formated, to be
  prefixed by C:. This was an issue for Volumes defined with Linux-like
  paths (e.g.: /test_dir). filepath.Abs solves this issue.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-10-01 09:02:18 +00:00
haoyun
5c2426a7b2 cleanup: import from k8s.io/utils/clock/testing instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:34:56 +08:00
haoyun
6484fab1e0 cleanup: import from k8s.io/utils/clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:27:20 +08:00
zounengren
fcffe0c83a switch usage directly to errdefs.(ErrAlreadyExists and ErrNotFound)
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2021-09-24 18:26:58 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Michael Crosby
55893b9be7 Add CNI conf based on runtime class
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-17 19:05:06 +00:00
Phil Estes
f40df3d72b
Enable image config labels in ctr and CRI container creation
Signed-off-by: Phil Estes <estesp@amazon.com>
2021-09-15 15:31:19 -04:00
Phil Estes
d081457ba4
Merge pull request #5974 from claudiubelu/hanging-task-delete-fix
task delete: Closes task IO before waiting
2021-09-15 11:30:23 -04:00
Fu Wei
e1ad779107
Merge pull request #5817 from dmcgowan/shim-plugins
Add support for shim plugins
2021-09-12 18:18:20 +08:00
Fu Wei
d9f921e4f0
Merge pull request #5906 from thaJeztah/replace_os_exec 2021-09-11 10:38:53 +08:00
Phil Estes
6589876d20
Merge pull request #5964 from crosbymichael/cni-pref
add ip_pref CNI options for primary pod ip
2021-09-10 12:06:23 -04:00
Fu Wei
689a863efe
Merge pull request #5939 from scuzhanglei/privileged-device 2021-09-10 22:15:46 +08:00
Michael Crosby
1ddc54c00d
Merge pull request #5954 from claudiubelu/fix-sandbox-remove
sandbox: Allows the sandbox to be deleted in NotReady state
2021-09-10 10:12:34 -04:00
Michael Crosby
1efed43090
add ip_pref CNI options for primary pod ip
This fixes the TODO of this function and also expands on how the primary pod ip
is selected. This change allows the operator to prefer ipv4, ipv6, or retain the
ordering provided by the return results of the CNI plugins.

This makes it much more flexible for ops to configure containerd and how IPs are
set on the pod.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-10 10:04:21 -04:00
scuzhanglei
756f4a3147 cri: add devices for privileged container
Signed-off-by: scuzhanglei <greatzhanglei@gmail.com>
2021-09-10 10:16:26 +08:00
Fu Wei
d58542a9d1
Merge pull request #5627 from payall4u/payall4u/cri-support-cgroup-v2 2021-09-09 23:10:33 +08:00
Claudiu Belu
55faa5e93d task delete: Closes task IO before waiting
After containerd restarts, it will try to recover its sandboxes,
containers, and images. If it detects a task in the Created or
Stopped state, it will be removed. This will cause the containerd
process it hang on Windows on the t.io.Wait() call.

Calling t.io.Close() beforehand will solve this issue.

Additionally, the same issue occurs when trying to stopp a sandbox
after containerd restarts. This will solve that case as well.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-07 02:17:01 -07:00
Wei Fu
2bcd6a4e88 cri: patch update image labels
The CRI-plugin subscribes the image event on k8s.io namespace. By
default, the image event is created by CRI-API. However, the image can
be downloaded by containerd API on k8s.io with the customized labels.
The CRI-plugin should use patch update for `io.cri-containerd.image`
label in this case.

Fixes: #5900

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-09-05 18:48:26 +08:00
Claudiu Belu
24cec9be56 sandbox: Allows the sandbox to be deleted in NotReady state
The Pod Sandbox can enter in a NotReady state if the task associated
with it no longer exists (it died, or it was killed). In this state,
the Pod network namespace could still be open, which means we can't
remove the sandbox, even if --force was used.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-02 03:40:56 -07:00
Mike Brown
e00f87f1dc
Merge pull request #5927 from adelina-t/ws_2022_image_update
Update Pause image in tests & config
2021-08-31 16:11:57 -05:00
Adelina Tuvenie
6d3d34b85d Update Pause image in tests & config
With the introduction of Windows Server 2022, some images have been updated
to support WS2022 in their manifest list. This commit updates the test images
accordingly.

Signed-off-by: Adelina Tuvenie <atuvenie@cloudbasesolutions.com>
2021-08-31 19:42:57 +03:00
Mikko Ylinen
e0f8c04dad cri: Devices ownership from SecurityContext
CRI container runtimes mount devices (set via kubernetes device plugins)
to containers by taking the host user/group IDs (uid/gid) to the
corresponding container device.

This triggers a problem when trying to run those containers with
non-zero (root uid/gid = 0) uid/gid set via runAsUser/runAsGroup:
the container process has no permission to use the device even when
its gid is permissive to non-root users because the container user
does not belong to that group.

It is possible to workaround the problem by manually adding the device
gid(s) to supplementalGroups. However, this is also problematic because
the device gid(s) may have different values depending on the workers'
distro/version in the cluster.

This patch suggests to take RunAsUser/RunAsGroup set via SecurityContext
as the device UID/GID, respectively. The feature must be enabled by
setting device_ownership_from_security_context runtime config value to
true (valid on Linux only).

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-08-30 09:30:00 +03:00
Phil Estes
af1a0908d0
Merge pull request #5865 from dcantah/windows-pod-runasusername
Add RunAsUserName functionality for the Windows pod sandbox container
2021-08-25 22:25:14 -04:00
Sebastiaan van Stijn
2ac9968401
replace uses of os/exec with golang.org/x/sys/execabs
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows. This issue also
affects Unix users who have “.” listed explicitly in their PATH and are running
“go get” outside of a module or with module mode disabled.

This issue is not limited to the go command itself, and can also affect binaries
that use `os.Command`, `os.LookPath`, etc.

From the related blogpost (ttps://blog.golang.org/path-security):

> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing

This patch replaces all uses of `os/exec` with `golang.org/x/sys/execabs`. While
some uses of `os/exec` should not be problematic (e.g. part of tests), it is
probably good to be consistent, in case code gets moved around.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-25 18:11:09 +02:00
Fu Wei
6fa9588531
Merge pull request #5903 from AkihiroSuda/gofmt117
Run `go fmt` with Go 1.17
2021-08-24 23:01:41 +08:00
Daniel Canter
25644b4614 Add RunAsUserName functionality for the Windows Pod Sandbox Container
There was recent changes to cri to bring in a Windows section containing a
security context object to the pod config. Before this there was no way to specify
a user for the pod sandbox container to run as. In addition, the security context
is a field for field mirror of the Windows container version of it, so add the
ability to specify a GMSA credential spec for the pod sandbox container as well.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-08-23 07:35:22 -07:00
payall4u
f8dfbee178 add cri test case
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:59:19 +08:00
payall4u
9a8bf13158 feature: add field LinuxContainerResources.Unified on cri
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:49:31 +08:00
Akihiro Suda
d3aa7ee9f0
Run go fmt with Go 1.17
The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-08-22 09:31:50 +09:00
Jacob Blain Christen
c3609ff4ca cri: filter selinux xattr for image volumes
Exclude the `security.selinux` xattr when copying content from layer
storage for image volumes. This allows for the already correct label
at the target location to be applied to the copied content, thus
enabling containers to write to volumes that they implicitly expect to be
able to write to.

- Fixes containerd/containerd#5090
- See rancher/rke2#690

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2021-08-20 23:47:24 -07:00
Phil Estes
ff2e58d114
Merge pull request #5131 from perithompson/windows-hostnetwork
Add Windows HostProcess Support
2021-08-20 14:29:37 -04:00
Kazuyoshi Kato
4dd5ca70fb script: update golangci-lint from v1.38.0 and v1.36.0 to v1.42.0
golint has been deprecated and replaced by revive since v1.41.0.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-08-19 16:27:16 -07:00
Derek McGowan
8d135d2842
Add support for shim plugins
Refactor shim v2 to load and register plugins.
Update init shim interface to not require task service implementation on
returned service, but register as plugin if it is.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-17 11:06:09 -07:00
Gunju Kim
1224060f89 Allow expanded DNS configuration
Signed-off-by: Gunju Kim <gjkim042@gmail.com>
2021-08-14 06:13:01 +09:00
Peri Thompson
79b369a0bb
Added windows hostProcess cni skip
Signed-off-by: Peri Thompson <perit@vmware.com>
2021-08-11 22:23:49 +01:00
Michael Crosby
218db0f9af
Merge pull request #5835 from dmcgowan/plugin-events-cleanup
Move plugin context events into separate plugin
2021-08-07 21:47:11 -04:00
Derek McGowan
0a0621bb47
Move plugin context events into separate plugin
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:59:20 -07:00
Derek McGowan
6f027e38a8
Remove redundant build tags
Remove build tags which are already implied by the name of the file.
Ensures build tags are used consistently

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:27:46 -07:00
Derek McGowan
caf9e256b7
Merge pull request #5693 from kzys/sigrtmin
Support SIGRTMIN+n signals
2021-07-27 11:58:57 -07:00
Davanum Srinivas
494b940f14
Introduce a new go module - containerd/api for use in standalone clients
In containerd 1.5.x, we introduced support for go modules by adding a
go.mod file in the root directory. This go.mod lists all the things
needed across the whole code base (with the exception of
integration/client which has its own go.mod). So when projects that
need to make calls to containerd API will pull in some code from
containerd/containerd, the `go mod` commands will add all the things
listed in the root go.mod to the projects go.mod file. This causes
some problems as the list of things needed to make a simple API call
is enormous. in effect, making a API call will pull everything that a
typical server needs as well as the root go.mod is all encompassing.
In general if we had smaller things folks could use, that will make it
easier by reducing the number of things that will end up in a consumers
go.mod file.

Now coming to a specific problem, the root containerd go.mod has various
k8s.io/* modules listed. Also kubernetes depends on containerd indirectly
via both moby/moby (working with docker maintainers seperately) and via
google/cadvisor. So when the kubernetes maintainers try to use latest
1.5.x containerd, they will see the kubernetes go.mod ending up depending
on the older version of kubernetes!

So if we can expose just the minimum things needed to make a client API
call then projects like cadvisor can adopt that instead of pulling in
the entire go.mod from containerd. Looking at the existing code in
cadvisor the minimum things needed would be the api/ directory from
containerd. Please see proof of concept here:
github.com/google/cadvisor/pull/2908

To enable that, in this PR, we add a go.mod file in api/ directory. we
split the Protobuild.yaml into two, one for just the things in api/
directory and the rest in the root directory. We adjust various targets
to build things correctly using `protobuild` and also ensure that we
end up with the same generated code as before as well. To ensure we
better take care of the various go.mod/go.sum files, we update the
existing `make vendor` and also add a new `make verify-vendor` that one
can run locally as well in the CI.

Ideally, we would have a `containerd/client` either as a standalone repo
or within `containerd/containerd` as a separate go module. but we will
start here to experiment with a standalone api go module first.

Also there are various follow ups we can do, for example @thaJeztah has
identified two tasks we could do after this PR lands:

github.com/containerd/containerd/pull/5716#discussion_r668821396

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-07-27 07:34:59 -04:00
Kazuyoshi Kato
1d3d08026d Support SIGRTMIN+n signals
systemd uses SIGRTMIN+n signals, but containerd didn't support the signals
since Go's sys/unix doesn't support them.

This change introduces SIGRTMIN+n handling by utilizing moby/sys/signal.

Fixes #5402.

https://www.freedesktop.org/software/systemd/man/systemd.html#Signals

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-07-26 09:36:43 -07:00
Wei Fu
ac75071b49 remove pkg/cri/platforms package
The package is a duplicate of platforms. No need to maintain
pkg/cri/platforms.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-07-10 10:14:27 +08:00
Brian Goff
0a8802df67 Allow WithServices to use custom implementations
Before this change, for several of the services that `WithServices`
handles, only the grpc client is supported.
Now, for instance, one can use an `images.Store` directly instead of
only an `imagesapi.StoreSlient`.

Some of the methods have been renamed to satisfy the difference between
using a grpc `<Foo>Client` vs the main interface.

I did not see a good candidate for TaskService so have left that mostly
unchanged.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-07-09 23:30:40 +00:00
Phil Estes
cf600abecc
Merge pull request #5619 from mikebrow/cri-add-v1-proxy-alpha
[CRI] move up to CRI v1 and support v1alpha in parallel
2021-07-09 14:07:24 -04:00
Mike Brown
d1c1051927 use fu wei's suggeted interface pick for marshaling
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-07 15:45:45 -05:00
Mike Brown
14962dcbd2 add alpha version
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-06 11:40:20 -05:00
Mike Brown
a5c417ac06 move up to CRI v1 and support v1alpha in parallel
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-28 09:34:12 -05:00
Dan Williams
dac2543a07 sandbox: send pod UID to CNI plugins as K8S_POD_UID
CNI plugins that need to wait for network state to converge
may want to cancel waiting when a short lived pod is deleted.
However, there is a race between when kubelet asks the runtime
to create the sandbox for the pod, and when the plugin is able
request the pod object from the apiserver. It may be the case
that the plugin receives the new pod, rather than the pod
the sandbox request was initiated for.

Passing the pod UID to the plugin allows the plugin to check
whether the pod it gets from the apiserver is actually the
pod its sandbox request was started for.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-06-22 22:53:30 -05:00
Mike Brown
560e7d4799 fixing some doc links
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-21 18:24:47 -05:00
Kazuyoshi Kato
1bbee573af github.com/golang/protobuf/proto is deprecated
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-06-17 10:28:48 -04:00
Quan Tian
728743eb28 Fix cleanup context of teardownPodNetwork
Similar to other deferred cleanup operations, teardownPodNetwork should
use a different context as the original context may have expired,
otherwise CNI wouldn't been invoked, leading to leak of network
resources, e.g. IP addresses.

Signed-off-by: Quan Tian <qtian@vmware.com>
2021-06-04 19:17:05 +08:00
zounengren
498bb36f67 scrub the stale TODO
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-06-01 11:22:15 +08:00
Shiming Zhang
79e3452213 update the link
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-05-20 11:56:29 +08:00
Shiming Zhang
1acca8bba3 Don't check for apparmor_parser to be present
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-05-20 11:56:29 +08:00
Phil Estes
e47400cbd2
Merge pull request #5100 from adisky/skip-tls-localHost
Skip TLS verification for localhost
2021-05-12 14:56:53 -04:00
Kevin Parsons
b0d3b35b28 windows: Use GetFinalPathNameByHandle for ResolveSymbolicLink
This change splits the definition of pkg/cri/os.ResolveSymbolicLink by
platform (windows/!windows), and switches to an alternate implementation
for Windows. This aims to fix the issue described in containerd/containerd#5405.

The previous implementation which just called filepath.EvalSymlinks has
historically had issues on Windows. One of these issues we were able to
fix in Go, but EvalSymlinks's behavior is not well specified on
Windows, and there could easily be more issues in the future, so it
seems prudent to move to a separate implementation for Windows.

The new implementation uses the Windows GetFinalPathNameByHandle API,
which takes a handle to an open file or directory and some flags, and
returns the "real" name for the object. See comments in the code for
details on the implementation.

I have tested this change with a variety of mounts and everything seems
to work as expected. Functions that make incorrect assumptions on what a
Windows path can look like may have some trouble with the \\?\ path
syntax. For instance EvalSymlinks fails when given a \\?\UNC\ path. For
this reason, the resolvePath implementation modifies the returned path
to translate to the more common form (\\?\UNC\server\share ->
\\server\share).

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
2021-05-04 11:55:11 -07:00
Mike Brown
c1a35232d8
Merge pull request #5446 from Random-Liu/fix-auth-config
Fix different registry hosts referencing the same auth config.
2021-05-04 06:21:02 -05:00
Lantao Liu
81402e4758 Fix different registry hosts referencing the same auth config.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-05-03 17:42:57 -07:00
Aditi Sharma
8014d9fee0 Skip TLS verification for localhost
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-05-03 10:21:54 +05:30
Mike Brown
3dad67eedb
Merge pull request #5203 from tghartland/5008-target-namespace
Support PID NamespaceMode_TARGET
2021-04-21 19:41:38 -05:00
Maksym Pavlenko
cb64dc8250
Merge pull request #5401 from Iceber/use-unbuffered-channel
process: use the unbuffered channel as the done signal
2021-04-21 16:50:32 -07:00
Thomas Hartland
efcb187429 Add unit tests for PID NamespaceMode_TARGET validation
Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 19:59:10 +02:00
Thomas Hartland
b48f27df6b Support PID NamespaceMode_TARGET
This commit adds support for the PID namespace mode TARGET
when generating a container spec.

The container that is created will be sharing its PID namespace
with the target container that was specified by ID in the namespace
options.

Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 17:54:17 +02:00
Iceber Gu
909660ea92 process: use the unbuffered channel as the done signal
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-04-21 18:24:18 +08:00
Michael Crosby
a3fe5c84c0
Merge pull request #5383 from wzshiming/clean/process-io
move common code to pkg/process from runtime
2021-04-20 14:40:12 -04:00
Phil Estes
81c4ac202f
Merge pull request #5256 from kolyshkin/seccomp-enabled
pkg/seccomp: simplify and speed up isEnabled
2021-04-19 15:59:28 -04:00
Shiming Zhang
7966a6652a Cleanup code
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-04-19 16:59:45 +08:00
Sebastiaan van Stijn
1c03c377e5
go.mod: github.com/containerd/fifo v1.0.0
full diff: https://github.com/containerd/fifo/compare/115abcc95a1d...v1.0.0

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-19 09:27:45 +02:00
Kir Kolyshkin
3292ea5862 pkg/seccomp: use sync.Once to speed up IsEnabled
It does not make sense to check if seccomp is supported by the kernel
more than once per runtime, so let's use sync.Once to speed it up.

A quick benchmark (old implementation, before this commit, after):

BenchmarkIsEnabledOld-4           37183            27971 ns/op
BenchmarkIsEnabled-4            1252161              947 ns/op
BenchmarkIsEnabledOnce-4      666274008             2.14 ns/op

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-16 15:52:35 -07:00
Kir Kolyshkin
00b5c99b1a pkg/seccomp: simplify IsEnabled, update doc
Current implementation of seccomp.IsEnabled (rooted in runc) is not
too good.

First, it parses the whole /proc/self/status, adding each key: value
pair into the map (lots of allocations and future work for garbage
collector), when using a single key from that map.

Second, the presence of "Seccomp" key in /proc/self/status merely means
that kernel option CONFIG_SECCOMP is set, but there is a need to _also_
check for CONFIG_SECCOMP_FILTER (the code for which exists but never
executed in case /proc/self/status has Seccomp key).

Replace all this with a single call to prctl; see the long comment in
the code for details.

While at it, improve the IsEnabled documentation.

NOTE historically, parsing /proc/self/status was added after a concern
was raised in https://github.com/opencontainers/runc/pull/471 that
prctl(PR_GET_SECCOMP, ...) can result in the calling process being
killed with SIGKILL. This is a valid concern, so the new code here
does not use PR_GET_SECCOMP at all.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-16 15:52:35 -07:00
Phil Estes
4f18131239
Merge pull request #5286 from payall4u/optimize-cri-redirect-logs
cri: Reduce the cpu usage of  the function redirectLogs in cri
2021-04-14 21:33:05 -04:00
Sebastiaan van Stijn
864a3322b3
go.mod: github.com/containerd/go-cni v1.0.2
full diff: https://github.com/containerd/go-cni/compare/v1.0.1...v1.0.2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-14 09:09:18 +02:00
Mike Brown
8a04bd0521 address recent runtimes config confusion
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-12 15:33:38 -05:00
Mike Brown
e96d2a5d90 Revert "remove two very old no longer used runtime options"
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-12 10:16:01 -05:00
Fu, Wei
7e3fd8da24
Merge pull request #5298 from jsturtevant/issue-5297
Support multi-arch images for Windows via ctr
2021-04-12 13:52:14 +08:00
payall4u
4bc8f692fc optimize cri redirect logs
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-04-09 11:45:53 +08:00
Fu, Wei
d064140369
Merge pull request #5302 from mikebrow/toml-cri-defaults
shows our runc.v2 default options
2021-04-09 11:11:25 +08:00
Sebastiaan van Stijn
9bc8d63c9f
cri/server: use containerd/oci instead of libcontainer/devices
Looks like we had our own copy of the "getDevices" code already, so use
that code (which also matches the code that's used to _generate_ the spec,
so a better match).

Moving the code to a separate file, I also noticed that the _unix and _linux
code was _exactly_ the same (baring some `//nolint:` comments), so also
removing the duplicated code.

With this patch applied, we removed the dependency on the libcontainer/devices
package (leaving only libcontainer/user).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-08 23:25:21 +02:00
Mike Brown
dd16b006e5 merge in the move to the new options type
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-08 14:09:59 -05:00
Mike Brown
9144ce9677 shows our runc.v2 default options in the containerd default config
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-08 14:09:59 -05:00
Aditi Sharma
4d4117415e Change CRI config runtime options type
Changing Runtime.Options type to map[string]interface{}
to correctly marshal it from go to JSON.
See issue: https://github.com/kubernetes-sigs/cri-tools/issues/728

Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-04-08 15:11:33 +05:30
Mike Brown
88880f0f2c
Merge pull request #5304 from mikebrow/cri-registry-doc-updates
remove mirrors from default; document the deprecation of registry.configs and registry.mirrors
merging based on LGTMs from https://github.com/containerd/containerd/pull/5304#pullrequestreview-628234110 and https://github.com/containerd/containerd/pull/5304#pullrequestreview-630478887 thanks!
2021-04-07 14:49:36 -05:00
Mike Brown
d4be6aa8fa rm mirror defaults; doc registry deprecations
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-07 12:29:43 -05:00
Akihiro Suda
8ba8533bde
pkg/cri/opts.WithoutRunMount -> oci.WithoutRunMount
Move `pkg/cri/opts.WithoutRunMount` function to `oci.WithoutRunMount`
so that it can be used without dependency on CRI.

Also add `oci.WithoutMounts(dests ...string)` for generality.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-04-07 21:25:36 +09:00
Mike Brown
0186a329e9 remove two very old no longer used runtime options
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-06 20:41:09 -05:00
Derek McGowan
261c107ffc
Merge pull request #5278 from mxpv/toml
Migrate TOML to github.com/pelletier/go-toml
2021-04-01 21:24:52 -07:00
Maksym Pavlenko
5ada2f74a7 Keep host order as defined in TOML file
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-04-01 09:29:16 -07:00
James Sturtevant
d9ff8ebef5 support multi-arch images for windows via ctr
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2021-03-31 15:50:01 -07:00
Mike Brown
1b05b605c8
Merge pull request #5145 from aojea/happyeyeballs
use (sort of) happy-eyeballs for port-forwarding
2021-03-26 09:51:29 -05:00
Maksym Pavlenko
ddd4298a10 Migrate current TOML code to github.com/pelletier/go-toml
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-25 13:13:33 -07:00
Derek McGowan
75a0c2b7d3
Merge pull request #5264 from mxpv/tests
Run unit tests on CI for MacOS
2021-03-25 09:46:25 -07:00
Fu, Wei
80fa9fe32a
Merge pull request #5135 from AkihiroSuda/default-config-crypt
add imgcrypt stream processors to the default config
2021-03-25 14:31:38 +08:00
Maksym Pavlenko
4674ad7beb Ignore some tests on darwin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-24 22:40:22 -07:00
Maksym Pavlenko
181e2d4216
Merge pull request #5250 from dmcgowan/cri-fix-reference-ordering
Fix reference ordering in CRI image store
2021-03-23 14:45:16 -07:00
Sebastiaan van Stijn
708299ca40
Move RunningInUserNS() to its own package
This allows using the utility without bringing whole of "sys" with it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-23 11:29:53 +01:00
Derek McGowan
0886ceaea2
Fix reference ordering in CRI image store
Currently image references end up being stored in a
random order due to the way maps are iterated through
in Go. This leads to inconsistent identifiers being
resolved when a single reference is needed to identify
an image and the ordering of the references is used for
the selection.

Sort references in a consistent and ranked manner,
from higher information formats to lower.

Note: A `name + tag` reference is considered higher
information than a `name + digest` reference since a
registry may be used to resolve the digest from a
`name + tag` reference.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-22 22:29:57 -07:00
Antonio Ojea
305b425830 use happy-eyeballs for port-forwarding
golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12.
It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.
However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially, trying IPv4 first for keeping the same behaviour.
xref https://github.com/golang/go/issues/44922

Signed-off-by: Antonio Ojea <aojea@redhat.com>
2021-03-22 20:15:24 +01:00
Michael Crosby
e0c94bb269
Merge pull request #4708 from kzys/enable-criu
Re-enable CRIU tests by not using overlayfs snapshotter
2021-03-19 14:23:05 -04:00
Shiming Zhang
1410220d8f Fix error log when copy file
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-03-20 00:13:00 +08:00
Michael Crosby
3f98a6d2d3
Merge pull request #5211 from pacoxu/pause/3.5
upgrade pause image to 3.5 for non-root
2021-03-18 11:43:59 -04:00
Phil Estes
32a08f1a6a
Merge pull request #4847 from cpuguy83/devices_by_dir
Support adding devices by dir
2021-03-17 09:41:02 -04:00
Kazuyoshi Kato
b520428b5a Fix CRIU
- process.Init#io could be nil
- Make sure CreateTaskRequest#Options is not empty before unmarshaling

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-03-16 16:46:45 -07:00
pacoxu
ffff688663 upgrade pause image to 3.5 for non-root
Signed-off-by: pacoxu <paco.xu@daocloud.io>
2021-03-16 23:20:35 +08:00
Derek McGowan
2755ead927
Merge pull request #4978 from cpuguy83/certs_dir
Add support for using a host registry dir in cri
2021-03-15 13:47:03 -07:00
Brian Goff
7776e5ef2a Support adding devices by dir
This enables cases where devices exist in a subdirectory of /dev,
particularly where those device names are not portable across machines,
which makes it problematic to specify from a runtime such as cri.

Added this to `ctr` as well so I could test that the code at least
works.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-15 16:42:23 +00:00
Akihiro Suda
ecb881e5e6
add imgcrypt stream processors to the default config
Enable the following config by default:

```toml
version = 2

[plugins."io.containerd.grpc.v1.cri".image_decryption]
  key_model = "node"

[stream_processors]
  [stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"]
    accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"]
    returns = "application/vnd.oci.image.layer.v1.tar+gzip"
    path = "ctd-decoder"
    args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
    env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
  [stream_processors."io.containerd.ocicrypt.decoder.v1.tar"]
    accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"]
    returns = "application/vnd.oci.image.layer.v1.tar"
    path = "ctd-decoder"
    args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
    env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
```

Fix issue 5128

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-15 13:27:16 +09:00
Brian Goff
b0b6d9aa03 Add support for using a host registry dir in cri
This will be used instead of the cri registry config in the main config
toml.

---

Also pulls in changes from containerd/cri@d0b4eecbb3

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-12 22:42:22 +00:00
Derek McGowan
8cf669ce34
Fix unsupported files exporting functions for apparmor and seccomp
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-12 08:47:05 -08:00
Derek McGowan
35eeb24a17
Fix exported comments enforcer in CI
Add comments where missing and fix incorrect comments

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-12 08:47:05 -08:00
Iceber Gu
f37ae8fc35
move to v3.4.1 for the pause image
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-03-07 15:21:20 +08:00
Iceber Gu
92ab1a63b0 cri: fix container status
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-03-05 00:00:10 +08:00
f00231050
591caece0c cri: check fsnotify watcher when receiving cni conf dir events
carry: 612f5f9f44

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-03-03 16:46:41 +08:00
Phil Estes
8dbe53a2a9
Merge pull request #5070 from yoheiueda/empty-masked
cri: set default masked/readonly paths to empty paths
2021-02-25 15:38:45 -05:00
Akihiro Suda
7ee610edb5
drop dependency on github.com/syndtr/gocapability
pkg/cap has the full list of the caps (for UT, originally),
so we can drop dependency on github.com/syndtr/gocapability

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-25 15:17:28 +09:00
Akihiro Suda
9822173354
cap: rename FromUint64 to FromBitmap
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-25 15:02:10 +09:00
Yohei Ueda
07f1df4541
cri: set default masked/readonly paths to empty paths
Fixes #5029.

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2021-02-24 23:50:40 +09:00
Phil Estes
757be0a090
Merge pull request #5017 from AkihiroSuda/parse-cap
oci.WithPrivileged: set the current caps, not the known caps
2021-02-23 09:10:57 -05:00
Mike Brown
9173d3e929
Merge pull request #5021 from wzshiming/fix/signal_repeatedly
Fix repeated sending signal
2021-02-22 09:45:56 -06:00
Justin Terry (SF)
06e4e09567 cri: append envs from image config to empty slice to avoid env lost
Signed-off-by: Justin Terry (SF) <juterry@microsoft.com>
2021-02-18 16:39:28 -08:00
Phil Estes
c32ccdf8be
Merge pull request #5024 from yadzhang/deepcopy-imageconfig
cri: append envs from image config to empty slice to avoid env lost
2021-02-18 12:51:51 -05:00
Akihiro Suda
746cef0bc2
Merge pull request #5044 from wzshiming/fix/empty-error-warpping
Fix empty error warpping
2021-02-18 13:47:13 +09:00
zhangyadong.0808
08318b1ab9 cri: append envs from image config to empty slice to avoid env lost
Signed-off-by: Yadong Zhang <yadzhang@gmail.com>
2021-02-18 11:37:41 +08:00
Shiming Zhang
59db8a10e0 Fix empty error warpping
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-02-18 11:06:59 +08:00
Shiming Zhang
dc6f5ef3b9 Fix repeated sending signal
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-02-17 21:33:49 +08:00
Michael Crosby
41e3057cc6
Merge pull request #5025 from jeremyje/win20h2
Add references to Windows 20H2 test images.
2021-02-12 11:58:49 -05:00
Lorenz Brun
36d0bc1f2b Allow moving netns directory into StateDir
Signed-off-by: Lorenz Brun <lorenz@nexantic.com>
2021-02-10 18:33:14 +01:00
Akihiro Suda
a2d1a8a865
oci.WithPrivileged: set the current caps, not the known caps
This change is needed for running the latest containerd inside Docker
that is not aware of the recently added caps (BPF, PERFMON, CHECKPOINT_RESTORE).

Without this change, containerd inside Docker fails to run containers with
"apply caps: operation not permitted" error.

See kubernetes-sigs/kind 2058

NOTE: The caller process of this function is now assumed to be as
privileged as possible.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-02-10 17:14:17 +09:00
Michael Crosby
e874e2597e [cri] add pod annotations to CNI call
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-02-09 13:24:01 -05:00
Jeremy Edwards
1c81071d39 Add references to Windows 20H2 test images.
Signed-off-by: Jeremy Edwards <1312331+jeremyje@users.noreply.github.com>
2021-02-09 16:25:36 +00:00
Derek McGowan
b3f2402062
Merge pull request #5002 from crosbymichael/anno-image-name
[cri] add image-name annotation
2021-02-05 08:27:41 -08:00
Akihiro Suda
e908be5b58
Merge pull request #5001 from kzys/no-lint-upgrade 2021-02-06 00:40:38 +09:00
Kazuyoshi Kato
07db46ee23 lint: update nolint syntax for golangci-lint
Newer golangci-lint needs explicit `//` separator. Otherwise it treats
the entire line (`staticcheck deprecated ... yet`) as a name.

https://golangci-lint.run/usage/false-positives/#nolint

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-02-04 11:59:55 -08:00
Sebastiaan van Stijn
04d061fa6a
update runc to v1.0.0-rc93
full diff: https://github.com/opencontainers/runc/compare/v1.0.0-rc92...v1.0.0-rc93

also removes dependency on libcontainer/configs

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-04 16:13:30 +01:00
Sebastiaan van Stijn
54cc3483ff
pkg/cri/server: don't import libcontainer/configs
Looks like this import was not needed for the test; simplified the test
by just using the device-path (a counter would work, but for debugging,
having the list of paths can be useful).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-02-04 16:08:39 +01:00
Michael Crosby
99cb62f233 [cri] add image-name annotation
For some tools having the actual image name in the annotations is helpful for
debugging and auditing the workload.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-02-04 07:05:11 -05:00
Lantao Liu
b5bf1fd5d8 Fix deprecated registry auth conversion.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-02-03 19:22:26 -08:00
Aditi Sharma
1423e9199d Update gogo/protobuf to v1.3.2
bump version 1.3.2 for gogo/protobuf due to CVE-2021-3121 discovered
in gogo/protobuf version 1.3.1, CVE has been fixed in 1.3.2

Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-01-28 12:57:50 +00:00
Michael Crosby
591d7e2fb1 remove exec sync debug contents from logs
This was dumping untrusted output to the debug logs from user containers.
We should not dump this type of information to reduce log sizes and any
information leaks from user containers.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-01-26 14:57:54 -05:00
Alban Crequy
28e4fb25f4 cri: add annotations for pod name and namespace
cri-o has annotations for pod name, namespace and container name:
https://github.com/containers/podman/blob/master/pkg/annotations/annotations.go

But so far containerd had only the container name.

This patch will be useful for seccomp agents to have a different
behaviour depending on the pod (see runtime-spec PR 1074 and runc PR
2682). This should simplify the code in:
b2d423695d/pkg/kuberesolver/kuberesolver.go (L16-L27)

Signed-off-by: Alban Crequy <alban@kinvolk.io>
2021-01-26 12:10:39 +01:00
Wei Fu
e56de63099 cri: handle sandbox/container exit event separately
The event monitor handles exit events one by one. If there is something
wrong about deleting task, it will slow down the terminating Pods. In
order to reduce the impact, the exit event watcher should handle exit
event separately. If it failed, the watcher should put it into backoff
queue and retry it.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-01-24 13:43:38 +08:00
Shengjing Zhu
2818fdebaa Move runtimeoptions out of cri package
Since it's a standard set of runtime opts, and used in ctr as well,
it could be moved out of cri.

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2021-01-23 01:24:35 +08:00
Michael Crosby
a731039238 [cri] label etc files for selinux containers
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-01-19 13:42:09 -05:00
Mike Brown
550b4949cb
Merge pull request #4700 from mikebrow/cri-security-profile-update
CRI security profile update for CRI graduation
2021-01-12 12:21:56 -06:00
Sebastiaan van Stijn
2374178c9b
pkg/cri/server: optimizations in unmountRecursive()
Use a PrefixFilter() to get only the mounts we're interested in,
which removes the need to manually filter mounts from the mountinfo
results.

Additional optimizations can be made, as:

> ... there's a little known fact that `umount(MNT_DETACH)` is actually
> recursive in Linux, IOW this function can be replaced with
> `unix.Umount(target, unix.MNT_DETACH)` (or `mount.UnmountAll(target, unix.MNT_DETACH)`
>  (provided that target itself is a mount point).

e8fb2c392f (r535450446)

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-01-08 17:32:01 +01:00
Sebastiaan van Stijn
7572919201
mount: remove remaining uses of mount.Self()
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-01-08 17:31:59 +01:00
Davanum Srinivas
1f5b84f27c
[CRI] Reduce clutter of log entries during process execution
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-01-06 13:09:03 -05:00
Shengjing Zhu
5988bfc1ef docs: Various typo found by codespell
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2020-12-22 13:22:16 +08:00
Michael Crosby
2e442ea485 [cri] ensure log dir is created
containerd is responsible for creating the log but there is no code to ensure
that the log dir exists.  While kubelet should have created this there can be
times where this is not the case and this can cause stuck tasks.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-12-17 15:04:39 -05:00
Akihiro Suda
7e6e4c466f
remove "selinux" build tag
The build tag was removed in go-selinux v1.8.0: opencontainers/selinux#132

Related: remove "apparmor" build tag: 0a9147f3aa

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-15 20:05:25 +09:00
Akihiro Suda
0a9147f3aa
remove "apparmor" build tag
The "apparmor" build tag does not have any cgo dependency and can be removed safely.

Related: https://github.com/opencontainers/runc/issues/2704

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-08 19:22:39 +09:00
Mike Brown
6467c3374d refactor based on comments
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-12-07 21:39:31 -06:00
Phil Estes
73a301c7a1
Merge pull request #4772 from gaurav1086/ValidatePluginConfig_fix_range_iterator_issue
[cri/config] : fix range iterator issue in ValidatePluginConfig
2020-12-07 12:42:07 -05:00
Phil Estes
efad13faaf
Merge pull request #4811 from AkihiroSuda/expose-apparmor
expose hostSupportsAppArmor()
2020-12-07 08:22:16 -05:00
Akihiro Suda
55eda46b22
expose hostSupportsAppArmor()
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-12-07 19:12:59 +09:00
Gaurav Singh
071a185506 cri/config: fix range iterator issue in ValidatePluginConfig
Go uses the same address variable while iterating in a range,
so use a copy when using its address.

Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-12-04 17:37:09 -05:00
Mike Brown
b4727eafbe adding code to support seccomp apparmor securityprofile
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-12-04 15:15:32 -06:00
Michael Crosby
3d358c9df3 [cri] don't clear base security settings
When a base runtime spec is being used, admins can configure defaults for the
spec so that default ulimits or other security related settings get applied for
all containers launched.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-12-02 06:51:37 -05:00
Shengjing Zhu
fe767f95c7 Fix package name in cri runtimeoptions protobuf
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2020-11-22 16:15:34 +08:00
Maksym Pavlenko
2837fb35a7
Merge pull request #4715 from thaJeztah/remove_libcontainer_apparmor
pkg/cri/server: remove dependency on libcontainer/apparmor, libcontainer/utils
2020-11-18 14:34:48 -08:00
Sebastiaan van Stijn
eba94a15c8
pkg/cri/server: remove dependency on libcontainer/apparmor, libcontainer/utils
recent versions of libcontainer/apparmor simplified the AppArmor
check to only check if the host supports AppArmor, but no longer
checks if apparmor_parser is installed, or if we're running
docker-in-docker;

bfb4ea1b1b

> The `apparmor_parser` binary is not really required for a system to run
> AppArmor from a runc perspective. How to apply the profile is more in
> the responsibility of higher level runtimes like Podman and Docker,
> which may do the binary check on their own.

This patch copies the logic from libcontainer/apparmor, and
restores the additional checks.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-11-12 15:42:25 +01:00
Jacob Blain Christen
a1e7dd939d cri: selinuxrelabel=false for /dev/shm w/ host ipc
This is a followup to #4699 that addresses an oversight that could cause
the CRI to relabel the host /dev/shm, which should be a no-op in most
cases. Additionally, fixes unit tests to make correct assertions for
/dev/shm relabeling.

Discovered while applying the changes for #4699 to containerd/cri 1.4:
https://github.com/containerd/cri/pull/1605

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-11 15:22:17 -07:00
Jacob Blain Christen
e8d8ae3b97 cri: selinux relabel /dev/shm
Address an issue originally seen in the k3s 1.3 and 1.4 forks of containerd/cri, https://github.com/rancher/k3s/issues/2240

Even with updated container-selinux policy, container-local /dev/shm
will get mounted with container_runtime_tmpfs_t because it is a tmpfs
created by the runtime and not the container (thus, container_runtime_t
transition rules apply). The relabel mitigates such, allowing envoy
proxy to work correctly (and other programs that wish to write to their
/dev/shm) under selinux.

Tested locally with:
- SELINUX=Enforcing vagrant up --provision-with=shell,selinux,test-integration
- SELINUX=Enforcing CRITEST_ARGS=--ginkgo.skip='HostIpc is true' vagrant up --provision-with=shell,selinux,test-cri
- SELINUX=Permissive CRITEST_ARGS=--ginkgo.focus='HostIpc is true' vagrant up --provision-with=shell,selinux,test-cri

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2020-11-06 12:05:17 -07:00
Sebastiaan van Stijn
1146098421
replace pkg/symlink with moby/sys/symlink
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-10-30 00:05:15 +01:00
Derek McGowan
5184bccea3
Merge pull request #4631 from dims/copy-a-few-packages-from-moby/moby
Copy pkg/symlink and pkg/truncindex from moby/moby
2020-10-29 09:13:30 -07:00
Wei Fu
f2e8fda82b
Merge pull request #4665 from dmcgowan/update-default-snapshot-annotations
Update make snapshot annotations disabled by default
2020-10-28 21:12:02 +08:00
Derek McGowan
b2642458f9
Update make snapshot annotations disabled by default
This experimental feature should not be enabled by default as
it is not used by any default snapshotters.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-27 21:32:25 -07:00
Akihiro Suda
8ff2707a3c
Merge pull request #4610 from shahzzzam/samashah/add-annotations
Add manifest digest annotation for snapshotters
2020-10-28 13:11:49 +09:00
zhuangqh
30c9addd6c fix: always set unknown to false when handling exit event
Signed-off-by: jerryzhuang <zhuangqhc@gmail.com>
2020-10-27 10:50:15 +08:00
Davanum Srinivas
a9cb22309a
Copy pkg/symlink and pkg/truncindex from moby/moby
moby/moby SHA : 9c15e82f19b0ad3c5fe8617a8ec2dddc6639f40a

github.com/docker/docker/pkg/truncindex/truncindex.go -> pkg/cri/store/truncindex/truncindex.go
github.com/docker/docker/pkg/symlink/LICENSE.APACHE -> pkg/symlink/LICENSE.APACHE
github.com/docker/docker/pkg/symlink/LICENSE.BSD -> pkg/symlink/LICENSE.BSD
github.com/docker/docker/pkg/symlink/README.md -> pkg/symlink/README.md
github.com/docker/docker/pkg/symlink/fs.go -> pkg/symlink/fs.go
github.com/docker/docker/pkg/symlink/fs_unix.go -> pkg/symlink/fs_unix.go
github.com/docker/docker/pkg/symlink/fs_windows.go -> pkg/symlink/fs_windows.go

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-10-15 08:36:35 -04:00
Daniel Canter
cdb2f9c66f Filter snapshotter labels passed to WithNewSnapshot
Made a change yesterday that passed through snapshotter labels into the wrapper of
WithNewSnapshot, but it passed the entirety of the annotations into the snapshotter.
This change just filters the set that we care about down to snapshotter specific
labels.

Will probably be future changes to add some more labels for LCOW/WCOW and the corresponding
behavior for these new labels.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-15 04:49:39 -07:00
Phil Estes
9b70de01d6
Merge pull request #4630 from dcantah/pass-snapshotter-opt
Cri - Pass snapshotter labels into customopts.WithNewSnapshot
2020-10-14 10:54:06 -04:00
Daniel Canter
9a1f6ea4dc Cri - Pass snapshotter labels into customopts.WithNewSnapshot
Previously there wwasn't a way to pass any labels to snapshotters as the wrapper
around WithNewSnapshot didn't have a parm to pass them in.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-14 04:14:03 -07:00
Daniel Canter
d74225b588 Fix comment in RemovePodSandbox
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-12 17:59:08 -07:00
zhangjianming
116902cd21 fix no-pivot not working in io.containerd.runtime.v1.linux
Signed-off-by: zhangjianming <zhang.jianming7@zte.com.cn>
2020-10-12 09:39:59 +08:00
Maksym Pavlenko
3d02441a79 Refactor pkg packages
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-10-08 17:30:17 -07:00
Akihiro Suda
915263f269
Merge pull request #4502 from akshat-kmr/master
Add logging binary support when terminal is true
2020-10-08 12:14:39 +09:00
Samarth Shah
5fc721370d Add manifest digest annotation for snapshotters
Signed-off-by: Samarth Shah <samarthmshah@gmail.com>
2020-10-07 23:12:01 +00:00
Maksym Pavlenko
3508ddd3dd Refactor CRI packages
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-10-07 14:45:57 -07:00
Derek McGowan
b22b627300
Move cri server packages under pkg/cri
Organizes the cri related server packages under pkg/cri

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 13:09:37 -07:00
Derek McGowan
1c60ae7f87
Use local version of cri packages
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 10:59:40 -07:00
Derek McGowan
e7a350176a
Merge containerd/cri into containerd/containerd
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 10:58:39 -07:00
Derek McGowan
0820015314 Prepare cri for merge to containerd
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 10:58:39 -07:00
Michael Crosby
a0b3b4e4da
Merge pull request #1593 from moolen/fix/add-nri-labels
Add missing sandbox labels when invoking nri
2020-10-07 13:17:06 -04:00
Derek McGowan
07c98d0bf1 Fix lint in Unix environments
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-05 17:46:01 -07:00
Moritz Johner
f87302ab20 Add missing sandbox labels when invoking nri plugins
Signed-off-by: Moritz Johner <beller.moritz@googlemail.com>
2020-10-03 23:30:09 +02:00
Derek McGowan
a3c0e8859c Align lint checks with containerd
Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-09-30 23:17:46 -07:00
Derek McGowan
83e6efc6fc Use tabs in protofile indentation
This is enforced as part of containerd's fmt checks

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-09-30 21:50:33 -07:00
Mike Brown
2e3bebb297
Merge pull request #1583 from thaJeztah/simplify_ensure_removeall_windows
pkg/server: make ensureRemoveAll() an alias for os.RemoveAll() on Windows
2020-09-28 14:26:18 -05:00
Mike Brown
b1ee4c0d7b
Merge pull request #1570 from yoheiueda/masked
Set masked and readonly paths based on default Unix spec
2020-09-24 15:45:58 -05:00
Sebastiaan van Stijn
e2928124d1
pkg/server: make ensureRemoveAll() an alias for os.RemoveAll() on Windows
The tricks performed by ensureRemoveAll only make sense for Linux and
other Unices, so separate it out, and make ensureRemoveAll for Windows
just an alias of os.RemoveAll.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-09-22 10:11:46 +02:00
ktock
e571fd864f Limit value size of additional annotation for avoiding unpack failure
In containerd, there is a size limit for label size (4096 chars).
Currently if an image has many layers (> (4096-39)/72 > 56),
`containerd.io/snapshot/cri.image-layers` will hit the limit of label size and
the unpack will fail.
This commit fixes this by limiting the size of the annotation.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2020-09-15 22:47:28 +09:00
Yohei Ueda
b582da4438
Set masked and readonly paths based on default Unix spec
The default values of masked and readonly paths are defined
in populateDefaultUnixSpec, and are used when a sandbox is
created.  It is not, however, used for new containers.  If
a container definition does not contain a security context
specifying masked/readonly paths, a container created from
it does not have masked and readonly paths.

This patch applies the default values to masked and
readonly paths of a new container, when any specific values
are not specified.

Fixes #1569

Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
2020-09-09 23:13:05 +09:00
Michael Crosby
d715d00906 Handle KVM based runtimes with selinux
Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-08-26 21:38:03 -04:00
Akshat Kumar
7a9fbec5fb Add logging binary support when terminal is true
Currently the shims only support starting the logging binary process if the
io.Creator Config does not specify Terminal: true. This means that the program
using containerd will only be able to specify FIFO io when Terminal: true,
rather than allowing the shim to fork the logging binary process. Hence,
containerd consumers face an inconsistent behavior regarding logging binary
management depending on the Terminal option.

Allowing the shim to fork the logging binary process will introduce consistency
between the running container and the logging process. Otherwise, the logging
process may die if its parent process dies whereas the container will keep
running, resulting in the loss of container logs.

Signed-off-by: Akshat Kumar <kshtku@amazon.com>
2020-08-25 17:28:29 -07:00
Derek McGowan
56a89cda34
Merge pull request #1552 from crosbymichael/nri
Add experimental NRI injection points
2020-08-24 13:58:11 -07:00
Antonio Ojea
1403a391c3 bump cni dependencies
Signed-off-by: Antonio Ojea <aojea@redhat.com>
2020-08-21 18:00:20 +02:00
Michael Crosby
63f89eb954 Update server with nri injection points
This allows development with container to be done for NRI without the need for
custom builds.

This is an experimental feature and is not enabled unless a user has a global
`/etc/nri/conf.json` config setup with plugins on the system.  No NRI code will
be executed if this config file does not exist.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-08-20 08:10:09 -04:00
Akihiro Suda
7332e2ad2e
remove libseccomp cgo dependency
The CRI plugin was depending on libseccomp cgo dependency via
libseccomp-golang via libcontainer.

https://github.com/seccomp/libseccomp-golang/blob/v0.9.1/seccomp_internal.go#L17

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-30 18:51:23 +09:00
Mike Brown
8a2d1cc802 adds support for pod id lookup for filter
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-07-29 15:23:22 -05:00
ktock
c80660b82b Allow GC to discard content after successful pull and unpack
This commit adds a config flag for allowing GC to clean layer contents up after
unpacking these contents completed, which leads to deduplication of layer
contents between the snapshotter and the contnet store.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2020-07-28 09:05:47 +09:00
Michael Crosby
5f5d954b6a add selinux category range to config
This allows an admin to set the upper bounds on the category range for selinux
labels.  This can be useful when handling allocation of PVs or other volume
types that need to be shared with selinux enabled on the hosts and volumes.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2020-07-20 16:02:07 -04:00
Akihiro Suda
707d2c49d1
allow disabling hugepages
This helps with running rootless mode + cgroup v2 + systemd without hugetlb delegation.
Systemd does not (and will not, perhaps) support hugetlb delegation as of systemd v245. https://github.com/systemd/systemd/
issues/14662

From 502bc5427e/src/patches/containerd/0001-DIRTY-VENDOR-cri-allow-disabling-hugepages.patch

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-16 11:46:25 +09:00
James Sturtevant
2bb0b19c4b Update to latest pause image for windows
Signed-off-by: James Sturtevant <jstur@microsoft.com>
2020-07-15 11:45:21 -07:00
Mike Brown
4b3974c4e9 show runc options tag
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-07-10 16:33:36 -05:00
Abhishek Kulkarni
287c52d1c6 Forcibly stop running containers before removal
Signed-off-by: Abhishek Kulkarni <abd.kulkarni@gmail.com>
2020-07-04 15:49:00 -05:00
Akihiro Suda
fb208d015a
vendor runc v1.0.0-rc91
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-03 14:03:21 +09:00
Akihiro Suda
fe6833a9a4
config: TolerateMissingHugePagesCgroupController -> TolerateMissingHugetlbController
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-02 13:49:42 +09:00
Akihiro Suda
b69d7bdc5f
config: fix TOML tag for TolerateMissingHugePagesCgroupController
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-07-02 13:38:19 +09:00
Mike Brown
c2191fddd7
Merge pull request #1513 from brianpursley/state-name
Change "failed to stop sandbox" error message to use state name instead of numeric value
2020-06-27 16:08:27 -05:00
Brian Pursley
aa04fc9d53 Change "failed to stop sandbox" error message to use state name instead of numeric value
Signed-off-by: Brian Pursley <bpursley@cinlogic.com>
2020-06-27 16:45:08 -04:00
Kevin Parsons
210561a8e3 Support named pipe mounts for Windows containers
Adds support to mount named pipes into Windows containers. This support
already exists in hcsshim, so this change just passes them through
correctly in cri. Named pipe mounts must start with "\\.\pipe\".

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
2020-06-25 12:01:08 -07:00
Mike Brown
f5c7ac9272 fix for image pull linter change
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2020-06-24 18:10:31 -05:00
Davanum Srinivas
3ee62de2bf
remove unused method
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 15:03:47 -04:00
Davanum Srinivas
cbb7c28f19
Add copyright headers
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 14:49:13 -04:00
Davanum Srinivas
e2072b71cc
Copy kubernetes/pkg/util/bandwidth
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 14:48:25 -04:00
Davanum Srinivas
2909022a6e
Make local copy of kubelet/cri/streaming
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 13:54:34 -04:00
Davanum Srinivas
41f184f15b
Update vendor.conf to kubernetes 1.19.0-beta.2
update streaming import path
switch remote package path

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-22 08:44:49 -04:00
Michael Crosby
6164822714
Merge pull request #1508 from janosi/sctp-hostport
Remove the protocol filter from the HostPort management
2020-06-15 14:48:37 -04:00
Mike Brown
b661ad711e
Merge pull request #1504 from lorenz/ignore-image-defined-volumes
Add option for ignoring volumes defined in images
2020-06-14 11:52:48 -05:00
Mike Brown
26dc5b9772
Merge pull request #1505 from dcantah/windows-cred-spec
Add GMSA credential spec passing
2020-06-14 11:52:33 -05:00
Laszlo Janosi
479dfbac45
Remove the protocol filter from the portMappings constructor.
Reason: originally it was introduced to prevent the loading of the SCTP kernel module on the nodes. But iptables chain creation alone does not load the kernel module. The module would be loaded if an SCTP socket was created, but neither cri nor the portmap CNI plugin starts managing SCTP sockets if hostPort / portmappings are defined.
Signed-off-by: Laszlo Janosi <laszlo.janosi@ibm.com>
2020-06-14 15:48:00 +00:00
Kenta Tada
730b7a932e Change the type of PdeathSignal
Use x/sys as same as runtime/v1/linux/runtime.go

Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2020-06-11 11:35:51 +09:00
Daniel Canter
9620b2e1da Add GMSA Credential Spec passing
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-06-10 11:15:07 -07:00
Lorenz Brun
5a1d49b063 Add option for ignoring volumes defined in images
Signed-off-by: Lorenz Brun <lorenz@brun.one>
2020-06-09 21:02:47 +02:00
Brian Goff
c694c63176 Add config for registry http headers
This adds a configuration knob for adding request headers to all
registry requests. It is not namespaced to a registry.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-06-08 18:56:15 -07:00
Gaurav Singh
7213cd89d6 Process I/O: Fix goroutine leak
Signed-off-by: Gaurav Singh <gaurav1086@gmail.com>
2020-06-07 17:38:36 -04:00