Commit Graph

1072 Commits

Author SHA1 Message Date
Derek McGowan
9afc778b73
Merge pull request #6111 from crosbymichael/latency-metrics
[cri] add sandbox and container latency metrics
2021-11-16 16:59:33 -08:00
Phil Estes
7758cdc09a
Merge pull request #6253 from jonyhy96/feat-rwmutex
feat: use rwmutex instead
2021-11-16 11:39:29 -05:00
Derek McGowan
6835a94707
Split runc shim into plugin components
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
Derek McGowan
6eea8f3f62
Add shutdown package
Allows shutdown to handle callbacks with similar behavior as context
cancel

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-11-15 20:16:45 -08:00
haoyun
bef792b962 feat: use rwmutex instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-16 11:06:40 +08:00
Derek McGowan
d055487b00
Merge pull request #6206 from mxpv/path
Allow absolute path to shim binaries
2021-11-15 18:05:48 -08:00
Olli Janatuinen
2a81c9f677 CRI: Support enable_unprivileged_icmp and enable_unprivileged_ports options
Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
2021-11-15 18:30:09 +02:00
Michael Crosby
6765524b73 use write lock when updating container stats
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-11 15:17:48 +00:00
Maksym Pavlenko
6870f3b1b8 Support custom runtime path when launching tasks
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-11-09 13:31:46 -08:00
Michael Crosby
91bbaf6799 [cri] add sandbox and container latency metrics
These are simple metrics that allow users to view more fine grained metrics on
internal operations.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-11-09 21:07:38 +00:00
Michael Crosby
4b7cc560b2
Merge pull request #6222 from jonyhy96/add-more-description
cleanup: add more description on comment
2021-11-09 15:55:32 -05:00
haoyun
5748006337 cleanup: add more description on comment
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-11-09 19:13:37 +08:00
David Porter
2e6d5709e3 Implement CRI container and pods stats
See https://kep.k8s.io/2371

* Implement new CRI RPCs - `ListPodSandboxStats` and `PodSandboxStats`
  * `ListPodSandboxStats` and `PodSandboxStats` which return stats about
    pod sandbox. To obtain pod sandbox stats, underlying metrics are
    read from the pod sandbox cgroup parent.
  * Process info is obtained by calling into the underlying task
  * Network stats are taken by looking up network metrics based on the
    pod sandbox network namespace path
* Return more detailed stats for cpu and memory for existing container
  stats. These metrics use the underlying task's metrics to obtain
  stats.

Signed-off-by: David Porter <porterdavid@google.com>
2021-11-03 17:52:05 -07:00
Mike Brown
ea89788105 adds additional debug out to timebox cni setup
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-11-01 09:34:29 -05:00
zounengren
a217b5ac8f bump CNI to spec v1.0.0
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-10-22 10:58:40 +08:00
Sambhav Kothari
2a8dac12a7 Output a warning for label image labels instead of erroring
This change ignore errors during container runtime due to large
image labels and instead outputs warning. This is necessary as certain
image building tools like buildpacks may have large labels in the images
which need not be passed to the container.

Signed-off-by: Sambhav Kothari <sambhavs.email@gmail.com>
2021-10-14 19:25:48 +01:00
Claudiu Belu
2bc77b8a28 Adds Windows resource limits support
This will allow running Windows Containers to have their resource
limits updated through containerd. The CPU resource limits support
has been added for Windows Server 20H2 and newer, on older versions
hcsshim will raise an Unimplemented error.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-25 13:20:55 -07:00
Derek McGowan
cb6fb93af5
Merge pull request #6011 from crosbymichael/schedcore
add runc shim support for sched core
2021-10-08 10:42:16 -07:00
Derek McGowan
26ee1b1ee5
Merge pull request #4695 from crosbymichael/cri-class
[cri] Add CNI conf based on runtime class
2021-10-08 09:27:49 -07:00
Michael Crosby
e48bbe8394 add runc shim support for sched core
In linux 5.14 and hopefully some backports, core scheduling allows processes to
be co scheduled within the same domain on SMT enabled systems.

The containerd impl sets the core sched domain when launching a shim. This
allows a clean way for each shim(container/pod) to be in its own domain and any
additional containers, (v2 pods) be be launched with the same domain as well as
any exec'd process added to the container.

kernel docs: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/core-scheduling.html

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-10-08 16:18:09 +00:00
Michael Crosby
7b8a697f28
Merge pull request #6034 from claudiubelu/windows/fixes-image-volume
Fixes Windows containers with image volumes
2021-10-07 11:50:01 -04:00
Akihiro Suda
703b86533b
pkg/cap: remove an outdated comment
pkg/cap no longer depends on github.com/syndtr/gocapability

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-10-06 13:24:30 +09:00
Derek McGowan
63b7e5771e
Merge pull request #5973 from Juneezee/deprecate-ioutil
refactor: move from io/ioutil to io and os package
2021-10-01 10:52:06 -07:00
Claudiu Belu
791e175c79 Windows: Fixes Windows containers with image volumes
Currently, there are few issues that preventing containers
with image volumes to properly start on Windows.

- Unlike the Linux implementation, the Container volume mount paths
  were not created if they didn't exist. Those paths are now created.

- while copying the image volume contents to the container volume,
  the layers were not properly deactivated, which means that the
  container can't start since those layers are still open. The layers
  are now properly deactivated, allowing the container to start.

- even if the above issue didn't exist, the Windows implementation of
  mount/Mount.Mount deactivates the layers, which wouldn't allow us
  to copy files from them. The layers are now deactivated after we've
  copied the necessary files from them.

- the target argument of the Windows implementation of mount/Mount.Mount
  was unused, which means that folder was always empty. We're now
  symlinking the Layer Mount Path into the target folder.

- hcsshim needs its Container Mount Paths to be properly formated, to be
  prefixed by C:. This was an issue for Volumes defined with Linux-like
  paths (e.g.: /test_dir). filepath.Abs solves this issue.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-10-01 09:02:18 +00:00
haoyun
5c2426a7b2 cleanup: import from k8s.io/utils/clock/testing instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:34:56 +08:00
haoyun
6484fab1e0 cleanup: import from k8s.io/utils/clock instead
Signed-off-by: haoyun <yun.hao@daocloud.io>
2021-09-30 23:27:20 +08:00
zounengren
fcffe0c83a switch usage directly to errdefs.(ErrAlreadyExists and ErrNotFound)
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2021-09-24 18:26:58 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Michael Crosby
55893b9be7 Add CNI conf based on runtime class
Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-17 19:05:06 +00:00
Phil Estes
f40df3d72b
Enable image config labels in ctr and CRI container creation
Signed-off-by: Phil Estes <estesp@amazon.com>
2021-09-15 15:31:19 -04:00
Phil Estes
d081457ba4
Merge pull request #5974 from claudiubelu/hanging-task-delete-fix
task delete: Closes task IO before waiting
2021-09-15 11:30:23 -04:00
Fu Wei
e1ad779107
Merge pull request #5817 from dmcgowan/shim-plugins
Add support for shim plugins
2021-09-12 18:18:20 +08:00
Fu Wei
d9f921e4f0
Merge pull request #5906 from thaJeztah/replace_os_exec 2021-09-11 10:38:53 +08:00
Phil Estes
6589876d20
Merge pull request #5964 from crosbymichael/cni-pref
add ip_pref CNI options for primary pod ip
2021-09-10 12:06:23 -04:00
Fu Wei
689a863efe
Merge pull request #5939 from scuzhanglei/privileged-device 2021-09-10 22:15:46 +08:00
Michael Crosby
1ddc54c00d
Merge pull request #5954 from claudiubelu/fix-sandbox-remove
sandbox: Allows the sandbox to be deleted in NotReady state
2021-09-10 10:12:34 -04:00
Michael Crosby
1efed43090
add ip_pref CNI options for primary pod ip
This fixes the TODO of this function and also expands on how the primary pod ip
is selected. This change allows the operator to prefer ipv4, ipv6, or retain the
ordering provided by the return results of the CNI plugins.

This makes it much more flexible for ops to configure containerd and how IPs are
set on the pod.

Signed-off-by: Michael Crosby <michael@thepasture.io>
2021-09-10 10:04:21 -04:00
scuzhanglei
756f4a3147 cri: add devices for privileged container
Signed-off-by: scuzhanglei <greatzhanglei@gmail.com>
2021-09-10 10:16:26 +08:00
Fu Wei
d58542a9d1
Merge pull request #5627 from payall4u/payall4u/cri-support-cgroup-v2 2021-09-09 23:10:33 +08:00
Claudiu Belu
55faa5e93d task delete: Closes task IO before waiting
After containerd restarts, it will try to recover its sandboxes,
containers, and images. If it detects a task in the Created or
Stopped state, it will be removed. This will cause the containerd
process it hang on Windows on the t.io.Wait() call.

Calling t.io.Close() beforehand will solve this issue.

Additionally, the same issue occurs when trying to stopp a sandbox
after containerd restarts. This will solve that case as well.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-07 02:17:01 -07:00
Wei Fu
2bcd6a4e88 cri: patch update image labels
The CRI-plugin subscribes the image event on k8s.io namespace. By
default, the image event is created by CRI-API. However, the image can
be downloaded by containerd API on k8s.io with the customized labels.
The CRI-plugin should use patch update for `io.cri-containerd.image`
label in this case.

Fixes: #5900

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-09-05 18:48:26 +08:00
Claudiu Belu
24cec9be56 sandbox: Allows the sandbox to be deleted in NotReady state
The Pod Sandbox can enter in a NotReady state if the task associated
with it no longer exists (it died, or it was killed). In this state,
the Pod network namespace could still be open, which means we can't
remove the sandbox, even if --force was used.

Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
2021-09-02 03:40:56 -07:00
Mike Brown
e00f87f1dc
Merge pull request #5927 from adelina-t/ws_2022_image_update
Update Pause image in tests & config
2021-08-31 16:11:57 -05:00
Adelina Tuvenie
6d3d34b85d Update Pause image in tests & config
With the introduction of Windows Server 2022, some images have been updated
to support WS2022 in their manifest list. This commit updates the test images
accordingly.

Signed-off-by: Adelina Tuvenie <atuvenie@cloudbasesolutions.com>
2021-08-31 19:42:57 +03:00
Mikko Ylinen
e0f8c04dad cri: Devices ownership from SecurityContext
CRI container runtimes mount devices (set via kubernetes device plugins)
to containers by taking the host user/group IDs (uid/gid) to the
corresponding container device.

This triggers a problem when trying to run those containers with
non-zero (root uid/gid = 0) uid/gid set via runAsUser/runAsGroup:
the container process has no permission to use the device even when
its gid is permissive to non-root users because the container user
does not belong to that group.

It is possible to workaround the problem by manually adding the device
gid(s) to supplementalGroups. However, this is also problematic because
the device gid(s) may have different values depending on the workers'
distro/version in the cluster.

This patch suggests to take RunAsUser/RunAsGroup set via SecurityContext
as the device UID/GID, respectively. The feature must be enabled by
setting device_ownership_from_security_context runtime config value to
true (valid on Linux only).

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2021-08-30 09:30:00 +03:00
Phil Estes
af1a0908d0
Merge pull request #5865 from dcantah/windows-pod-runasusername
Add RunAsUserName functionality for the Windows pod sandbox container
2021-08-25 22:25:14 -04:00
Sebastiaan van Stijn
2ac9968401
replace uses of os/exec with golang.org/x/sys/execabs
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows. This issue also
affects Unix users who have “.” listed explicitly in their PATH and are running
“go get” outside of a module or with module mode disabled.

This issue is not limited to the go command itself, and can also affect binaries
that use `os.Command`, `os.LookPath`, etc.

From the related blogpost (ttps://blog.golang.org/path-security):

> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing

This patch replaces all uses of `os/exec` with `golang.org/x/sys/execabs`. While
some uses of `os/exec` should not be problematic (e.g. part of tests), it is
probably good to be consistent, in case code gets moved around.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-25 18:11:09 +02:00
Fu Wei
6fa9588531
Merge pull request #5903 from AkihiroSuda/gofmt117
Run `go fmt` with Go 1.17
2021-08-24 23:01:41 +08:00
Daniel Canter
25644b4614 Add RunAsUserName functionality for the Windows Pod Sandbox Container
There was recent changes to cri to bring in a Windows section containing a
security context object to the pod config. Before this there was no way to specify
a user for the pod sandbox container to run as. In addition, the security context
is a field for field mirror of the Windows container version of it, so add the
ability to specify a GMSA credential spec for the pod sandbox container as well.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-08-23 07:35:22 -07:00
payall4u
f8dfbee178 add cri test case
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:59:19 +08:00
payall4u
9a8bf13158 feature: add field LinuxContainerResources.Unified on cri
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-08-23 10:49:31 +08:00
Akihiro Suda
d3aa7ee9f0
Run go fmt with Go 1.17
The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-08-22 09:31:50 +09:00
Jacob Blain Christen
c3609ff4ca cri: filter selinux xattr for image volumes
Exclude the `security.selinux` xattr when copying content from layer
storage for image volumes. This allows for the already correct label
at the target location to be applied to the copied content, thus
enabling containers to write to volumes that they implicitly expect to be
able to write to.

- Fixes containerd/containerd#5090
- See rancher/rke2#690

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2021-08-20 23:47:24 -07:00
Phil Estes
ff2e58d114
Merge pull request #5131 from perithompson/windows-hostnetwork
Add Windows HostProcess Support
2021-08-20 14:29:37 -04:00
Kazuyoshi Kato
4dd5ca70fb script: update golangci-lint from v1.38.0 and v1.36.0 to v1.42.0
golint has been deprecated and replaced by revive since v1.41.0.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-08-19 16:27:16 -07:00
Derek McGowan
8d135d2842
Add support for shim plugins
Refactor shim v2 to load and register plugins.
Update init shim interface to not require task service implementation on
returned service, but register as plugin if it is.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-17 11:06:09 -07:00
Gunju Kim
1224060f89 Allow expanded DNS configuration
Signed-off-by: Gunju Kim <gjkim042@gmail.com>
2021-08-14 06:13:01 +09:00
Peri Thompson
79b369a0bb
Added windows hostProcess cni skip
Signed-off-by: Peri Thompson <perit@vmware.com>
2021-08-11 22:23:49 +01:00
Michael Crosby
218db0f9af
Merge pull request #5835 from dmcgowan/plugin-events-cleanup
Move plugin context events into separate plugin
2021-08-07 21:47:11 -04:00
Derek McGowan
0a0621bb47
Move plugin context events into separate plugin
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:59:20 -07:00
Derek McGowan
6f027e38a8
Remove redundant build tags
Remove build tags which are already implied by the name of the file.
Ensures build tags are used consistently

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:27:46 -07:00
Derek McGowan
caf9e256b7
Merge pull request #5693 from kzys/sigrtmin
Support SIGRTMIN+n signals
2021-07-27 11:58:57 -07:00
Davanum Srinivas
494b940f14
Introduce a new go module - containerd/api for use in standalone clients
In containerd 1.5.x, we introduced support for go modules by adding a
go.mod file in the root directory. This go.mod lists all the things
needed across the whole code base (with the exception of
integration/client which has its own go.mod). So when projects that
need to make calls to containerd API will pull in some code from
containerd/containerd, the `go mod` commands will add all the things
listed in the root go.mod to the projects go.mod file. This causes
some problems as the list of things needed to make a simple API call
is enormous. in effect, making a API call will pull everything that a
typical server needs as well as the root go.mod is all encompassing.
In general if we had smaller things folks could use, that will make it
easier by reducing the number of things that will end up in a consumers
go.mod file.

Now coming to a specific problem, the root containerd go.mod has various
k8s.io/* modules listed. Also kubernetes depends on containerd indirectly
via both moby/moby (working with docker maintainers seperately) and via
google/cadvisor. So when the kubernetes maintainers try to use latest
1.5.x containerd, they will see the kubernetes go.mod ending up depending
on the older version of kubernetes!

So if we can expose just the minimum things needed to make a client API
call then projects like cadvisor can adopt that instead of pulling in
the entire go.mod from containerd. Looking at the existing code in
cadvisor the minimum things needed would be the api/ directory from
containerd. Please see proof of concept here:
github.com/google/cadvisor/pull/2908

To enable that, in this PR, we add a go.mod file in api/ directory. we
split the Protobuild.yaml into two, one for just the things in api/
directory and the rest in the root directory. We adjust various targets
to build things correctly using `protobuild` and also ensure that we
end up with the same generated code as before as well. To ensure we
better take care of the various go.mod/go.sum files, we update the
existing `make vendor` and also add a new `make verify-vendor` that one
can run locally as well in the CI.

Ideally, we would have a `containerd/client` either as a standalone repo
or within `containerd/containerd` as a separate go module. but we will
start here to experiment with a standalone api go module first.

Also there are various follow ups we can do, for example @thaJeztah has
identified two tasks we could do after this PR lands:

github.com/containerd/containerd/pull/5716#discussion_r668821396

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-07-27 07:34:59 -04:00
Kazuyoshi Kato
1d3d08026d Support SIGRTMIN+n signals
systemd uses SIGRTMIN+n signals, but containerd didn't support the signals
since Go's sys/unix doesn't support them.

This change introduces SIGRTMIN+n handling by utilizing moby/sys/signal.

Fixes #5402.

https://www.freedesktop.org/software/systemd/man/systemd.html#Signals

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-07-26 09:36:43 -07:00
Wei Fu
ac75071b49 remove pkg/cri/platforms package
The package is a duplicate of platforms. No need to maintain
pkg/cri/platforms.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-07-10 10:14:27 +08:00
Brian Goff
0a8802df67 Allow WithServices to use custom implementations
Before this change, for several of the services that `WithServices`
handles, only the grpc client is supported.
Now, for instance, one can use an `images.Store` directly instead of
only an `imagesapi.StoreSlient`.

Some of the methods have been renamed to satisfy the difference between
using a grpc `<Foo>Client` vs the main interface.

I did not see a good candidate for TaskService so have left that mostly
unchanged.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-07-09 23:30:40 +00:00
Phil Estes
cf600abecc
Merge pull request #5619 from mikebrow/cri-add-v1-proxy-alpha
[CRI] move up to CRI v1 and support v1alpha in parallel
2021-07-09 14:07:24 -04:00
Mike Brown
d1c1051927 use fu wei's suggeted interface pick for marshaling
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-07 15:45:45 -05:00
Mike Brown
14962dcbd2 add alpha version
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-07-06 11:40:20 -05:00
Mike Brown
a5c417ac06 move up to CRI v1 and support v1alpha in parallel
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-28 09:34:12 -05:00
Dan Williams
dac2543a07 sandbox: send pod UID to CNI plugins as K8S_POD_UID
CNI plugins that need to wait for network state to converge
may want to cancel waiting when a short lived pod is deleted.
However, there is a race between when kubelet asks the runtime
to create the sandbox for the pod, and when the plugin is able
request the pod object from the apiserver. It may be the case
that the plugin receives the new pod, rather than the pod
the sandbox request was initiated for.

Passing the pod UID to the plugin allows the plugin to check
whether the pod it gets from the apiserver is actually the
pod its sandbox request was started for.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-06-22 22:53:30 -05:00
Mike Brown
560e7d4799 fixing some doc links
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-21 18:24:47 -05:00
Kazuyoshi Kato
1bbee573af github.com/golang/protobuf/proto is deprecated
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-06-17 10:28:48 -04:00
Quan Tian
728743eb28 Fix cleanup context of teardownPodNetwork
Similar to other deferred cleanup operations, teardownPodNetwork should
use a different context as the original context may have expired,
otherwise CNI wouldn't been invoked, leading to leak of network
resources, e.g. IP addresses.

Signed-off-by: Quan Tian <qtian@vmware.com>
2021-06-04 19:17:05 +08:00
zounengren
498bb36f67 scrub the stale TODO
Signed-off-by: zounengren <zouyee1989@gmail.com>
2021-06-01 11:22:15 +08:00
Shiming Zhang
79e3452213 update the link
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-05-20 11:56:29 +08:00
Shiming Zhang
1acca8bba3 Don't check for apparmor_parser to be present
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-05-20 11:56:29 +08:00
Phil Estes
e47400cbd2
Merge pull request #5100 from adisky/skip-tls-localHost
Skip TLS verification for localhost
2021-05-12 14:56:53 -04:00
Kevin Parsons
b0d3b35b28 windows: Use GetFinalPathNameByHandle for ResolveSymbolicLink
This change splits the definition of pkg/cri/os.ResolveSymbolicLink by
platform (windows/!windows), and switches to an alternate implementation
for Windows. This aims to fix the issue described in containerd/containerd#5405.

The previous implementation which just called filepath.EvalSymlinks has
historically had issues on Windows. One of these issues we were able to
fix in Go, but EvalSymlinks's behavior is not well specified on
Windows, and there could easily be more issues in the future, so it
seems prudent to move to a separate implementation for Windows.

The new implementation uses the Windows GetFinalPathNameByHandle API,
which takes a handle to an open file or directory and some flags, and
returns the "real" name for the object. See comments in the code for
details on the implementation.

I have tested this change with a variety of mounts and everything seems
to work as expected. Functions that make incorrect assumptions on what a
Windows path can look like may have some trouble with the \\?\ path
syntax. For instance EvalSymlinks fails when given a \\?\UNC\ path. For
this reason, the resolvePath implementation modifies the returned path
to translate to the more common form (\\?\UNC\server\share ->
\\server\share).

Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
2021-05-04 11:55:11 -07:00
Mike Brown
c1a35232d8
Merge pull request #5446 from Random-Liu/fix-auth-config
Fix different registry hosts referencing the same auth config.
2021-05-04 06:21:02 -05:00
Lantao Liu
81402e4758 Fix different registry hosts referencing the same auth config.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-05-03 17:42:57 -07:00
Aditi Sharma
8014d9fee0 Skip TLS verification for localhost
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-05-03 10:21:54 +05:30
Mike Brown
3dad67eedb
Merge pull request #5203 from tghartland/5008-target-namespace
Support PID NamespaceMode_TARGET
2021-04-21 19:41:38 -05:00
Maksym Pavlenko
cb64dc8250
Merge pull request #5401 from Iceber/use-unbuffered-channel
process: use the unbuffered channel as the done signal
2021-04-21 16:50:32 -07:00
Thomas Hartland
efcb187429 Add unit tests for PID NamespaceMode_TARGET validation
Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 19:59:10 +02:00
Thomas Hartland
b48f27df6b Support PID NamespaceMode_TARGET
This commit adds support for the PID namespace mode TARGET
when generating a container spec.

The container that is created will be sharing its PID namespace
with the target container that was specified by ID in the namespace
options.

Signed-off-by: Thomas Hartland <thomas.george.hartland@cern.ch>
2021-04-21 17:54:17 +02:00
Iceber Gu
909660ea92 process: use the unbuffered channel as the done signal
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-04-21 18:24:18 +08:00
Michael Crosby
a3fe5c84c0
Merge pull request #5383 from wzshiming/clean/process-io
move common code to pkg/process from runtime
2021-04-20 14:40:12 -04:00
Phil Estes
81c4ac202f
Merge pull request #5256 from kolyshkin/seccomp-enabled
pkg/seccomp: simplify and speed up isEnabled
2021-04-19 15:59:28 -04:00
Shiming Zhang
7966a6652a Cleanup code
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-04-19 16:59:45 +08:00
Sebastiaan van Stijn
1c03c377e5
go.mod: github.com/containerd/fifo v1.0.0
full diff: https://github.com/containerd/fifo/compare/115abcc95a1d...v1.0.0

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-19 09:27:45 +02:00
Kir Kolyshkin
3292ea5862 pkg/seccomp: use sync.Once to speed up IsEnabled
It does not make sense to check if seccomp is supported by the kernel
more than once per runtime, so let's use sync.Once to speed it up.

A quick benchmark (old implementation, before this commit, after):

BenchmarkIsEnabledOld-4           37183            27971 ns/op
BenchmarkIsEnabled-4            1252161              947 ns/op
BenchmarkIsEnabledOnce-4      666274008             2.14 ns/op

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-16 15:52:35 -07:00
Kir Kolyshkin
00b5c99b1a pkg/seccomp: simplify IsEnabled, update doc
Current implementation of seccomp.IsEnabled (rooted in runc) is not
too good.

First, it parses the whole /proc/self/status, adding each key: value
pair into the map (lots of allocations and future work for garbage
collector), when using a single key from that map.

Second, the presence of "Seccomp" key in /proc/self/status merely means
that kernel option CONFIG_SECCOMP is set, but there is a need to _also_
check for CONFIG_SECCOMP_FILTER (the code for which exists but never
executed in case /proc/self/status has Seccomp key).

Replace all this with a single call to prctl; see the long comment in
the code for details.

While at it, improve the IsEnabled documentation.

NOTE historically, parsing /proc/self/status was added after a concern
was raised in https://github.com/opencontainers/runc/pull/471 that
prctl(PR_GET_SECCOMP, ...) can result in the calling process being
killed with SIGKILL. This is a valid concern, so the new code here
does not use PR_GET_SECCOMP at all.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2021-04-16 15:52:35 -07:00
Phil Estes
4f18131239
Merge pull request #5286 from payall4u/optimize-cri-redirect-logs
cri: Reduce the cpu usage of  the function redirectLogs in cri
2021-04-14 21:33:05 -04:00
Sebastiaan van Stijn
864a3322b3
go.mod: github.com/containerd/go-cni v1.0.2
full diff: https://github.com/containerd/go-cni/compare/v1.0.1...v1.0.2

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-04-14 09:09:18 +02:00
Mike Brown
8a04bd0521 address recent runtimes config confusion
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-12 15:33:38 -05:00
Mike Brown
e96d2a5d90 Revert "remove two very old no longer used runtime options"
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-04-12 10:16:01 -05:00
Fu, Wei
7e3fd8da24
Merge pull request #5298 from jsturtevant/issue-5297
Support multi-arch images for Windows via ctr
2021-04-12 13:52:14 +08:00
payall4u
4bc8f692fc optimize cri redirect logs
Signed-off-by: Zhiyu Li <payall4u@qq.com>
2021-04-09 11:45:53 +08:00
Fu, Wei
d064140369
Merge pull request #5302 from mikebrow/toml-cri-defaults
shows our runc.v2 default options
2021-04-09 11:11:25 +08:00