Commit Graph

693 Commits

Author SHA1 Message Date
Danny Canter
0bc9633414 runtime/v2: net.Dial gRPC shim sockets before trying grpc
This is mostly to workaround an issue with gRPC based shims after containerd
restart. If a shim dies while containerd is also down/restarting, on reboot
grpc.DialContext with our current set of DialOptions will make us wait for
100 seconds per shim even if the socket no longer exists or has no listener.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-11-30 19:37:43 -08:00
Maksym Pavlenko
5664c9a61a
Merge pull request #9368 from thaJeztah/shim_logs
services/server, runtime/v2/shim: use structured log for plugin ID
2023-11-15 23:21:26 +00:00
Sebastiaan van Stijn
1a1bd6d0a7
runtime/v2/shim: use structured log for plugin ID
These logs were already using structured logs, so include "id" as a field,
which also prevents the id being quoted (and escaped when printing);

    time="2023-11-15T11:30:23.745574884Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
    time="2023-11-15T11:30:23.745612425Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.pause\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
    time="2023-11-15T11:30:23.745620884Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
    time="2023-11-15T11:30:23.745625925Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1

Also updated some changed `WithError().WithField()` calls, to prevent some
overhead.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-11-15 13:23:53 +01:00
Sebastiaan van Stijn
71fd85f5ed
runtime/v2/shim: run(): remove unused "name" argument
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-11-15 13:23:53 +01:00
Sebastiaan van Stijn
0a59c33be5
runtime/v2/shim: rename var that shadowed package var
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-11-15 13:23:53 +01:00
Maksym Pavlenko
7d65a45639
Move runc shim implementation to cmd
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-11-14 10:13:32 -08:00
Avi Deitcher
76049170b8 document runtime and shim configuration and selection
Signed-off-by: Avi Deitcher <avi@deitcher.net>
2023-11-06 08:59:36 +02:00
Sebastiaan van Stijn
2af6db672e
switch back from golang.org/x/sys/execabs to os/exec (go1.19)
This is effectively a revert of 2ac9968401, which
switched from os/exec to the golang.org/x/sys/execabs package to mitigate
security issues (mainly on Windows) with lookups resolving to binaries in the
current directory.

from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path

> ## PATH lookups
>
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in
> the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-11-02 21:15:40 +01:00
Derek McGowan
9db21401c4
Switch to github.com/containerd/plugin
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-11-01 23:01:42 -07:00
Derek McGowan
261e01c2ac
Move client to subpackage
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-11-01 10:37:00 -07:00
Derek McGowan
5fdf55e493
Update go module to github.com/containerd/containerd/v2
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-10-29 20:52:21 -07:00
Maksym Pavlenko
f515cd5c55
Reorder fields when writing bootstrap params
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:06 -07:00
Maksym Pavlenko
3d53fbe858
Fix CRI integration tests
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:05 -07:00
Maksym Pavlenko
f76eaf5a6b
Fix 'not a directory' error when restoring bootstrap.json
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:05 -07:00
Maksym Pavlenko
cf75cfa32c
Add more logs around shim restore
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:04 -07:00
Maksym Pavlenko
8061cb0237
Save bootstrap.json instead of address file
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:03 -07:00
Maksym Pavlenko
e03bf32b86
Switch runc to v3
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:03 -07:00
Maksym Pavlenko
7a2d801d62
Expose shim instance version
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:02 -07:00
Maksym Pavlenko
f66c46806a
Bridge task service v2
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:29:01 -07:00
Maksym Pavlenko
daaf67662f
Switch runc shim to task v3
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-19 12:28:59 -07:00
Abel Feng
7bca70c0c3 sandbox: do not call Connect when loadShim
The ShimManager.Start() will call loadShim() to get the existing shim if SandboxID
is specified for a container, but shimTask.PID() is called in loadShim,
which will call Connect() of Task API with the ID of a task that is not
created yet(containerd is getting the shim and Task API address to call
Create, so the task is not created yet).
In this commit we change the logic of loadShim() to get the shim without calling
Connect() of the not created container ID.

Signed-off-by: Abel Feng <fshb1988@gmail.com>
2023-10-16 21:17:50 +08:00
Akihiro Suda
9fcc3beb59
Merge pull request #9230 from mxpv/exit
Exit shim when shutdown manager is done
2023-10-13 14:51:20 +09:00
Derek McGowan
7b2a918213
Generalize the plugin package
Remove containerd specific parts of the plugin package to prepare its
move out of the main repository. Separate the plugin registration
singleton into a separate package.

Separating out the plugin package and registration makes it easier to
implement external plugins without creating a dependency loop.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-10-12 21:22:32 -07:00
Derek McGowan
a80606bc2d
Move plugin type definitions to containerd plugins package
The plugins packages defines the plugins used by containerd.
Move all the types and properties to this package.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-10-12 20:52:56 -07:00
Maksym Pavlenko
2486c12987 Exit shim when shutdown manager is done
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-10-12 19:53:05 -07:00
Chen Yiyang
68dd47ef70 containerd-shim-runc-v2: avoid potential deadlock in create handler
After pr #8617, create handler of containerd-shim-runc-v2 will
call handleStarted() to record the init process and handle its exit.
Init process wouldn't quit so early in normal circumstances. But if
this screnario occurs, handleStarted() will call
handleProcessExit(), which will cause deadlock because create() had
acquired s.mu, and handleProcessExit() will try to lock it again.

So, I added a parameter muLocked to handleStarted to indicate whether
or not s.mu is currently locked, and thus deciding whether or not to
lock it when calling handleProcessExit.

Fix: #9103
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2023-10-02 17:44:41 +00:00
Chen Yiyang
6604ff6c55 containerd-shim-runc-v2: remove unnecessary s.getContainer()
Previous code has already called `getContainer()`, just pass it into
`s.getContainerPids` to reduce unnecessary lock and map lookup.

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
2023-10-02 17:44:41 +00:00
Derek McGowan
508aa3a1ef
Move to use github.com/containerd/log
Add github.com/containerd/log to go.mod

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-09-22 07:53:23 -07:00
Jin Dong
fc45365fa1 Remove most logrus
Signed-off-by: Jin Dong <jin.dong@databricks.com>
2023-08-26 14:31:53 -04:00
Jin Dong
cd8c8ae4bc Remove hashicorp/go-multierror
Signed-off-by: Jin Dong <jin.dong@databricks.com>
2023-08-20 17:59:45 -07:00
Wei Fu
601699a184 integration: add ShouldRetryShutdown case based on #7496
Since the moby/moby can't handle duplicate exit event well, it's hard
for containerd to retry shutdown if there is error, like context
canceled.

In order to prevent from regression like #4769, I add skipped
integration case as TODO item and we should rethink about how to handle
the task/shim lifecycle.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-11 17:43:51 +08:00
Maksym Pavlenko
8dcc06d14a
Merge pull request #8747 from Iceber/shim_ttrpc_service
shim: change ttrpcService and ttrpcServerOptioner to exported interfaces
2023-07-18 17:12:22 -07:00
Akihiro Suda
98f27e1d9c
Revert "Add support for mounts on Darwin"
This reverts commit 2799b28e61.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-07-19 00:22:20 +09:00
Marat Radchenko
2799b28e61 Add support for mounts on Darwin
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-07-17 23:27:04 +03:00
Phil Estes
34b1653e95
Merge pull request #8780 from slonopotamus/uncopypaste-read-spec
Uncopypaste parsing of OCI Bundle spec file
2023-07-11 09:53:00 -04:00
Marat Radchenko
9e34b8b441 Uncopypaste parsing of OCI Bundle spec file
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-07-11 14:41:15 +03:00
Phil Estes
466d884518
Merge pull request #8777 from yankay/fix-restart-with-tty
Fix the automatically restart issue when using LogURI and Terminal together
2023-07-06 10:51:11 -04:00
Kay Yan
073de93086 Fix the auto restart fail when using LogURI and TTY together
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2023-07-06 04:58:56 +00:00
Iceber Gu
00e5ae2118 shim: change ttrpcService and ttrpcServerOptioner to exported interfaces
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-07-06 00:36:43 +08:00
Marat Radchenko
51a1e7f0b2 Fix example shim to actually use its task service
In commit 4b35c3829d, example shim erroneously started to depend on runc, fix that back.

Also, build example shim on all supported platforms to prevent such situations in the future.

Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-07-03 20:40:20 +03:00
Marat Radchenko
0607e73263 Move GetTopic function out of runc shim
Every shim implementation needs to select a correct publisher topic when posting events, so move it out of Linux-only runc code to the place where other shims can also use it

Otherwise, shims have to copy-paste this code. For example, see runj: 8158e558a3/containerd/shim.go (L144-L172)

Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-06-30 10:29:21 +03:00
Jin Dong
0a92661e69 Add a platform.ParseAll helper
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2023-06-26 20:34:37 +00:00
Kazuyoshi Kato
ded713010c
Merge pull request #8617 from corhere/reduce-exec-lock-contention
runtime/v2/runc: handle early exits w/o big locks
2023-06-14 15:55:07 -07:00
Derek McGowan
0ae64ebd4e
Merge pull request #8680 from dcantah/sb-metrics
Sandbox: Add SandboxMetrics rpc
2023-06-14 18:11:18 +00:00
Jin Dong
9e09bfb590 Use RWMutex in NSMap and reduce lock area
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2023-06-14 17:50:54 +00:00
Danny Canter
d56722ef2a Sandbox: Add SandboxMetrics rpc
To gather metrics/stats about a specific sandbox instance, it'd be nice to
have a dedicated rpc for this. Due to the same "what kind of stats are going
to be returned" dilemma exists for sandboxes as well, I've re-used the metrics
type we have as the data field is just an `any`, leaving the metrics returned
entirely up to the shim author. For CRI usecases this will just be cgroup and
windows stats as that's all that's supported right now.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-12 03:30:48 -07:00
Derek McGowan
98b7dfb870
Merge pull request #8673 from thaJeztah/no_any
avoid "any" as variable name
2023-06-10 20:44:30 -07:00
Sebastiaan van Stijn
4bb709c018
avoid "any" as variable name
Avoid shadowing / confusion with Go's "any" built-in type.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-10 13:49:06 +02:00
Sebastiaan van Stijn
577696f608
replace some basic uses of fmt.Sprintf()
Really tiny gains here, and doesn't significantly impact readability:

    BenchmarkSprintf
    BenchmarkSprintf-10    11528700     91.59 ns/op   32 B/op  1 allocs/op
    BenchmarkConcat
    BenchmarkConcat-10    100000000     11.76 ns/op    0 B/op  0 allocs/op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-06-10 13:24:43 +02:00
Cory Snider
5cd6210ad0 runtime/v2/runc: handle early exits w/o big locks
eventSendMu is causing severe lock contention when multiple processes
start and exit concurrently. Replace it with a different scheme for
maintaining causality w.r.t. start and exit events for a process which
does not rely on big locks for synchronization.

Keep track of all processes for which a Task(Exec)Start event has been
published and have not yet exited in a map, keyed by their PID.
Processing exits then is as simple as looking up which process
corresponds to the PID. If there are no started processes known with
that PID, the PID must either belong to a process which was started by
s.Start() and before the s.Start() call has added the process to the map
of running processes, or a reparented process which we don't care about.
Handle the former case by having each s.Start() call subscribe to exit
events before starting the process. It checks if the PID has exited in
the time between it starting the process and publishing the TaskStart
event, handling the exit if it has. Exit events for reparented processes
received when no s.Start() calls are in flight are immediately
discarded, and events received during an s.Start() call are discarded
when the s.Start() call returns.

Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-06-09 16:53:43 -04:00