Move runtime to core/runtime

Signed-off-by: Derek McGowan <derek@mcg.dev>
This commit is contained in:
Derek McGowan
2024-01-17 09:58:04 -08:00
parent df9b0a0675
commit dbc74db6a1
93 changed files with 63 additions and 63 deletions

516
core/runtime/v2/README.md Normal file
View File

@@ -0,0 +1,516 @@
# Runtime v2
Runtime v2 introduces a first class shim API for runtime authors to integrate with containerd.
containerd, the daemon, does not directly launch containers. Instead, it acts as a higher-level manager
or hub for coordinating the activities of containers and content, that lower-level
programs, called "runtimes", actually implement to start, stop and manage containers,
either individual containers or groups of containers, e.g. Kubernetes pods.
For example, containerd will retrieve container image config and its content as layers, use the snapshotter to lay it out on disk, set up
the container's rootfs and config, and then launch a runtime that will create/start/stop the container.
This document describes the major components of the v2 runtime integration model, how the components interact
with containerd and the v2 runtime, and how to use and integrate different v2 runtimes.
To simplify the interaction, runtime v2 introduced a first class v2 API for runtime authors to integrate with containerd,
replacing the v1 API.
The v2 API is minimal and scoped to the execution lifecycle of a container.
This document is split into the following sections:
* [architecture](#architecture) - the major components, their purposes and relationships
* [usage](#usage) - how to invoke specific runtimes, and how to configure them
* [authoring](#shim-authoring) - how to author a v2 runtime
## Architecture
### containerd-runtime communication
containerd expects a runtime to implement several container control features, such as create, start and stop.
The high-level flow is as follows:
1. client requests from containerd to create a container
1. containerd lays out the container's filesystem, and creates the necessary config information
1. containerd invokes the runtime over an API to create/start/stop the container
However, containerd itself does not actually directly invoke the runtime to start the container.
Instead it expects to invoke the runtime, which will expose a socket - Unix-domain on Unix-like systems, named pipe on Windows -
and listen for container commands via [ttRPC](https://github.com/containerd/ttrpc) over that
socket.
The runtime is expected to process those operations. How it does so is entirely within the scope of the runtime implementation.
Two common patterns are:
* a single binary for runtime that both listens on the socket and creates/starts/stops the container
* a separate shim binary that listens on the socket, and invokes a separate runtime engine that creates/starts/stops the container
The separate "shim+engine" pattern is used because it makes it easier to integrate distinct runtimes implementing a specific runtime
engine spec, such as the [OCI runtime spec](https://github.com/opencontainers/runtime-spec).
The ttRPC protocol can be handled via one runtime shim, while distinct runtime engine implementations can
be used, as long as they implement the OCI runtime spec.
The most commonly used runtime _engine_ is [runc](https://github.com/opencontainers/runc), which implements the
[OCI runtime spec](https://github.com/opencontainers/runtime-spec). As this is a runtime _engine_, it is not
invoked directly by containerd; instead, it is invoked by a shim, which listens on the socket and invokes the runtime engine.
#### shim+engine Architecture
##### runtime shim
The runtime shim is what actually is invoked by containerd. It has minimal options on start beyond
being provided the communications port for containerd and some configuration information.
The runtime shim listens on the socket for ttRPC commands from containerd, and then invokes a separate program,
the runtime engine, via `fork`/`exec` to run the container. For example, the `io.containerd.runc.v2` shim invokes
an OCI compliant runtime engine such as `runc`.
containerd passes options to the shim over the ttRPC connection, which may include the runtime engine binary
to invoke. These are the `options` for the [`CreateTaskRequest`](https://github.com/containerd/containerd/blob/main/runtime/v2/README.md#container-level-shim-configuration).
For example, the `io.containerd.runc.v2` shim supports including the path to the runtime engine binary.
##### runtime engine
The runtime engine itself is what actually starts and stops the container.
For example, in the case of [runc](https://github.com/opencontainers/runc), the containerd project provides the shim
as the executable `containerd-shim-runc-v2`. This is invoked by containerd and starts the ttRPC listener.
The shim then invokes the actual `runc` binary, passing it the container configuration, and the `runc` binary
creates/starts/stops the container typically via `libcontainer`->system apis.
#### shim+engine Relationship
Since each shim instance communicates with containerd as a daemon, while parenting containers via invoking independent runtimes,
it is possible to have one shim for multiple containers and invocations. For example,
you could have one `containerd-shim-runc-v2` communicating with one containerd, and it can
invoke ten distinct containers.
It even is possible to have one shim for multiple containers, each with its own actual runtime,
since, as described above, the runtime binary is passed as one of the options in `CreateTaskRequest`.
containerd does not know or care about whether the shim to container relationship is one-to-one,
or one-to-many. It is entirely up to the shim to decide. For example, the `io.containerd.runc.v2` shim
automatically groups based on the presence of
[labels](https://github.com/containerd/containerd/blob/b30e0163ac36c1a193604e5eca031053d62019c5/runtime/v2/runc/manager/manager_linux.go#L54-L60). In practice, this means that containers launched by Kubernetes, that are part of the same Kubernetes pod, are handled by a single
shim, grouping on the `io.kubernetes.cri.sandbox-id` label set by the CRI plugin.
The flow, then, is as follows:
1. containerd receives a request to create a container
1. containerd lays out the container's filesystem, and creates the necessary [container config](https://github.com/opencontainers/image-spec/blob/main/config.md) information
1. containerd invokes the shim, including container configuration, which uses that information to decide whether to launch a new socket listener (1:1 shim to container) or use an existing one (1:many)
* if existing, return the address of the existing socket and exit
* if new, the shim:
1. creates a new process to listen on a socket for ttRPC commands from containerd
1. returns the address to that socket to containerd
1. exits
1. containerd sends the shim a command to start the container
1. The shim invokes `runc` to create/start/stop the container
An excellent flow diagram is available later in this document under [Flow](#Flow).
## Usage
### Invoking Runtimes
A runtime - single instance or shim+engine - and its options, can be selected when creating a container via one of the exposed
containerd services (containerd client, CRI API,...), or via a client that calls into the containerd provided services.
Examples of containerd clients include `ctr`, `nerdctl`, kubernetes, docker/moby, rancher and others.
The runtime can also be changed via a container update.
The runtime name that is passed is a string that is used to identify the runtime to containerd. In the case of separate shim+engine,
this will be the runtime _shim_. Either way, this is the binary that containerd executes and expects to start the ttRPC listener.
The runtime name can be either a URI-like string, or, beginning with containerd 1.6.0, the actual path to the executable.
1. If the runtime name is a path, use that as the actual path to the runtime to invoke.
1. If the runtime name is URI-like, convert it to a runtime name using the below logic.
If the runtime name is URI-like, containerd will convert the passed runtime from the URI-like name to a binary name using the following logic:
1. Replaces all `.` with `-`
1. Takes the last 2 components, e.g. `runc.v2`
1. Prepends `containerd-shim`
For example, if the runtime name is `io.containerd.runc.v2`, containerd will invoke the shim as `containerd-shim-runc-v2`. It expects to
find the binary in its normal `PATH`.
containerd keeps the `containerd-shim-*` prefix so that users can `ps aux | grep containerd-shim` to see running shims on their system.
For example:
```bash
$ ctr --runtime io.containerd.runc.v2 run --rm docker.io/library/alpine:latest alpine
```
Will invoke `containerd-shim-runc-v2`.
You can test this by trying another name:
```bash
$ ctr run --runtime=io.foo.bar.runc2.v2.baz --rm docker.io/library/hello-world:latest hello-world /hello
ctr: failed to start shim: failed to resolve runtime path: runtime "io.foo.bar.runc2.v2.baz" binary not installed "containerd-shim-v2-baz": file does not exist: unknown
```
It received `io.foo.bar.runc2.v2.baz` and looked for `containerd-shim-v2-baz`.
You also can override the default configured runtime for the shim, by passing it the `--runc-binary`
option. For example"
```
ctr --runtime io.containerd.runc.v2 --runc-binary /usr/local/bin/runc-custom run --rm docker.io/library/alpine:latest alpine
```
### Configuring Runtimes
You can configure one or more runtimes in containerd's `config.toml` configuration file, by modifying the
section:
```toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
```
See [config.toml man page](../../docs/man/containerd-config.toml.5.md) for more details and an example.
These "named runtimes" in the configuration file are used solely when invoked via CRI, which has a
[`runtime_handler` field](https://github.com/kubernetes/cri-api/blob/de5f1318aede866435308f39cb432618a15f104e/pkg/apis/runtime/v1/api.proto#L476).
## Shim Authoring
This section is dedicated to runtime authors wishing to build a shim.
It will detail how the API works and different considerations when building shim.
### Commands
Container information is provided to a shim in two ways.
The OCI Runtime Bundle and on the `Create` rpc request.
#### `start`
Each shim MUST implement a `start` subcommand.
This command will launch new shims.
The start command MUST accept the following flags:
* `-namespace` the namespace for the container
* `-address` the address of the containerd's main grpc socket
* `-publish-binary` the binary path to publish events back to containerd
* `-id` the id of the container
The start command, as well as all binary calls to the shim, has the bundle for the container set as the `cwd`.
The start command may have the following containerd specific environment variables set:
* `TTRPC_ADDRESS` the address of containerd's ttrpc API socket
* `GRPC_ADDRESS` the address of containerd's grpc API socket (1.7+)
* `MAX_SHIM_VERSION` the maximum shim version supported by the client, always `2` for shim v2 (1.7+)
* `SCHED_CORE` enable core scheduling if available (1.6+)
* `NAMESPACE` an optional namespace the shim is operating in or inheriting (1.7+)
The start command MUST write to stdout either the ttrpc address that the shim is serving its API on, or _(experimental)_
a JSON structure in the following format (where protocol can be either "ttrpc" or "grpc"):
```json
{
"version": 2,
"address": "/address/of/task/service",
"protocol": "grpc"
}
```
The address will be used by containerd to issue API requests for container operations.
The start command can either start a new shim or return an address to an existing shim based on the shim's logic.
#### `delete`
Each shim MUST implement a `delete` subcommand.
This command allows containerd to delete any container resources created, mounted, and/or run by a shim when containerd can no longer communicate over rpc.
This happens if a shim is SIGKILL'd with a running container.
These resources will need to be cleaned up when containerd looses the connection to a shim.
This is also used when containerd boots and reconnects to shims.
If a bundle is still on disk but containerd cannot connect to a shim, the delete command is invoked.
The delete command MUST accept the following flags:
* `-namespace` the namespace for the container
* `-address` the address of the containerd's main socket
* `-publish-binary` the binary path to publish events back to containerd
* `-id` the id of the container
* `-bundle` the path to the bundle to delete. On non-Windows and non-FreeBSD platforms this will match `cwd`
The delete command will be executed in the container's bundle as its `cwd` except for on Windows and FreeBSD platforms.
### Host Level Shim Configuration
containerd does not provide any host level configuration for shims via the API.
If a shim needs configuration from the user with host level information across all instances, a shim specific configuration file can be setup.
### Container Level Shim Configuration
On the create request, there is a generic `*protobuf.Any` that allows a user to specify container level configuration for the shim.
```proto
message CreateTaskRequest {
string id = 1;
...
google.protobuf.Any options = 10;
}
```
A shim author can create their own protobuf message for configuration and clients can import and provide this information is needed.
### I/O
I/O for a container is provided by the client to the shim via fifo on Linux, named pipes on Windows, or log files on disk.
The paths to these files are provided on the `Create` rpc for the initial creation and on the `Exec` rpc for additional processes.
```proto
message CreateTaskRequest {
string id = 1;
bool terminal = 4;
string stdin = 5;
string stdout = 6;
string stderr = 7;
}
```
```proto
message ExecProcessRequest {
string id = 1;
string exec_id = 2;
bool terminal = 3;
string stdin = 4;
string stdout = 5;
string stderr = 6;
}
```
Containers that are to be launched with an interactive terminal will have the `terminal` field set to `true`, data is still copied over the files(fifos,pipes) in the same way as non interactive containers.
### Root Filesystems
The root filesystem for the containers is provided by on the `Create` rpc.
Shims are responsible for managing the lifecycle of the filesystem mount during the lifecycle of a container.
```proto
message CreateTaskRequest {
string id = 1;
string bundle = 2;
repeated containerd.types.Mount rootfs = 3;
...
}
```
The mount protobuf message is:
```proto
message Mount {
// Type defines the nature of the mount.
string type = 1;
// Source specifies the name of the mount. Depending on mount type, this
// may be a volume name or a host path, or even ignored.
string source = 2;
// Target path in container
string target = 3;
// Options specifies zero or more fstab style mount options.
repeated string options = 4;
}
```
Shims are responsible for mounting the filesystem into the `rootfs/` directory of the bundle.
Shims are also responsible for unmounting of the filesystem.
During a `delete` binary call, the shim MUST ensure that filesystem is also unmounted.
Filesystems are provided by the containerd snapshotters.
### Events
The Runtime v2 supports an async event model. In order for the an upstream caller (such as Docker) to get these events in the correct order a Runtime v2 shim MUST implement the following events where `Compliance=MUST`. This avoids race conditions between the shim and shim client where for example a call to `Start` can signal a `TaskExitEventTopic` before even returning the results from the `Start` call. With these guarantees of a Runtime v2 shim a call to `Start` is required to have published the async event `TaskStartEventTopic` before the shim can publish the `TaskExitEventTopic`.
#### Tasks
| Topic | Compliance | Description |
| ----- | ---------- | ----------- |
| `runtime.TaskCreateEventTopic` | MUST | When a task is successfully created |
| `runtime.TaskStartEventTopic` | MUST (follow `TaskCreateEventTopic`) | When a task is successfully started |
| `runtime.TaskExitEventTopic` | MUST (follow `TaskStartEventTopic`) | When a task exits expected or unexpected |
| `runtime.TaskDeleteEventTopic` | MUST (follow `TaskExitEventTopic` or `TaskCreateEventTopic` if never started) | When a task is removed from a shim |
| `runtime.TaskPausedEventTopic` | SHOULD | When a task is successfully paused |
| `runtime.TaskResumedEventTopic` | SHOULD (follow `TaskPausedEventTopic`) | When a task is successfully resumed |
| `runtime.TaskCheckpointedEventTopic` | SHOULD | When a task is checkpointed |
| `runtime.TaskOOMEventTopic` | SHOULD | If the shim collects Out of Memory events |
#### Execs
| Topic | Compliance | Description |
| ----- | ---------- | ----------- |
| `runtime.TaskExecAddedEventTopic` | MUST (follow `TaskCreateEventTopic` ) | When an exec is successfully added |
| `runtime.TaskExecStartedEventTopic` | MUST (follow `TaskExecAddedEventTopic`) | When an exec is successfully started |
| `runtime.TaskExitEventTopic` | MUST (follow `TaskExecStartedEventTopic`) | When an exec (other than the init exec) exits expected or unexpected |
| `runtime.TaskDeleteEventTopic` | SHOULD (follow `TaskExitEventTopic` or `TaskExecAddedEventTopic` if never started) | When an exec is removed from a shim |
### Flow
The following sequence diagram shows the flow of actions when `ctr run` command executed.
```mermaid
sequenceDiagram
participant ctr
participant containerd
participant shim
autonumber
ctr->>containerd: Create container
Note right of containerd: Save container metadata
containerd-->>ctr: Container ID
ctr->>containerd: Create task
%% Start shim
containerd-->shim: Prepare bundle
containerd->>shim: Execute binary: containerd-shim-runc-v2 start
shim->shim: Start TTRPC server
shim-->>containerd: Respond with address: unix://containerd/container.sock
containerd-->>shim: Create TTRPC client
%% Schedule task
Note right of containerd: Schedule new task
containerd->>shim: TaskService.CreateTaskRequest
shim-->>containerd: Task PID
containerd-->>ctr: Task ID
%% Start task
ctr->>containerd: Start task
containerd->>shim: TaskService.StartRequest
shim-->>containerd: OK
%% Wait task
ctr->>containerd: Wait task
containerd->>shim: TaskService.WaitRequest
Note right of shim: Block until task exits
shim-->>containerd: Exit status
containerd-->>ctr: OK
Note over ctr,shim: Other task requests (Kill, Pause, Resume, CloseIO, Exec, etc)
%% Kill signal
opt Kill task
ctr->>containerd: Kill task
containerd->>shim: TaskService.KillRequest
shim-->>containerd: OK
containerd-->>ctr: OK
end
%% Delete task
ctr->>containerd: Task Delete
containerd->>shim: TaskService.DeleteRequest
shim-->>containerd: Exit information
containerd->>shim: TaskService.ShutdownRequest
shim-->>containerd: OK
containerd-->shim: Close client
containerd->>shim: Execute binary: containerd-shim-runc-v2 delete
containerd-->shim: Delete bundle
containerd-->>ctr: Exit code
```
#### Logging
Shims may support pluggable logging via STDIO URIs.
Current supported schemes for logging are:
* fifo - Linux
* binary - Linux & Windows
* file - Linux & Windows
* npipe - Windows
Binary logging has the ability to forward a container's STDIO to an external binary for consumption.
A sample logging driver that forwards the container's STDOUT and STDERR to `journald` is:
```go
package main
import (
"bufio"
"context"
"fmt"
"io"
"sync"
"github.com/containerd/containerd/runtime/v2/logging"
"github.com/coreos/go-systemd/journal"
)
func main() {
logging.Run(log)
}
func log(ctx context.Context, config *logging.Config, ready func() error) error {
// construct any log metadata for the container
vars := map[string]string{
"SYSLOG_IDENTIFIER": fmt.Sprintf("%s:%s", config.Namespace, config.ID),
}
var wg sync.WaitGroup
wg.Add(2)
// forward both stdout and stderr to the journal
go copy(&wg, config.Stdout, journal.PriInfo, vars)
go copy(&wg, config.Stderr, journal.PriErr, vars)
// signal that we are ready and setup for the container to be started
if err := ready(); err != nil {
return err
}
wg.Wait()
return nil
}
func copy(wg *sync.WaitGroup, r io.Reader, pri journal.Priority, vars map[string]string) {
defer wg.Done()
s := bufio.NewScanner(r)
for s.Scan() {
journal.Send(s.Text(), pri, vars)
}
}
```
### Other
#### Unsupported rpcs
If a shim does not or cannot implement an rpc call, it MUST return a `github.com/containerd/containerd/errdefs.ErrNotImplemented` error.
#### Debugging and Shim Logs
A fifo on unix or named pipe on Windows will be provided to the shim.
It can be located inside the `cwd` of the shim named "log".
The shims can use the existing `github.com/containerd/log` package to log debug messages.
Messages will automatically be output in the containerd's daemon logs with the correct fields and runtime set.
#### ttrpc
[ttrpc](https://github.com/containerd/ttrpc) is one of the supported protocols for shims.
It works with standard protobufs and GRPC services as well as generating clients.
The only difference between grpc and ttrpc is the wire protocol.
ttrpc removes the http stack in order to save memory and binary size to keep shims small.
It is recommended to use ttrpc in your shim but grpc support is currently an experimental feature.

215
core/runtime/v2/binary.go Normal file
View File

@@ -0,0 +1,215 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"bytes"
"context"
"fmt"
"io"
"os"
"path/filepath"
gruntime "runtime"
"github.com/containerd/containerd/v2/api/runtime/task/v2"
"github.com/containerd/containerd/v2/core/runtime"
client "github.com/containerd/containerd/v2/core/runtime/v2/shim"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/protobuf"
"github.com/containerd/containerd/v2/protobuf/proto"
"github.com/containerd/containerd/v2/protobuf/types"
"github.com/containerd/log"
)
type shimBinaryConfig struct {
runtime string
address string
ttrpcAddress string
schedCore bool
}
func shimBinary(bundle *Bundle, config shimBinaryConfig) *binary {
return &binary{
bundle: bundle,
runtime: config.runtime,
containerdAddress: config.address,
containerdTTRPCAddress: config.ttrpcAddress,
schedCore: config.schedCore,
}
}
type binary struct {
runtime string
containerdAddress string
containerdTTRPCAddress string
schedCore bool
bundle *Bundle
}
func (b *binary) Start(ctx context.Context, opts *types.Any, onClose func()) (_ *shim, err error) {
args := []string{"-id", b.bundle.ID}
switch log.GetLevel() {
case log.DebugLevel, log.TraceLevel:
args = append(args, "-debug")
}
args = append(args, "start")
cmd, err := client.Command(
ctx,
&client.CommandConfig{
Runtime: b.runtime,
Address: b.containerdAddress,
TTRPCAddress: b.containerdTTRPCAddress,
Path: b.bundle.Path,
Opts: opts,
Args: args,
SchedCore: b.schedCore,
})
if err != nil {
return nil, err
}
// Windows needs a namespace when openShimLog
ns, _ := namespaces.Namespace(ctx)
shimCtx, cancelShimLog := context.WithCancel(namespaces.WithNamespace(context.Background(), ns))
defer func() {
if err != nil {
cancelShimLog()
}
}()
f, err := openShimLog(shimCtx, b.bundle, client.AnonDialer)
if err != nil {
return nil, fmt.Errorf("open shim log pipe: %w", err)
}
defer func() {
if err != nil {
f.Close()
}
}()
// open the log pipe and block until the writer is ready
// this helps with synchronization of the shim
// copy the shim's logs to containerd's output
go func() {
defer f.Close()
_, err := io.Copy(os.Stderr, f)
// To prevent flood of error messages, the expected error
// should be reset, like os.ErrClosed or os.ErrNotExist, which
// depends on platform.
err = checkCopyShimLogError(ctx, err)
if err != nil {
log.G(ctx).WithError(err).Error("copy shim log")
}
}()
out, err := cmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("%s: %w", out, err)
}
response := bytes.TrimSpace(out)
onCloseWithShimLog := func() {
onClose()
cancelShimLog()
f.Close()
}
// Save runtime binary path for restore.
if err := os.WriteFile(filepath.Join(b.bundle.Path, "shim-binary-path"), []byte(b.runtime), 0600); err != nil {
return nil, err
}
params, err := parseStartResponse(response)
if err != nil {
return nil, err
}
conn, err := makeConnection(ctx, b.bundle.ID, params, onCloseWithShimLog)
if err != nil {
return nil, err
}
// Save bootstrap configuration (so containerd can restore shims after restart).
if err := writeBootstrapParams(filepath.Join(b.bundle.Path, "bootstrap.json"), params); err != nil {
return nil, fmt.Errorf("failed to write bootstrap.json: %w", err)
}
return &shim{
bundle: b.bundle,
client: conn,
version: params.Version,
}, nil
}
func (b *binary) Delete(ctx context.Context) (*runtime.Exit, error) {
log.G(ctx).Info("cleaning up dead shim")
// On Windows and FreeBSD, the current working directory of the shim should
// not be the bundle path during the delete operation. Instead, we invoke
// with the default work dir and forward the bundle path on the cmdline.
// Windows cannot delete the current working directory while an executable
// is in use with it. On FreeBSD, fork/exec can fail.
var bundlePath string
if gruntime.GOOS != "windows" && gruntime.GOOS != "freebsd" {
bundlePath = b.bundle.Path
}
args := []string{
"-id", b.bundle.ID,
"-bundle", b.bundle.Path,
}
switch log.GetLevel() {
case log.DebugLevel, log.TraceLevel:
args = append(args, "-debug")
}
args = append(args, "delete")
cmd, err := client.Command(ctx,
&client.CommandConfig{
Runtime: b.runtime,
Address: b.containerdAddress,
TTRPCAddress: b.containerdTTRPCAddress,
Path: bundlePath,
Opts: nil,
Args: args,
})
if err != nil {
return nil, err
}
var (
out = bytes.NewBuffer(nil)
errb = bytes.NewBuffer(nil)
)
cmd.Stdout = out
cmd.Stderr = errb
if err := cmd.Run(); err != nil {
log.G(ctx).WithField("cmd", cmd).WithError(err).Error("failed to delete")
return nil, fmt.Errorf("%s: %w", errb.String(), err)
}
s := errb.String()
if s != "" {
log.G(ctx).Warnf("cleanup warnings %s", s)
}
var response task.DeleteResponse
if err := proto.Unmarshal(out.Bytes(), &response); err != nil {
return nil, err
}
if err := b.bundle.Delete(); err != nil {
return nil, err
}
return &runtime.Exit{
Status: response.ExitStatus,
Timestamp: protobuf.FromTimestamp(response.ExitedAt),
Pid: response.Pid,
}, nil
}

331
core/runtime/v2/bridge.go Normal file
View File

@@ -0,0 +1,331 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"fmt"
"github.com/containerd/ttrpc"
"google.golang.org/grpc"
"google.golang.org/protobuf/types/known/emptypb"
v2 "github.com/containerd/containerd/v2/api/runtime/task/v2"
v3 "github.com/containerd/containerd/v2/api/runtime/task/v3"
api "github.com/containerd/containerd/v2/api/runtime/task/v3" // Current version used by TaskServiceClient
)
// TaskServiceClient exposes a client interface to shims, which aims to hide
// the underlying complexity and backward compatibility (v2 task service vs v3, TTRPC vs GRPC, etc).
type TaskServiceClient interface {
State(context.Context, *api.StateRequest) (*api.StateResponse, error)
Create(context.Context, *api.CreateTaskRequest) (*api.CreateTaskResponse, error)
Start(context.Context, *api.StartRequest) (*api.StartResponse, error)
Delete(context.Context, *api.DeleteRequest) (*api.DeleteResponse, error)
Pids(context.Context, *api.PidsRequest) (*api.PidsResponse, error)
Pause(context.Context, *api.PauseRequest) (*emptypb.Empty, error)
Resume(context.Context, *api.ResumeRequest) (*emptypb.Empty, error)
Checkpoint(context.Context, *api.CheckpointTaskRequest) (*emptypb.Empty, error)
Kill(context.Context, *api.KillRequest) (*emptypb.Empty, error)
Exec(context.Context, *api.ExecProcessRequest) (*emptypb.Empty, error)
ResizePty(context.Context, *api.ResizePtyRequest) (*emptypb.Empty, error)
CloseIO(context.Context, *api.CloseIORequest) (*emptypb.Empty, error)
Update(context.Context, *api.UpdateTaskRequest) (*emptypb.Empty, error)
Wait(context.Context, *api.WaitRequest) (*api.WaitResponse, error)
Stats(context.Context, *api.StatsRequest) (*api.StatsResponse, error)
Connect(context.Context, *api.ConnectRequest) (*api.ConnectResponse, error)
Shutdown(context.Context, *api.ShutdownRequest) (*emptypb.Empty, error)
}
// NewTaskClient returns a new task client interface which handles both GRPC and TTRPC servers depending on the
// client object type passed in.
//
// Supported client types are:
// - *ttrpc.Client
// - grpc.ClientConnInterface
//
// Currently supported servers:
// - TTRPC v2 (compatibility with shims before 2.0)
// - TTRPC v3
// - GRPC v3
func NewTaskClient(client interface{}, version int) (TaskServiceClient, error) {
switch c := client.(type) {
case *ttrpc.Client:
switch version {
case 2:
return &ttrpcV2Bridge{client: v2.NewTaskClient(c)}, nil
case 3:
return v3.NewTTRPCTaskClient(c), nil
default:
return nil, fmt.Errorf("containerd client supports only v2 and v3 TTRPC task client (got %d)", version)
}
case grpc.ClientConnInterface:
if version != 3 {
return nil, fmt.Errorf("containerd client supports only v3 GRPC task service (got %d)", version)
}
return &grpcV3Bridge{v3.NewTaskClient(c)}, nil
default:
return nil, fmt.Errorf("unsupported shim client type %T", c)
}
}
// ttrpcV2Bridge is a bridge from TTRPC v2 task service.
type ttrpcV2Bridge struct {
client v2.TaskService
}
var _ TaskServiceClient = (*ttrpcV2Bridge)(nil)
func (b *ttrpcV2Bridge) State(ctx context.Context, request *api.StateRequest) (*api.StateResponse, error) {
resp, err := b.client.State(ctx, &v2.StateRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
})
return &v3.StateResponse{
ID: resp.GetID(),
Bundle: resp.GetBundle(),
Pid: resp.GetPid(),
Status: resp.GetStatus(),
Stdin: resp.GetStdin(),
Stdout: resp.GetStdout(),
Stderr: resp.GetStderr(),
Terminal: resp.GetTerminal(),
ExitStatus: resp.GetExitStatus(),
ExitedAt: resp.GetExitedAt(),
ExecID: resp.GetExecID(),
}, err
}
func (b *ttrpcV2Bridge) Create(ctx context.Context, request *api.CreateTaskRequest) (*api.CreateTaskResponse, error) {
resp, err := b.client.Create(ctx, &v2.CreateTaskRequest{
ID: request.GetID(),
Bundle: request.GetBundle(),
Rootfs: request.GetRootfs(),
Terminal: request.GetTerminal(),
Stdin: request.GetStdin(),
Stdout: request.GetStdout(),
Stderr: request.GetStderr(),
Checkpoint: request.GetCheckpoint(),
ParentCheckpoint: request.GetParentCheckpoint(),
Options: request.GetOptions(),
})
return &api.CreateTaskResponse{Pid: resp.GetPid()}, err
}
func (b *ttrpcV2Bridge) Start(ctx context.Context, request *api.StartRequest) (*api.StartResponse, error) {
resp, err := b.client.Start(ctx, &v2.StartRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
})
return &api.StartResponse{Pid: resp.GetPid()}, err
}
func (b *ttrpcV2Bridge) Delete(ctx context.Context, request *api.DeleteRequest) (*api.DeleteResponse, error) {
resp, err := b.client.Delete(ctx, &v2.DeleteRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
})
return &api.DeleteResponse{
Pid: resp.GetPid(),
ExitStatus: resp.GetExitStatus(),
ExitedAt: resp.GetExitedAt(),
}, err
}
func (b *ttrpcV2Bridge) Pids(ctx context.Context, request *api.PidsRequest) (*api.PidsResponse, error) {
resp, err := b.client.Pids(ctx, &v2.PidsRequest{ID: request.GetID()})
return &api.PidsResponse{Processes: resp.GetProcesses()}, err
}
func (b *ttrpcV2Bridge) Pause(ctx context.Context, request *api.PauseRequest) (*emptypb.Empty, error) {
return b.client.Pause(ctx, &v2.PauseRequest{ID: request.GetID()})
}
func (b *ttrpcV2Bridge) Resume(ctx context.Context, request *api.ResumeRequest) (*emptypb.Empty, error) {
return b.client.Resume(ctx, &v2.ResumeRequest{ID: request.GetID()})
}
func (b *ttrpcV2Bridge) Checkpoint(ctx context.Context, request *api.CheckpointTaskRequest) (*emptypb.Empty, error) {
return b.client.Checkpoint(ctx, &v2.CheckpointTaskRequest{
ID: request.GetID(),
Path: request.GetPath(),
Options: request.GetOptions(),
})
}
func (b *ttrpcV2Bridge) Kill(ctx context.Context, request *api.KillRequest) (*emptypb.Empty, error) {
return b.client.Kill(ctx, &v2.KillRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
Signal: request.GetSignal(),
All: request.GetAll(),
})
}
func (b *ttrpcV2Bridge) Exec(ctx context.Context, request *api.ExecProcessRequest) (*emptypb.Empty, error) {
return b.client.Exec(ctx, &v2.ExecProcessRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
Terminal: request.GetTerminal(),
Stdin: request.GetStdin(),
Stdout: request.GetStdout(),
Stderr: request.GetStderr(),
Spec: request.GetSpec(),
})
}
func (b *ttrpcV2Bridge) ResizePty(ctx context.Context, request *api.ResizePtyRequest) (*emptypb.Empty, error) {
return b.client.ResizePty(ctx, &v2.ResizePtyRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
Width: request.GetWidth(),
Height: request.GetHeight(),
})
}
func (b *ttrpcV2Bridge) CloseIO(ctx context.Context, request *api.CloseIORequest) (*emptypb.Empty, error) {
return b.client.CloseIO(ctx, &v2.CloseIORequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
Stdin: request.GetStdin(),
})
}
func (b *ttrpcV2Bridge) Update(ctx context.Context, request *api.UpdateTaskRequest) (*emptypb.Empty, error) {
return b.client.Update(ctx, &v2.UpdateTaskRequest{
ID: request.GetID(),
Resources: request.GetResources(),
Annotations: request.GetAnnotations(),
})
}
func (b *ttrpcV2Bridge) Wait(ctx context.Context, request *api.WaitRequest) (*api.WaitResponse, error) {
resp, err := b.client.Wait(ctx, &v2.WaitRequest{
ID: request.GetID(),
ExecID: request.GetExecID(),
})
return &api.WaitResponse{
ExitStatus: resp.GetExitStatus(),
ExitedAt: resp.GetExitedAt(),
}, err
}
func (b *ttrpcV2Bridge) Stats(ctx context.Context, request *api.StatsRequest) (*api.StatsResponse, error) {
resp, err := b.client.Stats(ctx, &v2.StatsRequest{ID: request.GetID()})
return &api.StatsResponse{Stats: resp.GetStats()}, err
}
func (b *ttrpcV2Bridge) Connect(ctx context.Context, request *api.ConnectRequest) (*api.ConnectResponse, error) {
resp, err := b.client.Connect(ctx, &v2.ConnectRequest{ID: request.GetID()})
return &api.ConnectResponse{
ShimPid: resp.GetShimPid(),
TaskPid: resp.GetTaskPid(),
Version: resp.GetVersion(),
}, err
}
func (b *ttrpcV2Bridge) Shutdown(ctx context.Context, request *api.ShutdownRequest) (*emptypb.Empty, error) {
return b.client.Shutdown(ctx, &v2.ShutdownRequest{
ID: request.GetID(),
Now: request.GetNow(),
})
}
// grpcV3Bridge implements task service client for v3 GRPC server.
// GRPC uses same request/response structures as TTRPC, so it just wraps GRPC calls.
type grpcV3Bridge struct {
client v3.TaskClient
}
var _ TaskServiceClient = (*grpcV3Bridge)(nil)
func (g *grpcV3Bridge) State(ctx context.Context, request *api.StateRequest) (*api.StateResponse, error) {
return g.client.State(ctx, request)
}
func (g *grpcV3Bridge) Create(ctx context.Context, request *api.CreateTaskRequest) (*api.CreateTaskResponse, error) {
return g.client.Create(ctx, request)
}
func (g *grpcV3Bridge) Start(ctx context.Context, request *api.StartRequest) (*api.StartResponse, error) {
return g.client.Start(ctx, request)
}
func (g *grpcV3Bridge) Delete(ctx context.Context, request *api.DeleteRequest) (*api.DeleteResponse, error) {
return g.client.Delete(ctx, request)
}
func (g *grpcV3Bridge) Pids(ctx context.Context, request *api.PidsRequest) (*api.PidsResponse, error) {
return g.client.Pids(ctx, request)
}
func (g *grpcV3Bridge) Pause(ctx context.Context, request *api.PauseRequest) (*emptypb.Empty, error) {
return g.client.Pause(ctx, request)
}
func (g *grpcV3Bridge) Resume(ctx context.Context, request *api.ResumeRequest) (*emptypb.Empty, error) {
return g.client.Resume(ctx, request)
}
func (g *grpcV3Bridge) Checkpoint(ctx context.Context, request *api.CheckpointTaskRequest) (*emptypb.Empty, error) {
return g.client.Checkpoint(ctx, request)
}
func (g *grpcV3Bridge) Kill(ctx context.Context, request *api.KillRequest) (*emptypb.Empty, error) {
return g.client.Kill(ctx, request)
}
func (g *grpcV3Bridge) Exec(ctx context.Context, request *api.ExecProcessRequest) (*emptypb.Empty, error) {
return g.client.Exec(ctx, request)
}
func (g *grpcV3Bridge) ResizePty(ctx context.Context, request *api.ResizePtyRequest) (*emptypb.Empty, error) {
return g.client.ResizePty(ctx, request)
}
func (g *grpcV3Bridge) CloseIO(ctx context.Context, request *api.CloseIORequest) (*emptypb.Empty, error) {
return g.client.CloseIO(ctx, request)
}
func (g *grpcV3Bridge) Update(ctx context.Context, request *api.UpdateTaskRequest) (*emptypb.Empty, error) {
return g.client.Update(ctx, request)
}
func (g *grpcV3Bridge) Wait(ctx context.Context, request *api.WaitRequest) (*api.WaitResponse, error) {
return g.client.Wait(ctx, request)
}
func (g *grpcV3Bridge) Stats(ctx context.Context, request *api.StatsRequest) (*api.StatsResponse, error) {
return g.client.Stats(ctx, request)
}
func (g *grpcV3Bridge) Connect(ctx context.Context, request *api.ConnectRequest) (*api.ConnectResponse, error) {
return g.client.Connect(ctx, request)
}
func (g *grpcV3Bridge) Shutdown(ctx context.Context, request *api.ShutdownRequest) (*emptypb.Empty, error) {
return g.client.Shutdown(ctx, request)
}

169
core/runtime/v2/bundle.go Normal file
View File

@@ -0,0 +1,169 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"fmt"
"os"
"path/filepath"
"runtime"
"github.com/containerd/containerd/v2/core/mount"
"github.com/containerd/containerd/v2/pkg/identifiers"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/oci"
"github.com/containerd/typeurl/v2"
"github.com/opencontainers/runtime-spec/specs-go"
)
// LoadBundle loads an existing bundle from disk
func LoadBundle(ctx context.Context, root, id string) (*Bundle, error) {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return nil, err
}
return &Bundle{
ID: id,
Path: filepath.Join(root, ns, id),
Namespace: ns,
}, nil
}
// NewBundle returns a new bundle on disk
func NewBundle(ctx context.Context, root, state, id string, spec typeurl.Any) (b *Bundle, err error) {
if err := identifiers.Validate(id); err != nil {
return nil, fmt.Errorf("invalid task id %s: %w", id, err)
}
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return nil, err
}
work := filepath.Join(root, ns, id)
b = &Bundle{
ID: id,
Path: filepath.Join(state, ns, id),
Namespace: ns,
}
var paths []string
defer func() {
if err != nil {
for _, d := range paths {
os.RemoveAll(d)
}
}
}()
// create state directory for the bundle
if err := os.MkdirAll(filepath.Dir(b.Path), 0711); err != nil {
return nil, err
}
if err := os.Mkdir(b.Path, 0700); err != nil {
return nil, err
}
if typeurl.Is(spec, &specs.Spec{}) {
if err := prepareBundleDirectoryPermissions(b.Path, spec.GetValue()); err != nil {
return nil, err
}
}
paths = append(paths, b.Path)
// create working directory for the bundle
if err := os.MkdirAll(filepath.Dir(work), 0711); err != nil {
return nil, err
}
rootfs := filepath.Join(b.Path, "rootfs")
if err := os.MkdirAll(rootfs, 0711); err != nil {
return nil, err
}
paths = append(paths, rootfs)
if err := os.Mkdir(work, 0711); err != nil {
if !os.IsExist(err) {
return nil, err
}
os.RemoveAll(work)
if err := os.Mkdir(work, 0711); err != nil {
return nil, err
}
}
paths = append(paths, work)
// symlink workdir
if err := os.Symlink(work, filepath.Join(b.Path, "work")); err != nil {
return nil, err
}
if spec := spec.GetValue(); spec != nil {
// write the spec to the bundle
specPath := filepath.Join(b.Path, oci.ConfigFilename)
err = os.WriteFile(specPath, spec, 0666)
if err != nil {
return nil, fmt.Errorf("failed to write bundle spec: %w", err)
}
}
return b, nil
}
// Bundle represents an OCI bundle
type Bundle struct {
// ID of the bundle
ID string
// Path to the bundle
Path string
// Namespace of the bundle
Namespace string
}
// Delete a bundle atomically
func (b *Bundle) Delete() error {
work, werr := os.Readlink(filepath.Join(b.Path, "work"))
rootfs := filepath.Join(b.Path, "rootfs")
if runtime.GOOS != "darwin" {
if err := mount.UnmountRecursive(rootfs, 0); err != nil {
return fmt.Errorf("unmount rootfs %s: %w", rootfs, err)
}
}
if err := os.Remove(rootfs); err != nil && !os.IsNotExist(err) {
return fmt.Errorf("failed to remove bundle rootfs: %w", err)
}
err := atomicDelete(b.Path)
if err == nil {
if werr == nil {
return atomicDelete(work)
}
return nil
}
// error removing the bundle path; still attempt removing work dir
var err2 error
if werr == nil {
err2 = atomicDelete(work)
if err2 == nil {
return err
}
}
return fmt.Errorf("failed to remove both bundle and workdir locations: %v: %w", err2, err)
}
// atomicDelete renames the path to a hidden file before removal
func atomicDelete(path string) error {
// create a hidden dir for an atomic removal
atomicPath := filepath.Join(filepath.Dir(path), fmt.Sprintf(".%s", filepath.Base(path)))
if err := os.Rename(path, atomicPath); err != nil {
if os.IsNotExist(err) {
return nil
}
return err
}
return os.RemoveAll(atomicPath)
}

View File

@@ -0,0 +1,23 @@
//go:build !linux
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
// prepareBundleDirectoryPermissions prepares the permissions of the bundle
// directory according to the needs of the current platform.
func prepareBundleDirectoryPermissions(path string, spec []byte) error { return nil }

View File

@@ -0,0 +1,74 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"encoding/json"
"os"
"github.com/opencontainers/runtime-spec/specs-go"
)
// prepareBundleDirectoryPermissions prepares the permissions of the bundle
// directory according to the needs of the current platform.
// On Linux when user namespaces are enabled, the permissions are modified to
// allow the remapped root GID to access the bundle.
func prepareBundleDirectoryPermissions(path string, spec []byte) error {
gid, err := remappedGID(spec)
if err != nil {
return err
}
if gid == 0 {
return nil
}
if err := os.Chown(path, -1, int(gid)); err != nil {
return err
}
return os.Chmod(path, 0710)
}
// ociSpecUserNS is a subset of specs.Spec used to reduce garbage during
// unmarshal.
type ociSpecUserNS struct {
Linux *linuxSpecUserNS
}
// linuxSpecUserNS is a subset of specs.Linux used to reduce garbage during
// unmarshal.
type linuxSpecUserNS struct {
GIDMappings []specs.LinuxIDMapping
}
// remappedGID reads the remapped GID 0 from the OCI spec, if it exists. If
// there is no remapping, remappedGID returns 0. If the spec cannot be parsed,
// remappedGID returns an error.
func remappedGID(spec []byte) (uint32, error) {
var ociSpec ociSpecUserNS
err := json.Unmarshal(spec, &ociSpec)
if err != nil {
return 0, err
}
if ociSpec.Linux == nil || len(ociSpec.Linux.GIDMappings) == 0 {
return 0, nil
}
for _, mapping := range ociSpec.Linux.GIDMappings {
if mapping.ContainerID == 0 {
return mapping.HostID, nil
}
}
return 0, nil
}

View File

@@ -0,0 +1,143 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"encoding/json"
"fmt"
"os"
"path/filepath"
"strconv"
"syscall"
"testing"
"github.com/containerd/containerd/v2/internal/testutil"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/oci"
"github.com/containerd/typeurl/v2"
"github.com/opencontainers/runtime-spec/specs-go"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestNewBundle(t *testing.T) {
testutil.RequiresRoot(t)
tests := []struct {
userns bool
}{{
userns: false,
}, {
userns: true,
}}
const usernsGID = 4200
for i, tc := range tests {
t.Run(strconv.Itoa(i), func(t *testing.T) {
dir := t.TempDir()
work := filepath.Join(dir, "work")
state := filepath.Join(dir, "state")
id := fmt.Sprintf("new-bundle-%d", i)
spec := oci.Spec{}
if tc.userns {
spec.Linux = &specs.Linux{
GIDMappings: []specs.LinuxIDMapping{{ContainerID: 0, HostID: usernsGID}},
}
}
specAny, err := typeurl.MarshalAny(&spec)
require.NoError(t, err, "failed to marshal spec")
ctx := namespaces.WithNamespace(context.TODO(), namespaces.Default)
b, err := NewBundle(ctx, work, state, id, specAny)
require.NoError(t, err, "NewBundle should succeed")
require.NotNil(t, b, "bundle should not be nil")
fi, err := os.Stat(b.Path)
assert.NoError(t, err, "should be able to stat bundle path")
if tc.userns {
assert.Equal(t, os.ModeDir|0710, fi.Mode(), "bundle path should be a directory with perm 0710")
} else {
assert.Equal(t, os.ModeDir|0700, fi.Mode(), "bundle path should be a directory with perm 0700")
}
stat, ok := fi.Sys().(*syscall.Stat_t)
require.True(t, ok, "should assert to *syscall.Stat_t")
expectedGID := uint32(0)
if tc.userns {
expectedGID = usernsGID
}
assert.Equal(t, expectedGID, stat.Gid, "gid should match")
})
}
}
func TestRemappedGID(t *testing.T) {
tests := []struct {
spec oci.Spec
gid uint32
}{{
// empty spec
spec: oci.Spec{},
gid: 0,
}, {
// empty Linux section
spec: oci.Spec{
Linux: &specs.Linux{},
},
gid: 0,
}, {
// empty ID mappings
spec: oci.Spec{
Linux: &specs.Linux{
GIDMappings: make([]specs.LinuxIDMapping, 0),
},
},
gid: 0,
}, {
// valid ID mapping
spec: oci.Spec{
Linux: &specs.Linux{
GIDMappings: []specs.LinuxIDMapping{{
ContainerID: 0,
HostID: 1000,
}},
},
},
gid: 1000,
}, {
// missing ID mapping
spec: oci.Spec{
Linux: &specs.Linux{
GIDMappings: []specs.LinuxIDMapping{{
ContainerID: 100,
HostID: 1000,
}},
},
},
gid: 0,
}}
for i, tc := range tests {
t.Run(strconv.Itoa(i), func(t *testing.T) {
s, err := json.Marshal(tc.spec)
require.NoError(t, err, "failed to marshal spec")
gid, err := remappedGID(s)
assert.NoError(t, err, "should unmarshal successfully")
assert.Equal(t, tc.gid, gid, "expected GID to match")
})
}
}

View File

@@ -0,0 +1,23 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
// When testutil is imported for one platform (bundle_linux_test.go) it
// should be imported for all platforms.
_ "github.com/containerd/containerd/v2/internal/testutil"
)

View File

@@ -0,0 +1,6 @@
# Example Shim
This directory provides skeleton code as the starting point for creating a Runtime v2 shim.
This allows runtime authors to quickly bootstrap new shim implementations.
For full documentation on building a shim for containerd, see the [Shim Documentation](../README.md) file.

View File

@@ -0,0 +1,29 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package main
import (
"context"
"github.com/containerd/containerd/v2/core/runtime/v2/example"
"github.com/containerd/containerd/v2/core/runtime/v2/shim"
)
func main() {
// init and execute the shim
shim.Run(context.Background(), example.NewManager("io.containerd.example.v1"))
}

View File

@@ -0,0 +1,179 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package example
import (
"context"
"os"
taskAPI "github.com/containerd/containerd/v2/api/runtime/task/v2"
"github.com/containerd/containerd/v2/core/runtime/v2/shim"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/containerd/v2/pkg/shutdown"
"github.com/containerd/containerd/v2/plugins"
ptypes "github.com/containerd/containerd/v2/protobuf/types"
"github.com/containerd/plugin"
"github.com/containerd/plugin/registry"
"github.com/containerd/ttrpc"
)
func init() {
registry.Register(&plugin.Registration{
Type: plugins.TTRPCPlugin,
ID: "task",
Requires: []plugin.Type{
plugins.EventPlugin,
plugins.InternalPlugin,
},
InitFn: func(ic *plugin.InitContext) (interface{}, error) {
pp, err := ic.GetByID(plugins.EventPlugin, "publisher")
if err != nil {
return nil, err
}
ss, err := ic.GetByID(plugins.InternalPlugin, "shutdown")
if err != nil {
return nil, err
}
return newTaskService(ic.Context, pp.(shim.Publisher), ss.(shutdown.Service))
},
})
}
func NewManager(name string) shim.Manager {
return manager{name: name}
}
type manager struct {
name string
}
func (m manager) Name() string {
return m.name
}
func (m manager) Start(ctx context.Context, id string, opts shim.StartOpts) (shim.BootstrapParams, error) {
return shim.BootstrapParams{}, errdefs.ErrNotImplemented
}
func (m manager) Stop(ctx context.Context, id string) (shim.StopStatus, error) {
return shim.StopStatus{}, errdefs.ErrNotImplemented
}
func newTaskService(ctx context.Context, publisher shim.Publisher, sd shutdown.Service) (taskAPI.TaskService, error) {
// The shim.Publisher and shutdown.Service are usually useful for your task service,
// but we don't need them in the exampleTaskService.
return &exampleTaskService{}, nil
}
var (
_ = shim.TTRPCService(&exampleTaskService{})
)
type exampleTaskService struct {
}
// RegisterTTRPC allows TTRPC services to be registered with the underlying server
func (s *exampleTaskService) RegisterTTRPC(server *ttrpc.Server) error {
taskAPI.RegisterTaskService(server, s)
return nil
}
// Create a new container
func (s *exampleTaskService) Create(ctx context.Context, r *taskAPI.CreateTaskRequest) (_ *taskAPI.CreateTaskResponse, err error) {
return nil, errdefs.ErrNotImplemented
}
// Start the primary user process inside the container
func (s *exampleTaskService) Start(ctx context.Context, r *taskAPI.StartRequest) (*taskAPI.StartResponse, error) {
return nil, errdefs.ErrNotImplemented
}
// Delete a process or container
func (s *exampleTaskService) Delete(ctx context.Context, r *taskAPI.DeleteRequest) (*taskAPI.DeleteResponse, error) {
return nil, errdefs.ErrNotImplemented
}
// Exec an additional process inside the container
func (s *exampleTaskService) Exec(ctx context.Context, r *taskAPI.ExecProcessRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// ResizePty of a process
func (s *exampleTaskService) ResizePty(ctx context.Context, r *taskAPI.ResizePtyRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// State returns runtime state of a process
func (s *exampleTaskService) State(ctx context.Context, r *taskAPI.StateRequest) (*taskAPI.StateResponse, error) {
return nil, errdefs.ErrNotImplemented
}
// Pause the container
func (s *exampleTaskService) Pause(ctx context.Context, r *taskAPI.PauseRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// Resume the container
func (s *exampleTaskService) Resume(ctx context.Context, r *taskAPI.ResumeRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// Kill a process
func (s *exampleTaskService) Kill(ctx context.Context, r *taskAPI.KillRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// Pids returns all pids inside the container
func (s *exampleTaskService) Pids(ctx context.Context, r *taskAPI.PidsRequest) (*taskAPI.PidsResponse, error) {
return nil, errdefs.ErrNotImplemented
}
// CloseIO of a process
func (s *exampleTaskService) CloseIO(ctx context.Context, r *taskAPI.CloseIORequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// Checkpoint the container
func (s *exampleTaskService) Checkpoint(ctx context.Context, r *taskAPI.CheckpointTaskRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// Connect returns shim information of the underlying service
func (s *exampleTaskService) Connect(ctx context.Context, r *taskAPI.ConnectRequest) (*taskAPI.ConnectResponse, error) {
return nil, errdefs.ErrNotImplemented
}
// Shutdown is called after the underlying resources of the shim are cleaned up and the service can be stopped
func (s *exampleTaskService) Shutdown(ctx context.Context, r *taskAPI.ShutdownRequest) (*ptypes.Empty, error) {
os.Exit(0)
return &ptypes.Empty{}, nil
}
// Stats returns container level system stats for a container and its processes
func (s *exampleTaskService) Stats(ctx context.Context, r *taskAPI.StatsRequest) (*taskAPI.StatsResponse, error) {
return nil, errdefs.ErrNotImplemented
}
// Update the live container
func (s *exampleTaskService) Update(ctx context.Context, r *taskAPI.UpdateTaskRequest) (*ptypes.Empty, error) {
return nil, errdefs.ErrNotImplemented
}
// Wait for a process to exit
func (s *exampleTaskService) Wait(ctx context.Context, r *taskAPI.WaitRequest) (*taskAPI.WaitResponse, error) {
return nil, errdefs.ErrNotImplemented
}

View File

@@ -0,0 +1,37 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package logging
import (
"context"
"io"
)
// Config of the container logs
type Config struct {
ID string
Namespace string
Stdout io.Reader
Stderr io.Reader
}
// LoggerFunc is implemented by custom v2 logging binaries.
//
// ready should be called when the logging binary finishes its setup and the container can be started.
//
// An example implementation of LoggerFunc: https://github.com/containerd/containerd/tree/main/runtime/v2#logging
type LoggerFunc func(ctx context.Context, cfg *Config, ready func() error) error

View File

@@ -0,0 +1,64 @@
//go:build !windows
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package logging
import (
"context"
"fmt"
"os"
"os/signal"
"golang.org/x/sys/unix"
)
// Run the logging driver
func Run(fn LoggerFunc) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
config := &Config{
ID: os.Getenv("CONTAINER_ID"),
Namespace: os.Getenv("CONTAINER_NAMESPACE"),
Stdout: os.NewFile(3, "CONTAINER_STDOUT"),
Stderr: os.NewFile(4, "CONTAINER_STDERR"),
}
var (
sigCh = make(chan os.Signal, 32)
errCh = make(chan error, 1)
wait = os.NewFile(5, "CONTAINER_WAIT")
)
signal.Notify(sigCh, unix.SIGTERM)
go func() {
errCh <- fn(ctx, config, wait.Close)
}()
for {
select {
case <-sigCh:
cancel()
case err := <-errCh:
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
os.Exit(0)
}
}
}

View File

@@ -0,0 +1,97 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package logging
import (
"context"
"errors"
"fmt"
"net"
"os"
"os/signal"
"syscall"
"github.com/Microsoft/go-winio"
)
// Run the logging driver
func Run(fn LoggerFunc) {
err := runInternal(fn)
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
os.Exit(0)
}
func runInternal(fn LoggerFunc) error {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
var (
soutPipe, serrPipe, waitPipe string
sout, serr, wait net.Conn
ok bool
err error
)
if soutPipe, ok = os.LookupEnv("CONTAINER_STDOUT"); !ok {
return errors.New("'CONTAINER_STDOUT' environment variable missing")
}
if sout, err = winio.DialPipeContext(ctx, soutPipe); err != nil {
return fmt.Errorf("unable to dial stdout pipe: %w", err)
}
if serrPipe, ok = os.LookupEnv("CONTAINER_STDERR"); !ok {
return errors.New("'CONTAINER_STDERR' environment variable missing")
}
if serr, err = winio.DialPipeContext(ctx, serrPipe); err != nil {
return fmt.Errorf("unable to dial stderr pipe: %w", err)
}
waitPipe = os.Getenv("CONTAINER_WAIT")
if wait, err = winio.DialPipeContext(ctx, waitPipe); err != nil {
return fmt.Errorf("unable to dial wait pipe: %w", err)
}
config := &Config{
ID: os.Getenv("CONTAINER_ID"),
Namespace: os.Getenv("CONTAINER_NAMESPACE"),
Stdout: sout,
Stderr: serr,
}
var (
sigCh = make(chan os.Signal, 2)
errCh = make(chan error, 1)
)
signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM)
go func() {
errCh <- fn(ctx, config, wait.Close)
}()
for {
select {
case <-sigCh:
cancel()
case err = <-errCh:
return err
}
}
}

535
core/runtime/v2/manager.go Normal file
View File

@@ -0,0 +1,535 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"errors"
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
"sync"
"github.com/containerd/containerd/v2/core/containers"
"github.com/containerd/containerd/v2/core/metadata"
"github.com/containerd/containerd/v2/core/runtime"
shimbinary "github.com/containerd/containerd/v2/core/runtime/v2/shim"
"github.com/containerd/containerd/v2/core/sandbox"
"github.com/containerd/containerd/v2/internal/cleanup"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/containerd/v2/pkg/events/exchange"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/timeout"
"github.com/containerd/containerd/v2/platforms"
"github.com/containerd/containerd/v2/plugins"
"github.com/containerd/containerd/v2/protobuf"
"github.com/containerd/log"
"github.com/containerd/plugin"
"github.com/containerd/plugin/registry"
)
// Config for the v2 runtime
type Config struct {
// Supported platforms
Platforms []string `toml:"platforms"`
// SchedCore enabled linux core scheduling
SchedCore bool `toml:"sched_core"`
}
func init() {
registry.Register(&plugin.Registration{
Type: plugins.RuntimePluginV2,
ID: "task",
Requires: []plugin.Type{
plugins.EventPlugin,
plugins.MetadataPlugin,
},
Config: &Config{
Platforms: defaultPlatforms(),
},
InitFn: func(ic *plugin.InitContext) (interface{}, error) {
config := ic.Config.(*Config)
supportedPlatforms, err := platforms.ParseAll(config.Platforms)
if err != nil {
return nil, err
}
ic.Meta.Platforms = supportedPlatforms
m, err := ic.GetSingle(plugins.MetadataPlugin)
if err != nil {
return nil, err
}
ep, err := ic.GetByID(plugins.EventPlugin, "exchange")
if err != nil {
return nil, err
}
cs := metadata.NewContainerStore(m.(*metadata.DB))
ss := metadata.NewSandboxStore(m.(*metadata.DB))
events := ep.(*exchange.Exchange)
shimManager, err := NewShimManager(ic.Context, &ManagerConfig{
Root: ic.Properties[plugins.PropertyRootDir],
State: ic.Properties[plugins.PropertyStateDir],
Address: ic.Properties[plugins.PropertyGRPCAddress],
TTRPCAddress: ic.Properties[plugins.PropertyTTRPCAddress],
Events: events,
Store: cs,
SchedCore: config.SchedCore,
SandboxStore: ss,
})
if err != nil {
return nil, err
}
return NewTaskManager(shimManager), nil
},
})
// Task manager uses shim manager as a dependency to manage shim instances.
// However, due to time limits and to avoid migration steps in 1.6 release,
// use the following workaround.
// This expected to be removed in 1.7.
registry.Register(&plugin.Registration{
Type: plugins.RuntimePluginV2,
ID: "shim",
InitFn: func(ic *plugin.InitContext) (interface{}, error) {
taskManagerI, err := ic.GetByID(plugins.RuntimePluginV2, "task")
if err != nil {
return nil, err
}
taskManager := taskManagerI.(*TaskManager)
return taskManager.manager, nil
},
})
}
type ManagerConfig struct {
Root string
State string
Store containers.Store
Events *exchange.Exchange
Address string
TTRPCAddress string
SchedCore bool
SandboxStore sandbox.Store
}
// NewShimManager creates a manager for v2 shims
func NewShimManager(ctx context.Context, config *ManagerConfig) (*ShimManager, error) {
for _, d := range []string{config.Root, config.State} {
if err := os.MkdirAll(d, 0711); err != nil {
return nil, err
}
}
m := &ShimManager{
root: config.Root,
state: config.State,
containerdAddress: config.Address,
containerdTTRPCAddress: config.TTRPCAddress,
shims: runtime.NewNSMap[ShimInstance](),
events: config.Events,
containers: config.Store,
schedCore: config.SchedCore,
sandboxStore: config.SandboxStore,
}
if err := m.loadExistingTasks(ctx); err != nil {
return nil, err
}
return m, nil
}
// ShimManager manages currently running shim processes.
// It is mainly responsible for launching new shims and for proper shutdown and cleanup of existing instances.
// The manager is unaware of the underlying services shim provides and lets higher level services consume them,
// but don't care about lifecycle management.
type ShimManager struct {
root string
state string
containerdAddress string
containerdTTRPCAddress string
schedCore bool
shims *runtime.NSMap[ShimInstance]
events *exchange.Exchange
containers containers.Store
// runtimePaths is a cache of `runtime names` -> `resolved fs path`
runtimePaths sync.Map
sandboxStore sandbox.Store
}
// ID of the shim manager
func (m *ShimManager) ID() string {
return plugins.RuntimePluginV2.String() + ".shim"
}
// Start launches a new shim instance
func (m *ShimManager) Start(ctx context.Context, id string, opts runtime.CreateOpts) (_ ShimInstance, retErr error) {
bundle, err := NewBundle(ctx, m.root, m.state, id, opts.Spec)
if err != nil {
return nil, err
}
defer func() {
if retErr != nil {
bundle.Delete()
}
}()
// This container belongs to sandbox which supposed to be already started via sandbox API.
if opts.SandboxID != "" {
process, err := m.Get(ctx, opts.SandboxID)
if err != nil {
return nil, fmt.Errorf("can't find sandbox %s", opts.SandboxID)
}
// Write sandbox ID this task belongs to.
if err := os.WriteFile(filepath.Join(bundle.Path, "sandbox"), []byte(opts.SandboxID), 0600); err != nil {
return nil, err
}
params, err := restoreBootstrapParams(filepath.Join(m.state, process.Namespace(), opts.SandboxID))
if err != nil {
return nil, err
}
if err := writeBootstrapParams(filepath.Join(bundle.Path, "bootstrap.json"), params); err != nil {
return nil, fmt.Errorf("failed to write bootstrap.json for bundle %s: %w", bundle.Path, err)
}
shim, err := loadShim(ctx, bundle, func() {})
if err != nil {
return nil, fmt.Errorf("failed to load sandbox task %q: %w", opts.SandboxID, err)
}
if err := m.shims.Add(ctx, shim); err != nil {
return nil, err
}
return shim, nil
}
shim, err := m.startShim(ctx, bundle, id, opts)
if err != nil {
return nil, err
}
defer func() {
if retErr != nil {
m.cleanupShim(ctx, shim)
}
}()
if err := m.shims.Add(ctx, shim); err != nil {
return nil, fmt.Errorf("failed to add task: %w", err)
}
return shim, nil
}
func (m *ShimManager) startShim(ctx context.Context, bundle *Bundle, id string, opts runtime.CreateOpts) (*shim, error) {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return nil, err
}
ctx = log.WithLogger(ctx, log.G(ctx).WithField("namespace", ns))
topts := opts.TaskOptions
if topts == nil || topts.GetValue() == nil {
topts = opts.RuntimeOptions
}
runtimePath, err := m.resolveRuntimePath(opts.Runtime)
if err != nil {
return nil, fmt.Errorf("failed to resolve runtime path: %w", err)
}
b := shimBinary(bundle, shimBinaryConfig{
runtime: runtimePath,
address: m.containerdAddress,
ttrpcAddress: m.containerdTTRPCAddress,
schedCore: m.schedCore,
})
shim, err := b.Start(ctx, protobuf.FromAny(topts), func() {
log.G(ctx).WithField("id", id).Info("shim disconnected")
cleanupAfterDeadShim(cleanup.Background(ctx), id, m.shims, m.events, b)
// Remove self from the runtime task list. Even though the cleanupAfterDeadShim()
// would publish taskExit event, but the shim.Delete() would always failed with ttrpc
// disconnect and there is no chance to remove this dead task from runtime task lists.
// Thus it's better to delete it here.
m.shims.Delete(ctx, id)
})
if err != nil {
return nil, fmt.Errorf("start failed: %w", err)
}
return shim, nil
}
// restoreBootstrapParams reads bootstrap.json to restore shim configuration.
// If its an old shim, this will perform migration - read address file and write default bootstrap
// configuration (version = 2, protocol = ttrpc, and address).
func restoreBootstrapParams(bundlePath string) (shimbinary.BootstrapParams, error) {
filePath := filepath.Join(bundlePath, "bootstrap.json")
// Read bootstrap.json if exists
if _, err := os.Stat(filePath); err == nil {
return readBootstrapParams(filePath)
} else if !errors.Is(err, os.ErrNotExist) {
return shimbinary.BootstrapParams{}, fmt.Errorf("failed to stat %s: %w", filePath, err)
}
// File not found, likely its an older shim. Try migrate.
address, err := shimbinary.ReadAddress(filepath.Join(bundlePath, "address"))
if err != nil {
return shimbinary.BootstrapParams{}, fmt.Errorf("unable to migrate shim: failed to get socket address for bundle %s: %w", bundlePath, err)
}
params := shimbinary.BootstrapParams{
Version: 2,
Address: address,
Protocol: "ttrpc",
}
if err := writeBootstrapParams(filePath, params); err != nil {
return shimbinary.BootstrapParams{}, fmt.Errorf("unable to migrate: failed to write bootstrap.json file: %w", err)
}
return params, nil
}
func (m *ShimManager) resolveRuntimePath(runtime string) (string, error) {
if runtime == "" {
return "", fmt.Errorf("no runtime name")
}
// Custom path to runtime binary
if filepath.IsAbs(runtime) {
// Make sure it exists before returning ok
if _, err := os.Stat(runtime); err != nil {
return "", fmt.Errorf("invalid custom binary path: %w", err)
}
return runtime, nil
}
// Check if relative path to runtime binary provided
if strings.Contains(runtime, "/") {
return "", fmt.Errorf("invalid runtime name %s, correct runtime name should be either format like `io.containerd.runc.v2` or a full path to the binary", runtime)
}
// Preserve existing logic and resolve runtime path from runtime name.
name := shimbinary.BinaryName(runtime)
if name == "" {
return "", fmt.Errorf("invalid runtime name %s, correct runtime name should be either format like `io.containerd.runc.v2` or a full path to the binary", runtime)
}
if path, ok := m.runtimePaths.Load(name); ok {
return path.(string), nil
}
var (
cmdPath string
lerr error
)
binaryPath := shimbinary.BinaryPath(runtime)
if _, serr := os.Stat(binaryPath); serr == nil {
cmdPath = binaryPath
}
if cmdPath == "" {
if cmdPath, lerr = exec.LookPath(name); lerr != nil {
if eerr, ok := lerr.(*exec.Error); ok {
if eerr.Err == exec.ErrNotFound {
self, err := os.Executable()
if err != nil {
return "", err
}
// Match the calling binaries (containerd) path and see
// if they are side by side. If so, execute the shim
// found there.
testPath := filepath.Join(filepath.Dir(self), name)
if _, serr := os.Stat(testPath); serr == nil {
cmdPath = testPath
}
if cmdPath == "" {
return "", fmt.Errorf("runtime %q binary not installed %q: %w", runtime, name, os.ErrNotExist)
}
}
}
}
}
cmdPath, err := filepath.Abs(cmdPath)
if err != nil {
return "", err
}
if path, ok := m.runtimePaths.LoadOrStore(name, cmdPath); ok {
// We didn't store cmdPath we loaded an already cached value. Use it.
cmdPath = path.(string)
}
return cmdPath, nil
}
// cleanupShim attempts to properly delete and cleanup shim after error
func (m *ShimManager) cleanupShim(ctx context.Context, shim *shim) {
dctx, cancel := timeout.WithContext(cleanup.Background(ctx), cleanupTimeout)
defer cancel()
_ = shim.Delete(dctx)
m.shims.Delete(dctx, shim.ID())
}
func (m *ShimManager) Get(ctx context.Context, id string) (ShimInstance, error) {
return m.shims.Get(ctx, id)
}
// Delete a runtime task
func (m *ShimManager) Delete(ctx context.Context, id string) error {
shim, err := m.shims.Get(ctx, id)
if err != nil {
return err
}
err = shim.Delete(ctx)
m.shims.Delete(ctx, id)
return err
}
// TaskManager wraps task service client on top of shim manager.
type TaskManager struct {
manager *ShimManager
}
// NewTaskManager creates a new task manager instance.
func NewTaskManager(shims *ShimManager) *TaskManager {
return &TaskManager{
manager: shims,
}
}
// ID of the task manager
func (m *TaskManager) ID() string {
return plugins.RuntimePluginV2.String() + ".task"
}
// Create launches new shim instance and creates new task
func (m *TaskManager) Create(ctx context.Context, taskID string, opts runtime.CreateOpts) (runtime.Task, error) {
shim, err := m.manager.Start(ctx, taskID, opts)
if err != nil {
return nil, fmt.Errorf("failed to start shim: %w", err)
}
// Cast to shim task and call task service to create a new container task instance.
// This will not be required once shim service / client implemented.
shimTask, err := newShimTask(shim)
if err != nil {
return nil, err
}
t, err := shimTask.Create(ctx, opts)
if err != nil {
// NOTE: ctx contains required namespace information.
m.manager.shims.Delete(ctx, taskID)
dctx, cancel := timeout.WithContext(cleanup.Background(ctx), cleanupTimeout)
defer cancel()
sandboxed := opts.SandboxID != ""
_, errShim := shimTask.delete(dctx, sandboxed, func(context.Context, string) {})
if errShim != nil {
if errdefs.IsDeadlineExceeded(errShim) {
dctx, cancel = timeout.WithContext(cleanup.Background(ctx), cleanupTimeout)
defer cancel()
}
shimTask.Shutdown(dctx)
shimTask.Close()
}
return nil, fmt.Errorf("failed to create shim task: %w", err)
}
return t, nil
}
// Get a specific task
func (m *TaskManager) Get(ctx context.Context, id string) (runtime.Task, error) {
shim, err := m.manager.shims.Get(ctx, id)
if err != nil {
return nil, err
}
return newShimTask(shim)
}
// Tasks lists all tasks
func (m *TaskManager) Tasks(ctx context.Context, all bool) ([]runtime.Task, error) {
shims, err := m.manager.shims.GetAll(ctx, all)
if err != nil {
return nil, err
}
out := make([]runtime.Task, len(shims))
for i := range shims {
newClient, err := newShimTask(shims[i])
if err != nil {
return nil, err
}
out[i] = newClient
}
return out, nil
}
// Delete deletes the task and shim instance
func (m *TaskManager) Delete(ctx context.Context, taskID string) (*runtime.Exit, error) {
shim, err := m.manager.shims.Get(ctx, taskID)
if err != nil {
return nil, err
}
container, err := m.manager.containers.Get(ctx, taskID)
if err != nil {
return nil, err
}
shimTask, err := newShimTask(shim)
if err != nil {
return nil, err
}
sandboxed := container.SandboxID != ""
exit, err := shimTask.delete(ctx, sandboxed, func(ctx context.Context, id string) {
m.manager.shims.Delete(ctx, id)
})
if err != nil {
return nil, fmt.Errorf("failed to delete task: %w", err)
}
return exit, nil
}

View File

@@ -0,0 +1,98 @@
//go:build !windows && !darwin
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"os"
"testing"
)
// setupAbsoluteShimPath creates a temporary directory in $PATH with an empty
// shim executable file in it to test the exec.LookPath branch of resolveRuntimePath
func setupAbsoluteShimPath(t *testing.T) (string, error) {
tempShimDir := t.TempDir()
_, err := os.Create(tempShimDir + "/containerd-shim-runc-v2")
if err != nil {
return "", err
}
t.Setenv("PATH", tempShimDir+":"+os.Getenv("PATH"))
absoluteShimPath := tempShimDir + "/containerd-shim-runc-v2"
err = os.Chmod(absoluteShimPath, 0777)
if err != nil {
return "", err
}
return absoluteShimPath, nil
}
func TestResolveRuntimePath(t *testing.T) {
sm := &ShimManager{}
absoluteShimPath, err := setupAbsoluteShimPath(t)
if err != nil {
t.Errorf("Failed to create temporary shim path: %q", err)
}
tests := []struct {
runtime string
want string
}{
{ // Absolute path
runtime: absoluteShimPath,
want: absoluteShimPath,
},
{ // Binary name
runtime: "io.containerd.runc.v2",
want: absoluteShimPath,
},
{ // Invalid absolute path
runtime: "/fake/abs/path",
want: "",
},
{ // No name
runtime: "",
want: "",
},
{ // Relative Path
runtime: "./containerd-shim-runc-v2",
want: "",
},
{
runtime: "fake/containerd-shim-runc-v2",
want: "",
},
{
runtime: "./fake/containerd-shim-runc-v2",
want: "",
},
{ // Relative Path or Bad Binary Name
runtime: ".io.containerd.runc.v2",
want: "",
},
}
for _, c := range tests {
have, _ := sm.resolveRuntimePath(c.runtime)
if have != c.want {
t.Errorf("Expected %q, got %q", c.want, have)
}
}
}

View File

@@ -0,0 +1,27 @@
//go:build !windows
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"github.com/containerd/containerd/v2/platforms"
)
func defaultPlatforms() []string {
return []string{platforms.DefaultString()}
}

View File

@@ -0,0 +1,28 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"github.com/containerd/containerd/v2/platforms"
)
func defaultPlatforms() []string {
return []string{
platforms.DefaultString(),
"linux/amd64",
}
}

159
core/runtime/v2/process.go Normal file
View File

@@ -0,0 +1,159 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"errors"
task "github.com/containerd/containerd/v2/api/runtime/task/v3"
tasktypes "github.com/containerd/containerd/v2/api/types/task"
"github.com/containerd/containerd/v2/core/runtime"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/containerd/v2/protobuf"
"github.com/containerd/ttrpc"
)
type process struct {
id string
shim *shimTask
}
func (p *process) ID() string {
return p.id
}
func (p *process) Kill(ctx context.Context, signal uint32, _ bool) error {
_, err := p.shim.task.Kill(ctx, &task.KillRequest{
Signal: signal,
ID: p.shim.ID(),
ExecID: p.id,
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func statusFromProto(from tasktypes.Status) runtime.Status {
var status runtime.Status
switch from {
case tasktypes.Status_CREATED:
status = runtime.CreatedStatus
case tasktypes.Status_RUNNING:
status = runtime.RunningStatus
case tasktypes.Status_STOPPED:
status = runtime.StoppedStatus
case tasktypes.Status_PAUSED:
status = runtime.PausedStatus
case tasktypes.Status_PAUSING:
status = runtime.PausingStatus
}
return status
}
func (p *process) State(ctx context.Context) (runtime.State, error) {
response, err := p.shim.task.State(ctx, &task.StateRequest{
ID: p.shim.ID(),
ExecID: p.id,
})
if err != nil {
if !errors.Is(err, ttrpc.ErrClosed) {
return runtime.State{}, errdefs.FromGRPC(err)
}
return runtime.State{}, errdefs.ErrNotFound
}
return runtime.State{
Pid: response.Pid,
Status: statusFromProto(response.Status),
Stdin: response.Stdin,
Stdout: response.Stdout,
Stderr: response.Stderr,
Terminal: response.Terminal,
ExitStatus: response.ExitStatus,
ExitedAt: protobuf.FromTimestamp(response.ExitedAt),
}, nil
}
// ResizePty changes the side of the process's PTY to the provided width and height
func (p *process) ResizePty(ctx context.Context, size runtime.ConsoleSize) error {
_, err := p.shim.task.ResizePty(ctx, &task.ResizePtyRequest{
ID: p.shim.ID(),
ExecID: p.id,
Width: size.Width,
Height: size.Height,
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
// CloseIO closes the provided IO pipe for the process
func (p *process) CloseIO(ctx context.Context) error {
_, err := p.shim.task.CloseIO(ctx, &task.CloseIORequest{
ID: p.shim.ID(),
ExecID: p.id,
Stdin: true,
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
// Start the process
func (p *process) Start(ctx context.Context) error {
_, err := p.shim.task.Start(ctx, &task.StartRequest{
ID: p.shim.ID(),
ExecID: p.id,
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
// Wait on the process to exit and return the exit status and timestamp
func (p *process) Wait(ctx context.Context) (*runtime.Exit, error) {
response, err := p.shim.task.Wait(ctx, &task.WaitRequest{
ID: p.shim.ID(),
ExecID: p.id,
})
if err != nil {
return nil, errdefs.FromGRPC(err)
}
return &runtime.Exit{
Timestamp: protobuf.FromTimestamp(response.ExitedAt),
Status: response.ExitStatus,
}, nil
}
func (p *process) Delete(ctx context.Context) (*runtime.Exit, error) {
response, err := p.shim.task.Delete(ctx, &task.DeleteRequest{
ID: p.shim.ID(),
ExecID: p.id,
})
if err != nil {
return nil, errdefs.FromGRPC(err)
}
return &runtime.Exit{
Status: response.ExitStatus,
Timestamp: protobuf.FromTimestamp(response.ExitedAt),
Pid: response.Pid,
}, nil
}

View File

@@ -0,0 +1,17 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package options

View File

@@ -0,0 +1,161 @@
file {
name: "github.com/containerd/containerd/runtime/v2/runc/options/oci.proto"
package: "containerd.runc.v1"
message_type {
name: "Options"
field {
name: "no_pivot_root"
number: 1
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "noPivotRoot"
}
field {
name: "no_new_keyring"
number: 2
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "noNewKeyring"
}
field {
name: "shim_cgroup"
number: 3
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "shimCgroup"
}
field {
name: "io_uid"
number: 4
label: LABEL_OPTIONAL
type: TYPE_UINT32
json_name: "ioUid"
}
field {
name: "io_gid"
number: 5
label: LABEL_OPTIONAL
type: TYPE_UINT32
json_name: "ioGid"
}
field {
name: "binary_name"
number: 6
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "binaryName"
}
field {
name: "root"
number: 7
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "root"
}
field {
name: "systemd_cgroup"
number: 9
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "systemdCgroup"
}
field {
name: "criu_image_path"
number: 10
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "criuImagePath"
}
field {
name: "criu_work_path"
number: 11
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "criuWorkPath"
}
reserved_range {
start: 8
end: 9
}
}
message_type {
name: "CheckpointOptions"
field {
name: "exit"
number: 1
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "exit"
}
field {
name: "open_tcp"
number: 2
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "openTcp"
}
field {
name: "external_unix_sockets"
number: 3
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "externalUnixSockets"
}
field {
name: "terminal"
number: 4
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "terminal"
}
field {
name: "file_locks"
number: 5
label: LABEL_OPTIONAL
type: TYPE_BOOL
json_name: "fileLocks"
}
field {
name: "empty_namespaces"
number: 6
label: LABEL_REPEATED
type: TYPE_STRING
json_name: "emptyNamespaces"
}
field {
name: "cgroups_mode"
number: 7
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "cgroupsMode"
}
field {
name: "image_path"
number: 8
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "imagePath"
}
field {
name: "work_path"
number: 9
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "workPath"
}
}
message_type {
name: "ProcessDetails"
field {
name: "exec_id"
number: 1
label: LABEL_OPTIONAL
type: TYPE_STRING
json_name: "execId"
}
}
options {
go_package: "github.com/containerd/containerd/v2/runtime/v2/runc/options;options"
}
syntax: "proto3"
}

View File

@@ -0,0 +1,467 @@
// Code generated by protoc-gen-go. DO NOT EDIT.
// versions:
// protoc-gen-go v1.28.1
// protoc v3.20.1
// source: github.com/containerd/containerd/runtime/v2/runc/options/oci.proto
package options
import (
protoreflect "google.golang.org/protobuf/reflect/protoreflect"
protoimpl "google.golang.org/protobuf/runtime/protoimpl"
reflect "reflect"
sync "sync"
)
const (
// Verify that this generated code is sufficiently up-to-date.
_ = protoimpl.EnforceVersion(20 - protoimpl.MinVersion)
// Verify that runtime/protoimpl is sufficiently up-to-date.
_ = protoimpl.EnforceVersion(protoimpl.MaxVersion - 20)
)
type Options struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
// disable pivot root when creating a container
NoPivotRoot bool `protobuf:"varint,1,opt,name=no_pivot_root,json=noPivotRoot,proto3" json:"no_pivot_root,omitempty"`
// create a new keyring for the container
NoNewKeyring bool `protobuf:"varint,2,opt,name=no_new_keyring,json=noNewKeyring,proto3" json:"no_new_keyring,omitempty"`
// place the shim in a cgroup
ShimCgroup string `protobuf:"bytes,3,opt,name=shim_cgroup,json=shimCgroup,proto3" json:"shim_cgroup,omitempty"`
// set the I/O's pipes uid
IoUid uint32 `protobuf:"varint,4,opt,name=io_uid,json=ioUid,proto3" json:"io_uid,omitempty"`
// set the I/O's pipes gid
IoGid uint32 `protobuf:"varint,5,opt,name=io_gid,json=ioGid,proto3" json:"io_gid,omitempty"`
// binary name of the runc binary
BinaryName string `protobuf:"bytes,6,opt,name=binary_name,json=binaryName,proto3" json:"binary_name,omitempty"`
// runc root directory
Root string `protobuf:"bytes,7,opt,name=root,proto3" json:"root,omitempty"`
// enable systemd cgroups
SystemdCgroup bool `protobuf:"varint,9,opt,name=systemd_cgroup,json=systemdCgroup,proto3" json:"systemd_cgroup,omitempty"`
// criu image path
CriuImagePath string `protobuf:"bytes,10,opt,name=criu_image_path,json=criuImagePath,proto3" json:"criu_image_path,omitempty"`
// criu work path
CriuWorkPath string `protobuf:"bytes,11,opt,name=criu_work_path,json=criuWorkPath,proto3" json:"criu_work_path,omitempty"`
}
func (x *Options) Reset() {
*x = Options{}
if protoimpl.UnsafeEnabled {
mi := &file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[0]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
}
func (x *Options) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*Options) ProtoMessage() {}
func (x *Options) ProtoReflect() protoreflect.Message {
mi := &file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[0]
if protoimpl.UnsafeEnabled && x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use Options.ProtoReflect.Descriptor instead.
func (*Options) Descriptor() ([]byte, []int) {
return file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescGZIP(), []int{0}
}
func (x *Options) GetNoPivotRoot() bool {
if x != nil {
return x.NoPivotRoot
}
return false
}
func (x *Options) GetNoNewKeyring() bool {
if x != nil {
return x.NoNewKeyring
}
return false
}
func (x *Options) GetShimCgroup() string {
if x != nil {
return x.ShimCgroup
}
return ""
}
func (x *Options) GetIoUid() uint32 {
if x != nil {
return x.IoUid
}
return 0
}
func (x *Options) GetIoGid() uint32 {
if x != nil {
return x.IoGid
}
return 0
}
func (x *Options) GetBinaryName() string {
if x != nil {
return x.BinaryName
}
return ""
}
func (x *Options) GetRoot() string {
if x != nil {
return x.Root
}
return ""
}
func (x *Options) GetSystemdCgroup() bool {
if x != nil {
return x.SystemdCgroup
}
return false
}
func (x *Options) GetCriuImagePath() string {
if x != nil {
return x.CriuImagePath
}
return ""
}
func (x *Options) GetCriuWorkPath() string {
if x != nil {
return x.CriuWorkPath
}
return ""
}
type CheckpointOptions struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
// exit the container after a checkpoint
Exit bool `protobuf:"varint,1,opt,name=exit,proto3" json:"exit,omitempty"`
// checkpoint open tcp connections
OpenTcp bool `protobuf:"varint,2,opt,name=open_tcp,json=openTcp,proto3" json:"open_tcp,omitempty"`
// checkpoint external unix sockets
ExternalUnixSockets bool `protobuf:"varint,3,opt,name=external_unix_sockets,json=externalUnixSockets,proto3" json:"external_unix_sockets,omitempty"`
// checkpoint terminals (ptys)
Terminal bool `protobuf:"varint,4,opt,name=terminal,proto3" json:"terminal,omitempty"`
// allow checkpointing of file locks
FileLocks bool `protobuf:"varint,5,opt,name=file_locks,json=fileLocks,proto3" json:"file_locks,omitempty"`
// restore provided namespaces as empty namespaces
EmptyNamespaces []string `protobuf:"bytes,6,rep,name=empty_namespaces,json=emptyNamespaces,proto3" json:"empty_namespaces,omitempty"`
// set the cgroups mode, soft, full, strict
CgroupsMode string `protobuf:"bytes,7,opt,name=cgroups_mode,json=cgroupsMode,proto3" json:"cgroups_mode,omitempty"`
// checkpoint image path
ImagePath string `protobuf:"bytes,8,opt,name=image_path,json=imagePath,proto3" json:"image_path,omitempty"`
// checkpoint work path
WorkPath string `protobuf:"bytes,9,opt,name=work_path,json=workPath,proto3" json:"work_path,omitempty"`
}
func (x *CheckpointOptions) Reset() {
*x = CheckpointOptions{}
if protoimpl.UnsafeEnabled {
mi := &file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[1]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
}
func (x *CheckpointOptions) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*CheckpointOptions) ProtoMessage() {}
func (x *CheckpointOptions) ProtoReflect() protoreflect.Message {
mi := &file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[1]
if protoimpl.UnsafeEnabled && x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use CheckpointOptions.ProtoReflect.Descriptor instead.
func (*CheckpointOptions) Descriptor() ([]byte, []int) {
return file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescGZIP(), []int{1}
}
func (x *CheckpointOptions) GetExit() bool {
if x != nil {
return x.Exit
}
return false
}
func (x *CheckpointOptions) GetOpenTcp() bool {
if x != nil {
return x.OpenTcp
}
return false
}
func (x *CheckpointOptions) GetExternalUnixSockets() bool {
if x != nil {
return x.ExternalUnixSockets
}
return false
}
func (x *CheckpointOptions) GetTerminal() bool {
if x != nil {
return x.Terminal
}
return false
}
func (x *CheckpointOptions) GetFileLocks() bool {
if x != nil {
return x.FileLocks
}
return false
}
func (x *CheckpointOptions) GetEmptyNamespaces() []string {
if x != nil {
return x.EmptyNamespaces
}
return nil
}
func (x *CheckpointOptions) GetCgroupsMode() string {
if x != nil {
return x.CgroupsMode
}
return ""
}
func (x *CheckpointOptions) GetImagePath() string {
if x != nil {
return x.ImagePath
}
return ""
}
func (x *CheckpointOptions) GetWorkPath() string {
if x != nil {
return x.WorkPath
}
return ""
}
type ProcessDetails struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
// exec process id if the process is managed by a shim
ExecID string `protobuf:"bytes,1,opt,name=exec_id,json=execId,proto3" json:"exec_id,omitempty"`
}
func (x *ProcessDetails) Reset() {
*x = ProcessDetails{}
if protoimpl.UnsafeEnabled {
mi := &file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[2]
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
ms.StoreMessageInfo(mi)
}
}
func (x *ProcessDetails) String() string {
return protoimpl.X.MessageStringOf(x)
}
func (*ProcessDetails) ProtoMessage() {}
func (x *ProcessDetails) ProtoReflect() protoreflect.Message {
mi := &file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[2]
if protoimpl.UnsafeEnabled && x != nil {
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
if ms.LoadMessageInfo() == nil {
ms.StoreMessageInfo(mi)
}
return ms
}
return mi.MessageOf(x)
}
// Deprecated: Use ProcessDetails.ProtoReflect.Descriptor instead.
func (*ProcessDetails) Descriptor() ([]byte, []int) {
return file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescGZIP(), []int{2}
}
func (x *ProcessDetails) GetExecID() string {
if x != nil {
return x.ExecID
}
return ""
}
var File_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto protoreflect.FileDescriptor
var file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDesc = []byte{
0x0a, 0x42, 0x67, 0x69, 0x74, 0x68, 0x75, 0x62, 0x2e, 0x63, 0x6f, 0x6d, 0x2f, 0x63, 0x6f, 0x6e,
0x74, 0x61, 0x69, 0x6e, 0x65, 0x72, 0x64, 0x2f, 0x63, 0x6f, 0x6e, 0x74, 0x61, 0x69, 0x6e, 0x65,
0x72, 0x64, 0x2f, 0x72, 0x75, 0x6e, 0x74, 0x69, 0x6d, 0x65, 0x2f, 0x76, 0x32, 0x2f, 0x72, 0x75,
0x6e, 0x63, 0x2f, 0x6f, 0x70, 0x74, 0x69, 0x6f, 0x6e, 0x73, 0x2f, 0x6f, 0x63, 0x69, 0x2e, 0x70,
0x72, 0x6f, 0x74, 0x6f, 0x12, 0x12, 0x63, 0x6f, 0x6e, 0x74, 0x61, 0x69, 0x6e, 0x65, 0x72, 0x64,
0x2e, 0x72, 0x75, 0x6e, 0x63, 0x2e, 0x76, 0x31, 0x22, 0xd2, 0x02, 0x0a, 0x07, 0x4f, 0x70, 0x74,
0x69, 0x6f, 0x6e, 0x73, 0x12, 0x22, 0x0a, 0x0d, 0x6e, 0x6f, 0x5f, 0x70, 0x69, 0x76, 0x6f, 0x74,
0x5f, 0x72, 0x6f, 0x6f, 0x74, 0x18, 0x01, 0x20, 0x01, 0x28, 0x08, 0x52, 0x0b, 0x6e, 0x6f, 0x50,
0x69, 0x76, 0x6f, 0x74, 0x52, 0x6f, 0x6f, 0x74, 0x12, 0x24, 0x0a, 0x0e, 0x6e, 0x6f, 0x5f, 0x6e,
0x65, 0x77, 0x5f, 0x6b, 0x65, 0x79, 0x72, 0x69, 0x6e, 0x67, 0x18, 0x02, 0x20, 0x01, 0x28, 0x08,
0x52, 0x0c, 0x6e, 0x6f, 0x4e, 0x65, 0x77, 0x4b, 0x65, 0x79, 0x72, 0x69, 0x6e, 0x67, 0x12, 0x1f,
0x0a, 0x0b, 0x73, 0x68, 0x69, 0x6d, 0x5f, 0x63, 0x67, 0x72, 0x6f, 0x75, 0x70, 0x18, 0x03, 0x20,
0x01, 0x28, 0x09, 0x52, 0x0a, 0x73, 0x68, 0x69, 0x6d, 0x43, 0x67, 0x72, 0x6f, 0x75, 0x70, 0x12,
0x15, 0x0a, 0x06, 0x69, 0x6f, 0x5f, 0x75, 0x69, 0x64, 0x18, 0x04, 0x20, 0x01, 0x28, 0x0d, 0x52,
0x05, 0x69, 0x6f, 0x55, 0x69, 0x64, 0x12, 0x15, 0x0a, 0x06, 0x69, 0x6f, 0x5f, 0x67, 0x69, 0x64,
0x18, 0x05, 0x20, 0x01, 0x28, 0x0d, 0x52, 0x05, 0x69, 0x6f, 0x47, 0x69, 0x64, 0x12, 0x1f, 0x0a,
0x0b, 0x62, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x5f, 0x6e, 0x61, 0x6d, 0x65, 0x18, 0x06, 0x20, 0x01,
0x28, 0x09, 0x52, 0x0a, 0x62, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x4e, 0x61, 0x6d, 0x65, 0x12, 0x12,
0x0a, 0x04, 0x72, 0x6f, 0x6f, 0x74, 0x18, 0x07, 0x20, 0x01, 0x28, 0x09, 0x52, 0x04, 0x72, 0x6f,
0x6f, 0x74, 0x12, 0x25, 0x0a, 0x0e, 0x73, 0x79, 0x73, 0x74, 0x65, 0x6d, 0x64, 0x5f, 0x63, 0x67,
0x72, 0x6f, 0x75, 0x70, 0x18, 0x09, 0x20, 0x01, 0x28, 0x08, 0x52, 0x0d, 0x73, 0x79, 0x73, 0x74,
0x65, 0x6d, 0x64, 0x43, 0x67, 0x72, 0x6f, 0x75, 0x70, 0x12, 0x26, 0x0a, 0x0f, 0x63, 0x72, 0x69,
0x75, 0x5f, 0x69, 0x6d, 0x61, 0x67, 0x65, 0x5f, 0x70, 0x61, 0x74, 0x68, 0x18, 0x0a, 0x20, 0x01,
0x28, 0x09, 0x52, 0x0d, 0x63, 0x72, 0x69, 0x75, 0x49, 0x6d, 0x61, 0x67, 0x65, 0x50, 0x61, 0x74,
0x68, 0x12, 0x24, 0x0a, 0x0e, 0x63, 0x72, 0x69, 0x75, 0x5f, 0x77, 0x6f, 0x72, 0x6b, 0x5f, 0x70,
0x61, 0x74, 0x68, 0x18, 0x0b, 0x20, 0x01, 0x28, 0x09, 0x52, 0x0c, 0x63, 0x72, 0x69, 0x75, 0x57,
0x6f, 0x72, 0x6b, 0x50, 0x61, 0x74, 0x68, 0x4a, 0x04, 0x08, 0x08, 0x10, 0x09, 0x22, 0xbb, 0x02,
0x0a, 0x11, 0x43, 0x68, 0x65, 0x63, 0x6b, 0x70, 0x6f, 0x69, 0x6e, 0x74, 0x4f, 0x70, 0x74, 0x69,
0x6f, 0x6e, 0x73, 0x12, 0x12, 0x0a, 0x04, 0x65, 0x78, 0x69, 0x74, 0x18, 0x01, 0x20, 0x01, 0x28,
0x08, 0x52, 0x04, 0x65, 0x78, 0x69, 0x74, 0x12, 0x19, 0x0a, 0x08, 0x6f, 0x70, 0x65, 0x6e, 0x5f,
0x74, 0x63, 0x70, 0x18, 0x02, 0x20, 0x01, 0x28, 0x08, 0x52, 0x07, 0x6f, 0x70, 0x65, 0x6e, 0x54,
0x63, 0x70, 0x12, 0x32, 0x0a, 0x15, 0x65, 0x78, 0x74, 0x65, 0x72, 0x6e, 0x61, 0x6c, 0x5f, 0x75,
0x6e, 0x69, 0x78, 0x5f, 0x73, 0x6f, 0x63, 0x6b, 0x65, 0x74, 0x73, 0x18, 0x03, 0x20, 0x01, 0x28,
0x08, 0x52, 0x13, 0x65, 0x78, 0x74, 0x65, 0x72, 0x6e, 0x61, 0x6c, 0x55, 0x6e, 0x69, 0x78, 0x53,
0x6f, 0x63, 0x6b, 0x65, 0x74, 0x73, 0x12, 0x1a, 0x0a, 0x08, 0x74, 0x65, 0x72, 0x6d, 0x69, 0x6e,
0x61, 0x6c, 0x18, 0x04, 0x20, 0x01, 0x28, 0x08, 0x52, 0x08, 0x74, 0x65, 0x72, 0x6d, 0x69, 0x6e,
0x61, 0x6c, 0x12, 0x1d, 0x0a, 0x0a, 0x66, 0x69, 0x6c, 0x65, 0x5f, 0x6c, 0x6f, 0x63, 0x6b, 0x73,
0x18, 0x05, 0x20, 0x01, 0x28, 0x08, 0x52, 0x09, 0x66, 0x69, 0x6c, 0x65, 0x4c, 0x6f, 0x63, 0x6b,
0x73, 0x12, 0x29, 0x0a, 0x10, 0x65, 0x6d, 0x70, 0x74, 0x79, 0x5f, 0x6e, 0x61, 0x6d, 0x65, 0x73,
0x70, 0x61, 0x63, 0x65, 0x73, 0x18, 0x06, 0x20, 0x03, 0x28, 0x09, 0x52, 0x0f, 0x65, 0x6d, 0x70,
0x74, 0x79, 0x4e, 0x61, 0x6d, 0x65, 0x73, 0x70, 0x61, 0x63, 0x65, 0x73, 0x12, 0x21, 0x0a, 0x0c,
0x63, 0x67, 0x72, 0x6f, 0x75, 0x70, 0x73, 0x5f, 0x6d, 0x6f, 0x64, 0x65, 0x18, 0x07, 0x20, 0x01,
0x28, 0x09, 0x52, 0x0b, 0x63, 0x67, 0x72, 0x6f, 0x75, 0x70, 0x73, 0x4d, 0x6f, 0x64, 0x65, 0x12,
0x1d, 0x0a, 0x0a, 0x69, 0x6d, 0x61, 0x67, 0x65, 0x5f, 0x70, 0x61, 0x74, 0x68, 0x18, 0x08, 0x20,
0x01, 0x28, 0x09, 0x52, 0x09, 0x69, 0x6d, 0x61, 0x67, 0x65, 0x50, 0x61, 0x74, 0x68, 0x12, 0x1b,
0x0a, 0x09, 0x77, 0x6f, 0x72, 0x6b, 0x5f, 0x70, 0x61, 0x74, 0x68, 0x18, 0x09, 0x20, 0x01, 0x28,
0x09, 0x52, 0x08, 0x77, 0x6f, 0x72, 0x6b, 0x50, 0x61, 0x74, 0x68, 0x22, 0x29, 0x0a, 0x0e, 0x50,
0x72, 0x6f, 0x63, 0x65, 0x73, 0x73, 0x44, 0x65, 0x74, 0x61, 0x69, 0x6c, 0x73, 0x12, 0x17, 0x0a,
0x07, 0x65, 0x78, 0x65, 0x63, 0x5f, 0x69, 0x64, 0x18, 0x01, 0x20, 0x01, 0x28, 0x09, 0x52, 0x06,
0x65, 0x78, 0x65, 0x63, 0x49, 0x64, 0x42, 0x45, 0x5a, 0x43, 0x67, 0x69, 0x74, 0x68, 0x75, 0x62,
0x2e, 0x63, 0x6f, 0x6d, 0x2f, 0x63, 0x6f, 0x6e, 0x74, 0x61, 0x69, 0x6e, 0x65, 0x72, 0x64, 0x2f,
0x63, 0x6f, 0x6e, 0x74, 0x61, 0x69, 0x6e, 0x65, 0x72, 0x64, 0x2f, 0x76, 0x32, 0x2f, 0x72, 0x75,
0x6e, 0x74, 0x69, 0x6d, 0x65, 0x2f, 0x76, 0x32, 0x2f, 0x72, 0x75, 0x6e, 0x63, 0x2f, 0x6f, 0x70,
0x74, 0x69, 0x6f, 0x6e, 0x73, 0x3b, 0x6f, 0x70, 0x74, 0x69, 0x6f, 0x6e, 0x73, 0x62, 0x06, 0x70,
0x72, 0x6f, 0x74, 0x6f, 0x33,
}
var (
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescOnce sync.Once
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescData = file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDesc
)
func file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescGZIP() []byte {
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescOnce.Do(func() {
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescData = protoimpl.X.CompressGZIP(file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescData)
})
return file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDescData
}
var file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes = make([]protoimpl.MessageInfo, 3)
var file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_goTypes = []interface{}{
(*Options)(nil), // 0: containerd.runc.v1.Options
(*CheckpointOptions)(nil), // 1: containerd.runc.v1.CheckpointOptions
(*ProcessDetails)(nil), // 2: containerd.runc.v1.ProcessDetails
}
var file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_depIdxs = []int32{
0, // [0:0] is the sub-list for method output_type
0, // [0:0] is the sub-list for method input_type
0, // [0:0] is the sub-list for extension type_name
0, // [0:0] is the sub-list for extension extendee
0, // [0:0] is the sub-list for field type_name
}
func init() { file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_init() }
func file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_init() {
if File_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto != nil {
return
}
if !protoimpl.UnsafeEnabled {
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[0].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*Options); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[1].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*CheckpointOptions); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes[2].Exporter = func(v interface{}, i int) interface{} {
switch v := v.(*ProcessDetails); i {
case 0:
return &v.state
case 1:
return &v.sizeCache
case 2:
return &v.unknownFields
default:
return nil
}
}
}
type x struct{}
out := protoimpl.TypeBuilder{
File: protoimpl.DescBuilder{
GoPackagePath: reflect.TypeOf(x{}).PkgPath(),
RawDescriptor: file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDesc,
NumEnums: 0,
NumMessages: 3,
NumExtensions: 0,
NumServices: 0,
},
GoTypes: file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_goTypes,
DependencyIndexes: file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_depIdxs,
MessageInfos: file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_msgTypes,
}.Build()
File_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto = out.File
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_rawDesc = nil
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_goTypes = nil
file_github_com_containerd_containerd_runtime_v2_runc_options_oci_proto_depIdxs = nil
}

View File

@@ -0,0 +1,58 @@
syntax = "proto3";
package containerd.runc.v1;
option go_package = "github.com/containerd/containerd/v2/runtime/v2/runc/options;options";
message Options {
// disable pivot root when creating a container
bool no_pivot_root = 1;
// create a new keyring for the container
bool no_new_keyring = 2;
// place the shim in a cgroup
string shim_cgroup = 3;
// set the I/O's pipes uid
uint32 io_uid = 4;
// set the I/O's pipes gid
uint32 io_gid = 5;
// binary name of the runc binary
string binary_name = 6;
// runc root directory
string root = 7;
// criu binary path.
//
// Removed in containerd v2.0: string criu_path = 8;
reserved 8;
// enable systemd cgroups
bool systemd_cgroup = 9;
// criu image path
string criu_image_path = 10;
// criu work path
string criu_work_path = 11;
}
message CheckpointOptions {
// exit the container after a checkpoint
bool exit = 1;
// checkpoint open tcp connections
bool open_tcp = 2;
// checkpoint external unix sockets
bool external_unix_sockets = 3;
// checkpoint terminals (ptys)
bool terminal = 4;
// allow checkpointing of file locks
bool file_locks = 5;
// restore provided namespaces as empty namespaces
repeated string empty_namespaces = 6;
// set the cgroups mode, soft, full, strict
string cgroups_mode = 7;
// checkpoint image path
string image_path = 8;
// checkpoint work path
string work_path = 9;
}
message ProcessDetails {
// exec process id if the process is managed by a shim
string exec_id = 1;
}

762
core/runtime/v2/shim.go Normal file
View File

@@ -0,0 +1,762 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"encoding/json"
"errors"
"fmt"
"io"
"net"
"os"
"path/filepath"
"strings"
"time"
"github.com/containerd/containerd/v2/pkg/atomicfile"
"github.com/containerd/containerd/v2/pkg/dialer"
"github.com/containerd/ttrpc"
"google.golang.org/grpc"
"google.golang.org/grpc/connectivity"
"google.golang.org/grpc/credentials/insecure"
eventstypes "github.com/containerd/containerd/v2/api/events"
task "github.com/containerd/containerd/v2/api/runtime/task/v3"
"github.com/containerd/containerd/v2/api/types"
"github.com/containerd/containerd/v2/core/runtime"
client "github.com/containerd/containerd/v2/core/runtime/v2/shim"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/containerd/v2/pkg/events/exchange"
"github.com/containerd/containerd/v2/pkg/identifiers"
"github.com/containerd/containerd/v2/pkg/timeout"
"github.com/containerd/containerd/v2/protobuf"
ptypes "github.com/containerd/containerd/v2/protobuf/types"
"github.com/containerd/log"
)
const (
loadTimeout = "io.containerd.timeout.shim.load"
cleanupTimeout = "io.containerd.timeout.shim.cleanup"
shutdownTimeout = "io.containerd.timeout.shim.shutdown"
)
func init() {
timeout.Set(loadTimeout, 5*time.Second)
timeout.Set(cleanupTimeout, 5*time.Second)
timeout.Set(shutdownTimeout, 3*time.Second)
}
func loadShim(ctx context.Context, bundle *Bundle, onClose func()) (_ ShimInstance, retErr error) {
shimCtx, cancelShimLog := context.WithCancel(ctx)
defer func() {
if retErr != nil {
cancelShimLog()
}
}()
f, err := openShimLog(shimCtx, bundle, client.AnonReconnectDialer)
if err != nil {
return nil, fmt.Errorf("open shim log pipe when reload: %w", err)
}
defer func() {
if retErr != nil {
f.Close()
}
}()
// open the log pipe and block until the writer is ready
// this helps with synchronization of the shim
// copy the shim's logs to containerd's output
go func() {
defer f.Close()
_, err := io.Copy(os.Stderr, f)
// To prevent flood of error messages, the expected error
// should be reset, like os.ErrClosed or os.ErrNotExist, which
// depends on platform.
err = checkCopyShimLogError(ctx, err)
if err != nil {
log.G(ctx).WithError(err).Error("copy shim log after reload")
}
}()
onCloseWithShimLog := func() {
onClose()
cancelShimLog()
f.Close()
}
params, err := restoreBootstrapParams(bundle.Path)
if err != nil {
return nil, fmt.Errorf("failed to read boostrap.json when restoring bundle %q: %w", bundle.ID, err)
}
conn, err := makeConnection(ctx, bundle.ID, params, onCloseWithShimLog)
if err != nil {
return nil, fmt.Errorf("unable to make connection: %w", err)
}
defer func() {
if retErr != nil {
conn.Close()
}
}()
shim := &shim{
bundle: bundle,
client: conn,
version: params.Version,
}
return shim, nil
}
func cleanupAfterDeadShim(ctx context.Context, id string, rt *runtime.NSMap[ShimInstance], events *exchange.Exchange, binaryCall *binary) {
ctx, cancel := timeout.WithContext(ctx, cleanupTimeout)
defer cancel()
log.G(ctx).WithField("id", id).Warn("cleaning up after shim disconnected")
response, err := binaryCall.Delete(ctx)
if err != nil {
log.G(ctx).WithError(err).WithField("id", id).Warn("failed to clean up after shim disconnected")
}
if _, err := rt.Get(ctx, id); err != nil {
// Task was never started or was already successfully deleted
// No need to publish events
return
}
var (
pid uint32
exitStatus uint32
exitedAt time.Time
)
if response != nil {
pid = response.Pid
exitStatus = response.Status
exitedAt = response.Timestamp
} else {
exitStatus = 255
exitedAt = time.Now()
}
events.Publish(ctx, runtime.TaskExitEventTopic, &eventstypes.TaskExit{
ContainerID: id,
ID: id,
Pid: pid,
ExitStatus: exitStatus,
ExitedAt: protobuf.ToTimestamp(exitedAt),
})
events.Publish(ctx, runtime.TaskDeleteEventTopic, &eventstypes.TaskDelete{
ContainerID: id,
Pid: pid,
ExitStatus: exitStatus,
ExitedAt: protobuf.ToTimestamp(exitedAt),
})
}
// CurrentShimVersion is the latest shim version supported by containerd (e.g. TaskService v3).
const CurrentShimVersion = 3
// ShimInstance represents running shim process managed by ShimManager.
type ShimInstance interface {
io.Closer
// ID of the shim.
ID() string
// Namespace of this shim.
Namespace() string
// Bundle is a file system path to shim's bundle.
Bundle() string
// Client returns the underlying TTRPC or GRPC client object for this shim.
// The underlying object can be either *ttrpc.Client or grpc.ClientConnInterface.
Client() any
// Delete will close the client and remove bundle from disk.
Delete(ctx context.Context) error
// Version returns shim's features compatibility version.
Version() int
}
func parseStartResponse(response []byte) (client.BootstrapParams, error) {
var params client.BootstrapParams
if err := json.Unmarshal(response, &params); err != nil || params.Version < 2 {
// Use TTRPC for legacy shims
params.Address = string(response)
params.Protocol = "ttrpc"
params.Version = 2
}
if params.Version > CurrentShimVersion {
return client.BootstrapParams{}, fmt.Errorf("unsupported shim version (%d): %w", params.Version, errdefs.ErrNotImplemented)
}
return params, nil
}
// writeBootstrapParams writes shim's bootstrap configuration (e.g. how to connect, version, etc).
func writeBootstrapParams(path string, params client.BootstrapParams) error {
path, err := filepath.Abs(path)
if err != nil {
return err
}
data, err := json.Marshal(&params)
if err != nil {
return err
}
f, err := atomicfile.New(path, 0o666)
if err != nil {
return err
}
_, err = f.Write(data)
if err != nil {
f.Cancel()
return err
}
return f.Close()
}
func readBootstrapParams(path string) (client.BootstrapParams, error) {
path, err := filepath.Abs(path)
if err != nil {
return client.BootstrapParams{}, err
}
data, err := os.ReadFile(path)
if err != nil {
return client.BootstrapParams{}, err
}
return parseStartResponse(data)
}
// makeConnection creates a new TTRPC or GRPC connection object from address.
// address can be either a socket path for TTRPC or JSON serialized BootstrapParams.
func makeConnection(ctx context.Context, id string, params client.BootstrapParams, onClose func()) (_ io.Closer, retErr error) {
log.G(ctx).WithFields(log.Fields{
"address": params.Address,
"protocol": params.Protocol,
"version": params.Version,
}).Infof("connecting to shim %s", id)
switch strings.ToLower(params.Protocol) {
case "ttrpc":
conn, err := client.Connect(params.Address, client.AnonReconnectDialer)
if err != nil {
return nil, fmt.Errorf("failed to create TTRPC connection: %w", err)
}
defer func() {
if retErr != nil {
conn.Close()
}
}()
return ttrpc.NewClient(conn, ttrpc.WithOnClose(onClose)), nil
case "grpc":
ctx, cancel := context.WithTimeout(ctx, time.Second*100)
defer cancel()
gopts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithBlock(),
}
return grpcDialContext(ctx, params.Address, onClose, gopts...)
default:
return nil, fmt.Errorf("unexpected protocol: %q", params.Protocol)
}
}
// grpcDialContext and the underlying grpcConn type exist solely
// so we can have something similar to ttrpc.WithOnClose to have
// a callback run when the connection is severed or explicitly closed.
func grpcDialContext(
ctx context.Context,
address string,
onClose func(),
gopts ...grpc.DialOption,
) (*grpcConn, error) {
// If grpc.WithBlock is specified in gopts this causes the connection to block waiting for
// a connection regardless of if the socket exists or has a listener when Dial begins. This
// specific behavior of WithBlock is mostly undesirable for shims, as if the socket isn't
// there when we go to load/connect there's likely an issue. However, getting rid of WithBlock is
// also undesirable as we don't want the background connection behavior, we want to ensure
// a connection before moving on. To bring this in line with the ttrpc connection behavior
// lets do an initial dial to ensure the shims socket is actually available. stat wouldn't suffice
// here as if the shim exited unexpectedly its socket may still be on the filesystem, but it'd return
// ECONNREFUSED which grpc.DialContext will happily trudge along through for the full timeout.
//
// This is especially helpful on restart of containerd as if the shim died while containerd
// was down, we end up waiting the full timeout.
conn, err := net.DialTimeout("unix", address, time.Second*10)
if err != nil {
return nil, err
}
conn.Close()
target := dialer.DialAddress(address)
client, err := grpc.DialContext(ctx, target, gopts...)
if err != nil {
return nil, fmt.Errorf("failed to create GRPC connection: %w", err)
}
done := make(chan struct{})
go func() {
gctx := context.Background()
sourceState := connectivity.Ready
for {
if client.WaitForStateChange(gctx, sourceState) {
state := client.GetState()
if state == connectivity.Idle || state == connectivity.Shutdown {
break
}
// Could be transient failure. Lets see if we can get back to a working
// state.
log.G(gctx).WithFields(log.Fields{
"state": state,
"addr": target,
}).Warn("shim grpc connection unexpected state")
sourceState = state
}
}
onClose()
close(done)
}()
return &grpcConn{
ClientConn: client,
onCloseDone: done,
}, nil
}
type grpcConn struct {
*grpc.ClientConn
onCloseDone chan struct{}
}
func (gc *grpcConn) UserOnCloseWait(ctx context.Context) error {
select {
case <-gc.onCloseDone:
return nil
case <-ctx.Done():
return ctx.Err()
}
}
type shim struct {
bundle *Bundle
client any
version int
}
var _ ShimInstance = (*shim)(nil)
// ID of the shim/task
func (s *shim) ID() string {
return s.bundle.ID
}
func (s *shim) Version() int {
return s.version
}
func (s *shim) Namespace() string {
return s.bundle.Namespace
}
func (s *shim) Bundle() string {
return s.bundle.Path
}
func (s *shim) Client() any {
return s.client
}
// Close closes the underlying client connection.
func (s *shim) Close() error {
if ttrpcClient, ok := s.client.(*ttrpc.Client); ok {
return ttrpcClient.Close()
}
if grpcClient, ok := s.client.(*grpcConn); ok {
return grpcClient.Close()
}
return nil
}
func (s *shim) Delete(ctx context.Context) error {
var result []error
if ttrpcClient, ok := s.client.(*ttrpc.Client); ok {
if err := ttrpcClient.Close(); err != nil {
result = append(result, fmt.Errorf("failed to close ttrpc client: %w", err))
}
if err := ttrpcClient.UserOnCloseWait(ctx); err != nil {
result = append(result, fmt.Errorf("close wait error: %w", err))
}
}
if grpcClient, ok := s.client.(*grpcConn); ok {
if err := grpcClient.Close(); err != nil {
result = append(result, fmt.Errorf("failed to close grpc client: %w", err))
}
if err := grpcClient.UserOnCloseWait(ctx); err != nil {
result = append(result, fmt.Errorf("close wait error: %w", err))
}
}
if err := s.bundle.Delete(); err != nil {
log.G(ctx).WithField("id", s.ID()).WithError(err).Error("failed to delete bundle")
result = append(result, fmt.Errorf("failed to delete bundle: %w", err))
}
return errors.Join(result...)
}
var _ runtime.Task = &shimTask{}
// shimTask wraps shim process and adds task service client for compatibility with existing shim manager.
type shimTask struct {
ShimInstance
task TaskServiceClient
}
func newShimTask(shim ShimInstance) (*shimTask, error) {
taskClient, err := NewTaskClient(shim.Client(), shim.Version())
if err != nil {
return nil, err
}
return &shimTask{
ShimInstance: shim,
task: taskClient,
}, nil
}
func (s *shimTask) Shutdown(ctx context.Context) error {
_, err := s.task.Shutdown(ctx, &task.ShutdownRequest{
ID: s.ID(),
})
if err != nil && !errors.Is(err, ttrpc.ErrClosed) {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) waitShutdown(ctx context.Context) error {
ctx, cancel := timeout.WithContext(ctx, shutdownTimeout)
defer cancel()
return s.Shutdown(ctx)
}
// PID of the task
func (s *shimTask) PID(ctx context.Context) (uint32, error) {
response, err := s.task.Connect(ctx, &task.ConnectRequest{
ID: s.ID(),
})
if err != nil {
return 0, errdefs.FromGRPC(err)
}
return response.TaskPid, nil
}
func (s *shimTask) delete(ctx context.Context, sandboxed bool, removeTask func(ctx context.Context, id string)) (*runtime.Exit, error) {
response, shimErr := s.task.Delete(ctx, &task.DeleteRequest{
ID: s.ID(),
})
if shimErr != nil {
log.G(ctx).WithField("id", s.ID()).WithError(shimErr).Debug("failed to delete task")
if !errors.Is(shimErr, ttrpc.ErrClosed) {
shimErr = errdefs.FromGRPC(shimErr)
if !errdefs.IsNotFound(shimErr) {
return nil, shimErr
}
}
}
// NOTE: If the shim has been killed and ttrpc connection has been
// closed, the shimErr will not be nil. For this case, the event
// subscriber, like moby/moby, might have received the exit or delete
// events. Just in case, we should allow ttrpc-callback-on-close to
// send the exit and delete events again. And the exit status will
// depend on result of shimV2.Delete.
//
// If not, the shim has been delivered the exit and delete events.
// So we should remove the record and prevent duplicate events from
// ttrpc-callback-on-close.
//
// TODO: It's hard to guarantee that the event is unique and sent only
// once. The moby/moby should not rely on that assumption that there is
// only one exit event. The moby/moby should handle the duplicate events.
//
// REF: https://github.com/containerd/containerd/issues/4769
if shimErr == nil {
removeTask(ctx, s.ID())
}
// Don't shutdown sandbox as there may be other containers running.
// Let controller decide when to shutdown.
if !sandboxed {
if err := s.waitShutdown(ctx); err != nil {
// FIXME(fuweid):
//
// If the error is context canceled, should we use context.TODO()
// to wait for it?
log.G(ctx).WithField("id", s.ID()).WithError(err).Error("failed to shutdown shim task and the shim might be leaked")
}
}
if err := s.ShimInstance.Delete(ctx); err != nil {
log.G(ctx).WithField("id", s.ID()).WithError(err).Error("failed to delete shim")
}
// remove self from the runtime task list
// this seems dirty but it cleans up the API across runtimes, tasks, and the service
removeTask(ctx, s.ID())
if shimErr != nil {
return nil, shimErr
}
return &runtime.Exit{
Status: response.ExitStatus,
Timestamp: protobuf.FromTimestamp(response.ExitedAt),
Pid: response.Pid,
}, nil
}
func (s *shimTask) Create(ctx context.Context, opts runtime.CreateOpts) (runtime.Task, error) {
topts := opts.TaskOptions
if topts == nil || topts.GetValue() == nil {
topts = opts.RuntimeOptions
}
request := &task.CreateTaskRequest{
ID: s.ID(),
Bundle: s.Bundle(),
Stdin: opts.IO.Stdin,
Stdout: opts.IO.Stdout,
Stderr: opts.IO.Stderr,
Terminal: opts.IO.Terminal,
Checkpoint: opts.Checkpoint,
Options: protobuf.FromAny(topts),
}
for _, m := range opts.Rootfs {
request.Rootfs = append(request.Rootfs, &types.Mount{
Type: m.Type,
Source: m.Source,
Target: m.Target,
Options: m.Options,
})
}
_, err := s.task.Create(ctx, request)
if err != nil {
return nil, errdefs.FromGRPC(err)
}
return s, nil
}
func (s *shimTask) Pause(ctx context.Context) error {
if _, err := s.task.Pause(ctx, &task.PauseRequest{
ID: s.ID(),
}); err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Resume(ctx context.Context) error {
if _, err := s.task.Resume(ctx, &task.ResumeRequest{
ID: s.ID(),
}); err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Start(ctx context.Context) error {
_, err := s.task.Start(ctx, &task.StartRequest{
ID: s.ID(),
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Kill(ctx context.Context, signal uint32, all bool) error {
if _, err := s.task.Kill(ctx, &task.KillRequest{
ID: s.ID(),
Signal: signal,
All: all,
}); err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Exec(ctx context.Context, id string, opts runtime.ExecOpts) (runtime.ExecProcess, error) {
if err := identifiers.Validate(id); err != nil {
return nil, fmt.Errorf("invalid exec id %s: %w", id, err)
}
request := &task.ExecProcessRequest{
ID: s.ID(),
ExecID: id,
Stdin: opts.IO.Stdin,
Stdout: opts.IO.Stdout,
Stderr: opts.IO.Stderr,
Terminal: opts.IO.Terminal,
Spec: opts.Spec,
}
if _, err := s.task.Exec(ctx, request); err != nil {
return nil, errdefs.FromGRPC(err)
}
return &process{
id: id,
shim: s,
}, nil
}
func (s *shimTask) Pids(ctx context.Context) ([]runtime.ProcessInfo, error) {
resp, err := s.task.Pids(ctx, &task.PidsRequest{
ID: s.ID(),
})
if err != nil {
return nil, errdefs.FromGRPC(err)
}
var processList []runtime.ProcessInfo
for _, p := range resp.Processes {
processList = append(processList, runtime.ProcessInfo{
Pid: p.Pid,
Info: p.Info,
})
}
return processList, nil
}
func (s *shimTask) ResizePty(ctx context.Context, size runtime.ConsoleSize) error {
_, err := s.task.ResizePty(ctx, &task.ResizePtyRequest{
ID: s.ID(),
Width: size.Width,
Height: size.Height,
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) CloseIO(ctx context.Context) error {
_, err := s.task.CloseIO(ctx, &task.CloseIORequest{
ID: s.ID(),
Stdin: true,
})
if err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Wait(ctx context.Context) (*runtime.Exit, error) {
taskPid, err := s.PID(ctx)
if err != nil {
return nil, err
}
response, err := s.task.Wait(ctx, &task.WaitRequest{
ID: s.ID(),
})
if err != nil {
return nil, errdefs.FromGRPC(err)
}
return &runtime.Exit{
Pid: taskPid,
Timestamp: protobuf.FromTimestamp(response.ExitedAt),
Status: response.ExitStatus,
}, nil
}
func (s *shimTask) Checkpoint(ctx context.Context, path string, options *ptypes.Any) error {
request := &task.CheckpointTaskRequest{
ID: s.ID(),
Path: path,
Options: options,
}
if _, err := s.task.Checkpoint(ctx, request); err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Update(ctx context.Context, resources *ptypes.Any, annotations map[string]string) error {
if _, err := s.task.Update(ctx, &task.UpdateTaskRequest{
ID: s.ID(),
Resources: resources,
Annotations: annotations,
}); err != nil {
return errdefs.FromGRPC(err)
}
return nil
}
func (s *shimTask) Stats(ctx context.Context) (*ptypes.Any, error) {
response, err := s.task.Stats(ctx, &task.StatsRequest{
ID: s.ID(),
})
if err != nil {
return nil, errdefs.FromGRPC(err)
}
return response.Stats, nil
}
func (s *shimTask) Process(ctx context.Context, id string) (runtime.ExecProcess, error) {
p := &process{
id: id,
shim: s,
}
if _, err := p.State(ctx); err != nil {
return nil, err
}
return p, nil
}
func (s *shimTask) State(ctx context.Context) (runtime.State, error) {
response, err := s.task.State(ctx, &task.StateRequest{
ID: s.ID(),
})
if err != nil {
if !errors.Is(err, ttrpc.ErrClosed) {
return runtime.State{}, errdefs.FromGRPC(err)
}
return runtime.State{}, errdefs.ErrNotFound
}
return runtime.State{
Pid: response.Pid,
Status: statusFromProto(response.Status),
Stdin: response.Stdin,
Stdout: response.Stdout,
Stderr: response.Stderr,
Terminal: response.Terminal,
ExitStatus: response.ExitStatus,
ExitedAt: protobuf.FromTimestamp(response.ExitedAt),
}, nil
}

View File

@@ -0,0 +1,169 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"sync"
"time"
v1 "github.com/containerd/containerd/v2/api/services/ttrpc/events/v1"
"github.com/containerd/containerd/v2/pkg/events"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/ttrpcutil"
"github.com/containerd/containerd/v2/protobuf"
"github.com/containerd/log"
"github.com/containerd/ttrpc"
)
const (
queueSize = 2048
maxRequeue = 5
)
type item struct {
ev *v1.Envelope
ctx context.Context
count int
}
// NewPublisher creates a new remote events publisher
func NewPublisher(address string) (*RemoteEventsPublisher, error) {
client, err := ttrpcutil.NewClient(address)
if err != nil {
return nil, err
}
l := &RemoteEventsPublisher{
client: client,
closed: make(chan struct{}),
requeue: make(chan *item, queueSize),
}
go l.processQueue()
return l, nil
}
// RemoteEventsPublisher forwards events to a ttrpc server
type RemoteEventsPublisher struct {
client *ttrpcutil.Client
closed chan struct{}
closer sync.Once
requeue chan *item
}
// Done returns a channel which closes when done
func (l *RemoteEventsPublisher) Done() <-chan struct{} {
return l.closed
}
// Close closes the remote connection and closes the done channel
func (l *RemoteEventsPublisher) Close() (err error) {
err = l.client.Close()
l.closer.Do(func() {
close(l.closed)
})
return err
}
func (l *RemoteEventsPublisher) processQueue() {
for i := range l.requeue {
if i.count > maxRequeue {
log.L.Errorf("evicting %s from queue because of retry count", i.ev.Topic)
// drop the event
continue
}
if err := l.forwardRequest(i.ctx, &v1.ForwardRequest{Envelope: i.ev}); err != nil {
log.L.WithError(err).Error("forward event")
l.queue(i)
}
}
}
func (l *RemoteEventsPublisher) queue(i *item) {
go func() {
i.count++
// re-queue after a short delay
time.Sleep(time.Duration(1*i.count) * time.Second)
l.requeue <- i
}()
}
// Publish publishes the event by forwarding it to the configured ttrpc server
func (l *RemoteEventsPublisher) Publish(ctx context.Context, topic string, event events.Event) error {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return err
}
evt, err := protobuf.MarshalAnyToProto(event)
if err != nil {
return err
}
i := &item{
ev: &v1.Envelope{
Timestamp: protobuf.ToTimestamp(time.Now()),
Namespace: ns,
Topic: topic,
Event: evt,
},
ctx: ctx,
}
if err := l.forwardRequest(i.ctx, &v1.ForwardRequest{Envelope: i.ev}); err != nil {
l.queue(i)
return err
}
return nil
}
func (l *RemoteEventsPublisher) forwardRequest(ctx context.Context, req *v1.ForwardRequest) error {
service, err := l.client.EventsService()
if err == nil {
fCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
_, err = service.Forward(fCtx, req)
cancel()
if err == nil {
return nil
}
}
if err != ttrpc.ErrClosed {
return err
}
// Reconnect and retry request
if err = l.client.Reconnect(); err != nil {
return err
}
service, err = l.client.EventsService()
if err != nil {
return err
}
// try again with a fresh context, otherwise we may get a context timeout unexpectedly.
fCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
_, err = service.Forward(fCtx, req)
cancel()
if err != nil {
return err
}
return nil
}

View File

@@ -0,0 +1,458 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"encoding/json"
"errors"
"flag"
"fmt"
"io"
"net"
"os"
"path/filepath"
"runtime"
"runtime/debug"
"time"
shimapi "github.com/containerd/containerd/v2/api/runtime/task/v3"
"github.com/containerd/containerd/v2/pkg/events"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/shutdown"
"github.com/containerd/containerd/v2/plugins"
"github.com/containerd/containerd/v2/protobuf"
"github.com/containerd/containerd/v2/protobuf/proto"
"github.com/containerd/containerd/v2/version"
"github.com/containerd/log"
"github.com/containerd/plugin"
"github.com/containerd/plugin/registry"
"github.com/containerd/ttrpc"
"github.com/sirupsen/logrus"
)
// Publisher for events
type Publisher interface {
events.Publisher
io.Closer
}
// StartOpts describes shim start configuration received from containerd
type StartOpts struct {
Address string
TTRPCAddress string
Debug bool
}
// BootstrapParams is a JSON payload returned in stdout from shim.Start call.
type BootstrapParams struct {
// Version is the version of shim parameters (expected 2 for shim v2)
Version int `json:"version"`
// Address is a address containerd should use to connect to shim.
Address string `json:"address"`
// Protocol is either TTRPC or GRPC.
Protocol string `json:"protocol"`
}
type StopStatus struct {
Pid int
ExitStatus int
ExitedAt time.Time
}
// Manager is the interface which manages the shim process
type Manager interface {
Name() string
Start(ctx context.Context, id string, opts StartOpts) (BootstrapParams, error)
Stop(ctx context.Context, id string) (StopStatus, error)
}
// OptsKey is the context key for the Opts value.
type OptsKey struct{}
// Opts are context options associated with the shim invocation.
type Opts struct {
BundlePath string
Debug bool
}
// BinaryOpts allows the configuration of a shims binary setup
type BinaryOpts func(*Config)
// Config of shim binary options provided by shim implementations
type Config struct {
// NoSubreaper disables setting the shim as a child subreaper
NoSubreaper bool
// NoReaper disables the shim binary from reaping any child process implicitly
NoReaper bool
// NoSetupLogger disables automatic configuration of logrus to use the shim FIFO
NoSetupLogger bool
}
type TTRPCService interface {
RegisterTTRPC(*ttrpc.Server) error
}
type TTRPCServerOptioner interface {
TTRPCService
UnaryInterceptor() ttrpc.UnaryServerInterceptor
}
var (
debugFlag bool
versionFlag bool
id string
namespaceFlag string
socketFlag string
bundlePath string
addressFlag string
containerdBinaryFlag string
action string
)
const (
ttrpcAddressEnv = "TTRPC_ADDRESS"
grpcAddressEnv = "GRPC_ADDRESS"
namespaceEnv = "NAMESPACE"
maxVersionEnv = "MAX_SHIM_VERSION"
)
func parseFlags() {
flag.BoolVar(&debugFlag, "debug", false, "enable debug output in logs")
flag.BoolVar(&versionFlag, "v", false, "show the shim version and exit")
flag.StringVar(&namespaceFlag, "namespace", "", "namespace that owns the shim")
flag.StringVar(&id, "id", "", "id of the task")
flag.StringVar(&socketFlag, "socket", "", "socket path to serve")
flag.StringVar(&bundlePath, "bundle", "", "path to the bundle if not workdir")
flag.StringVar(&addressFlag, "address", "", "grpc address back to main containerd")
flag.StringVar(&containerdBinaryFlag, "publish-binary", "",
fmt.Sprintf("path to publish binary (used for publishing events), but %s will ignore this flag, please use the %s env", os.Args[0], ttrpcAddressEnv),
)
flag.Parse()
action = flag.Arg(0)
}
func setRuntime() {
debug.SetGCPercent(40)
go func() {
for range time.Tick(30 * time.Second) {
debug.FreeOSMemory()
}
}()
if os.Getenv("GOMAXPROCS") == "" {
// If GOMAXPROCS hasn't been set, we default to a value of 2 to reduce
// the number of Go stacks present in the shim.
runtime.GOMAXPROCS(2)
}
}
func setLogger(ctx context.Context, id string) (context.Context, error) {
l := log.G(ctx)
l.Logger.SetFormatter(&logrus.TextFormatter{
TimestampFormat: log.RFC3339NanoFixed,
FullTimestamp: true,
})
if debugFlag {
l.Logger.SetLevel(log.DebugLevel)
}
f, err := openLog(ctx, id)
if err != nil {
return ctx, err
}
l.Logger.SetOutput(f)
return log.WithLogger(ctx, l), nil
}
// Run initializes and runs a shim server.
func Run(ctx context.Context, manager Manager, opts ...BinaryOpts) {
var config Config
for _, o := range opts {
o(&config)
}
ctx = log.WithLogger(ctx, log.G(ctx).WithField("runtime", manager.Name()))
if err := run(ctx, manager, config); err != nil {
fmt.Fprintf(os.Stderr, "%s: %s", manager.Name(), err)
os.Exit(1)
}
}
func run(ctx context.Context, manager Manager, config Config) error {
parseFlags()
if versionFlag {
fmt.Printf("%s:\n", filepath.Base(os.Args[0]))
fmt.Println(" Version: ", version.Version)
fmt.Println(" Revision:", version.Revision)
fmt.Println(" Go version:", version.GoVersion)
fmt.Println("")
return nil
}
if namespaceFlag == "" {
return fmt.Errorf("shim namespace cannot be empty")
}
setRuntime()
signals, err := setupSignals(config)
if err != nil {
return err
}
if !config.NoSubreaper {
if err := subreaper(); err != nil {
return err
}
}
ttrpcAddress := os.Getenv(ttrpcAddressEnv)
publisher, err := NewPublisher(ttrpcAddress)
if err != nil {
return err
}
defer publisher.Close()
ctx = namespaces.WithNamespace(ctx, namespaceFlag)
ctx = context.WithValue(ctx, OptsKey{}, Opts{BundlePath: bundlePath, Debug: debugFlag})
ctx, sd := shutdown.WithShutdown(ctx)
defer sd.Shutdown()
// Handle explicit actions
switch action {
case "delete":
logger := log.G(ctx).WithFields(log.Fields{
"pid": os.Getpid(),
"namespace": namespaceFlag,
})
if debugFlag {
logger.Logger.SetLevel(log.DebugLevel)
}
go reap(ctx, logger, signals)
ss, err := manager.Stop(ctx, id)
if err != nil {
return err
}
data, err := proto.Marshal(&shimapi.DeleteResponse{
Pid: uint32(ss.Pid),
ExitStatus: uint32(ss.ExitStatus),
ExitedAt: protobuf.ToTimestamp(ss.ExitedAt),
})
if err != nil {
return err
}
if _, err := os.Stdout.Write(data); err != nil {
return err
}
return nil
case "start":
opts := StartOpts{
Address: addressFlag,
TTRPCAddress: ttrpcAddress,
Debug: debugFlag,
}
params, err := manager.Start(ctx, id, opts)
if err != nil {
return err
}
data, err := json.Marshal(&params)
if err != nil {
return fmt.Errorf("failed to marshal bootstrap params to json: %w", err)
}
if _, err := os.Stdout.Write(data); err != nil {
return err
}
return nil
}
if !config.NoSetupLogger {
ctx, err = setLogger(ctx, id)
if err != nil {
return err
}
}
registry.Register(&plugin.Registration{
Type: plugins.InternalPlugin,
ID: "shutdown",
InitFn: func(ic *plugin.InitContext) (interface{}, error) {
return sd, nil
},
})
// Register event plugin
registry.Register(&plugin.Registration{
Type: plugins.EventPlugin,
ID: "publisher",
InitFn: func(ic *plugin.InitContext) (interface{}, error) {
return publisher, nil
},
})
var (
initialized = plugin.NewPluginSet()
ttrpcServices = []TTRPCService{}
ttrpcUnaryInterceptors = []ttrpc.UnaryServerInterceptor{}
)
for _, p := range registry.Graph(func(*plugin.Registration) bool { return false }) {
pID := p.URI()
log.G(ctx).WithFields(log.Fields{"id": pID, "type": p.Type}).Debug("loading plugin")
initContext := plugin.NewContext(
ctx,
initialized,
map[string]string{
// NOTE: Root is empty since the shim does not support persistent storage,
// shim plugins should make use state directory for writing files to disk.
// The state directory will be destroyed when the shim if cleaned up or
// on reboot
plugins.PropertyStateDir: filepath.Join(bundlePath, p.URI()),
plugins.PropertyGRPCAddress: addressFlag,
plugins.PropertyTTRPCAddress: ttrpcAddress,
},
)
// load the plugin specific configuration if it is provided
// TODO: Read configuration passed into shim, or from state directory?
// if p.Config != nil {
// pc, err := config.Decode(p)
// if err != nil {
// return nil, err
// }
// initContext.Config = pc
// }
result := p.Init(initContext)
if err := initialized.Add(result); err != nil {
return fmt.Errorf("could not add plugin result to plugin set: %w", err)
}
instance, err := result.Instance()
if err != nil {
if plugin.IsSkipPlugin(err) {
log.G(ctx).WithFields(log.Fields{"id": pID, "type": p.Type, "error": err}).Info("skip loading plugin")
continue
}
return fmt.Errorf("failed to load plugin %s: %w", pID, err)
}
if src, ok := instance.(TTRPCService); ok {
log.G(ctx).WithField("id", pID).Debug("registering ttrpc service")
ttrpcServices = append(ttrpcServices, src)
}
if src, ok := instance.(TTRPCServerOptioner); ok {
ttrpcUnaryInterceptors = append(ttrpcUnaryInterceptors, src.UnaryInterceptor())
}
}
if len(ttrpcServices) == 0 {
return fmt.Errorf("required that ttrpc service")
}
unaryInterceptor := chainUnaryServerInterceptors(ttrpcUnaryInterceptors...)
server, err := newServer(ttrpc.WithUnaryServerInterceptor(unaryInterceptor))
if err != nil {
return fmt.Errorf("failed creating server: %w", err)
}
for _, srv := range ttrpcServices {
if err := srv.RegisterTTRPC(server); err != nil {
return fmt.Errorf("failed to register service: %w", err)
}
}
if err := serve(ctx, server, signals, sd.Shutdown); err != nil {
if !errors.Is(err, shutdown.ErrShutdown) {
return err
}
}
// NOTE: If the shim server is down(like oom killer), the address
// socket might be leaking.
if address, err := ReadAddress("address"); err == nil {
_ = RemoveSocket(address)
}
select {
case <-sd.Done():
return nil
case <-time.After(5 * time.Second):
return errors.New("shim shutdown timeout")
}
}
// serve serves the ttrpc API over a unix socket in the current working directory
// and blocks until the context is canceled
func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal, shutdown func()) error {
dump := make(chan os.Signal, 32)
setupDumpStacks(dump)
path, err := os.Getwd()
if err != nil {
return err
}
l, err := serveListener(socketFlag)
if err != nil {
return err
}
go func() {
defer l.Close()
if err := server.Serve(ctx, l); err != nil && !errors.Is(err, net.ErrClosed) {
log.G(ctx).WithError(err).Fatal("containerd-shim: ttrpc server failure")
}
}()
logger := log.G(ctx).WithFields(log.Fields{
"pid": os.Getpid(),
"path": path,
"namespace": namespaceFlag,
})
go func() {
for range dump {
dumpStacks(logger)
}
}()
go handleExitSignals(ctx, logger, shutdown)
return reap(ctx, logger, signals)
}
func dumpStacks(logger *log.Entry) {
var (
buf []byte
stackSize int
)
bufferLen := 16384
for stackSize == len(buf) {
buf = make([]byte, bufferLen)
stackSize = runtime.Stack(buf, true)
bufferLen *= 2
}
buf = buf[:stackSize]
logger.Infof("=== BEGIN goroutine stack dump ===\n%s\n=== END goroutine stack dump ===", buf)
}

View File

@@ -0,0 +1,27 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import "github.com/containerd/ttrpc"
func newServer(opts ...ttrpc.ServerOpt) (*ttrpc.Server, error) {
return ttrpc.NewServer(opts...)
}
func subreaper() error {
return nil
}

View File

@@ -0,0 +1,27 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import "github.com/containerd/ttrpc"
func newServer(opts ...ttrpc.ServerOpt) (*ttrpc.Server, error) {
return ttrpc.NewServer(opts...)
}
func subreaper() error {
return nil
}

View File

@@ -0,0 +1,31 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"github.com/containerd/containerd/v2/pkg/sys/reaper"
"github.com/containerd/ttrpc"
)
func newServer(opts ...ttrpc.ServerOpt) (*ttrpc.Server, error) {
opts = append(opts, ttrpc.WithServerHandshaker(ttrpc.UnixSocketRequireSameUser()))
return ttrpc.NewServer(opts...)
}
func subreaper() error {
return reaper.SetSubreaper(1)
}

View File

@@ -0,0 +1,62 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"runtime"
"testing"
)
func TestRuntimeWithEmptyMaxEnvProcs(t *testing.T) {
var oldGoMaxProcs = runtime.GOMAXPROCS(0)
defer runtime.GOMAXPROCS(oldGoMaxProcs)
t.Setenv("GOMAXPROCS", "")
setRuntime()
var currentGoMaxProcs = runtime.GOMAXPROCS(0)
if currentGoMaxProcs != 2 {
t.Fatal("the max number of procs should be 2")
}
}
func TestRuntimeWithNonEmptyMaxEnvProcs(t *testing.T) {
t.Setenv("GOMAXPROCS", "not_empty")
setRuntime()
var oldGoMaxProcs2 = runtime.GOMAXPROCS(0)
if oldGoMaxProcs2 != runtime.NumCPU() {
t.Fatal("the max number CPU should be equal to available CPUs")
}
}
func TestShimOptWithValue(t *testing.T) {
ctx := context.TODO()
ctx = context.WithValue(ctx, OptsKey{}, Opts{Debug: true})
o := ctx.Value(OptsKey{})
if o == nil {
t.Fatal("opts nil")
}
op, ok := o.(Opts)
if !ok {
t.Fatal("opts not of type Opts")
}
if !op.Debug {
t.Fatal("opts.Debug should be true")
}
}

View File

@@ -0,0 +1,113 @@
//go:build !windows
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"fmt"
"io"
"net"
"os"
"os/signal"
"syscall"
"github.com/containerd/containerd/v2/pkg/sys/reaper"
"github.com/containerd/fifo"
"github.com/containerd/log"
"github.com/sirupsen/logrus"
"golang.org/x/sys/unix"
)
// setupSignals creates a new signal handler for all signals and sets the shim as a
// sub-reaper so that the container processes are reparented
func setupSignals(config Config) (chan os.Signal, error) {
signals := make(chan os.Signal, 32)
smp := []os.Signal{unix.SIGTERM, unix.SIGINT, unix.SIGPIPE}
if !config.NoReaper {
smp = append(smp, unix.SIGCHLD)
}
signal.Notify(signals, smp...)
return signals, nil
}
func setupDumpStacks(dump chan<- os.Signal) {
signal.Notify(dump, syscall.SIGUSR1)
}
func serveListener(path string) (net.Listener, error) {
var (
l net.Listener
err error
)
if path == "" {
l, err = net.FileListener(os.NewFile(3, "socket"))
path = "[inherited from parent]"
} else {
if len(path) > socketPathLimit {
return nil, fmt.Errorf("%q: unix socket path too long (> %d)", path, socketPathLimit)
}
l, err = net.Listen("unix", path)
}
if err != nil {
return nil, err
}
log.L.WithField("socket", path).Debug("serving api on socket")
return l, nil
}
func reap(ctx context.Context, logger *logrus.Entry, signals chan os.Signal) error {
logger.Debug("starting signal loop")
for {
select {
case <-ctx.Done():
return ctx.Err()
case s := <-signals:
// Exit signals are handled separately from this loop
// They get registered with this channel so that we can ignore such signals for short-running actions (e.g. `delete`)
switch s {
case unix.SIGCHLD:
if err := reaper.Reap(); err != nil {
logger.WithError(err).Error("reap exit status")
}
case unix.SIGPIPE:
}
}
}
}
func handleExitSignals(ctx context.Context, logger *logrus.Entry, cancel context.CancelFunc) {
ch := make(chan os.Signal, 32)
signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM)
for {
select {
case s := <-ch:
logger.WithField("signal", s).Debugf("Caught exit signal")
cancel()
return
case <-ctx.Done():
return
}
}
}
func openLog(ctx context.Context, _ string) (io.Writer, error) {
return fifo.OpenFifoDup2(ctx, "log", unix.O_WRONLY, 0700, int(os.Stderr.Fd()))
}

View File

@@ -0,0 +1,58 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"io"
"net"
"os"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/ttrpc"
"github.com/sirupsen/logrus"
)
func setupSignals(config Config) (chan os.Signal, error) {
return nil, errdefs.ErrNotImplemented
}
func newServer(opts ...ttrpc.ServerOpt) (*ttrpc.Server, error) {
return nil, errdefs.ErrNotImplemented
}
func subreaper() error {
return errdefs.ErrNotImplemented
}
func setupDumpStacks(dump chan<- os.Signal) {
}
func serveListener(path string) (net.Listener, error) {
return nil, errdefs.ErrNotImplemented
}
func reap(ctx context.Context, logger *logrus.Entry, signals chan os.Signal) error {
return errdefs.ErrNotImplemented
}
func handleExitSignals(ctx context.Context, logger *logrus.Entry, cancel context.CancelFunc) {
}
func openLog(ctx context.Context, _ string) (io.Writer, error) {
return nil, errdefs.ErrNotImplemented
}

View File

@@ -0,0 +1,218 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"bytes"
"context"
"errors"
"fmt"
"io"
"net"
"os"
"os/exec"
"path/filepath"
"strings"
"time"
"github.com/containerd/ttrpc"
"github.com/containerd/typeurl/v2"
"github.com/containerd/containerd/v2/pkg/atomicfile"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/protobuf/proto"
"github.com/containerd/containerd/v2/protobuf/types"
)
type CommandConfig struct {
Runtime string
Address string
TTRPCAddress string
Path string
SchedCore bool
Args []string
Opts *types.Any
}
// Command returns the shim command with the provided args and configuration
func Command(ctx context.Context, config *CommandConfig) (*exec.Cmd, error) {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return nil, err
}
self, err := os.Executable()
if err != nil {
return nil, err
}
args := []string{
"-namespace", ns,
"-address", config.Address,
"-publish-binary", self,
}
args = append(args, config.Args...)
cmd := exec.CommandContext(ctx, config.Runtime, args...)
cmd.Dir = config.Path
cmd.Env = append(
os.Environ(),
"GOMAXPROCS=2",
fmt.Sprintf("%s=2", maxVersionEnv),
fmt.Sprintf("%s=%s", ttrpcAddressEnv, config.TTRPCAddress),
fmt.Sprintf("%s=%s", grpcAddressEnv, config.Address),
fmt.Sprintf("%s=%s", namespaceEnv, ns),
)
if config.SchedCore {
cmd.Env = append(cmd.Env, "SCHED_CORE=1")
}
cmd.SysProcAttr = getSysProcAttr()
if config.Opts != nil {
d, err := proto.Marshal(config.Opts)
if err != nil {
return nil, err
}
cmd.Stdin = bytes.NewReader(d)
}
return cmd, nil
}
// BinaryName returns the shim binary name from the runtime name,
// empty string returns means runtime name is invalid
func BinaryName(runtime string) string {
// runtime name should format like $prefix.name.version
parts := strings.Split(runtime, ".")
if len(parts) < 2 || parts[0] == "" {
return ""
}
return fmt.Sprintf(shimBinaryFormat, parts[len(parts)-2], parts[len(parts)-1])
}
// BinaryPath returns the full path for the shim binary from the runtime name,
// empty string returns means runtime name is invalid
func BinaryPath(runtime string) string {
dir := filepath.Dir(runtime)
binary := BinaryName(runtime)
path, err := filepath.Abs(filepath.Join(dir, binary))
if err != nil {
return ""
}
return path
}
// Connect to the provided address
func Connect(address string, d func(string, time.Duration) (net.Conn, error)) (net.Conn, error) {
return d(address, 100*time.Second)
}
// WritePidFile writes a pid file atomically
func WritePidFile(path string, pid int) error {
path, err := filepath.Abs(path)
if err != nil {
return err
}
f, err := atomicfile.New(path, 0o644)
if err != nil {
return err
}
_, err = fmt.Fprintf(f, "%d", pid)
if err != nil {
f.Cancel()
return err
}
return f.Close()
}
// ErrNoAddress is returned when the address file has no content
var ErrNoAddress = errors.New("no shim address")
// ReadAddress returns the shim's socket address from the path
func ReadAddress(path string) (string, error) {
path, err := filepath.Abs(path)
if err != nil {
return "", err
}
data, err := os.ReadFile(path)
if err != nil {
return "", err
}
if len(data) == 0 {
return "", ErrNoAddress
}
return string(data), nil
}
// ReadRuntimeOptions reads config bytes from io.Reader and unmarshals it into the provided type.
// The type must be registered with typeurl.
//
// The function will return ErrNotFound, if the config is not provided.
// And ErrInvalidArgument, if unable to cast the config to the provided type T.
func ReadRuntimeOptions[T any](reader io.Reader) (T, error) {
var config T
data, err := io.ReadAll(reader)
if err != nil {
return config, fmt.Errorf("failed to read config bytes from stdin: %w", err)
}
if len(data) == 0 {
return config, errdefs.ErrNotFound
}
var any types.Any
if err := proto.Unmarshal(data, &any); err != nil {
return config, err
}
v, err := typeurl.UnmarshalAny(&any)
if err != nil {
return config, err
}
config, ok := v.(T)
if !ok {
return config, fmt.Errorf("invalid type %T: %w", v, errdefs.ErrInvalidArgument)
}
return config, nil
}
// chainUnaryServerInterceptors creates a single ttrpc server interceptor from
// a chain of many interceptors executed from first to last.
func chainUnaryServerInterceptors(interceptors ...ttrpc.UnaryServerInterceptor) ttrpc.UnaryServerInterceptor {
n := len(interceptors)
// force to use default interceptor in ttrpc
if n == 0 {
return nil
}
return func(ctx context.Context, unmarshal ttrpc.Unmarshaler, info *ttrpc.UnaryServerInfo, method ttrpc.Method) (interface{}, error) {
currentMethod := method
for i := n - 1; i > 0; i-- {
interceptor := interceptors[i]
innerMethod := currentMethod
currentMethod = func(currentCtx context.Context, currentUnmarshal func(interface{}) error) (interface{}, error) {
return interceptor(currentCtx, currentUnmarshal, info, innerMethod)
}
}
return interceptors[0](ctx, unmarshal, info, currentMethod)
}
}

View File

@@ -0,0 +1,118 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"path/filepath"
"reflect"
"testing"
"github.com/containerd/ttrpc"
)
func TestChainUnaryServerInterceptors(t *testing.T) {
methodInfo := &ttrpc.UnaryServerInfo{
FullMethod: filepath.Join("/", t.Name(), "foo"),
}
type callKey struct{}
callValue := "init"
callCtx := context.WithValue(context.Background(), callKey{}, callValue)
verifyCallCtxFn := func(ctx context.Context, key interface{}, expected interface{}) {
got := ctx.Value(key)
if !reflect.DeepEqual(expected, got) {
t.Fatalf("[context(key:%s) expected %v, but got %v", key, expected, got)
}
}
verifyInfoFn := func(info *ttrpc.UnaryServerInfo) {
if !reflect.DeepEqual(methodInfo, info) {
t.Fatalf("[info] expected %+v, but got %+v", methodInfo, info)
}
}
origUnmarshaler := func(obj interface{}) error {
v := obj.(*int64)
*v *= 2
return nil
}
type firstKey struct{}
firstValue := "from first"
var firstUnmarshaler ttrpc.Unmarshaler
first := func(ctx context.Context, unmarshal ttrpc.Unmarshaler, info *ttrpc.UnaryServerInfo, method ttrpc.Method) (interface{}, error) {
verifyCallCtxFn(ctx, callKey{}, callValue)
verifyInfoFn(info)
ctx = context.WithValue(ctx, firstKey{}, firstValue)
firstUnmarshaler = func(obj interface{}) error {
if err := unmarshal(obj); err != nil {
return err
}
v := obj.(*int64)
*v *= 2
return nil
}
return method(ctx, firstUnmarshaler)
}
type secondKey struct{}
secondValue := "from second"
second := func(ctx context.Context, unmarshal ttrpc.Unmarshaler, info *ttrpc.UnaryServerInfo, method ttrpc.Method) (interface{}, error) {
verifyCallCtxFn(ctx, callKey{}, callValue)
verifyCallCtxFn(ctx, firstKey{}, firstValue)
verifyInfoFn(info)
v := int64(3) // should return 12
if err := unmarshal(&v); err != nil {
t.Fatalf("unexpected error %v", err)
}
if expected := int64(12); v != expected {
t.Fatalf("expected int64(%v), but got %v", expected, v)
}
ctx = context.WithValue(ctx, secondKey{}, secondValue)
return method(ctx, unmarshal)
}
methodFn := func(ctx context.Context, unmarshal func(interface{}) error) (interface{}, error) {
verifyCallCtxFn(ctx, callKey{}, callValue)
verifyCallCtxFn(ctx, firstKey{}, firstValue)
verifyCallCtxFn(ctx, secondKey{}, secondValue)
v := int64(2)
if err := unmarshal(&v); err != nil {
return nil, err
}
return v, nil
}
interceptor := chainUnaryServerInterceptors(first, second)
v, err := interceptor(callCtx, origUnmarshaler, methodInfo, methodFn)
if err != nil {
t.Fatalf("expected nil, but got %v", err)
}
if expected := int64(8); v != expected {
t.Fatalf("expected result is int64(%v), but got %v", expected, v)
}
}

View File

@@ -0,0 +1,179 @@
//go:build !windows
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"crypto/sha256"
"fmt"
"net"
"os"
"path/filepath"
"runtime"
"strings"
"syscall"
"time"
"github.com/containerd/containerd/v2/defaults"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/sys"
)
const (
shimBinaryFormat = "containerd-shim-%s-%s"
socketPathLimit = 106
)
func getSysProcAttr() *syscall.SysProcAttr {
return &syscall.SysProcAttr{
Setpgid: true,
}
}
// AdjustOOMScore sets the OOM score for the process to the parents OOM score +1
// to ensure that they parent has a lower* score than the shim
// if not already at the maximum OOM Score
func AdjustOOMScore(pid int) error {
parent := os.Getppid()
score, err := sys.GetOOMScoreAdj(parent)
if err != nil {
return fmt.Errorf("get parent OOM score: %w", err)
}
shimScore := score + 1
if err := sys.AdjustOOMScore(pid, shimScore); err != nil {
return fmt.Errorf("set shim OOM score: %w", err)
}
return nil
}
const socketRoot = defaults.DefaultStateDir
// SocketAddress returns a socket address
func SocketAddress(ctx context.Context, socketPath, id string) (string, error) {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return "", err
}
d := sha256.Sum256([]byte(filepath.Join(socketPath, ns, id)))
return fmt.Sprintf("unix://%s/%x", filepath.Join(socketRoot, "s"), d), nil
}
// AnonDialer returns a dialer for a socket
func AnonDialer(address string, timeout time.Duration) (net.Conn, error) {
return net.DialTimeout("unix", socket(address).path(), timeout)
}
// AnonReconnectDialer returns a dialer for an existing socket on reconnection
func AnonReconnectDialer(address string, timeout time.Duration) (net.Conn, error) {
return AnonDialer(address, timeout)
}
// NewSocket returns a new socket
func NewSocket(address string) (*net.UnixListener, error) {
var (
sock = socket(address)
path = sock.path()
isAbstract = sock.isAbstract()
perm = os.FileMode(0600)
)
// Darwin needs +x to access socket, otherwise it'll fail with "bind: permission denied" when running as non-root.
if runtime.GOOS == "darwin" {
perm = 0700
}
if !isAbstract {
if err := os.MkdirAll(filepath.Dir(path), perm); err != nil {
return nil, fmt.Errorf("mkdir failed for %s: %w", path, err)
}
}
l, err := net.Listen("unix", path)
if err != nil {
return nil, err
}
if !isAbstract {
if err := os.Chmod(path, perm); err != nil {
os.Remove(sock.path())
l.Close()
return nil, fmt.Errorf("chmod failed for %s: %w", path, err)
}
}
return l.(*net.UnixListener), nil
}
const abstractSocketPrefix = "\x00"
type socket string
func (s socket) isAbstract() bool {
return !strings.HasPrefix(string(s), "unix://")
}
func (s socket) path() string {
path := strings.TrimPrefix(string(s), "unix://")
// if there was no trim performed, we assume an abstract socket
if len(path) == len(s) {
path = abstractSocketPrefix + path
}
return path
}
// RemoveSocket removes the socket at the specified address if
// it exists on the filesystem
func RemoveSocket(address string) error {
sock := socket(address)
if !sock.isAbstract() {
return os.Remove(sock.path())
}
return nil
}
// SocketEaddrinuse returns true if the provided error is caused by the
// EADDRINUSE error number
func SocketEaddrinuse(err error) bool {
netErr, ok := err.(*net.OpError)
if !ok {
return false
}
if netErr.Op != "listen" {
return false
}
syscallErr, ok := netErr.Err.(*os.SyscallError)
if !ok {
return false
}
errno, ok := syscallErr.Err.(syscall.Errno)
if !ok {
return false
}
return errno == syscall.EADDRINUSE
}
// CanConnect returns true if the socket provided at the address
// is accepting new connections
func CanConnect(address string) bool {
conn, err := AnonDialer(address, 100*time.Millisecond)
if err != nil {
return false
}
conn.Close()
return true
}

View File

@@ -0,0 +1,87 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package shim
import (
"context"
"fmt"
"net"
"os"
"syscall"
"time"
winio "github.com/Microsoft/go-winio"
)
const shimBinaryFormat = "containerd-shim-%s-%s.exe"
func getSysProcAttr() *syscall.SysProcAttr {
return nil
}
// AnonReconnectDialer returns a dialer for an existing npipe on containerd reconnection
func AnonReconnectDialer(address string, timeout time.Duration) (net.Conn, error) {
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
c, err := winio.DialPipeContext(ctx, address)
if os.IsNotExist(err) {
return nil, fmt.Errorf("npipe not found on reconnect: %w", os.ErrNotExist)
} else if err == context.DeadlineExceeded {
return nil, fmt.Errorf("timed out waiting for npipe %s: %w", address, err)
} else if err != nil {
return nil, err
}
return c, nil
}
// AnonDialer returns a dialer for a npipe
func AnonDialer(address string, timeout time.Duration) (net.Conn, error) {
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
// If there is nobody serving the pipe we limit the timeout for this case to
// 5 seconds because any shim that would serve this endpoint should serve it
// within 5 seconds.
serveTimer := time.NewTimer(5 * time.Second)
defer serveTimer.Stop()
for {
c, err := winio.DialPipeContext(ctx, address)
if err != nil {
if os.IsNotExist(err) {
select {
case <-serveTimer.C:
return nil, fmt.Errorf("pipe not found before timeout: %w", os.ErrNotExist)
default:
// Wait 10ms for the shim to serve and try again.
time.Sleep(10 * time.Millisecond)
continue
}
} else if err == context.DeadlineExceeded {
return nil, fmt.Errorf("timed out waiting for npipe %s: %w", address, err)
}
return nil, err
}
return c, nil
}
}
// RemoveSocket removes the socket at the specified address if
// it exists on the filesystem
func RemoveSocket(address string) error {
return nil
}

View File

@@ -0,0 +1,230 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"errors"
"os"
"path/filepath"
"github.com/containerd/containerd/v2/core/mount"
"github.com/containerd/containerd/v2/internal/cleanup"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/containerd/containerd/v2/pkg/namespaces"
"github.com/containerd/containerd/v2/pkg/timeout"
"github.com/containerd/log"
)
func (m *ShimManager) loadExistingTasks(ctx context.Context) error {
nsDirs, err := os.ReadDir(m.state)
if err != nil {
return err
}
for _, nsd := range nsDirs {
if !nsd.IsDir() {
continue
}
ns := nsd.Name()
// skip hidden directories
if len(ns) > 0 && ns[0] == '.' {
continue
}
log.G(ctx).WithField("namespace", ns).Debug("loading tasks in namespace")
if err := m.loadShims(namespaces.WithNamespace(ctx, ns)); err != nil {
log.G(ctx).WithField("namespace", ns).WithError(err).Error("loading tasks in namespace")
continue
}
if err := m.cleanupWorkDirs(namespaces.WithNamespace(ctx, ns)); err != nil {
log.G(ctx).WithField("namespace", ns).WithError(err).Error("cleanup working directory in namespace")
continue
}
}
return nil
}
func (m *ShimManager) loadShims(ctx context.Context) error {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return err
}
ctx = log.WithLogger(ctx, log.G(ctx).WithField("namespace", ns))
shimDirs, err := os.ReadDir(filepath.Join(m.state, ns))
if err != nil {
return err
}
for _, sd := range shimDirs {
if !sd.IsDir() {
continue
}
id := sd.Name()
// skip hidden directories
if len(id) > 0 && id[0] == '.' {
continue
}
bundle, err := LoadBundle(ctx, m.state, id)
if err != nil {
// fine to return error here, it is a programmer error if the context
// does not have a namespace
return err
}
// fast path
f, err := os.Open(bundle.Path)
if err != nil {
bundle.Delete()
log.G(ctx).WithError(err).Errorf("fast path read bundle path for %s", bundle.Path)
continue
}
bf, err := f.Readdirnames(-1)
f.Close()
if err != nil {
bundle.Delete()
log.G(ctx).WithError(err).Errorf("fast path read bundle path for %s", bundle.Path)
continue
}
if len(bf) == 0 {
bundle.Delete()
continue
}
var (
runtime string
)
// If we're on 1.6+ and specified custom path to the runtime binary, path will be saved in 'shim-binary-path' file.
if data, err := os.ReadFile(filepath.Join(bundle.Path, "shim-binary-path")); err == nil {
runtime = string(data)
} else if err != nil && !os.IsNotExist(err) {
log.G(ctx).WithError(err).Error("failed to read `runtime` path from bundle")
}
// Query runtime name from metadata store
if runtime == "" {
container, err := m.containers.Get(ctx, id)
if err != nil {
log.G(ctx).WithError(err).Errorf("loading container %s", id)
if err := mount.UnmountRecursive(filepath.Join(bundle.Path, "rootfs"), 0); err != nil {
log.G(ctx).WithError(err).Errorf("failed to unmount of rootfs %s", id)
}
bundle.Delete()
continue
}
runtime = container.Runtime.Name
}
runtime, err = m.resolveRuntimePath(runtime)
if err != nil {
bundle.Delete()
log.G(ctx).WithError(err).Error("failed to resolve runtime path")
continue
}
binaryCall := shimBinary(bundle,
shimBinaryConfig{
runtime: runtime,
address: m.containerdAddress,
ttrpcAddress: m.containerdTTRPCAddress,
schedCore: m.schedCore,
})
shim, err := loadShimTask(ctx, bundle, func() {
log.G(ctx).WithField("id", id).Info("shim disconnected")
cleanupAfterDeadShim(cleanup.Background(ctx), id, m.shims, m.events, binaryCall)
// Remove self from the runtime task list.
m.shims.Delete(ctx, id)
})
if err != nil {
log.G(ctx).WithError(err).Errorf("unable to load shim %q", id)
cleanupAfterDeadShim(ctx, id, m.shims, m.events, binaryCall)
continue
}
// There are 3 possibilities for the loaded shim here:
// 1. It could be a shim that is running a task.
// 2. It could be a sandbox shim.
// 3. Or it could be a shim that was created for running a task but
// something happened (probably a containerd crash) and the task was never
// created. This shim process should be cleaned up here. Look at
// containerd/containerd#6860 for further details.
_, sgetErr := m.sandboxStore.Get(ctx, id)
pInfo, pidErr := shim.Pids(ctx)
if sgetErr != nil && errors.Is(sgetErr, errdefs.ErrNotFound) && (len(pInfo) == 0 || errors.Is(pidErr, errdefs.ErrNotFound)) {
log.G(ctx).WithField("id", id).Info("cleaning leaked shim process")
// We are unable to get Pids from the shim and it's not a sandbox
// shim. We should clean it up her.
// No need to do anything for removeTask since we never added this shim.
shim.delete(ctx, false, func(ctx context.Context, id string) {})
} else {
m.shims.Add(ctx, shim.ShimInstance)
}
}
return nil
}
func loadShimTask(ctx context.Context, bundle *Bundle, onClose func()) (_ *shimTask, retErr error) {
shim, err := loadShim(ctx, bundle, onClose)
if err != nil {
return nil, err
}
// Check connectivity, TaskService is the only required service, so create a temp one to check connection.
s, err := newShimTask(shim)
if err != nil {
return nil, err
}
ctx, cancel := timeout.WithContext(ctx, loadTimeout)
defer cancel()
if _, err := s.PID(ctx); err != nil {
return nil, err
}
return s, nil
}
func (m *ShimManager) cleanupWorkDirs(ctx context.Context) error {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return err
}
f, err := os.Open(filepath.Join(m.root, ns))
if err != nil {
return err
}
defer f.Close()
dirs, err := f.Readdirnames(-1)
if err != nil {
return err
}
for _, dir := range dirs {
// if the task was not loaded, cleanup and empty working directory
// this can happen on a reboot where /run for the bundle state is cleaned up
// but that persistent working dir is left
if _, err := m.shims.Get(ctx, dir); err != nil {
path := filepath.Join(m.root, ns, dir)
if err := os.RemoveAll(path); err != nil {
log.G(ctx).WithError(err).Errorf("cleanup working dir %s", path)
}
}
}
return nil
}

View File

@@ -0,0 +1,124 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"errors"
"os"
"path/filepath"
"testing"
client "github.com/containerd/containerd/v2/core/runtime/v2/shim"
"github.com/containerd/containerd/v2/pkg/errdefs"
"github.com/stretchr/testify/require"
)
func TestParseStartResponse(t *testing.T) {
for _, tc := range []struct {
Name string
Response string
Expected client.BootstrapParams
Err error
}{
{
Name: "v2 shim",
Response: "/somedirectory/somesocket",
Expected: client.BootstrapParams{
Version: 2,
Address: "/somedirectory/somesocket",
Protocol: "ttrpc",
},
},
{
Name: "v2 shim using grpc",
Response: `{"version":2,"address":"/somedirectory/somesocket","protocol":"grpc"}`,
Expected: client.BootstrapParams{
Version: 2,
Address: "/somedirectory/somesocket",
Protocol: "grpc",
},
},
{
Name: "v2 shim using ttrpc",
Response: `{"version":2,"address":"/somedirectory/somesocket","protocol":"ttrpc"}`,
Expected: client.BootstrapParams{
Version: 2,
Address: "/somedirectory/somesocket",
Protocol: "ttrpc",
},
},
{
Name: "invalid shim v2 response",
Response: `{"address":"/somedirectory/somesocket","protocol":"ttrpc"}`,
Expected: client.BootstrapParams{
Version: 2,
Address: `{"address":"/somedirectory/somesocket","protocol":"ttrpc"}`,
Protocol: "ttrpc",
},
},
{
Name: "later unsupported shim",
Response: `{"Version": 4,"Address":"/somedirectory/somesocket","Protocol":"ttrpc"}`,
Expected: client.BootstrapParams{},
Err: errdefs.ErrNotImplemented,
},
} {
t.Run(tc.Name, func(t *testing.T) {
params, err := parseStartResponse([]byte(tc.Response))
if err != nil {
if !errors.Is(err, tc.Err) {
t.Errorf("unexpected error: %v", err)
}
return
} else if tc.Err != nil {
t.Fatal("expected error")
}
if params.Version != tc.Expected.Version {
t.Errorf("unexpected version %d, expected %d", params.Version, tc.Expected.Version)
}
if params.Protocol != tc.Expected.Protocol {
t.Errorf("unexpected protocol %q, expected %q", params.Protocol, tc.Expected.Protocol)
}
if params.Address != tc.Expected.Address {
t.Errorf("unexpected address %q, expected %q", params.Address, tc.Expected.Address)
}
})
}
}
func TestRestoreBootstrapParams(t *testing.T) {
bundlePath := t.TempDir()
err := os.WriteFile(filepath.Join(bundlePath, "address"), []byte("unix://123"), 0o666)
require.NoError(t, err)
restored, err := restoreBootstrapParams(bundlePath)
require.NoError(t, err)
expected := client.BootstrapParams{
Version: 2,
Address: "unix://123",
Protocol: "ttrpc",
}
require.EqualValues(t, expected, restored)
loaded, err := readBootstrapParams(filepath.Join(bundlePath, "bootstrap.json"))
require.NoError(t, err)
require.EqualValues(t, expected, loaded)
}

View File

@@ -0,0 +1,47 @@
//go:build !windows
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"errors"
"io"
"net"
"os"
"path/filepath"
"time"
"github.com/containerd/fifo"
"golang.org/x/sys/unix"
)
func openShimLog(ctx context.Context, bundle *Bundle, _ func(string, time.Duration) (net.Conn, error)) (io.ReadCloser, error) {
return fifo.OpenFifo(ctx, filepath.Join(bundle.Path, "log"), unix.O_RDWR|unix.O_CREAT|unix.O_NONBLOCK, 0700)
}
func checkCopyShimLogError(ctx context.Context, err error) error {
select {
case <-ctx.Done():
if err == fifo.ErrReadClosed || errors.Is(err, os.ErrClosed) {
return nil
}
default:
}
return err
}

View File

@@ -0,0 +1,53 @@
//go:build linux
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"os"
"testing"
"github.com/containerd/fifo"
)
func TestCheckCopyShimLogError(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
if err := checkCopyShimLogError(ctx, fifo.ErrReadClosed); err != fifo.ErrReadClosed {
t.Fatalf("should return the actual error before context is done, but %v", err)
}
if err := checkCopyShimLogError(ctx, nil); err != nil {
t.Fatalf("should return the actual error before context is done, but %v", err)
}
cancel()
if err := checkCopyShimLogError(ctx, fifo.ErrReadClosed); err != nil {
t.Fatalf("should return nil when error is ErrReadClosed after context is done, but %v", err)
}
if err := checkCopyShimLogError(ctx, nil); err != nil {
t.Fatalf("should return the actual error after context is done, but %v", err)
}
if err := checkCopyShimLogError(ctx, os.ErrClosed); err != nil {
t.Fatalf("should return the actual error after context is done, but %v", err)
}
if err := checkCopyShimLogError(ctx, fifo.ErrRdFrmWRONLY); err != fifo.ErrRdFrmWRONLY {
t.Fatalf("should return the actual error after context is done, but %v", err)
}
}

View File

@@ -0,0 +1,97 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"errors"
"fmt"
"io"
"net"
"os"
"sync"
"time"
"github.com/containerd/containerd/v2/pkg/namespaces"
)
type deferredPipeConnection struct {
ctx context.Context
wg sync.WaitGroup
once sync.Once
c net.Conn
conerr error
}
func (dpc *deferredPipeConnection) Read(p []byte) (n int, err error) {
if dpc.c == nil {
dpc.wg.Wait()
if dpc.c == nil {
return 0, dpc.conerr
}
}
return dpc.c.Read(p)
}
func (dpc *deferredPipeConnection) Close() error {
var err error
dpc.once.Do(func() {
dpc.wg.Wait()
if dpc.c != nil {
err = dpc.c.Close()
} else if dpc.conerr != nil {
err = dpc.conerr
}
})
return err
}
// openShimLog on Windows acts as the client of the log pipe. In this way the
// containerd daemon can reconnect to the shim log stream if it is restarted.
func openShimLog(ctx context.Context, bundle *Bundle, dialer func(string, time.Duration) (net.Conn, error)) (io.ReadCloser, error) {
ns, err := namespaces.NamespaceRequired(ctx)
if err != nil {
return nil, err
}
dpc := &deferredPipeConnection{
ctx: ctx,
}
dpc.wg.Add(1)
go func() {
c, conerr := dialer(
fmt.Sprintf("\\\\.\\pipe\\containerd-shim-%s-%s-log", ns, bundle.ID),
time.Second*10,
)
if conerr != nil {
dpc.conerr = fmt.Errorf("failed to connect to shim log: %w", conerr)
}
dpc.c = c
dpc.wg.Done()
}()
return dpc, nil
}
func checkCopyShimLogError(ctx context.Context, err error) error {
// When using a multi-container shim the 2nd to Nth container in the
// shim will not have a separate log pipe. Ignore the failure log
// message here when the shim connect times out.
if errors.Is(err, os.ErrNotExist) {
return nil
}
return err
}

View File

@@ -0,0 +1,40 @@
/*
Copyright The containerd Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v2
import (
"context"
"errors"
"os"
"testing"
)
func TestCheckCopyShimLogError(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
testError := errors.New("test error")
if err := checkCopyShimLogError(ctx, nil); err != nil {
t.Fatalf("should return the actual error except ErrNotExist, but %v", err)
}
if err := checkCopyShimLogError(ctx, testError); err != testError {
t.Fatalf("should return the actual error except ErrNotExist, but %v", err)
}
if err := checkCopyShimLogError(ctx, os.ErrNotExist); err != nil {
t.Fatalf("should return nil for ErrNotExist, but %v", err)
}
}