containerd/cmd/ctr/utils.go
Stephen J Day af2718b01f
namespaces: support within containerd
To support multi-tenancy, containerd allows the collection of metadata
and runtime objects within a heirarchical storage primitive known as
namespaces. Data cannot be shared across these namespaces, unless
allowed by the service. This allows multiple sets of containers to
managed without interaction between the clients that management. This
means that different users, such as SwarmKit, K8s, Docker and others can
use containerd without coordination. Through labels, one may use
namespaces as a tool for cleanly organizing the use of containerd
containers, including the metadata storage for higher level features,
such as ACLs.

Namespaces

Namespaces cross-cut all containerd operations and are communicated via
context, either within the Go context or via GRPC headers. As a general
rule, no features are tied to namespace, other than organization. This
will be maintained into the future. They are created as a side-effect of
operating on them or may be created manually. Namespaces can be labeled
for organization. They cannot be deleted unless the namespace is empty,
although we may want to make it so one can clean up the entirety of
containerd by deleting a namespace.

Most users will interface with namespaces by setting in the
context or via the `CONTAINERD_NAMESPACE` environment variable, but the
experience is mostly left to the client. For `ctr` and `dist`, we have
defined a "default" namespace that will be created up on use, but there
is nothing special about it. As part of this PR we have plumbed this
behavior through all commands, cleaning up context management along the
way.

Namespaces in Action

Namespaces can be managed with the `ctr namespaces` subcommand. They
can be created, labeled and destroyed.

A few commands can demonstrate the power of namespaces for use with
images. First, lets create a namespace:

```
$ ctr namespaces create foo mylabel=bar
$ ctr namespaces ls
NAME LABELS
foo  mylabel=bar
```

We can see that we have a namespace `foo` and it has a label. Let's pull
an image:

```
$ dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d45bc46b48e45e8c72c41aedd2a173bcc7f1ea4084a8fcfc5251b1da2a09c0b6: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5b690bc4eaa6434456ceaccf9b3e42229bd2691869ba439e515b28fe1a66c009: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:a858478874d144f6bfc03ae2d4598e2942fc9994159f2872e39fae88d45bd847: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4cdd94354d2a873333a205a02dbb853dd763c73600e0cf64f60b4bd7ab694875: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:10a267c67f423630f3afe5e04bbbc93d578861ddcc54283526222f3ad5e895b9: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c54584150374aa94b9f7c3fbd743adcff5adead7a3cf7207b0e51551ac4a5517: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1f9221193a65eaf1b0afc4f1d4fbb7f0f209369d2696e1c07671668e150ed2b: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:71c1f30d820f0457df186531dc4478967d075ba449bd3168a3e82137a47daf03: done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.9 s total:   0.0 B (0.0 B/s)
INFO[0000] unpacking rootfs
INFO[0000] Unpacked chain id: sha256:41719840acf0f89e761f4a97c6074b6e2c6c25e3830fcb39301496b5d36f9b51
```

Now, let's list the image:

```
$ dist images ls
REF                            TYPE  DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```

That looks normal. Let's list the images for the `foo` namespace and see
this in action:

```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
```

Look at that! Nothing was pulled in the namespace `foo`. Let's do the
same pull:

```
$ CONTAINERD_NAMESPACE=foo dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d45bc46b48e45e8c72c41aedd2a173bcc7f1ea4084a8fcfc5251b1da2a09c0b6: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:a858478874d144f6bfc03ae2d4598e2942fc9994159f2872e39fae88d45bd847: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4cdd94354d2a873333a205a02dbb853dd763c73600e0cf64f60b4bd7ab694875: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c54584150374aa94b9f7c3fbd743adcff5adead7a3cf7207b0e51551ac4a5517: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:71c1f30d820f0457df186531dc4478967d075ba449bd3168a3e82137a47daf03: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1f9221193a65eaf1b0afc4f1d4fbb7f0f209369d2696e1c07671668e150ed2b: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:10a267c67f423630f3afe5e04bbbc93d578861ddcc54283526222f3ad5e895b9: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5b690bc4eaa6434456ceaccf9b3e42229bd2691869ba439e515b28fe1a66c009: done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.8 s total:   0.0 B (0.0 B/s)
INFO[0000] unpacking rootfs
INFO[0000] Unpacked chain id: sha256:41719840acf0f89e761f4a97c6074b6e2c6c25e3830fcb39301496b5d36f9b51
```

Wow, that was very snappy! Looks like we pulled that image into out
namespace but didn't have to download any new data because we are
sharing storage. Let's take a peak at the images we have in `foo`:

```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF                            TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```

Now, let's remove that image from `foo`:

```
$ CONTAINERD_NAMESPACE=foo dist images rm
docker.io/library/redis:latest
```

Looks like it is gone:

```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
```

But, as we can see, it is present in the `default` namespace:

```
$ dist images ls
REF                            TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```

What happened here? We can tell by listing the namespaces to get a
better understanding:

```
$ ctr namespaces ls
NAME    LABELS
default
foo     mylabel=bar
```

From the above, we can see that the `default` namespace was created with
the standard commands without the environment variable set. Isolating
the set of shared images while sharing the data that matters.

Since we removed the images for namespace `foo`, we can remove it now:

```
$ ctr namespaces rm foo
foo
```

However, when we try to remove the `default` namespace, we get an error:

```
$ ctr namespaces rm default
ctr: unable to delete default: rpc error: code = FailedPrecondition desc = namespace default must be empty
```

This is because we require that namespaces be empty when removed.

Caveats

- While most metadata objects are namespaced, containers and tasks may
exhibit some issues. We still need to move runtimes to namespaces and
the container metadata storage may not be fully worked out.
- Still need to migrate content store to metadata storage and namespace
the content store such that some data storage (ie images).
- Specifics of snapshot driver's relation to namespace needs to be
worked out in detail.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-06-06 13:50:33 -07:00

274 lines
7.2 KiB
Go

package main
import (
gocontext "context"
"encoding/csv"
"fmt"
"io/ioutil"
"os"
"os/signal"
"path/filepath"
"strconv"
"strings"
"syscall"
"github.com/Sirupsen/logrus"
containersapi "github.com/containerd/containerd/api/services/containers"
contentapi "github.com/containerd/containerd/api/services/content"
diffapi "github.com/containerd/containerd/api/services/diff"
"github.com/containerd/containerd/api/services/execution"
imagesapi "github.com/containerd/containerd/api/services/images"
namespacesapi "github.com/containerd/containerd/api/services/namespaces"
snapshotapi "github.com/containerd/containerd/api/services/snapshot"
versionservice "github.com/containerd/containerd/api/services/version"
"github.com/containerd/containerd/api/types/task"
"github.com/containerd/containerd/content"
"github.com/containerd/containerd/images"
"github.com/containerd/containerd/namespaces"
contentservice "github.com/containerd/containerd/services/content"
"github.com/containerd/containerd/services/diff"
imagesservice "github.com/containerd/containerd/services/images"
namespacesservice "github.com/containerd/containerd/services/namespaces"
snapshotservice "github.com/containerd/containerd/services/snapshot"
"github.com/containerd/containerd/snapshot"
specs "github.com/opencontainers/runtime-spec/specs-go"
"github.com/urfave/cli"
"google.golang.org/grpc"
)
var grpcConn *grpc.ClientConn
// appContext returns the context for a command. Should only be called once per
// command, near the start.
//
// This will ensure the namespace is picked up and set the timeout, if one is
// defined.
func appContext(clicontext *cli.Context) (gocontext.Context, gocontext.CancelFunc) {
var (
ctx = gocontext.Background()
timeout = clicontext.GlobalDuration("timeout")
namespace = clicontext.GlobalString("namespace")
cancel = func() {}
)
ctx = namespaces.WithNamespace(ctx, namespace)
if timeout > 0 {
ctx, cancel = gocontext.WithTimeout(ctx, timeout)
} else {
ctx, cancel = gocontext.WithCancel(ctx)
}
return ctx, cancel
}
func getNamespacesService(clicontext *cli.Context) (namespaces.Store, error) {
conn, err := getGRPCConnection(clicontext)
if err != nil {
return nil, err
}
return namespacesservice.NewStoreFromClient(namespacesapi.NewNamespacesClient(conn)), nil
}
func getContainersService(context *cli.Context) (containersapi.ContainersClient, error) {
conn, err := getGRPCConnection(context)
if err != nil {
return nil, err
}
return containersapi.NewContainersClient(conn), nil
}
func getTasksService(context *cli.Context) (execution.TasksClient, error) {
conn, err := getGRPCConnection(context)
if err != nil {
return nil, err
}
return execution.NewTasksClient(conn), nil
}
func getContentStore(context *cli.Context) (content.Store, error) {
conn, err := getGRPCConnection(context)
if err != nil {
return nil, err
}
return contentservice.NewStoreFromClient(contentapi.NewContentClient(conn)), nil
}
func getSnapshotter(context *cli.Context) (snapshot.Snapshotter, error) {
conn, err := getGRPCConnection(context)
if err != nil {
return nil, err
}
return snapshotservice.NewSnapshotterFromClient(snapshotapi.NewSnapshotClient(conn)), nil
}
func getImageStore(clicontext *cli.Context) (images.Store, error) {
conn, err := getGRPCConnection(clicontext)
if err != nil {
return nil, err
}
return imagesservice.NewStoreFromClient(imagesapi.NewImagesClient(conn)), nil
}
func getDiffService(context *cli.Context) (diff.DiffService, error) {
conn, err := getGRPCConnection(context)
if err != nil {
return nil, err
}
return diff.NewDiffServiceFromClient(diffapi.NewDiffClient(conn)), nil
}
func getVersionService(context *cli.Context) (versionservice.VersionClient, error) {
conn, err := getGRPCConnection(context)
if err != nil {
return nil, err
}
return versionservice.NewVersionClient(conn), nil
}
func getTempDir(id string) (string, error) {
err := os.MkdirAll(filepath.Join(os.TempDir(), "ctr"), 0700)
if err != nil {
return "", err
}
tmpDir, err := ioutil.TempDir(filepath.Join(os.TempDir(), "ctr"), fmt.Sprintf("%s-", id))
if err != nil {
return "", err
}
return tmpDir, nil
}
func waitContainer(events execution.Tasks_EventsClient, id string, pid uint32) (uint32, error) {
for {
e, err := events.Recv()
if err != nil {
return 255, err
}
if e.Type != task.Event_EXIT {
continue
}
if e.ID == id && e.Pid == pid {
return e.ExitStatus, nil
}
}
}
func forwardAllSignals(containers execution.TasksClient, id string) chan os.Signal {
sigc := make(chan os.Signal, 128)
signal.Notify(sigc)
go func() {
for s := range sigc {
logrus.Debug("Forwarding signal ", s)
killRequest := &execution.KillRequest{
ContainerID: id,
Signal: uint32(s.(syscall.Signal)),
PidOrAll: &execution.KillRequest_All{
All: false,
},
}
_, err := containers.Kill(gocontext.Background(), killRequest)
if err != nil {
logrus.Fatalln(err)
}
}
}()
return sigc
}
func parseSignal(rawSignal string) (syscall.Signal, error) {
s, err := strconv.Atoi(rawSignal)
if err == nil {
sig := syscall.Signal(s)
for _, msig := range signalMap {
if sig == msig {
return sig, nil
}
}
return -1, fmt.Errorf("unknown signal %q", rawSignal)
}
signal, ok := signalMap[strings.TrimPrefix(strings.ToUpper(rawSignal), "SIG")]
if !ok {
return -1, fmt.Errorf("unknown signal %q", rawSignal)
}
return signal, nil
}
func stopCatch(sigc chan os.Signal) {
signal.Stop(sigc)
close(sigc)
}
// parseMountFlag parses a mount string in the form "type=foo,source=/path,destination=/target,options=rbind:rw"
func parseMountFlag(m string) (specs.Mount, error) {
mount := specs.Mount{}
r := csv.NewReader(strings.NewReader(m))
fields, err := r.Read()
if err != nil {
return mount, err
}
for _, field := range fields {
v := strings.Split(field, "=")
if len(v) != 2 {
return mount, fmt.Errorf("invalid mount specification: expected key=val")
}
key := v[0]
val := v[1]
switch key {
case "type":
mount.Type = val
case "source", "src":
mount.Source = val
case "destination", "dst":
mount.Destination = val
case "options":
mount.Options = strings.Split(val, ":")
default:
return mount, fmt.Errorf("mount option %q not supported", key)
}
}
return mount, nil
}
// replaceOrAppendEnvValues returns the defaults with the overrides either
// replaced by env key or appended to the list
func replaceOrAppendEnvValues(defaults, overrides []string) []string {
cache := make(map[string]int, len(defaults))
for i, e := range defaults {
parts := strings.SplitN(e, "=", 2)
cache[parts[0]] = i
}
for _, value := range overrides {
// Values w/o = means they want this env to be removed/unset.
if !strings.Contains(value, "=") {
if i, exists := cache[value]; exists {
defaults[i] = "" // Used to indicate it should be removed
}
continue
}
// Just do a normal set/update
parts := strings.SplitN(value, "=", 2)
if i, exists := cache[parts[0]]; exists {
defaults[i] = value
} else {
defaults = append(defaults, value)
}
}
// Now remove all entries that we want to "unset"
for i := 0; i < len(defaults); i++ {
if defaults[i] == "" {
defaults = append(defaults[:i], defaults[i+1:]...)
i--
}
}
return defaults
}