These opts either inherit the parent cgroup device.list or append the
default unix devices like /dev/null /dev/random so that the container
has access.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This makes it easier for callers to call this function and populate the
config without relying on specific flags across commands.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Some images like `criu` will have extra libs that it requires. This
adds lib support via LD_LIBRARY_PATH and InstallOpts
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds a way for users to programatically install containerd binary
dependencies.
With runtime v2 and new shim's being built, it will be a challenge to
get those onto machines. Users would have to find the link, download,
place it in their path, yada yada yada.
With this functionality of a managed `/opt` directory, containerd can
use existing image and distribution infra. to get binarys, shims, etc
onto the system.
Configuration:
*default:* `/opt/containerd`
*containerd config:*
```toml
[plugins.opt]
path = "/opt/mypath"
```
Usage:
*code:*
```go
image, err := client.Pull(ctx, "docker.io/crosbymichael/runc:latest")
client.Install(ctx, image)
```
*ctr:*
```bash
ctr content fetch docker.io/crosbymichael/runc:latest
ctr install docker.io/crosbymichael/runc:latest
```
You can manage versions and see what is running via standard image
commands.
Images:
These images MUST be small and only contain binaries.
```Dockerfile
FROM scratch
Add runc /bin/runc
```
Containerd will only extract files in `/bin` of the image.
Later on, we can add support for `/lib`.
The code adds a service to manage an `/opt/containerd` directory and
provide that path to callers via the introspection service.
How to Test:
Delete runc from your system.
```bash
> sudo ctr run --rm docker.io/library/redis:alpine redis
ctr: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/default/redis/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown
> sudo ctr content fetch docker.io/crosbymichael/runc:latest
> sudo ctr install docker.io/crosbymichael/runc:latest
> sudo ctr run --rm docker.io/library/redis:alpine redis
1:C 01 Aug 15:59:52.864 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 01 Aug 15:59:52.864 # Redis version=4.0.10, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 01 Aug 15:59:52.864 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
1:M 01 Aug 15:59:52.866 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1:M 01 Aug 15:59:52.866 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted.
1:M 01 Aug 15:59:52.866 # Current maximum open files is 1024. maxclients has been reduced to 992 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1:M 01 Aug 15:59:52.870 * Running mode=standalone, port=6379.
1:M 01 Aug 15:59:52.870 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 01 Aug 15:59:52.870 # Server initialized
1:M 01 Aug 15:59:52.870 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 01 Aug 15:59:52.870 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 01 Aug 15:59:52.870 * Ready to accept connections
^C1:signal-handler (1533139193) Received SIGINT scheduling shutdown...
1:M 01 Aug 15:59:53.472 # User requested shutdown...
1:M 01 Aug 15:59:53.472 * Saving the final RDB snapshot before exiting.
1:M 01 Aug 15:59:53.484 * DB saved on disk
1:M 01 Aug 15:59:53.484 # Redis is now ready to exit, bye bye...
```
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Reorders the code so that it doesnt overwrite the previous allocation
when creating a NewTask via ctr.exe
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
We introduce a WithSpecFromFile option combinator to allow creation
simpler creation of OCI specs from a file name. Often used as the first
option in a `SpecOpts` slice, it simplifies choosing between a local
file and the built-in default.
The code in `ctr run` has been updated to use the new option, with out
changing the order of operations or functionality present there.
Signed-off-by: Stephen Day <stephen.day@getcruise.com>
Separate Fetch and Pull commands in client to distinguish
between platform specific and non-platform specific operations.
`ctr images pull` with all platforms will now unpack all platforms.
`ctr content fetch` now supports platform flags.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
`Ctr` interface follows the pattern `ctr <command> <subcommand>` except
for the `plugins` command which does not have subcommands. This feels
unnatural to certain users and they would expect that they can list
containerd plugins via `ctr plugins list`.
This commit implements their expectation so that `plugins` becomes a
command "group" and its `list` subcommand actually lists the plugins.
Signed-off-by: Danail Branekov <danailster@gmail.com>
This adds a `Load` Opt for cio to load a tasks io/fifos without
attaching or starting the copy routines.
It adds the load method in `ctr` by default so that fifos or other IO
are removed from disk on delete methods inbetween command runs. It is
not the default for all task loads for backwards compat. and a user may
want to keep io around to reuse or if log files are used.
Fixes#2421
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Commit 05513284e7 exposed the "rootfs"
and "no-pivot" flags for the "containers" command, but it accidentally
removed them for "run" since package-level variables are initialized
before package-level init functions in golang. Hoisting these flags to
a package imported by both commands solves the problem.
Signed-off-by: Felix Abecassis <fabecassis@nvidia.com>
This change allows implementations to resolve the location of the actual data
using OCI descriptor fields such as MediaType.
No OCI descriptor field is written to the store.
No change on gRPC API.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
This allows non-privileged users to use containerd. This is part of a
larger track of work integrating containerd into Cloudfoundry's garden
with support for rootless.
[#156343575]
Signed-off-by: Claudia Beresford <cberesford@pivotal.io>
This adds gc.root label to snapshots created with prepare and commit via
the CLI. WIthout this, created snapshots get immediately garbage
collected. There may be a better solution but this seems to be a solid
stop gap.
We may also need to add more functionality around snapshot labeling for
the CLI but current use cases are unclear.
Signed-off-by: Stephen J Day <stevvooe@gmail.com>
Since Go 1.7, context is a standard package, superceding the
"x/net/context". Since Go 1.9, the latter only provides a few type
aliases from the former. Therefore, it makes sense to switch to the
standard package.
This commit was generated by the following script (with a couple of
minor fixups to remove extra changes done by goimports):
#!/bin/bash
if [ $# -ge 1 ]; then
FILES=$*
else
FILES=$(git ls-files \*.go | grep -vF ".pb.go" | grep -v
^vendor/)
fi
for f in $FILES; do
printf .
sed -i -e 's|"golang.org/x/net/context"$|"context"|' $f
goimports -w $f
awk ' /^$/ {e=1; next;}
/[[:space:]]"context"$/ {e=0;}
{if (e) {print ""; e=0}; print;}' < $f > $f.new && \
mv $f.new $f
goimports -w $f
done
echo
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Allows the client to choose the context to finish the lease.
This allows the client to switch contexts when the main context
used to the create the lease may have been cancelled.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Without this `ctr run` can fail with:
ctr: parent snapshot sha256:70798fd80095f40b41baa5d107fb61532bfe494d96313fea01e8fcbf4e8743ee does not exist: not found
My image was produced by buildkit, which doesn't unpack (I think this makes
sense since buildkit doesn't know if I am going to run the image or export/push
it etc).
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Adds a useful flag to `ctr` to enable joining any existing Linux
namespaces for any namespace types (network, pid, ipc, etc.) using the
existing With helper in the oci package.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
For missing required parameters adds error return before attempting any
actions to `ctr images` commands.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
This changes the Windows runtime to use the snapshotter and differ
created layers, and updates the ctr commands to use the snapshotter and differ.
Signed-off-by: Darren Stahl <darst@microsoft.com>
To avoid buffer bloat in long running processes, we try to use buffer
pools where possible. This is meant to address shim memory usage issues,
but may not be the root cause.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This linter checks for unnecessary type convertions.
Some convertions are whitelisted because their type is different
on 32bit platforms
Signed-off-by: Daniel Nephin <dnephin@gmail.com>
Preserves the order of the tree output between each execution. Slightly
refactored the behavior to be more "object oriented".
Signed-off-by: Stephen J Day <stephen.day@docker.com>
- Use lease API (previoisly, GC was not supported)
- Refactored interfaces for ease of future Docker v1 importer support
For usage, please refer to `ctr images import --help`.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
By replacing grpc with ttrpc, we can reduce total memory runtime
requirements and binary size. With minimal code changes, the shim can
now be controlled by the much lightweight protocol, reducing the total
memory required per container.
When reviewing this change, take particular notice of the generated shim
code.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Synchronous image delete provides an option image delete to wait
until the next garbage collection deletes after an image is removed
before returning success to the caller.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Currently the output for a non-existent image reference and a valid
image reference is exactly the same on `ctr images remove`. Instead of
outputting the target ref input, if it is "not found" we should alert
the user in case of a mispelling, but continue not to make it a failure
for the command (given it supports multiple ref entries)
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
This patch changes the output of `ctr version` to align version and revision.
It also changes `.Printf()` to `.Println()`, to make the code slightly easier
to read.
Before this change:
$ ctr version
Client:
Version: v1.0.0-beta.2-132-g564600e.m
Revision: 564600ee79aefb0f24cbcecc90d4388bd0ea59de.m
Server:
Version: v1.0.0-beta.2-132-g564600e.m
Revision: 564600ee79aefb0f24cbcecc90d4388bd0ea59de.m
With this patch applied:
$ ctr version
Client:
Version: v1.0.0-beta.2-132-g564600e.m
Revision: 564600ee79aefb0f24cbcecc90d4388bd0ea59de.m
Server:
Version: v1.0.0-beta.2-132-g564600e.m
Revision: 564600ee79aefb0f24cbcecc90d4388bd0ea59de.m
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
To reduce the binary size of containerd, we no longer import the
`server` package for only a few defaults. This reduces the size of `ctr`
by 2MB. There are probably other gains elsewhere.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This keeps the semantics the same as the other commands to only list
containers, tasks, images by calling the list subcommand.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Allow a user provided name for the checkpoint as well as a default
generated name for the checkpoint image.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This allows one to manage the checkpoints by using the `ctr image`
command.
The image is created with label "containerd.io/checkpoint". By
default, it is not included in the output of `ctr images ls`.
We can list the images by using the following command:
$ ctr images ls labels.containerd.\"io/checkpoint\"==true
Fixes#1026
Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
Add differ options and package with interface.
Update optional values on diff interface to use options.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
With this change, we integrate all the plugin changes into the
introspection service.
All plugins can be listed with the following command:
```console
$ ctr plugins
TYPE ID PLATFORM STATUS
io.containerd.content.v1 content - ok
io.containerd.metadata.v1 bolt - ok
io.containerd.differ.v1 walking linux/amd64 ok
io.containerd.grpc.v1 containers - ok
io.containerd.grpc.v1 content - ok
io.containerd.grpc.v1 diff - ok
io.containerd.grpc.v1 events - ok
io.containerd.grpc.v1 healthcheck - ok
io.containerd.grpc.v1 images - ok
io.containerd.grpc.v1 namespaces - ok
io.containerd.snapshotter.v1 btrfs linux/amd64 error
io.containerd.snapshotter.v1 overlayfs linux/amd64 ok
io.containerd.grpc.v1 snapshots - ok
io.containerd.monitor.v1 cgroups linux/amd64 ok
io.containerd.runtime.v1 linux linux/amd64 ok
io.containerd.grpc.v1 tasks - ok
io.containerd.grpc.v1 version - ok
```
There are few things to note about this output. The first is that it is
printed in the order in which plugins are initialized. This useful for
debugging plugin initialization problems. Also note that even though the
introspection GPRC api is a itself a plugin, it is not listed. This is
because the plugin takes a snapshot of the initialization state at the
end of the plugin init process. This allows us to see errors from each
plugin, as they happen. If it is required to introspect the existence of
the introspection service, we can make modifications to include it in
the future.
The last thing to note is that the btrfs plugin is in an error state.
This is a common state for containerd because even though we load the
plugin, most installations aren't on top of btrfs and the plugin cannot
be used. We can actually view this error using the detailed view with a
filter:
```console
$ ctr plugins --detailed id==btrfs
Type: io.containerd.snapshotter.v1
ID: btrfs
Platforms: linux/amd64
Exports:
root /var/lib/containerd/io.containerd.snapshotter.v1.btrfs
Error:
Code: Unknown
Message: path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter
```
Along with several other values, this is a valuable tool for evaluating the
state of components in containerd.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This makes sure the client is always in sync with the server before
performing any type of operations on the container metadata.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The `Check` function returns information about an image's content components
over a content provider. From this information, one can tell which content is
required, present or missing to run an image.
The utility can be demonstrated with the `check` command:
```console
$ ctr images check
REF TYPE DIGEST STATUS SIZE
docker.io/library/alpine:latest application/vnd.docker.distribution.manifest.list.v2+json sha256:f006ecbb824d87947d0b51ab8488634bf69fe4094959d935c0c103f4820a417d incomplete (1/2) 1.5 KiB/1.9 MiB
docker.io/library/postgres:latest application/vnd.docker.distribution.manifest.v2+json sha256:2f8080b9910a8b4f38ff5a55a82e77cb43d88bdbb16d723c71d18493590832e9 complete (13/13) 99.3 MiB/99.3 MiB
docker.io/library/redis:alpine application/vnd.docker.distribution.manifest.v2+json sha256:e633cded055a94202e4ccccb8125b7f383cd6ee56527ab890db643383a2647dd incomplete (6/7) 8.1 MiB/10.0 MiB
docker.io/library/ubuntu:latest application/vnd.docker.distribution.manifest.list.v2+json sha256:60f835698ea19e8d9d3a59e68fb96fb35bc43e745941cb2ea9eaf4ba3029ed8a unavailable (0/?) 0.0 B/?
docker.io/trollin/busybox:latest application/vnd.docker.distribution.manifest.list.v2+json sha256:54a6424f7a2d5f4f27b3d69e5f9f2bc25fe9087f0449d3cb4215db349f77feae complete (2/2) 699.9 KiB/699.9 KiB
```
The above shows us that we have two incomplete images and one that is
unavailable. The incomplete images are those that we know the complete
size of all content but some are missing. "Unavailable" means that the
check could not get enough information about the image to get its full
size.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The SIGUNUSED constant was removed from golang.org/x/sys/unix in
https://go-review.googlesource.com/61771 as it is also removed from the
respective glibc headers.
This means the command
ctr tasks kill SIGUNUSED ...
will no longer work. However, the same effect can be achieved with
ctr tasks kill SIGSYS ...
as SIGSYS has the same value as SIGUNUSED used to have.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Fixes pulling of multi-arch images by limiting the expansion
of the index by filtering to the current default platform.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Content commit is updated to take in a context, allowing
content to be committed within the same context the writer
was in. This is useful when commit may be able to use more
context to complete the action rather than creating its own.
An example of this being useful is for the metadata implementation
of content, having a context allows tests to fully create
content in one database transaction by making use of the context.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
The labels can be very long (e.g. cri-containerd stores a large JSON metadata
blob as `io.cri-containerd.container.metadata`) which renders the output
useless due to all the line wrapping etc.
The information is still available in `ctr containers info «name»`.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Fixes#1431
This adds KillOpts so that a client can specify when they want to kill a
single process or all the processes inside a container.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
In order to do more advanced spec generation with images, snapshots,
etc, we need to inject the context and client into the spec generation
code.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
After the rework of server-side defaults, the `ctr snapshot` command
stopped working due to no default snapshotter.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Instead of requiring callers to read the struct fields to check for an
error, provide the exit results via a function instead which is more
natural.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
In all of the examples, its recommended to call `Wait()` before starting
a process/task.
Since `Wait()` is a blocking call, this means it must be called from a
goroutine like so:
```go
statusC := make(chan uint32)
go func() {
status, err := task.Wait(ctx)
if err != nil {
// handle async err
}
statusC <- status
}()
task.Start(ctx)
<-statusC
```
This means there is a race here where there is no guarentee when the
goroutine is going to be scheduled, and even a bit more since this
requires an RPC call to be made.
In addition, this code is very messy and a common pattern for any caller
using Wait+Start.
Instead, this changes `Wait()` to use an async model having `Wait()`
return a channel instead of the code itself.
This ensures that when `Wait()` returns that the client has a handle on
the event stream (already made the RPC request) before returning and
reduces any sort of race to how the stream is handled by grpc since we
can't guarentee that we have a goroutine running and blocked on
`Recv()`.
Making `Wait()` async also cleans up the code in the caller drastically:
```go
statusC, err := task.Wait(ctx)
if err != nil {
return err
}
task.Start(ctx)
status := <-statusC
if status.Err != nil {
return err
}
```
No more spinning up goroutines and more natural error
handling for the caller.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The argument order, naming and behavior of the snapshots command didn't
really follow any of the design constraints or conventions of the
`Snapshotter` interface. This brings the command into line with that
interface definition.
The `snapshot archive` command has been removed as it requires more
thought on design to correctly emit diffs.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
After some analysis, it was found that Content.Reader was generally
redudant to an io.ReaderAt. This change removes `Content.Reader` in
favor of a `Content.ReaderAt`. In general, `ReaderAt` can perform better
over interfaces with indeterminant latency because it avoids remote
state for reads. Where a reader is required, a helper is provided to
convert it into an `io.SectionReader`.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The syscall package is locked down and the comment in [1] advises to
switch code to use the corresponding package from golang.org/x/sys. Do
so and replace usage of package syscall with package
golang.org/x/sys/{unix,windows} where applicable.
[1] https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24
This will also allow to get updates and fixes for syscall wrappers
without having to use a new go version.
Errno, Signal and SysProcAttr aren't changed as they haven't been
implemented in x/sys/. Stat_t from syscall is used if standard library
packages (e.g. os) require it. syscall.ENOTSUP, syscall.SIGKILL and
syscall.SIGTERM are used for cross-platform files.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Snapshotters for run must be created with requested snapshotter.
The order of the options is important to ensure that the snapshotter
is set before the snapshots are created.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
This changes Wait() from returning an error whenever you call wait on a
stopped process/task to returning the exit status from the process.
This also adds the exit status to the Status() call on a process/task so
that a user can Wait(), check status, then cancel the wait to avoid
races in event handling.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This splits up the create and start of an exec process in the shim to
have two separate steps like the initial process. This will allow
better state reporting for individual process along with a more robust
wait for execs.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This change further plumbs the components required for implementing
event filters. Specifically, we now have the ability to filter on the
`topic` and `namespace`.
In the course of implementing this functionality, it was found that
there were mismatches in the events API that created extra serialization
round trips. A modification to `typeurl.MarshalAny` and a clear
separation between publishing and forwarding allow us to avoid these
serialization issues.
Unfortunately, this has required a few tweaks to the GRPC API, so this
is a breaking change. `Publish` and `Forward` have been clearly separated in
the GRPC API. `Publish` honors the contextual namespace and performs
timestamping while `Forward` simply validates and forwards. The behavior
of `Subscribe` is to propagate events for all namespaces unless
specifically filtered (and hence the relation to this particular change.
The following is an example of using filters to monitor the task events
generated while running the [bucketbench tool](https://github.com/estesp/bucketbench):
```
$ ctr events 'topic~=/tasks/.+,namespace==bb'
...
2017-07-28 22:19:51.78944874 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-6-8","pid":25889}
2017-07-28 22:19:51.791893688 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-4-8","pid":25882}
2017-07-28 22:19:51.792608389 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-2-9","pid":25860}
2017-07-28 22:19:51.793035217 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-5-6","pid":25869}
2017-07-28 22:19:51.802659622 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-0-7","pid":25877}
2017-07-28 22:19:51.805192898 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-3-6","pid":25856}
2017-07-28 22:19:51.832374931 +0000 UTC bb /tasks/exit {"container_id":"bb-ctr-8-6","id":"bb-ctr-8-6","pid":25864,"exited_at":"2017-07-28T22:19:51.832013043Z"}
2017-07-28 22:19:51.84001249 +0000 UTC bb /tasks/exit {"container_id":"bb-ctr-2-9","id":"bb-ctr-2-9","pid":25860,"exited_at":"2017-07-28T22:19:51.839717714Z"}
2017-07-28 22:19:51.840272635 +0000 UTC bb /tasks/exit {"container_id":"bb-ctr-7-6","id":"bb-ctr-7-6","pid":25855,"exited_at":"2017-07-28T22:19:51.839796335Z"}
...
```
In addition to the events changes, we now display the namespace origin
of the event in the cli tool.
This will be followed by a PR to add individual field filtering for the
events API for each event type.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Export as a tar (Note: "-" can be used for stdout):
$ ctr images export /tmp/oci-busybox.tar docker.io/library/busybox:latest
Import a tar (Note: "-" can be used for stdin):
$ ctr images import foo/new:latest /tmp/oci-busybox.tar
Note: media types are not converted at the moment: e.g.
application/vnd.docker.image.rootfs.diff.tar.gzip
-> application/vnd.oci.image.layer.v1.tar+gzip
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
e.g. ctr run -t --rm --rootfs /tmp/busybox-rootfs foo /bin/sh
(--rm removes the container but does not remove rootfs dir, of course)
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
In the course of setting out to add filters and address some cleanup, it
was found that we had a few problems in the events subsystem that needed
addressing before moving forward.
The biggest change was to move to the more standard terminology of
publish and subscribe. We make this terminology change across the Go
interface and the GRPC API, making the behavior more familier. The
previous system was very context-oriented, which is no longer required.
With this, we've removed a large amount of dead and unneeded code. Event
transactions, context storage and the concept of `Poster` is gone. This
has been replaced in most places with a `Publisher`, which matches the
actual usage throughout the codebase, removing the need for helpers.
There are still some questions around the way events are handled in the
shim. Right now, we've preserved some of the existing bugs which may
require more extensive changes to resolve correctly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
What started out as a simple PR to remove the "Readonly" column became an
adventure to add a proper type for a "View" snapshot. The short story here is
that we now get the following output:
```
$ sudo ctr snapshot ls
ID PARENT KIND
sha256:08c2295a7fa5c220b0f60c994362d290429ad92f6e0235509db91582809442f3 Committed
testing4 sha256:08c2295a7fa5c220b0f60c994362d290429ad92f6e0235509db91582809442f3 Active
```
In pursuing this output, it was found that the idea of having "readonly" as an
attribute on all snapshots was redundant. For committed, they are always
readonly, as they are not accessible without an active snapshot. For active
snapshots that were views, we'd have to check the type before interpreting
"readonly". With this PR, this is baked fully into the kind of snapshot. When
`Snapshotter.View` is called, the kind of snapshot is `KindView`, and the
storage system reflects this end to end.
Unfortunately, this will break existing users. There is no migration, so they
will have to wipe `/var/lib/containerd` and recreate everything. However, this
is deemed worthwhile at this point, as we won't have to judge validity of the
"Readonly" field when new snapshot types are added.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This changeset:
- adds `mount` subcommand to `ctr snapshot`
- adds `snapshot-name` flag for specifying target snapshot name in both `mount`
and `prepare` snapshot subcommands
Signed-off-by: Sunny Gogoi <me@darkowlzz.space>
Signed-off-by: rajasec <rajasec79@gmail.com>
Updating the usage and errors for ctr run command
Signed-off-by: rajasec <rajasec79@gmail.com>
Updating the usage of run command
Signed-off-by: rajasec <rajasec79@gmail.com>
Reverting back the imports
Signed-off-by: rajasec <rajasec79@gmail.com>
Rather than make a large PR, we can move parts of the dist commands over
piece by piece. This first step moves over the images command. Others
will follow.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Rather than using the more verbose `set-labels` command, we are changing
the command to set labels for various objects to `label`, as it can be
used as a verb. This matches changes in the content store labeling.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
e.g. dist pull --snapshotter btrfs ...; ctr run --snapshotter btrfs ...
(empty string defaults for overlayfs)
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This removes the RuntimeEvent super proto with enums into separate
runtime event protos to be inline with the other events that are output
by containerd.
This also renames the runtime events into Task* events.
Fixes#1071
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This changeset adds `prepare` subcommand to `ctr snapshot` and removes
`prepare` from `dist rootfs` to keep the basic snapshot operation commands
together.
Signed-off-by: Sunny Gogoi <me@darkowlzz.space>
Marshaling Container interface resulted in empty json. Use Container proto
struct to get proper container attributes.
Signed-off-by: Sunny Gogoi <me@darkowlzz.space>
Now that we have most of the services required for use with containerd,
it was found that common patterns were used throughout services. By
defining a central `errdefs` package, we ensure that services will map
errors to and from grpc consistently and cleanly. One can decorate an
error with as much context as necessary, using `pkg/errors` and still
have the error mapped correctly via grpc.
We make a few sacrifices. At this point, the common errors we use across
the repository all map directly to grpc error codes. While this seems
positively crazy, it actually works out quite well. The error conditions
that were specific weren't super necessary and the ones that were
necessary now simply have better context information. We lose the
ability to add new codes, but this constraint may not be a bad thing.
Effectively, as long as one uses the errors defined in `errdefs`, the
error class will be mapped correctly across the grpc boundary and
everything will be good. If you don't use those definitions, the error
maps to "unknown" and the error message is preserved.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Fix the behavior of removing snapshot on container delete.
Adds a flag to keep the snapshot if desired.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Runtime is not printed while container listing due to typo introduced
in #935.
This fixes the Typo.
Signed-off-by: Kunal Kushwaha <kushwaha_kunal_v7@lab.ntt.co.jp>
This moves the shim's API and protos out of the containerd services
package and into the linux runtime package. This is because the shim is
an implementation detail of the linux runtime that we have and it is not
a containerd user facing api.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
When using events, it was found to be fairly unwieldy with a number of
extra packages. For the most part, when interacting with the events
service, we want types of the same version of the service. This has been
accomplished by moving all events types into the events package.
In addition, several fixes to the way events are marshaled have been
included. Specifically, we defer to the protobuf type registration
system to assemble events and type urls, with a little bit sheen on top
of add a containerd.io oriented namespace.
This has resulted in much cleaner event consumption and has removed the
reliance on error prone type urls, in favor of concrete types.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Move existing snapshot command to archive subcommand of snapshot.
Add list command for listing snapshots.
Add usage command for showing snapshot disk usage.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: update events package to include emitter and use envelope proto
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: enable events service and update ctr events to use events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
event listeners
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: helper func for emitting in services
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: improved cli for containers and tasks
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
create event envelope with poster
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: introspect event data to use for type url
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: use pb encoding; add event types
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument content and snapshot services with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument image service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument namespace service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add namespace support
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: only send events from namespace requested from client
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: switch to go-events for broadcasting
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
It is unused since 4c1af8fdd8 ("Port ctr to use client") and leaving it
around will just tempt people into writing code with security holes.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
We need a separate API for handing the exit status and deletion of
Exec'd processes to make sure they are properly cleaned up within the
shim and daemon.
Fixes#973
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
To support multi-tenancy, containerd allows the collection of metadata
and runtime objects within a heirarchical storage primitive known as
namespaces. Data cannot be shared across these namespaces, unless
allowed by the service. This allows multiple sets of containers to
managed without interaction between the clients that management. This
means that different users, such as SwarmKit, K8s, Docker and others can
use containerd without coordination. Through labels, one may use
namespaces as a tool for cleanly organizing the use of containerd
containers, including the metadata storage for higher level features,
such as ACLs.
Namespaces
Namespaces cross-cut all containerd operations and are communicated via
context, either within the Go context or via GRPC headers. As a general
rule, no features are tied to namespace, other than organization. This
will be maintained into the future. They are created as a side-effect of
operating on them or may be created manually. Namespaces can be labeled
for organization. They cannot be deleted unless the namespace is empty,
although we may want to make it so one can clean up the entirety of
containerd by deleting a namespace.
Most users will interface with namespaces by setting in the
context or via the `CONTAINERD_NAMESPACE` environment variable, but the
experience is mostly left to the client. For `ctr` and `dist`, we have
defined a "default" namespace that will be created up on use, but there
is nothing special about it. As part of this PR we have plumbed this
behavior through all commands, cleaning up context management along the
way.
Namespaces in Action
Namespaces can be managed with the `ctr namespaces` subcommand. They
can be created, labeled and destroyed.
A few commands can demonstrate the power of namespaces for use with
images. First, lets create a namespace:
```
$ ctr namespaces create foo mylabel=bar
$ ctr namespaces ls
NAME LABELS
foo mylabel=bar
```
We can see that we have a namespace `foo` and it has a label. Let's pull
an image:
```
$ dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d45bc46b48e45e8c72c41aedd2a173bcc7f1ea4084a8fcfc5251b1da2a09c0b6: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5b690bc4eaa6434456ceaccf9b3e42229bd2691869ba439e515b28fe1a66c009: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:a858478874d144f6bfc03ae2d4598e2942fc9994159f2872e39fae88d45bd847: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4cdd94354d2a873333a205a02dbb853dd763c73600e0cf64f60b4bd7ab694875: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:10a267c67f423630f3afe5e04bbbc93d578861ddcc54283526222f3ad5e895b9: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c54584150374aa94b9f7c3fbd743adcff5adead7a3cf7207b0e51551ac4a5517: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1f9221193a65eaf1b0afc4f1d4fbb7f0f209369d2696e1c07671668e150ed2b: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:71c1f30d820f0457df186531dc4478967d075ba449bd3168a3e82137a47daf03: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.9 s total: 0.0 B (0.0 B/s)
INFO[0000] unpacking rootfs
INFO[0000] Unpacked chain id: sha256:41719840acf0f89e761f4a97c6074b6e2c6c25e3830fcb39301496b5d36f9b51
```
Now, let's list the image:
```
$ dist images ls
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```
That looks normal. Let's list the images for the `foo` namespace and see
this in action:
```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
```
Look at that! Nothing was pulled in the namespace `foo`. Let's do the
same pull:
```
$ CONTAINERD_NAMESPACE=foo dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d45bc46b48e45e8c72c41aedd2a173bcc7f1ea4084a8fcfc5251b1da2a09c0b6: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:a858478874d144f6bfc03ae2d4598e2942fc9994159f2872e39fae88d45bd847: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4cdd94354d2a873333a205a02dbb853dd763c73600e0cf64f60b4bd7ab694875: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c54584150374aa94b9f7c3fbd743adcff5adead7a3cf7207b0e51551ac4a5517: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:71c1f30d820f0457df186531dc4478967d075ba449bd3168a3e82137a47daf03: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1f9221193a65eaf1b0afc4f1d4fbb7f0f209369d2696e1c07671668e150ed2b: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:10a267c67f423630f3afe5e04bbbc93d578861ddcc54283526222f3ad5e895b9: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5b690bc4eaa6434456ceaccf9b3e42229bd2691869ba439e515b28fe1a66c009: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.8 s total: 0.0 B (0.0 B/s)
INFO[0000] unpacking rootfs
INFO[0000] Unpacked chain id: sha256:41719840acf0f89e761f4a97c6074b6e2c6c25e3830fcb39301496b5d36f9b51
```
Wow, that was very snappy! Looks like we pulled that image into out
namespace but didn't have to download any new data because we are
sharing storage. Let's take a peak at the images we have in `foo`:
```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```
Now, let's remove that image from `foo`:
```
$ CONTAINERD_NAMESPACE=foo dist images rm
docker.io/library/redis:latest
```
Looks like it is gone:
```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
```
But, as we can see, it is present in the `default` namespace:
```
$ dist images ls
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```
What happened here? We can tell by listing the namespaces to get a
better understanding:
```
$ ctr namespaces ls
NAME LABELS
default
foo mylabel=bar
```
From the above, we can see that the `default` namespace was created with
the standard commands without the environment variable set. Isolating
the set of shared images while sharing the data that matters.
Since we removed the images for namespace `foo`, we can remove it now:
```
$ ctr namespaces rm foo
foo
```
However, when we try to remove the `default` namespace, we get an error:
```
$ ctr namespaces rm default
ctr: unable to delete default: rpc error: code = FailedPrecondition desc = namespace default must be empty
```
This is because we require that namespaces be empty when removed.
Caveats
- While most metadata objects are namespaced, containers and tasks may
exhibit some issues. We still need to move runtimes to namespaces and
the container metadata storage may not be fully worked out.
- Still need to migrate content store to metadata storage and namespace
the content store such that some data storage (ie images).
- Specifics of snapshot driver's relation to namespace needs to be
worked out in detail.
Signed-off-by: Stephen J Day <stephen.day@docker.com>