Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: update events package to include emitter and use envelope proto
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: enable events service and update ctr events to use events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
event listeners
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: helper func for emitting in services
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: improved cli for containers and tasks
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
create event envelope with poster
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: introspect event data to use for type url
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: use pb encoding; add event types
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument content and snapshot services with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument image service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument namespace service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add namespace support
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: only send events from namespace requested from client
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: switch to go-events for broadcasting
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
It is unused since 4c1af8fdd8 ("Port ctr to use client") and leaving it
around will just tempt people into writing code with security holes.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
We need a separate API for handing the exit status and deletion of
Exec'd processes to make sure they are properly cleaned up within the
shim and daemon.
Fixes#973
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
When using WithBlock() on the dialer, the connection timeout must fully
expire before any status is provided to the user about whether they can
even connect to the socket. For example, if the containerd socket is
root-owned and the user tries `dist images ls` without `sudo`, the
default is 30 sec. of "hang" before the command returns.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Replaced pull unpacker with boolean to call unpack.
Added unpack and target to image type.
Updated progress logic for pull.
Added list images to client.
Updated rootfs unpacker to use client.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
To support multi-tenancy, containerd allows the collection of metadata
and runtime objects within a heirarchical storage primitive known as
namespaces. Data cannot be shared across these namespaces, unless
allowed by the service. This allows multiple sets of containers to
managed without interaction between the clients that management. This
means that different users, such as SwarmKit, K8s, Docker and others can
use containerd without coordination. Through labels, one may use
namespaces as a tool for cleanly organizing the use of containerd
containers, including the metadata storage for higher level features,
such as ACLs.
Namespaces
Namespaces cross-cut all containerd operations and are communicated via
context, either within the Go context or via GRPC headers. As a general
rule, no features are tied to namespace, other than organization. This
will be maintained into the future. They are created as a side-effect of
operating on them or may be created manually. Namespaces can be labeled
for organization. They cannot be deleted unless the namespace is empty,
although we may want to make it so one can clean up the entirety of
containerd by deleting a namespace.
Most users will interface with namespaces by setting in the
context or via the `CONTAINERD_NAMESPACE` environment variable, but the
experience is mostly left to the client. For `ctr` and `dist`, we have
defined a "default" namespace that will be created up on use, but there
is nothing special about it. As part of this PR we have plumbed this
behavior through all commands, cleaning up context management along the
way.
Namespaces in Action
Namespaces can be managed with the `ctr namespaces` subcommand. They
can be created, labeled and destroyed.
A few commands can demonstrate the power of namespaces for use with
images. First, lets create a namespace:
```
$ ctr namespaces create foo mylabel=bar
$ ctr namespaces ls
NAME LABELS
foo mylabel=bar
```
We can see that we have a namespace `foo` and it has a label. Let's pull
an image:
```
$ dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d45bc46b48e45e8c72c41aedd2a173bcc7f1ea4084a8fcfc5251b1da2a09c0b6: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5b690bc4eaa6434456ceaccf9b3e42229bd2691869ba439e515b28fe1a66c009: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:a858478874d144f6bfc03ae2d4598e2942fc9994159f2872e39fae88d45bd847: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4cdd94354d2a873333a205a02dbb853dd763c73600e0cf64f60b4bd7ab694875: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:10a267c67f423630f3afe5e04bbbc93d578861ddcc54283526222f3ad5e895b9: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c54584150374aa94b9f7c3fbd743adcff5adead7a3cf7207b0e51551ac4a5517: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1f9221193a65eaf1b0afc4f1d4fbb7f0f209369d2696e1c07671668e150ed2b: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:71c1f30d820f0457df186531dc4478967d075ba449bd3168a3e82137a47daf03: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.9 s total: 0.0 B (0.0 B/s)
INFO[0000] unpacking rootfs
INFO[0000] Unpacked chain id: sha256:41719840acf0f89e761f4a97c6074b6e2c6c25e3830fcb39301496b5d36f9b51
```
Now, let's list the image:
```
$ dist images ls
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```
That looks normal. Let's list the images for the `foo` namespace and see
this in action:
```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
```
Look at that! Nothing was pulled in the namespace `foo`. Let's do the
same pull:
```
$ CONTAINERD_NAMESPACE=foo dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d45bc46b48e45e8c72c41aedd2a173bcc7f1ea4084a8fcfc5251b1da2a09c0b6: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:a858478874d144f6bfc03ae2d4598e2942fc9994159f2872e39fae88d45bd847: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4cdd94354d2a873333a205a02dbb853dd763c73600e0cf64f60b4bd7ab694875: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:c54584150374aa94b9f7c3fbd743adcff5adead7a3cf7207b0e51551ac4a5517: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:71c1f30d820f0457df186531dc4478967d075ba449bd3168a3e82137a47daf03: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1f9221193a65eaf1b0afc4f1d4fbb7f0f209369d2696e1c07671668e150ed2b: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:10a267c67f423630f3afe5e04bbbc93d578861ddcc54283526222f3ad5e895b9: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5b690bc4eaa6434456ceaccf9b3e42229bd2691869ba439e515b28fe1a66c009: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.8 s total: 0.0 B (0.0 B/s)
INFO[0000] unpacking rootfs
INFO[0000] Unpacked chain id: sha256:41719840acf0f89e761f4a97c6074b6e2c6c25e3830fcb39301496b5d36f9b51
```
Wow, that was very snappy! Looks like we pulled that image into out
namespace but didn't have to download any new data because we are
sharing storage. Let's take a peak at the images we have in `foo`:
```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```
Now, let's remove that image from `foo`:
```
$ CONTAINERD_NAMESPACE=foo dist images rm
docker.io/library/redis:latest
```
Looks like it is gone:
```
$ CONTAINERD_NAMESPACE=foo dist images ls
REF TYPE DIGEST SIZE
```
But, as we can see, it is present in the `default` namespace:
```
$ dist images ls
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:548a75066f3f280eb017a6ccda34c561ccf4f25459ef8e36d6ea582b6af1decf 72.7 MiB
```
What happened here? We can tell by listing the namespaces to get a
better understanding:
```
$ ctr namespaces ls
NAME LABELS
default
foo mylabel=bar
```
From the above, we can see that the `default` namespace was created with
the standard commands without the environment variable set. Isolating
the set of shared images while sharing the data that matters.
Since we removed the images for namespace `foo`, we can remove it now:
```
$ ctr namespaces rm foo
foo
```
However, when we try to remove the `default` namespace, we get an error:
```
$ ctr namespaces rm default
ctr: unable to delete default: rpc error: code = FailedPrecondition desc = namespace default must be empty
```
This is because we require that namespaces be empty when removed.
Caveats
- While most metadata objects are namespaced, containers and tasks may
exhibit some issues. We still need to move runtimes to namespaces and
the container metadata storage may not be fully worked out.
- Still need to migrate content store to metadata storage and namespace
the content store such that some data storage (ie images).
- Specifics of snapshot driver's relation to namespace needs to be
worked out in detail.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The implementations for the storage of metadata have been merged into a
single metadata package where they can share storage primitives and
techniques. The is a requisite for the addition of namespaces, which
will require a coordinated layout for records to be organized by
namespace.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This avoids issues with the various deferred error handlers in the event that
`err` is shadowed or named differently, which this function currently avoids
but which is an easy trap to fall into.
Since named return values are all or nothing we need to name the waitGroup too
and adjust the code to suite.
Thanks to Aaron Lehmann for the suggestion, see also
https://github.com/docker/swarmkit/pull/1965#discussion_r118137410
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Split resolver to only return a name with separate methods
for getting a fetcher and pusher. Add implementation for
push.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Working from feedback on the existing implementation, we have now
introduced a central metadata object to represent the lifecycle and pin
the resources required to implement what people today know as
containers. This includes the runtime specification and the root
filesystem snapshots. We also allow arbitrary labeling of the container.
Such provisions will bring the containerd definition of container closer
to what is expected by users.
The objects that encompass today's ContainerService, centered around the
runtime, will be known as tasks. These tasks take on the existing
lifecycle behavior of containerd's containers, which means that they are
deleted when they exit. Largely, there are no other changes except for
naming.
The `Container` object will operate purely as a metadata object. No
runtime state will be held on `Container`. It only informs the execution
service on what is required for creating tasks and the resources in use
by that container. The resources referenced by that container will be
deleted when the container is deleted, if not in use. In this sense,
users can create, list, label and delete containers in a similar way as
they do with docker today, without the complexity of runtime locks that
plagues current implementations.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This moves both the Mount type and mountinfo into a single mount
package.
This also opens up the root of the repo to hold the containerd client
implementation.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Update go-runc to 49b2a02ec1ed3e4ae52d30b54a291b75
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add shim to restore creation
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Keep checkpoint path in service
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add C/R to non-shim build
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Checkpoint rw and image
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Pause container on bind checkpoints
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Return dump.log in error on checkpoint failure
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Pause container for checkpoint
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Update runc to 639454475cb9c8b861cc599f8bcd5c8c790ae402
For checkpoint into to work you need runc version
639454475cb9c8b861cc599f8bcd5c8c790ae402 + and criu 3.0 as this is what
I have been testing with.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Move restore behind create calls
This remove the restore RPCs in favor of providing the checkpoint
information to the `Create` calls of a container. If provided, the
container will be created/restored from the checkpoint instead of an
existing container.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Regen protos after rebase
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds support for signalling a container process by pid.
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
make Ps more extensible
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
ps: windows support
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
Update go-runc to master with portability fixes.
Subreaper only exists on Linux, and only Linux runs the shim in a
mount namespace.
With these changes the shim compiles on Darwin, which means the
whole build compiles without errors now.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Remove rootfs service in place of snapshot service. Adds
diff service for extracting and creating diffs. Diff
creation is not yet implemented. This service allows
pulling or creating images without needing root access to
mount. Additionally in the future this will allow containerd
to ensure extractions happen safely in a chroot if needed.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
The split between provider and ingester was a long standing division
reflecting the client-side use cases. For the most part, we were
differentiating these for the algorithms that operate them, but it made
instantation and use of the types challenging. On the server-side, this
distinction is generally less important. This change unifies these types
and in the process we get a few benefits.
The first is that we now completely access the content store over GRPC.
This was the initial intent and we have now satisfied this goal
completely. There are a few issues around listing content and getting
status, but we resolve these with simple streaming and regexp filters.
More can probably be done to polish this but the result is clean.
Several other content-oriented methods were polished in the process of
unification. We have now properly seperated out the `Abort` method to
cancel ongoing or stalled ingest processes. We have also replaced the
`Active` method with a single status method.
The transition went extremely smoothly. Once the clients were updated to
use the new methods, every thing worked as expected on the first
compile.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This updates containerd to use the latest versions of cgroups, fifo,
console, and go-runc from the containerd org.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds pause and unpause to containerd's execution service and the
same commands to the `ctr` client.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Leave in btrfs by default, but add go build tags to exclude it.
`go build -tags containerd_no_btrfs` will leave that driver out.
As the current containerd/btrfs code needs link to libbtrfs*.so, but not
all distros provide it.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
This mainly fixes Linux vs generic Unix differences, with some
differences between Darwin and Freebsd (which are close bit not
identical). Should make fixing for other Unix platforms easier.
Note there are not yet `runc` equivalents for these platforms;
my current use case is image manipulation for the `moby` tool.
However there is interest in OCI runtime ports for both platforms.
Current status is that MacOS can build and run `ctr`, `dist`
and `containerd` and some operations are supported. FreeBSD 11
still needs some more fixes to continuity for extended attributes.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This allows one to edit content in the content store with their favorite
editor. It is as simple as this:
```console
$ dist content edit sha256:58e1a1bb75db1b5a24a462dd5e2915277ea06438c3f105138f97eb53149673c4
```
The above will pop up your $EDITOR, where you can make changes to the content.
When you are done, save and the new version will be added to the content store.
The digest of the new content will be printed to stdout:
```console
sha256:247f30ac320db65f3314b63b908a3aeaac5813eade6cabc9198b5883b22807bc
```
We can then retrieve the content quite easily:
```console
$ dist content get sha256:247f30ac320db65f3314b63b908a3aeaac5813eade6cabc9198b5883b22807bc
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 1278,
"digest": "sha256:4a415e3663882fbc554ee830889c68a33b3585503892cc718a4698e91ef2a526"
},
"annotations": {},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1905270,
"digest": "sha256:627beaf3eaaff1c0bc3311d60fb933c17ad04fe377e1043d9593646d8ae3bfe1"
}
]
}
```
In this case, an annotations field was added to the original manifest.
While this implementation is very simple, we can add all sorts of validation
and tooling to allow one to edit images inline. Coupled with declaring the
mediatype, we could return specific errors that can allow a user to craft
valid, working modifications to images for testing and profit.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Updates the filemode on the grpc socket to have group write
permission which is needed to perform GRPC. Additionally, ensure
the run directory has the specified group ownership and has group
read and enter permission.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
This adds a config option to set the oom score for the containerd daemon
as well as automatically setting the oom score for the shim's lauched so
that they are not killed until the very end of an out of memory
condition.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
rather than automagically doing this, it is the user's responsibility to
review the output of `containerd config default` and create the config
themselves.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
when wanting to craft a custom config, but based on the default config,
add a route to output the containerd config to a tempfile.
Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>
With this changeset, image store access is now moved to completely
accessible over GRPC. No clients manipulate the image store database
directly and the GRPC client is fully featured. The metadata database is
now managed by the daemon and access coordinated via services.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This adds very simple deletion of images by name. We still need to
consider the approach to handling image name, so this may change. For
the time being, it allows one to delete an image entry in the metadata
database.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
We need to set +x on the overlay dirs or after dropping from root to a
non-root user an eperm will happen on exec or other file access
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Allow usage of the experimental docker resolver as a package. There are
very few changes to the consuming code, demonstrating the effectiveness
of the abstraction. This move will allow future contributions to a more
featured resolver implementation.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
As a demonstration of the power of the visitor implementation, we now
report the image size in the `dist images` command. This is the size of
the packed resources as would be pushed into a remote. A similar method
could be added to calculate the unpacked size.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With this changeset, we now have a proof of concept of end to end pull.
Up to this point, the relationship between subsystems has been somewhat
theoretical. We now leverage fetching, the snapshot drivers, the rootfs
service, image metadata and the execution service, validating the proposed
model for containerd. There are a few caveats, including the need to move some
of the access into GRPC services, but the basic components are there.
The first command we will cover here is `dist pull`. This is the analog
of `docker pull` and `git pull`. It performs a full resource fetch for
an image and unpacks the root filesystem into the snapshot drivers. An
example follows:
``` console
$ sudo ./bin/dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:4c8fb09e8d634ab823b1c125e64f0e1ceaf216025aa38283ea1b42997f1e8059: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:3b281f2bcae3b25c701d53a219924fffe79bdb74385340b73a539ed4020999c4: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:e4a35914679d05d25e2fccfd310fde1aa59ffbbf1b0b9d36f7b03db5ca0311b0: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4b7726832aec75f0a742266c7190c4d2217492722dfd603406208eaa902648d8: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:338a7133395941c85087522582af182d2f6477dbf54ba769cb24ec4fd91d728f: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:83f12ff60ff1132d1e59845e26c41968406b4176c1a85a50506c954696b21570: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:693502eb7dfbc6b94964ae66ebc72d3e32facd981c72995b09794f1e87bac184: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:622732cddc347afc9360b4b04b46c6f758191a1dc73d007f95548658847ee67e: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:19a7e34366a6f558336c364693df538c38307484b729a36fede76432789f084f: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 1.6 s total: 0.0 B (0.0 B/s)
INFO[0001] unpacking rootfs
```
Note that we haven't integrated rootfs unpacking into the status output, but we
pretty much have what is in docker today (:P). We can see the result of our pull
with the following:
```console
$ sudo ./bin/dist images
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:4c8fb09e8d634ab823b1c125e64f0e1ceaf216025aa38283ea1b42997f1e8059 1.8 kB
```
The above shows that we have an image called "docker.io/library/redis:latest"
mapped to the given digest marked with a specific format. We get the size of
the manifest right now, not the full image, but we can add more as we need it.
For the most part, this is all that is needed, but a few tweaks to the model
for naming may need to be added. Specifically, we may want to index under a few
different names, including those qualified by hash or matched by tag versions.
We can do more work in this area as we develop the metadata store.
The name shown above can then be used to run the actual container image. We can
do this with the following command:
```console
$ sudo ./bin/ctr run --id foo docker.io/library/redis:latest /usr/local/bin/redis-server
1:C 17 Mar 17:20:25.316 # Warning: no config file specified, using the default config. In order to specify a config file use /usr/local/bin/redis-server /path/to/redis.conf
1:M 17 Mar 17:20:25.317 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.2.8 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 1
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
1:M 17 Mar 17:20:25.326 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 17 Mar 17:20:25.326 # Server started, Redis version 3.2.8
1:M 17 Mar 17:20:25.326 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 17 Mar 17:20:25.326 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 17 Mar 17:20:25.326 * The server is now ready to accept connections on port 6379
```
Wow! So, now we are running `redis`!
There are still a few things to work out. Notice that we have to specify the
command as part of the arguments to `ctr run`. This is because are not yet
reading the image config and converting it to an OCI runtime config. With the
base laid in this PR, adding such functionality should be straightforward.
While this is a _little_ messy, this is great progress. It should be easy
sailing from here.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With this PR, we introduce the concept of image handlers. They support
walking a tree of image resource descriptors for doing various tasks
related to processing them. Handlers can be dispatched sequentially or
in parallel and can be stacked for various effects.
The main functionality we introduce here is parameterized fetch without
coupling format resolution to the process itself. Two important
handlers, `remotes.FetchHandler` and `image.ChildrenHandler` can be
composed to implement recursive fetch with full status reporting. The
approach can also be modified to filter based on platform or other
constraints, unlocking a lot of possibilities.
This also includes some light refactoring in the fetch command, in
preparation for submission of end to end pull.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The service can use the snapshotter directly to get the rootfs.
Removed debug line for mount response.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This reuses the exiting shim code and services to let containerd run as
the reaper for all container processes without the use of a shim.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
After receiving feedback during containerd summit walk through of the
pull POC, we found that the resolution flow for names was out of place.
We could see this present in awkward places where we were trying to
re-resolve whether something was a digest or a tag and extra retries to
various endpoints.
By centering this problem around, "what do we write in the metadata
store?", the following interface comes about:
```
Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error)
```
The above takes an "opaque" reference (we'll get to this later) and
returns the canonical name for the object, a content description of the
object and a `Fetcher` that can be used to retrieve the object and its
child resources. We can write `name` into the metadata store, pointing
at the descriptor. Descisions about discovery, trust, provenance,
distribution are completely abstracted away from the pulling code.
A first response to such a monstrosity is "that is a lot of return
arguments". When we look at the actual, we can see that in practice, the
usage pattern works well, albeit we don't quite demonstrate the utility
of `name`, which will be more apparent later. Designs that allowed
separate resolution of the `Fetcher` and the return of a collected
object were considered. Let's give this a chance before we go
refactoring this further.
With this change, we introduce a reference package with helps for
remotes to decompose "docker-esque" references into consituent
components, without arbitrarily enforcing those opinions on the backend.
Utlimately, the name and the reference used to qualify that name are
completely opaque to containerd. Obviously, implementors will need to
show some candor in following some conventions, but the possibilities
are fairly wide. Structurally, we still maintain the concept of the
locator and object but the interpretation is up to the resolver.
For the most part, the `dist` tool operates exactly the same, except
objects can be fetched with a reference:
```
dist fetch docker.io/library/redis:latest
```
The above should work well with a running containerd instance. I
recommend giving this a try with `fetch-object`, as well. With
`fetch-object`, it is easy for one to better understand the intricacies
of the OCI/Docker image formats.
Ultimately, this serves the main purpose of the elusive "metadata
store".
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With the rename of fetch to fetch-object, we now introduce the `fetch`
command. It will fetch all of the resources required for an image into
the content store. We'll still need to follow this up with metadata
registration but this is a good start.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
To make using the `fetch-object` for demonstrations much easier, the
mediatypes are defaulted when a non-digest object identifier is
provided. We also add support for OCI mediatypes, although they are
mostly unavailable.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
To allow us to differentiate from fetching an image, fetch a part of an
image and pulling an image, we now call the `fetch` command the
`fetch-object` command. We can now introduce a command that does the
complete image fetch without creating snapshots, allowing `pull` to
perform the entire process.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Allow deletion of content over the GRPC interface. For now, we are going
with a model that conducts reference management outside of the content
store, in the metadata store but this design is valid either way.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
When using the fetcher concurrently, the loop modifying the closed
`base` parameter was causing urls from different digests to be returned
randomly. We copy the the value and then modify it to make it work
correctly.
Luckily, we are using content addressable storage or this would have
been undetectable.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
A previous PR placed the version string replacement in the `init`
function in the other commands. This makes this same change consistently
in the `dist` tool.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Because of the plugin findings and having the default runtime builtin
this makes it much better for development and testing.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
For clients which only want to know about one container this is simpler than
searching the result of execution.List.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
After implementing pull, a few changes are required to the content store
interface to make sure that the implementation works smoothly.
Specifically, we work to make sure the predeclaration path for digests
works the same between remote and local writers. Before, we were
hesitent to require the the size and digest up front, but it became
clear that having this provided significant benefit.
There are also several cleanups related to naming. We now call the
expected digest `Expected` consistently across the board and `Total` is
used to mark the expected size.
This whole effort comes together to provide a very smooth status
reporting workflow for image pull and push. This will be more obvious
when the bulk of pull code lands.
There are a few other changes to make `content.WriteBlob` more broadly
useful. In accordance with addition for predeclaring expected size when
getting a `Writer`, `WriteBlob` now supports this fully. It will also
resume downloads if provided an `io.Seeker` or `io.ReaderAt`. Coupled
with the `httpReadSeeker` from `docker/distribution`, we should only be
a lines of code away from resumable downloads.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This setup will now correctly set the version number from the git tag.
When using `--version`, we will see the binary name, the package it was
built from and a git hash based on the tag:
```console
$./bin/dist -v
./bin/dist github.com/docker/containerd 0b45d91.m
```
Note that in the above example, if we set a tag of `v1.0.0-dev`, that
will show up in the version number, as follows:
```console
$./bin/dist -v
./bin/dist github.com/docker/containerd v1.0.0-dev
```
Once commits are made past that tag, the version number will be
expressed relative to that tag and include a git hash:
```console
$./bin/dist -v
./bin/dist github.com/docker/containerd v1.0.0-dev-1-g7953e96.m
```
Some these examples include a `.m` postfix. This indicates that the
binary was build from a source tree with local modifications.
We can add a dev tag to start getting 1.0 version numbers for test
builds.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add registration for more subsystems via plugins
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Move content service to separate package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
We've moved to using the config, directly. This removes the argument and
gets rid of a few extra lines of code.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Following from the rest of the work in this branch, we now are porting
the dist command to work directly against the containerd content API.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
After iterating on the GRPC API, the changes required for the actual
implementation are now included in the content store. The begin change
is the move to a single, atomic `Ingester.Writer` method for locking
content ingestion on a key. From this, comes several new interface
definitions.
The main benefit here is the clarification between `Status` and `Info`
that came out of the GPRC API. `Status` tells the status of a write,
whereas `Info` is for querying metadata about various blobs.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This adds a config file for containerd configuration. It is hard to
have structure data on cli flags and the config file should be used for
the majority of fields when configuring containerd.
There are still a few flags on the daemon that override config file
values but flags should take a back seat going forward and should be
kept at a minimum.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
It's duplicate with --log-level, and since --log-level
takes advantage of --debug and it has default value,
--debug never works now.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Removed unused requires root test function and updated
tar requires function to use lookup method.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
After trying to explain the complexities of developing with protobuf, I
have now created a command that correctly calculates the import paths
for each package and runs the protobuf command.
The Makefile has been updated accordingly, expect we now no longer use
`go generate`. A new target `protos` has been defined. We alias the two,
for the lazy. We leave `go generate` in place for cases where we will
actually use `go generate`.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This changeset adds the simple apply command. It consumes a tar layer
and applies that layer to the specified directory. For the most part, it
is a direct call into Docker's `pkg/archive.ApplyLayer`.
The following demonstrates unpacking the wordpress rootfs into a local
directory `wordpress`:
```
$ ./dist fetch docker.io/library/wordpress 4.5 mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "sudo ./dist apply ./wordpress < $(./dist path -n "+.digest+")"' | xargs -I{} -n1 sh -c "{}"
```
Note that you should have fetched the layers into the local content
store before running the above. Alternatively, you can just read the
manifest from the content store, rather than fetching it. We use fetch
above to avoid having to lookup the manifest digest for our demo.
This tool has a long way to go. We still need to incorporate
snapshotting, as well as the ability to calculate the `ChainID` under
subsequent unpacking. Once we have some tools to play around with
snapshotting, we'll be able to incorporate our `rootfs.ApplyLayer`
algorithm that will get us a lot closer to a production worthy system.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With this changeset we introduce several new things. The first is the
top-level dist command. This is a toolkit that implements various
distribution primitives, such as fetching, unpacking and ingesting.
The first component to this is a simple `fetch` command. It is a
low-level command that takes a "remote", identified by a `locator`, and
an object identifier. Keyed by the locator, this tool can identify a
remote implementation to fetch the content and write it back to standard
out. By allowing this to be the unit of pluggability in fetching
content, we can have quite a bit of flexibility in how we retrieve
images.
The current `fetch` implementation provides anonymous access to docker
hub images, through the namespace `docker.io`. As an example, one can
fetch the manifest for `redis` with the following command:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json
```
Note that we have provided a mediatype "hint", nudging the fetch
implementation to grab the correct endpoint. We can hash the output of
that to fetch the same content by digest:
```
$ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256)
```
Note that the hint is now elided, since we have affixed the content to a
particular hash.
If you are not yet entertained, let's bring `jq` and `xargs` into the
mix for maximum fun. The following incantation fetches the same manifest
and downloads all layers into the convenience of `/dev/null`:
```
$ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null
```
This is just the beginning. We should be able to centralize
configuration around fetch to implement a number of distribution
methodologies that have been challenging or impossible up to this point.
The `locator`, mentioned earlier, is a schemaless URL that provides a
host and path that can be used to resolve the remote. By dispatching on
this common identifier, we should be able to support almost any protocol
and discovery mechanism imaginable.
When this is more solidified, we can roll these up into higher-level
operations that can be orchestrated through the `dist` tool or via GRPC.
What a time to be alive!
Signed-off-by: Stephen J Day <stephen.day@docker.com>
github.com/docker/docker/pkg/archive requires Sirupsen/logrus.
So let's remove sirupsen/logrus at the moment.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
The default terminal setting for a new pty on Linux (unix98) has +ONLCR,
resulting in '\n' writes by a container process to be converted to
'\r\n' reads by the managing process. This is quite unexpected, To fix it, make
the terminal sane after opening it by setting -ONLCR.
this patch fix method comes from: eea28f480d
thanks @cyphar Aleksa Sarai <asarai@suse.de>
Signed-off-by: Wang Long <long.wanglong@huawei.com>
This adds Makefile and cmd/protoc-gen-gogoctrd for generating protobufs.
More adjustments are required and the default target has been stubbed
out for now.
Signed-off-by: Stephen J Day <stephen.day@docker.com>