We move from having our own generated version of the googleapis files to
an upstream version that is present in gogo. As part of this, we update
the protobuf package to 1.0 and make some corrections for slight
differences in the generated code.
The impact of this change is very low.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Update content ingests to use content from another namespace.
Ingests must be committed to make content available and the
client will see the sharing as an ingest which has already
been fully written to, but not completed.
Updated the database version to change the ingest record in
the database from a link key to an object with a link and
expected value. This expected value is used to indicate that
the content already exists and an underlying writer may
not yet exist.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
This implements the Windows snapshotter and diff Apply function.
This allows for Windows layers to be created, and layers to be pulled
from the hub.
Signed-off-by: Darren Stahl <darst@microsoft.com>
Because tasks may be deleted while listing containers, we need to ignore
errors from state requests that are due to a closed error. All of these
get mapped to ErrNotFound, which can be used to filter the entries.
There may be a better fix that does a better job of keeping track of the
intended state of a backend task. The current condition of assuming that
a closed client is a shutdown task may be too naive.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This linter checks for unnecessary type convertions.
Some convertions are whitelisted because their type is different
on 32bit platforms
Signed-off-by: Daniel Nephin <dnephin@gmail.com>
The boltdb image store now manages its own transactions when
one is not provided, but allows the caller to pass in a
transaction through the context. This makes the image store
more similar to the content and snapshot stores. Additionally,
use the reference to the metadata database to mark the content
store as dirty after an image has been deleted. The deletion
of an image means a reference to a piece of content is gone
and therefore garbage collection should be run to check if
any resources can be cleaned up as a result.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Synchronous image delete provides an option image delete to wait
until the next garbage collection deletes after an image is removed
before returning success to the caller.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Add garbage collection as a background process and policy
configuration for configuring when to run garbage collection.
By default garbage collection will run when deletion occurs
and no more than 20ms out of every second.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
By defining a concrete, non-protobuf type for the events interface, we
can completely decouple it from the grpc packages that are expensive at
runtime. This does requires some allocation cost for converting between
types, but the saving for the size of the shim are worth it.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
To avoid importing all of grpc when consuming events, the types of
events have been split in to a separate package. This should allow a
reduction in memory usage in cases where a package is consuming events
but not using the gprc service directly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The locks now retry on the backend side to prevent clients from having
to round trip on locks that might be momentarily held. This exposed some
timing errors in the updated_at fields for content ingest, so we've had
to move that to a separate file to export the monotonic go runtime
timestamps.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Since these are registered and the interface is what matters, these
Service types do not need to be exported.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Prevent checkpoints from getting garbage collected by
adding root labels to unreferenced checkpoint objects.
Mark checkpoints as gc roots.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Add differ options and package with interface.
Update optional values on diff interface to use options.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
With this change, we integrate all the plugin changes into the
introspection service.
All plugins can be listed with the following command:
```console
$ ctr plugins
TYPE ID PLATFORM STATUS
io.containerd.content.v1 content - ok
io.containerd.metadata.v1 bolt - ok
io.containerd.differ.v1 walking linux/amd64 ok
io.containerd.grpc.v1 containers - ok
io.containerd.grpc.v1 content - ok
io.containerd.grpc.v1 diff - ok
io.containerd.grpc.v1 events - ok
io.containerd.grpc.v1 healthcheck - ok
io.containerd.grpc.v1 images - ok
io.containerd.grpc.v1 namespaces - ok
io.containerd.snapshotter.v1 btrfs linux/amd64 error
io.containerd.snapshotter.v1 overlayfs linux/amd64 ok
io.containerd.grpc.v1 snapshots - ok
io.containerd.monitor.v1 cgroups linux/amd64 ok
io.containerd.runtime.v1 linux linux/amd64 ok
io.containerd.grpc.v1 tasks - ok
io.containerd.grpc.v1 version - ok
```
There are few things to note about this output. The first is that it is
printed in the order in which plugins are initialized. This useful for
debugging plugin initialization problems. Also note that even though the
introspection GPRC api is a itself a plugin, it is not listed. This is
because the plugin takes a snapshot of the initialization state at the
end of the plugin init process. This allows us to see errors from each
plugin, as they happen. If it is required to introspect the existence of
the introspection service, we can make modifications to include it in
the future.
The last thing to note is that the btrfs plugin is in an error state.
This is a common state for containerd because even though we load the
plugin, most installations aren't on top of btrfs and the plugin cannot
be used. We can actually view this error using the detailed view with a
filter:
```console
$ ctr plugins --detailed id==btrfs
Type: io.containerd.snapshotter.v1
ID: btrfs
Platforms: linux/amd64
Exports:
root /var/lib/containerd/io.containerd.snapshotter.v1.btrfs
Error:
Code: Unknown
Message: path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter
```
Along with several other values, this is a valuable tool for evaluating the
state of components in containerd.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Updates metadata plugin to require content and
snapshotter plugins be loaded and initializes with
those plugins, keeping the metadata database structure
static after initialization. Service plugins now only
require metadata plugin access snapshotter or content
stores through metadata, which was already required
behavior of the services.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Fixes#1497
This sets the namespace on the context when deleting a namespace so that
the publish event does not fail. Use the same namespace as you are
deleting for the context.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Matching support is now implemented in the platforms package. The
`Parse` function now returns a matcher object that can be used to
match OCI platform specifications. We define this as an interface to
allow the creation of helpers oriented around platform selection.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This field allows a client to store specialized information in the
container metadata rather than having to store this itself and keep
the data in sync with containerd.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
ref: #1464
This tries to solve issues with races around process state. First it
adds the process mutex around the state call so that any state changes,
deletions, etc will be handled in order.
Second, for IsNoExist errors from the runtime, return a stopped state if
a process has been removed from the underlying OCI runtime but not from
the shim yet. This shouldn't happen with the lock from above but its
hare to verify this issue.
Third, handle shim disconnections and return an ErrNotFound.
Forth, don't abort returning all tasks if one task is unable to return
its state.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Content commit is updated to take in a context, allowing
content to be committed within the same context the writer
was in. This is useful when commit may be able to use more
context to complete the action rather than creating its own.
An example of this being useful is for the metadata implementation
of content, having a context allows tests to fully create
content in one database transaction by making use of the context.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
In order to enforce strict handling of snapshotter values on the
container object, the defaults have been moved to the client side. This
ensures that we correctly qualify the snapshotter under use when from
the container at the time it was created, rather than possibly losing
the metadata on a change of default.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Updates the differ service to support calling and configuring
multiple differs. The differs are configured as an ordered list
of differs which will each be attempting until a supported differ
is called.
Additionally a not supported error type was added to allow differs
to be selective of whether the differ arguments are supported by
the differ. This error type corresponds to the GRPC unimplemented error.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Add commit options which allow for setting labels on commit.
Prevents potential race between garbage collector reading labels
after commit and labels getting set.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
After some analysis, it was found that Content.Reader was generally
redudant to an io.ReaderAt. This change removes `Content.Reader` in
favor of a `Content.ReaderAt`. In general, `ReaderAt` can perform better
over interfaces with indeterminant latency because it avoids remote
state for reads. Where a reader is required, a helper is provided to
convert it into an `io.SectionReader`.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This changes Wait() from returning an error whenever you call wait on a
stopped process/task to returning the exit status from the process.
This also adds the exit status to the Status() call on a process/task so
that a user can Wait(), check status, then cancel the wait to avoid
races in event handling.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This change further plumbs the components required for implementing
event filters. Specifically, we now have the ability to filter on the
`topic` and `namespace`.
In the course of implementing this functionality, it was found that
there were mismatches in the events API that created extra serialization
round trips. A modification to `typeurl.MarshalAny` and a clear
separation between publishing and forwarding allow us to avoid these
serialization issues.
Unfortunately, this has required a few tweaks to the GRPC API, so this
is a breaking change. `Publish` and `Forward` have been clearly separated in
the GRPC API. `Publish` honors the contextual namespace and performs
timestamping while `Forward` simply validates and forwards. The behavior
of `Subscribe` is to propagate events for all namespaces unless
specifically filtered (and hence the relation to this particular change.
The following is an example of using filters to monitor the task events
generated while running the [bucketbench tool](https://github.com/estesp/bucketbench):
```
$ ctr events 'topic~=/tasks/.+,namespace==bb'
...
2017-07-28 22:19:51.78944874 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-6-8","pid":25889}
2017-07-28 22:19:51.791893688 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-4-8","pid":25882}
2017-07-28 22:19:51.792608389 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-2-9","pid":25860}
2017-07-28 22:19:51.793035217 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-5-6","pid":25869}
2017-07-28 22:19:51.802659622 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-0-7","pid":25877}
2017-07-28 22:19:51.805192898 +0000 UTC bb /tasks/start {"container_id":"bb-ctr-3-6","pid":25856}
2017-07-28 22:19:51.832374931 +0000 UTC bb /tasks/exit {"container_id":"bb-ctr-8-6","id":"bb-ctr-8-6","pid":25864,"exited_at":"2017-07-28T22:19:51.832013043Z"}
2017-07-28 22:19:51.84001249 +0000 UTC bb /tasks/exit {"container_id":"bb-ctr-2-9","id":"bb-ctr-2-9","pid":25860,"exited_at":"2017-07-28T22:19:51.839717714Z"}
2017-07-28 22:19:51.840272635 +0000 UTC bb /tasks/exit {"container_id":"bb-ctr-7-6","id":"bb-ctr-7-6","pid":25855,"exited_at":"2017-07-28T22:19:51.839796335Z"}
...
```
In addition to the events changes, we now display the namespace origin
of the event in the cli tool.
This will be followed by a PR to add individual field filtering for the
events API for each event type.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
In the course of setting out to add filters and address some cleanup, it
was found that we had a few problems in the events subsystem that needed
addressing before moving forward.
The biggest change was to move to the more standard terminology of
publish and subscribe. We make this terminology change across the Go
interface and the GRPC API, making the behavior more familier. The
previous system was very context-oriented, which is no longer required.
With this, we've removed a large amount of dead and unneeded code. Event
transactions, context storage and the concept of `Poster` is gone. This
has been replaced in most places with a `Publisher`, which matches the
actual usage throughout the codebase, removing the need for helpers.
There are still some questions around the way events are handled in the
shim. Right now, we've preserved some of the existing bugs which may
require more extensive changes to resolve correctly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
What started out as a simple PR to remove the "Readonly" column became an
adventure to add a proper type for a "View" snapshot. The short story here is
that we now get the following output:
```
$ sudo ctr snapshot ls
ID PARENT KIND
sha256:08c2295a7fa5c220b0f60c994362d290429ad92f6e0235509db91582809442f3 Committed
testing4 sha256:08c2295a7fa5c220b0f60c994362d290429ad92f6e0235509db91582809442f3 Active
```
In pursuing this output, it was found that the idea of having "readonly" as an
attribute on all snapshots was redundant. For committed, they are always
readonly, as they are not accessible without an active snapshot. For active
snapshots that were views, we'd have to check the type before interpreting
"readonly". With this PR, this is baked fully into the kind of snapshot. When
`Snapshotter.View` is called, the kind of snapshot is `KindView`, and the
storage system reflects this end to end.
Unfortunately, this will break existing users. There is no migration, so they
will have to wipe `/var/lib/containerd` and recreate everything. However, this
is deemed worthwhile at this point, as we won't have to judge validity of the
"Readonly" field when new snapshot types are added.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Since we now have a common set of error definitions, mapped to existing
error codes, we no longer need the specialized error codes used for
interaction with linux processes. The main issue was that string
matching was being used to map these to useful error codes. With this
change, we use errors defined in the `errdefs` package, which map
cleanly to GRPC error codes and are recoverable on either side of the
request.
The main focus of this PR was in removin these from the shim. We may
need follow ups to ensure error codes are preserved by the `Tasks`
service.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Update list content command to support filters
Add label subcommand to content in dist tool to update labels
Add uncompressed label on unpack
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
e.g. dist pull --snapshotter btrfs ...; ctr run --snapshotter btrfs ...
(empty string defaults for overlayfs)
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Move content status to list statuses and add single status
to interface.
Updates API to support list statuses and status
Updates snapshot key creation to be generic
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
The primary feature we get with this PR is support for filters and
labels on the image metadata store. In the process of doing this, the
conventions for the API have been converged between containers and
images, providing a model for other services.
With images, `Put` (renamed to `Update` briefly) has been split into a
`Create` and `Update`, allowing one to control the behavior around these
operations. `Update` now includes support for masking fields at the
datastore-level across both the containers and image service. Filters
are now just string values to interpreted directly within the data
store. This should allow for some interesting future use cases in which
the datastore might use the syntax for more efficient query paths.
The containers service has been updated to follow these conventions as
closely as possible.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Allow plugins to be mapped and returned by their ID.
Add skip plugin to allow plugins to decide whether they should
be loaded.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Now that we have most of the services required for use with containerd,
it was found that common patterns were used throughout services. By
defining a central `errdefs` package, we ensure that services will map
errors to and from grpc consistently and cleanly. One can decorate an
error with as much context as necessary, using `pkg/errors` and still
have the error mapped correctly via grpc.
We make a few sacrifices. At this point, the common errors we use across
the repository all map directly to grpc error codes. While this seems
positively crazy, it actually works out quite well. The error conditions
that were specific weren't super necessary and the ones that were
necessary now simply have better context information. We lose the
ability to add new codes, but this constraint may not be a bad thing.
Effectively, as long as one uses the errors defined in `errdefs`, the
error class will be mapped correctly across the grpc boundary and
everything will be good. If you don't use those definitions, the error
maps to "unknown" and the error message is preserved.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
These rpcs only return pids []uint32 so should be named that way in
order to have other rpcs that list Processes such as Exec'd processes.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
A few days ago, we added validation for namespaces. We've decided to
expand these naming rules to include containers. To facilitate this, a
common package `identifiers` now provides a common validation area.
These rules will be extended to apply to task identifiers, snapshot keys
and other areas where user-provided identifiers may be used.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This moves the shim's API and protos out of the containerd services
package and into the linux runtime package. This is because the shim is
an implementation detail of the linux runtime that we have and it is not
a containerd user facing api.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
To simplify use of types, we have consolidate the packages for the mount
and descriptor protobuf types into a single Go package. We also drop the
versioning from the type packages, as these types will remain the same
between versions.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
When using events, it was found to be fairly unwieldy with a number of
extra packages. For the most part, when interacting with the events
service, we want types of the same version of the service. This has been
accomplished by moving all events types into the events package.
In addition, several fixes to the way events are marshaled have been
included. Specifically, we defer to the protobuf type registration
system to assemble events and type urls, with a little bit sheen on top
of add a containerd.io oriented namespace.
This has resulted in much cleaner event consumption and has removed the
reliance on error prone type urls, in favor of concrete types.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This allow returning a more meaningful message when a request context
doesn't hold a namespace.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: update events package to include emitter and use envelope proto
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: enable events service and update ctr events to use events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
event listeners
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: helper func for emitting in services
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: improved cli for containers and tasks
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
create event envelope with poster
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: introspect event data to use for type url
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: use pb encoding; add event types
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument content and snapshot services with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument image service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument namespace service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add namespace support
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: only send events from namespace requested from client
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: switch to go-events for broadcasting
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>