Content commit is updated to take in a context, allowing
content to be committed within the same context the writer
was in. This is useful when commit may be able to use more
context to complete the action rather than creating its own.
An example of this being useful is for the metadata implementation
of content, having a context allows tests to fully create
content in one database transaction by making use of the context.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Ensure all writers are closed at end of test for content
test suite. Prevents test from leaving lingering connections
to the content store.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Add commit options which allow for setting labels on commit.
Prevents potential race between garbage collector reading labels
after commit and labels getting set.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
After some analysis, it was found that Content.Reader was generally
redudant to an io.ReaderAt. This change removes `Content.Reader` in
favor of a `Content.ReaderAt`. In general, `ReaderAt` can perform better
over interfaces with indeterminant latency because it avoids remote
state for reads. Where a reader is required, a helper is provided to
convert it into an `io.SectionReader`.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Export as a tar (Note: "-" can be used for stdout):
$ ctr images export /tmp/oci-busybox.tar docker.io/library/busybox:latest
Import a tar (Note: "-" can be used for stdin):
$ ctr images import foo/new:latest /tmp/oci-busybox.tar
Note: media types are not converted at the moment: e.g.
application/vnd.docker.image.rootfs.diff.tar.gzip
-> application/vnd.oci.image.layer.v1.tar+gzip
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Fixes message on pull about a statuses not being found.
These statuses can just be ignored since they are no
longer valid and holding the database lock while reading
statuses is not necessary.
Closes#1231
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Because the lock on an ingest ref being held regardless of whether a
writer was in use, resuming an existing ingest proved impossible. We now
defer writer locking to the content store backend, where the lock will
be released automatically on closing the writer or on restarting
containerd.
There are still cases where a writer can be abandoned but not closed,
leaving an active ingest, but this is extremely rare and can be resolved
with a daemon restart.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Add content test suite with test for writer.
Update fs and metadata implementations to use test suite.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Update list content command to support filters
Add label subcommand to content in dist tool to update labels
Add uncompressed label on unpack
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Move content status to list statuses and add single status
to interface.
Updates API to support list statuses and status
Updates snapshot key creation to be generic
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Now that we have most of the services required for use with containerd,
it was found that common patterns were used throughout services. By
defining a central `errdefs` package, we ensure that services will map
errors to and from grpc consistently and cleanly. One can decorate an
error with as much context as necessary, using `pkg/errors` and still
have the error mapped correctly via grpc.
We make a few sacrifices. At this point, the common errors we use across
the repository all map directly to grpc error codes. While this seems
positively crazy, it actually works out quite well. The error conditions
that were specific weren't super necessary and the ones that were
necessary now simply have better context information. We lose the
ability to add new codes, but this constraint may not be a bad thing.
Effectively, as long as one uses the errors defined in `errdefs`, the
error class will be mapped correctly across the grpc boundary and
everything will be good. If you don't use those definitions, the error
maps to "unknown" and the error message is preserved.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
These interfaces allow us to preserve both the checking of error "cause"
as well as messages returned from the gRPC API so that the client gets
full error reason instead of a default "metadata: not found" in the case
of a missing image.
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
Updates content service to handle lock errors and return
them to the client. The client remote handler has been
updated to retry when a resource is locked until the
resource is unlocked or the expected resource exists.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Open up content store writer with expected digest before
opening up the fetch call to the registry. Add Copy method
to content store helpers to allow content copy to take
advantage of seeking helper code.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
The split between provider and ingester was a long standing division
reflecting the client-side use cases. For the most part, we were
differentiating these for the algorithms that operate them, but it made
instantation and use of the types challenging. On the server-side, this
distinction is generally less important. This change unifies these types
and in the process we get a few benefits.
The first is that we now completely access the content store over GRPC.
This was the initial intent and we have now satisfied this goal
completely. There are a few issues around listing content and getting
status, but we resolve these with simple streaming and regexp filters.
More can probably be done to polish this but the result is clean.
Several other content-oriented methods were polished in the process of
unification. We have now properly seperated out the `Abort` method to
cancel ongoing or stalled ingest processes. We have also replaced the
`Active` method with a single status method.
The transition went extremely smoothly. Once the clients were updated to
use the new methods, every thing worked as expected on the first
compile.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This mainly fixes Linux vs generic Unix differences, with some
differences between Darwin and Freebsd (which are close bit not
identical). Should make fixing for other Unix platforms easier.
Note there are not yet `runc` equivalents for these platforms;
my current use case is image manipulation for the `moby` tool.
However there is interest in OCI runtime ports for both platforms.
Current status is that MacOS can build and run `ctr`, `dist`
and `containerd` and some operations are supported. FreeBSD 11
still needs some more fixes to continuity for extended attributes.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
When compiling for armv7, the type is `int32` for `Timespec.Sec` and
`Timespec.Nsec` in the syscall package. When passing these values
to `time.Unix`, the values must be converted to `int64`.
Signed-off-by: Kevin Swiber <kswiber@gmail.com>
With this changeset, we now have a proof of concept of end to end pull.
Up to this point, the relationship between subsystems has been somewhat
theoretical. We now leverage fetching, the snapshot drivers, the rootfs
service, image metadata and the execution service, validating the proposed
model for containerd. There are a few caveats, including the need to move some
of the access into GRPC services, but the basic components are there.
The first command we will cover here is `dist pull`. This is the analog
of `docker pull` and `git pull`. It performs a full resource fetch for
an image and unpacks the root filesystem into the snapshot drivers. An
example follows:
``` console
$ sudo ./bin/dist pull docker.io/library/redis:latest
docker.io/library/redis:latest: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:4c8fb09e8d634ab823b1c125e64f0e1ceaf216025aa38283ea1b42997f1e8059: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:3b281f2bcae3b25c701d53a219924fffe79bdb74385340b73a539ed4020999c4: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:e4a35914679d05d25e2fccfd310fde1aa59ffbbf1b0b9d36f7b03db5ca0311b0: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4b7726832aec75f0a742266c7190c4d2217492722dfd603406208eaa902648d8: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:338a7133395941c85087522582af182d2f6477dbf54ba769cb24ec4fd91d728f: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:83f12ff60ff1132d1e59845e26c41968406b4176c1a85a50506c954696b21570: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:693502eb7dfbc6b94964ae66ebc72d3e32facd981c72995b09794f1e87bac184: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:622732cddc347afc9360b4b04b46c6f758191a1dc73d007f95548658847ee67e: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:19a7e34366a6f558336c364693df538c38307484b729a36fede76432789f084f: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 1.6 s total: 0.0 B (0.0 B/s)
INFO[0001] unpacking rootfs
```
Note that we haven't integrated rootfs unpacking into the status output, but we
pretty much have what is in docker today (:P). We can see the result of our pull
with the following:
```console
$ sudo ./bin/dist images
REF TYPE DIGEST SIZE
docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:4c8fb09e8d634ab823b1c125e64f0e1ceaf216025aa38283ea1b42997f1e8059 1.8 kB
```
The above shows that we have an image called "docker.io/library/redis:latest"
mapped to the given digest marked with a specific format. We get the size of
the manifest right now, not the full image, but we can add more as we need it.
For the most part, this is all that is needed, but a few tweaks to the model
for naming may need to be added. Specifically, we may want to index under a few
different names, including those qualified by hash or matched by tag versions.
We can do more work in this area as we develop the metadata store.
The name shown above can then be used to run the actual container image. We can
do this with the following command:
```console
$ sudo ./bin/ctr run --id foo docker.io/library/redis:latest /usr/local/bin/redis-server
1:C 17 Mar 17:20:25.316 # Warning: no config file specified, using the default config. In order to specify a config file use /usr/local/bin/redis-server /path/to/redis.conf
1:M 17 Mar 17:20:25.317 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 3.2.8 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in standalone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 1
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
1:M 17 Mar 17:20:25.326 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 17 Mar 17:20:25.326 # Server started, Redis version 3.2.8
1:M 17 Mar 17:20:25.326 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 17 Mar 17:20:25.326 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 17 Mar 17:20:25.326 * The server is now ready to accept connections on port 6379
```
Wow! So, now we are running `redis`!
There are still a few things to work out. Notice that we have to specify the
command as part of the arguments to `ctr run`. This is because are not yet
reading the image config and converting it to an OCI runtime config. With the
base laid in this PR, adding such functionality should be straightforward.
While this is a _little_ messy, this is great progress. It should be easy
sailing from here.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
To make restarting after failed pull less racy, we define `Truncate(size
int64) error` on `content.Writer` for the zero offset. Truncating a
writer will dump any existing data and digest state and start from the
beginning. All subsequent writes will start from the zero offset.
For the service, we support this by defining the behavior for a write
that changes the offset. To keep this narrow, we only support writes out
of order at the offset 0, which causes the writer to dump existing data
and reset the local hash.
This makes restarting failed pulls much smoother when there was a
previously encountered error and the source doesn't support arbitrary
seeks or reads at arbitrary offsets. By allowing this to be done while
holding the write lock on a ref, we can restart the full download
without causing a race condition.
Once we implement seeking on the `io.Reader` returned by the fetcher,
this will be less useful, but it is good to ensure that our protocol
properly supports this use case for when streaming is the only option.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Allow deletion of content over the GRPC interface. For now, we are going
with a model that conducts reference management outside of the content
store, in the metadata store but this design is valid either way.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
After implementing pull, a few changes are required to the content store
interface to make sure that the implementation works smoothly.
Specifically, we work to make sure the predeclaration path for digests
works the same between remote and local writers. Before, we were
hesitent to require the the size and digest up front, but it became
clear that having this provided significant benefit.
There are also several cleanups related to naming. We now call the
expected digest `Expected` consistently across the board and `Total` is
used to mark the expected size.
This whole effort comes together to provide a very smooth status
reporting workflow for image pull and push. This will be more obvious
when the bulk of pull code lands.
There are a few other changes to make `content.WriteBlob` more broadly
useful. In accordance with addition for predeclaring expected size when
getting a `Writer`, `WriteBlob` now supports this fully. It will also
resume downloads if provided an `io.Seeker` or `io.ReaderAt`. Coupled
with the `httpReadSeeker` from `docker/distribution`, we should only be
a lines of code away from resumable downloads.
Signed-off-by: Stephen J Day <stephen.day@docker.com>