The split between provider and ingester was a long standing division
reflecting the client-side use cases. For the most part, we were
differentiating these for the algorithms that operate them, but it made
instantation and use of the types challenging. On the server-side, this
distinction is generally less important. This change unifies these types
and in the process we get a few benefits.
The first is that we now completely access the content store over GRPC.
This was the initial intent and we have now satisfied this goal
completely. There are a few issues around listing content and getting
status, but we resolve these with simple streaming and regexp filters.
More can probably be done to polish this but the result is clean.
Several other content-oriented methods were polished in the process of
unification. We have now properly seperated out the `Abort` method to
cancel ongoing or stalled ingest processes. We have also replaced the
`Active` method with a single status method.
The transition went extremely smoothly. Once the clients were updated to
use the new methods, every thing worked as expected on the first
compile.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
After iterating on the GRPC API, the changes required for the actual
implementation are now included in the content store. The begin change
is the move to a single, atomic `Ingester.Writer` method for locking
content ingestion on a key. From this, comes several new interface
definitions.
The main benefit here is the clarification between `Status` and `Info`
that came out of the GPRC API. `Status` tells the status of a write,
whereas `Info` is for querying metadata about various blobs.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With this change, we add the following commands to the dist tool:
- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage
We demonstrate the utility with the following shell pipeline:
```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```
The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.
Put shortly, this is a parallel layer download.
In a separate shell session, could monitor the active downloads with the
following:
```
$ watch -n0.2 ./dist active
```
For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:
```
$ watch -n0.2 tree .content
```
This will help to understand what is going on internally.
To get access to the layers, you can use the path command:
```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```
When you are done, you can clear out the content with the classic xargs
pipeline:
```
$ ./dist list -q | xargs ./dist delete
```
Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.
From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.
Signed-off-by: Stephen J Day <stephen.day@docker.com>