 44d7cd1528
			
		
	
	44d7cd1528
	
	
	
		
			
			The docs have been out of the sync with the actual implementation since 2018. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
		
			
				
	
	
		
			106 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			106 lines
		
	
	
		
			4.1 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Data Flow
 | |
| 
 | |
| In the past, container systems have hidden the complexity of pulling container
 | |
| images, hiding many details and complexity. This document intends to shed light
 | |
| on that complexity and detail how a "pull" operation will look from the
 | |
| perspective of a containerd user. We use the _bundle_ as the target object in
 | |
| this workflow, and walk back from there to describe the full process. In this
 | |
| context, we describe both pulling an image and creating a bundle from that
 | |
| image.
 | |
| 
 | |
| With containerd, we redefine the "pull" to comprise the same set of steps
 | |
| encompassed in prior container engines. In this model, an image defines a
 | |
| collection of resources that can be used to create a _bundle_. There is no
 | |
| specific format or object called an image. The goal of the pull is to produce a
 | |
| set of steps is to resolve the resources that comprise an image, with the
 | |
| separation providing lifecycle points in the process. 
 | |
| 
 | |
| A reference implementation of the complete "pull", performed client-side, will
 | |
| be provided as part of containerd, but there may not be a single "pull" API
 | |
| call.
 | |
| 
 | |
| A rough diagram of the dataflow, along with the relevant components, is below.
 | |
| 
 | |
| 
 | |
| 
 | |
| While the process proceeds left to right in the diagram, this document is
 | |
| written right to left. By working through this process backwards, we can best
 | |
| understand the approach employed by containerd.
 | |
| 
 | |
| ## Running a Container
 | |
| 
 | |
| For containerd, we'd generally like to retrieve a _bundle_. This is the
 | |
| runtime, on-disk container layout, which includes the filesystem and
 | |
| configuration required to run the container.
 | |
| 
 | |
| Generically, speaking, we can say we have the following directory:
 | |
| 
 | |
| ```
 | |
| config.json
 | |
| rootfs/
 | |
| ```
 | |
| 
 | |
| The contents of `config.json` isn't interesting in this context, but for
 | |
| clarity, it may be the runc config or a containerd specific configuration file
 | |
| for setting up a running container. The `rootfs` is a directory where
 | |
| containerd will setup the runtime container's filesystem.
 | |
| 
 | |
| While containerd doesn't have the concept of an image, we can effectively build
 | |
| this structure from an image, as projected into containerd. Given this, we can
 | |
| say that are requirements for running a container are to do the following:
 | |
| 
 | |
| 1. Convert the configuration from the container image into the target format
 | |
|    for the containerd runtime.
 | |
| 2. Reproduce the root filesystem from the container image. While we could
 | |
|    unpack this into `rootfs` in the bundle, we can also just pass this as a set
 | |
|    of mounts to the container configuration.
 | |
| 
 | |
| The above defines the framework in which we will operate. Put differently, we
 | |
| can say that we want to create a bundle by creating these two components of a
 | |
| bundle.
 | |
| 
 | |
| ## Creating a Bundle
 | |
| 
 | |
| Now that we've defined what is required to run a container, a _bundle_, we need
 | |
| to create one.
 | |
| 
 | |
| Let's say we have the following:
 | |
| 
 | |
| ```
 | |
| ctr run ubuntu
 | |
| ```
 | |
| 
 | |
| This does no pulling of images. It only takes the name and creates a _bundle_.
 | |
| Broken down into steps, the process looks as follows:
 | |
| 
 | |
| 1. Lookup the digest of the image in metadata store.
 | |
| 2. Resolve the manifest in the content store.
 | |
| 3. Resolve the layer snapshots in the snapshot subsystem.
 | |
| 4. Transform the config into the target bundle format.
 | |
| 5. Create a runtime snapshot for the rootfs of the container, including resolution of mounts.
 | |
| 6. Run the container.
 | |
| 
 | |
| From this, we can understand the required resources to _pull_ an image:
 | |
| 
 | |
| 1. An entry in the metadata store a name pointing at a particular digest.
 | |
| 2. The manifest must be available in the content store.
 | |
| 3. The result of successively applied layers must be available as a snapshot.
 | |
| 
 | |
| ## Unpacking Layers
 | |
| 
 | |
| While this process may be pull or run driven, the idea is quite simple. For
 | |
| each layer, apply the result to a snapshot of the previous layer. The result
 | |
| should be stored under the chain id (as defined by OCI) of the resulting
 | |
| application.
 | |
| 
 | |
| ## Pulling an Image
 | |
| 
 | |
| With all the above defined, pulling an image simply becomes the following:
 | |
| 
 | |
| 1. Fetch the manifest for the image, verify and store it.
 | |
| 2. Fetch each layer of the image manifest, verify and store them.
 | |
| 3. Store the manifest digest under the provided name.
 | |
| 
 | |
| Note that we leave off using the name to resolve a particular location. We'll
 | |
| leave that for another doc!
 |