Mounting as MS_SLAVE here breaks use cases which want to use
rootPropagation=shared in order to expose mounts to the host (and other
containers binding the same subtree), mounting as e.g. MS_SHARED is pointless
in this context so just remove.
Having done this we also need to arrange to manually clean up the mounts on
delete, so do so.
Note that runc will also setup root as required by rootPropagation, defaulting
to MS_PRIVATE.
Fixes#1132.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Since we now have a common set of error definitions, mapped to existing
error codes, we no longer need the specialized error codes used for
interaction with linux processes. The main issue was that string
matching was being used to map these to useful error codes. With this
change, we use errors defined in the `errdefs` package, which map
cleanly to GRPC error codes and are recoverable on either side of the
request.
The main focus of this PR was in removin these from the shim. We may
need follow ups to ensure error codes are preserved by the `Tasks`
service.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
After review, there are cases where having common requirements for
namespaces and identifiers creates contention between applications. One
example is that it is nice to have namespaces comply with domain name
requirement, but that does not allow underscores, which are required for
certain identifiers.
The namespaces validation has been reverted to be in line with RFC 1035.
Existing identifiers has been modified to allow simply alpha-numeric
identifiers, while limiting adjacent separators.
We may follow up tweaks for the identifier charset but this split should
remove the hard decisions.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This removes the RuntimeEvent super proto with enums into separate
runtime event protos to be inline with the other events that are output
by containerd.
This also renames the runtime events into Task* events.
Fixes#1071
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This makes it possible to enable shim debug by adding the following to
`config.toml`:
[plugins.linux]
shim_debug = true
I moved the debug setting from the `client.Config struct` to an argument to
`client.WithStart` since this is the only place it would be used.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
The compiler doesn't spot this, but guru does.
This seems to have become unused in 79e6a93624 ("Fix incorrect reference to
the gRPC runtime name as a binary").
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
We hope that containerd supports any OCI compliant runtime, and not only
runc.
This patch fixes all the error messages to not be completely runc
specific and change the initProcess structure to have its runtime
pointer be called 'runtime' and not 'runc'
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
These rpcs only return pids []uint32 so should be named that way in
order to have other rpcs that list Processes such as Exec'd processes.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Until we have a way to preserve the initial command used to start the
container, we have to default to the default `runc` found on the $PATH.
This code after the last refactor of shim/API is incorrectly using the
gRPC object reference of the v1 runtime as a binary name which causes
os.Exec() errors.
Signed-off-by: Phil Estes <estesp@gmail.com>
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
This moves the shim's API and protos out of the containerd services
package and into the linux runtime package. This is because the shim is
an implementation detail of the linux runtime that we have and it is not
a containerd user facing api.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
To simplify use of types, we have consolidate the packages for the mount
and descriptor protobuf types into a single Go package. We also drop the
versioning from the type packages, as these types will remain the same
between versions.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
When using events, it was found to be fairly unwieldy with a number of
extra packages. For the most part, when interacting with the events
service, we want types of the same version of the service. This has been
accomplished by moving all events types into the events package.
In addition, several fixes to the way events are marshaled have been
included. Specifically, we defer to the protobuf type registration
system to assemble events and type urls, with a little bit sheen on top
of add a containerd.io oriented namespace.
This has resulted in much cleaner event consumption and has removed the
reliance on error prone type urls, in favor of concrete types.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: update events package to include emitter and use envelope proto
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: enable events service and update ctr events to use events service
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
event listeners
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: helper func for emitting in services
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: improved cli for containers and tasks
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
create event envelope with poster
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: introspect event data to use for type url
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: use pb encoding; add event types
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument content and snapshot services with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument image service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: instrument namespace service with events
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: add namespace support
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: only send events from namespace requested from client
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
events: switch to go-events for broadcasting
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
We need a separate API for handing the exit status and deletion of
Exec'd processes to make sure they are properly cleaned up within the
shim and daemon.
Fixes#973
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This allows attach of existing fifos to be done without any information
stored on the client side.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This also fixes a deadlock in the shim's reaper where execs would lockup
and/or miss a quick exiting exec process's exit status.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
There are three aspects which need to be covered:
- the runtime needs to restart its event pump when it reconnects (in
loadContainer).
- on the server side shim needs to monitor the stream context so it knows when
the connection goes away.
- if the shim's stream.Send() fails (because the stream died between taking
the event off the channel and calling stream.Send()) then to avoid losing
that event the shim should remember it and send it out first on the next
stream.
The shim's event production machinery only handles producing a single event
stream, so add an interlock to ensure there is only one reader of the
`s.events` channel at a time. Subsequent attempts to use Events will block
until the existing owner is done.
Fixes#921.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Working from feedback on the existing implementation, we have now
introduced a central metadata object to represent the lifecycle and pin
the resources required to implement what people today know as
containers. This includes the runtime specification and the root
filesystem snapshots. We also allow arbitrary labeling of the container.
Such provisions will bring the containerd definition of container closer
to what is expected by users.
The objects that encompass today's ContainerService, centered around the
runtime, will be known as tasks. These tasks take on the existing
lifecycle behavior of containerd's containers, which means that they are
deleted when they exit. Largely, there are no other changes except for
naming.
The `Container` object will operate purely as a metadata object. No
runtime state will be held on `Container`. It only informs the execution
service on what is required for creating tasks and the resources in use
by that container. The resources referenced by that container will be
deleted when the container is deleted, if not in use. In this sense,
users can create, list, label and delete containers in a similar way as
they do with docker today, without the complexity of runtime locks that
plagues current implementations.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This moves both the Mount type and mountinfo into a single mount
package.
This also opens up the root of the repo to hold the containerd client
implementation.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Update go-runc to 49b2a02ec1ed3e4ae52d30b54a291b75
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add shim to restore creation
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Keep checkpoint path in service
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add C/R to non-shim build
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Checkpoint rw and image
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Pause container on bind checkpoints
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Return dump.log in error on checkpoint failure
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Pause container for checkpoint
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Update runc to 639454475cb9c8b861cc599f8bcd5c8c790ae402
For checkpoint into to work you need runc version
639454475cb9c8b861cc599f8bcd5c8c790ae402 + and criu 3.0 as this is what
I have been testing with.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Move restore behind create calls
This remove the restore RPCs in favor of providing the checkpoint
information to the `Create` calls of a container. If provided, the
container will be created/restored from the checkpoint instead of an
existing container.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Regen protos after rebase
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds support for signalling a container process by pid.
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
make Ps more extensible
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
ps: windows support
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
Since an error is returned via the RPC clients will assume (rightly so)
that a call to the Delete() RPC is not necessary.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Update go-runc to master with portability fixes.
Subreaper only exists on Linux, and only Linux runs the shim in a
mount namespace.
With these changes the shim compiles on Darwin, which means the
whole build compiles without errors now.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This updates containerd to use the latest versions of cgroups, fifo,
console, and go-runc from the containerd org.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Fixes#770
Use a wait group to wait for the `io.Copy` go routines to be scheduled
before continuing to start the container.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Because of go interface unpacking we need to only set the interface on
the opts when we actually have a socket.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This adds a config option to set the oom score for the containerd daemon
as well as automatically setting the oom score for the shim's lauched so
that they are not killed until the very end of an out of memory
condition.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add functionality for restoring containers after containerd dies and is
restarted with terminated shims.
This ensures that on restore, if a container no longer has a running
shim, containerd will kill and cleanup the container.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This reuses the exiting shim code and services to let containerd run as
the reaper for all container processes without the use of a shim.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The message was defined but the method was returning empty, plumb through the
result from the shim layer.
Compile tested only.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Because of the plugin findings and having the default runtime builtin
this makes it much better for development and testing.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add registration for more subsystems via plugins
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Move content service to separate package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Remove runtime files from containerd
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Update supervisor for orphaned containers
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Remove ctr/container.go back to rpc calls
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add attach to loaded container
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add monitor based on epoll for process exits
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Convert pids in containerd to string
This is so that we no longer care about linux or system level pids and
processes in containerd have user defined process id(pid) kinda like the
exec process ids that docker has today.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add reaper back to containerd
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Implement list containers with new process model
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Implement restore of processes
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add NONBLOCK to exit fifo open
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Implement tty reattach
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Fix race in exit pipe creation
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Add delete to shim
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Update shim to use pid-file and not stdout
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Check pointers against nil before dereferencing them. Skip any sections
that are nil, since that's equivalent to having no values defined for
those sections.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This currently logs to a json file with the stream type. This is slow
and hard on the cpu and memory so we need to swich this over to
something like protobufs for the binary logs but this is just a start.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>