Commit Graph

65 Commits

Author SHA1 Message Date
Michael Crosby
951c129bf1 Handle locking and errors for process state
ref: #1464

This tries to solve issues with races around process state.  First it
adds the process mutex around the state call so that any state changes,
deletions, etc will be handled in order.

Second, for IsNoExist errors from the runtime, return a stopped state if
a process has been removed from the underlying OCI runtime but not from
the shim yet.  This shouldn't happen with the lock from above but its
hare to verify this issue.

Third, handle shim disconnections and return an ErrNotFound.

Forth, don't abort returning all tasks if one task is unable to return
its state.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-07 16:22:00 -04:00
Kenfe-Mickael Laventure
1b79170849
linux: Add RuntimeRoot to RuncOptions
This allow specifying wher the OCI runtime should store its state data.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-31 14:35:05 -07:00
Kenfe-Mickael Laventure
ab0cb4e756
linux: Honor RuncOptions if set on container
This also fix the type used for RuncOptions.SystemCgroup, hence introducing
an API break.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-31 14:35:05 -07:00
Kenfe-Mickael Laventure
d541567119
Handle SIGKILL'ed shim while daemon is running
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-29 08:27:44 -07:00
Michael Crosby
967497097a Add procesStates for shim processes
Use the state pattern to handle process transitions from one state to
another and what actions can be performed on a process in a specific
state.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-25 14:03:55 -04:00
Michael Crosby
eb58ecab7c Add null io option
This adds null IO option for efficient handling of IO.
It provides a container directly with `/dev/null` and does not require
any io.Copy within the shim whenever a user does not want the IO of the
container.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-14 13:09:16 -04:00
Michael Crosby
fa3454e54d Update go-runc to b85ac701de5065a66918203dd18f05
This includes fixes for pipe ownership and NullIO options.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-11 17:15:25 -04:00
Michael Crosby
d513dd2bfd Fix race with task checkpoint
Because runc will delete a container after a successful checkpoint we
need to handle a NotFound error from runc on delete.

There is also a race between SIGKILL'ing the shim and it actually
exiting to unmount the tasks rootfs, we need to loop and wait for the
task to actually be reaped before trying to delete the rootfs+bundle.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-10 16:35:03 -04:00
Phil Estes
e9b86af848 Merge pull request #1283 from mlaventure/resurrect-state-dir
Resurrect State directory
2017-08-07 10:41:21 -04:00
Kenfe-Mickael Laventure
8700e23a10
Use root dir when storing temporary checkpoint data
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-03 14:38:18 -07:00
Kenfe-Mickael Laventure
fbe4751b83
Use a temp socket to receive the console from runc
This greatly reduce the risk that we will hit the unix socket maximum path
length.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-03 10:44:10 -07:00
Michael Crosby
bf4838eb71 Shutdown console after process exits
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-02 17:09:50 -04:00
Michael Crosby
504033e373 Add Get of task and process state
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-02 13:50:08 -04:00
Michael Crosby
9f08965699 Change Exited/Status to SetExited/ExitStatus
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-02 13:50:08 -04:00
Daniel Dao
8e53465842
use epoll to manage console i/o in linux
this adds a `platform` interface for shim service to manage platform-specific
behaviors such as I/O (which uses epoll in linux to work around bugs with applications
that closes all consoles i.e. https://github.com/opencontainers/runc/pull/1434
and https://github.com/moby/moby/issues/27202)

Its expected that we only have 1 epollfd per containerd_shim to manage all processes.
Since all the work are done outside of the container runtime, upgrading of runc
is not required and should be done separately.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2017-07-30 10:50:39 +01:00
Michael Crosby
a0a5cc7787 Add user namespace support to client
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-27 11:06:20 -04:00
Ian Campbell
d42cb88ba2 Loop umount'ing rootfs until there are no more mounts
This is simpler than trying to count how many successful mounts we made.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
2017-07-20 10:50:08 +01:00
Ian Campbell
d63d2ecf6c Simplify mount cleanup on failure by using defer
This avoids someone adding a new error path and forgetting to call the cleanup
function.

We prefer to use an explicit flag to gate the clean rather than relying on `err
!= nil` so we don't have to rely on people never accidentally shadowing the
`err` as seen by the closure.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
2017-07-20 10:50:08 +01:00
Ian Campbell
300f083127 Cleanup mounts if we fail to mount one element of rootfs
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
2017-07-20 10:50:08 +01:00
Ian Campbell
8b365117a2 containerd-shim: Do not remount root MS_SLAVE
Mounting as MS_SLAVE here breaks use cases which want to use
rootPropagation=shared in order to expose mounts to the host (and other
containers binding the same subtree), mounting as e.g. MS_SHARED is pointless
in this context so just remove.

Having done this we also need to arrange to manually clean up the mounts on
delete, so do so.

Note that runc will also setup root as required by rootPropagation, defaulting
to MS_PRIVATE.

Fixes #1132.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
2017-07-20 10:50:08 +01:00
Stephen J Day
6d0bcd5aec
linux, linux/shim: remove error definitions
Since we now have a common set of error definitions, mapped to existing
error codes, we no longer need the specialized error codes used for
interaction with linux processes. The main issue was that string
matching was being used to map these to useful error codes. With this
change, we use errors defined in the `errdefs` package, which map
cleanly to GRPC error codes and are recoverable on either side of the
request.

The main focus of this PR was in removin these from the shim. We may
need follow ups to ensure error codes are preserved by the `Tasks`
service.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-18 15:56:49 -07:00
Stephen J Day
9e5bd5a2dc
namespaces, identifiers: split validation
After review, there are cases where having common requirements for
namespaces and identifiers creates contention between applications.  One
example is that it is nice to have namespaces comply with domain name
requirement, but that does not allow underscores, which are required for
certain identifiers.

The namespaces validation has been reverted to be in line with RFC 1035.
Existing identifiers has been modified to allow simply alpha-numeric
identifiers, while limiting adjacent separators.

We may follow up tweaks for the identifier charset but this split should
remove the hard decisions.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-12 14:46:47 -07:00
Michael Crosby
f93bfb6233 Add Exec IDs
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-06 15:23:08 -07:00
Michael Crosby
4b9a8ee13e Require *T for typeurl interaction
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-06 13:14:48 -07:00
Samuel Ortiz
b67398af15 linux: Make containerd less runc specific
We hope that containerd supports any OCI compliant runtime, and not only
runc.
This patch fixes all the error messages to not be completely runc
specific and change the initProcess structure to have its runtime
pointer be called 'runtime' and not 'runc'

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-07-03 17:45:23 +02:00
Michael Crosby
82d0208aaa Implement options for runtime specific settings
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-29 15:32:45 -07:00
Michael Crosby
7c8acca29a Move runtime interfaces to runtime package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-28 10:10:59 -07:00
Derek McGowan
a5fa3bb923 Merge pull request #1100 from crosbymichael/update-task
Implement task update
2017-06-27 14:39:45 -07:00
Michael Crosby
6ec84ef83c Add namespace to container metrics
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-27 11:54:14 -07:00
Michael Crosby
f36e0193a4 Implement task update
This allows tasks to have their resources updated as they are running.

Fixes #1067

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-26 16:38:49 -07:00
Michael Crosby
990536f2cc Move shim protos into linux pkg
This moves the shim's API and protos out of the containerd services
package and into the linux runtime package. This is because the shim is
an implementation detail of the linux runtime that we have and it is not
a containerd user facing api.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-23 16:21:47 -07:00
Michael Crosby
3b9d9dfa3e Fix error on doulbe Kill calls
This returns a typed error for calls to Kill when the process has
already finished.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-23 13:28:48 -07:00
Kenfe-Mickael Laventure
0917269a9a
shim: Use correct error when trying to expand runc error
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-06-23 11:31:16 -07:00
Stephen J Day
12a6beaeeb
*: update import paths to use versioned services
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-06-21 18:29:06 -07:00
Michael Crosby
94eafaab60 Update GRPC for consistency
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-21 13:34:24 -07:00
Kenfe-Mickael Laventure
5922cfaba8
linux: Bubble up runc error message
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-06-14 10:21:48 -07:00
Kenfe-Mickael Laventure
33598cc5d3
linux: Wrap error with contextual message
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-06-14 10:21:48 -07:00
Michael Crosby
ff598449d1 Add DeleteProcess API for removing execs
We need a separate API for handing the exit status and deletion of
Exec'd processes to make sure they are properly cleaned up within the
shim and daemon.

Fixes #973

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-12 09:32:23 -07:00
Michael Crosby
497db9ac06 Namespace tasks via runc --root
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-06 16:31:00 -07:00
Michael Crosby
a8c5542ba8 Add checkpoint and restore to client package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-06 09:58:33 -07:00
Michael Crosby
00734ab04a Return fifo paths from Shim
This allows attach of existing fifos to be done without any information
stored on the client side.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-01 14:12:02 -07:00
Michael Crosby
ee90a77f63 Rename Image to CheckpointPath in shim
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-05-26 10:06:53 -07:00
Michael Crosby
d7af92e00c Move Mount into mount pkg
This moves both the Mount type and mountinfo into a single mount
package.

This also opens up the root of the repo to hold the containerd client
implementation.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-05-22 16:41:12 -07:00
Derek McGowan
b07504c713 Merge pull request #862 from crosbymichael/checkpoint
Initial Support for Checkpoint && Restore
2017-05-22 15:51:10 -07:00
Michael Crosby
7cc1b64bd8 Add checkpoint and restore
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Update go-runc to 49b2a02ec1ed3e4ae52d30b54a291b75

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add shim to restore creation

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Keep checkpoint path in service

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Add C/R to non-shim build

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Checkpoint rw and image

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Pause container on bind checkpoints

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Return dump.log in error on checkpoint failure

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Pause container for checkpoint

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Update runc to 639454475cb9c8b861cc599f8bcd5c8c790ae402

For checkpoint into to work you need runc version
639454475cb9c8b861cc599f8bcd5c8c790ae402 + and criu 3.0 as this is what
I have been testing with.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Move restore behind create calls

This remove the restore RPCs in favor of providing the checkpoint
information to the `Create` calls of a container.  If provided, the
container will be created/restored from the checkpoint instead of an
existing container.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>

Regen protos after rebase

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-05-22 15:34:45 -07:00
Kenfe-Mickael Laventure
8ca92a2aa8 Ensure shim start & exit events are sent in right order
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-05-19 15:31:18 -07:00
Kenfe-Mickael Laventure
dd16c0583b Revert "Merge pull request #853 from AkihiroSuda/check-rootfs"
This reverts commit c1530b5b76, reversing
changes made to 3695ba77bb.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-05-16 14:48:13 -07:00
Kenfe-Mickaël Laventure
47718b0930 Merge pull request #861 from justincormack/go-runc-port
Portability fixes for containerd shim
2017-05-16 12:07:08 -07:00
Justin Cormack
6a571ecd40 Portability fixes for containerd shim
Update go-runc to master with portability fixes.

Subreaper only exists on Linux, and only Linux runs the shim in a
mount namespace.

With these changes the shim compiles on Darwin, which means the
whole build compiles without errors now.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2017-05-16 17:13:32 +01:00
Akihiro Suda
7a62734d82 linux: error out if no rootfs specified
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2017-05-14 14:03:26 +00:00