Commit Graph

70 Commits

Author SHA1 Message Date
Michael Crosby
7030a4add0 Close epoller on task stop
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-20 11:29:13 -04:00
Kenfe-Mickael Laventure
92772bd471
linux: Ensure all init children are dead when it exits
This ensure that when using the host pid, we don't let process alive,
preventing Wait() to return until they all die.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-09-01 14:50:56 -07:00
Kenfe-Mickael Laventure
ab0cb4e756
linux: Honor RuncOptions if set on container
This also fix the type used for RuncOptions.SystemCgroup, hence introducing
an API break.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-31 14:35:05 -07:00
Michael Crosby
6b4c4a2937 Update reaper for multipe subscribers
Depends on https://github.com/containerd/go-runc/pull/24

The is currently a race with the reaper where you could miss some exit
events from processes.

The problem before and why the reaper was so complex was because
processes could fork, getting a pid, and then fail on an execve before
we would have time to register the process with the reaper.  This could
cause pids to fill up in a map as a way to reduce the race.

This changes makes the reaper handle multiple subscribers so that the
caller can handle locking, for when they want to wait for a specific
pid, without affecting other callers using the reaper code.

Exit events are broadcast to multiple subscribers, in the case, the runc
commands and container pids that we get from a pid-file.  Locking while
the entire container stats no longs affects runc commands where you want
to call `runc create` and wait until that has been completed.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-31 14:29:47 -04:00
Michael Crosby
c3711c3866 Merge pull request #1319 from mlaventure/handle-sigkilled-shim
Handle sigkilled shim
2017-08-29 14:06:17 -04:00
Kenfe-Mickael Laventure
1c92c0ecbf
Fix panic in CloseIO when not Stdin was allocated for a process
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-29 09:58:48 -07:00
Kenfe-Mickael Laventure
3f34c421d3
Add missing "/tasks/exec-started" event topic
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-29 08:27:44 -07:00
Kenfe-Mickael Laventure
c23f29ebce
containerd-shim: Don't try to delete container twice
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-29 08:27:44 -07:00
Kenfe-Mickael Laventure
eb4abac9f7
linux: Prevent deadlock in reaper.WaitPid()
A deadlock can occurs if `WaitPid()` is called twice before the process
dies.

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-29 08:27:44 -07:00
Kenfe-Mickael Laventure
8a1b03e525
Add ExitedAt to process proto definition
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-21 08:18:02 -07:00
Michael Crosby
d513dd2bfd Fix race with task checkpoint
Because runc will delete a container after a successful checkpoint we
need to handle a NotFound error from runc on delete.

There is also a race between SIGKILL'ing the shim and it actually
exiting to unmount the tasks rootfs, we need to loop and wait for the
task to actually be reaped before trying to delete the rootfs+bundle.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-10 16:35:03 -04:00
Phil Estes
e9b86af848 Merge pull request #1283 from mlaventure/resurrect-state-dir
Resurrect State directory
2017-08-07 10:41:21 -04:00
Kenfe-Mickael Laventure
8700e23a10
Use root dir when storing temporary checkpoint data
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-03 14:38:18 -07:00
Michael Crosby
9f13b414b9 Return exit status from Wait of stopped process
This changes Wait() from returning an error whenever you call wait on a
stopped process/task to returning the exit status from the process.

This also adds the exit status to the Status() call on a process/task so
that a user can Wait(), check status, then cancel the wait to avoid
races in event handling.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-03 17:22:33 -04:00
Michael Crosby
504033e373 Add Get of task and process state
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-02 13:50:08 -04:00
Michael Crosby
9f08965699 Change Exited/Status to SetExited/ExitStatus
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-02 13:50:08 -04:00
Michael Crosby
63878d14ea Add create/start to exec processes in shim
This splits up the create and start of an exec process in the shim to
have two separate steps like the initial process.  This will allow
better state reporting for individual process along with a more robust
wait for execs.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-02 13:50:08 -04:00
Stephen J Day
7ed88c1e36
linux/shim: use events.Publisher interface
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-31 14:23:51 -07:00
Stephen Day
92d737f4ae Merge pull request #1259 from dqminh/epoll-io
Use Epoll to perform I/O in linux
2017-07-31 13:47:41 -07:00
Stephen J Day
af2d7f0e55
events: initial support for filters
This change further plumbs the components required for implementing
event filters. Specifically, we now have the ability to filter on the
`topic` and `namespace`.

In the course of implementing this functionality, it was found that
there were mismatches in the events API that created extra serialization
round trips. A modification to `typeurl.MarshalAny` and a clear
separation between publishing and forwarding allow us to avoid these
serialization issues.

Unfortunately, this has required a few tweaks to the GRPC API, so this
is a breaking change. `Publish` and `Forward` have been clearly separated in
the GRPC API. `Publish` honors the contextual namespace and performs
timestamping while `Forward` simply validates and forwards. The behavior
of `Subscribe` is to propagate events for all namespaces unless
specifically filtered (and hence the relation to this particular change.

The following is an example of using filters to monitor the task events
generated while running the [bucketbench tool](https://github.com/estesp/bucketbench):

```
$ ctr events 'topic~=/tasks/.+,namespace==bb'
...
2017-07-28 22:19:51.78944874 +0000 UTC   bb        /tasks/start   {"container_id":"bb-ctr-6-8","pid":25889}
2017-07-28 22:19:51.791893688 +0000 UTC   bb        /tasks/start   {"container_id":"bb-ctr-4-8","pid":25882}
2017-07-28 22:19:51.792608389 +0000 UTC   bb        /tasks/start   {"container_id":"bb-ctr-2-9","pid":25860}
2017-07-28 22:19:51.793035217 +0000 UTC   bb        /tasks/start   {"container_id":"bb-ctr-5-6","pid":25869}
2017-07-28 22:19:51.802659622 +0000 UTC   bb        /tasks/start   {"container_id":"bb-ctr-0-7","pid":25877}
2017-07-28 22:19:51.805192898 +0000 UTC   bb        /tasks/start   {"container_id":"bb-ctr-3-6","pid":25856}
2017-07-28 22:19:51.832374931 +0000 UTC   bb        /tasks/exit   {"container_id":"bb-ctr-8-6","id":"bb-ctr-8-6","pid":25864,"exited_at":"2017-07-28T22:19:51.832013043Z"}
2017-07-28 22:19:51.84001249 +0000 UTC   bb        /tasks/exit   {"container_id":"bb-ctr-2-9","id":"bb-ctr-2-9","pid":25860,"exited_at":"2017-07-28T22:19:51.839717714Z"}
2017-07-28 22:19:51.840272635 +0000 UTC   bb        /tasks/exit   {"container_id":"bb-ctr-7-6","id":"bb-ctr-7-6","pid":25855,"exited_at":"2017-07-28T22:19:51.839796335Z"}
...
```

In addition to the events changes, we now display the namespace origin
of the event in the cli tool.

This will be followed by a PR to add individual field filtering for the
events API for each event type.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-31 12:53:18 -07:00
Michael Crosby
7b6ff6ec89 event forwarding without shim
Fixes #1138

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-31 10:05:24 -04:00
Daniel Dao
8e53465842
use epoll to manage console i/o in linux
this adds a `platform` interface for shim service to manage platform-specific
behaviors such as I/O (which uses epoll in linux to work around bugs with applications
that closes all consoles i.e. https://github.com/opencontainers/runc/pull/1434
and https://github.com/moby/moby/issues/27202)

Its expected that we only have 1 epollfd per containerd_shim to manage all processes.
Since all the work are done outside of the container runtime, upgrading of runc
is not required and should be done separately.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2017-07-30 10:50:39 +01:00
Stephen J Day
a615a6fe5d
events: refactor event distribution
In the course of setting out to add filters and address some cleanup, it
was found that we had a few problems in the events subsystem that needed
addressing before moving forward.

The biggest change was to move to the more standard terminology of
publish and subscribe. We make this terminology change across the Go
interface and the GRPC API, making the behavior more familier. The
previous system was very context-oriented, which is no longer required.

With this, we've removed a large amount of dead and unneeded code. Event
transactions, context storage and the concept of `Poster` is gone. This
has been replaced in most places with a `Publisher`, which matches the
actual usage throughout the codebase, removing the need for helpers.

There are still some questions around the way events are handled in the
shim. Right now, we've preserved some of the existing bugs which may
require more extensive changes to resolve correctly.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-25 15:08:09 -07:00
Derek McGowan
a8504277cc Merge pull request #1209 from stevvooe/remove-errors
linux, linux/shim: remove error definitions
2017-07-18 19:18:23 -07:00
Stephen J Day
1ecb2ea30d
linux/shim: remove redundant topic prefix
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-18 18:19:25 -07:00
Stephen J Day
6d0bcd5aec
linux, linux/shim: remove error definitions
Since we now have a common set of error definitions, mapped to existing
error codes, we no longer need the specialized error codes used for
interaction with linux processes. The main issue was that string
matching was being used to map these to useful error codes. With this
change, we use errors defined in the `errdefs` package, which map
cleanly to GRPC error codes and are recoverable on either side of the
request.

The main focus of this PR was in removin these from the shim. We may
need follow ups to ensure error codes are preserved by the `Tasks`
service.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-18 15:56:49 -07:00
Kenfe-Mickael Laventure
e4beb7c554
Use constants for runtime event topics
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-07-18 14:19:48 +02:00
Kenfe-Mickael Laventure
a578730a94
Update linux events topic
This also remove the duplicate events for Task{Create,Start,Delete}

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-07-18 13:47:28 +02:00
Stephen J Day
9e5bd5a2dc
namespaces, identifiers: split validation
After review, there are cases where having common requirements for
namespaces and identifiers creates contention between applications.  One
example is that it is nice to have namespaces comply with domain name
requirement, but that does not allow underscores, which are required for
certain identifiers.

The namespaces validation has been reverted to be in line with RFC 1035.
Existing identifiers has been modified to allow simply alpha-numeric
identifiers, while limiting adjacent separators.

We may follow up tweaks for the identifier charset but this split should
remove the hard decisions.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-07-12 14:46:47 -07:00
Michael Crosby
2b6d790ff4 Refactor runtime events into Task* types
This removes the RuntimeEvent super proto with enums into separate
runtime event protos to be inline with the other events that are output
by containerd.

This also renames the runtime events into Task* events.

Fixes #1071

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-12 10:57:57 -07:00
Michael Crosby
58da62dd0f Add runtime events for pause,resume,checkpoint
Fixes #1068

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-11 12:38:20 -07:00
Michael Crosby
6578565216 Use event service post for shim events
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-07 16:30:57 -07:00
Michael Crosby
f93bfb6233 Add Exec IDs
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-06 15:23:08 -07:00
Samuel Ortiz
b67398af15 linux: Make containerd less runc specific
We hope that containerd supports any OCI compliant runtime, and not only
runc.
This patch fixes all the error messages to not be completely runc
specific and change the initProcess structure to have its runtime
pointer be called 'runtime' and not 'runc'

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-07-03 17:45:23 +02:00
Michael Crosby
e2d5522435 Change ListProcesses to ListPids
These rpcs only return pids []uint32 so should be named that way in
order to have other rpcs that list Processes such as Exec'd processes.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-28 16:10:41 -07:00
Michael Crosby
040558cf81 Remove runtime.Event types
This uses the events service types for runtime events

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-28 10:47:22 -07:00
Michael Crosby
f36e0193a4 Implement task update
This allows tasks to have their resources updated as they are running.

Fixes #1067

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-26 16:38:49 -07:00
Kenfe-Mickael Laventure
b7f37e778c
containerd-shim: Do not crash when receiving RPC before a Create() is issued
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-06-23 17:51:57 -07:00
Michael Crosby
990536f2cc Move shim protos into linux pkg
This moves the shim's API and protos out of the containerd services
package and into the linux runtime package. This is because the shim is
an implementation detail of the linux runtime that we have and it is not
a containerd user facing api.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-23 16:21:47 -07:00
Michael Crosby
3b9d9dfa3e Fix error on doulbe Kill calls
This returns a typed error for calls to Kill when the process has
already finished.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-23 13:28:48 -07:00
Stephen J Day
12a6beaeeb
*: update import paths to use versioned services
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-06-21 18:29:06 -07:00
Michael Crosby
8b2cf6e8e6 Fix Wait() on process/tasks
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-21 13:48:24 -07:00
Michael Crosby
ff598449d1 Add DeleteProcess API for removing execs
We need a separate API for handing the exit status and deletion of
Exec'd processes to make sure they are properly cleaned up within the
shim and daemon.

Fixes #973

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-12 09:32:23 -07:00
Michael Crosby
497db9ac06 Namespace tasks via runc --root
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-06 16:31:00 -07:00
Michael Crosby
00734ab04a Return fifo paths from Shim
This allows attach of existing fifos to be done without any information
stored on the client side.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-01 14:12:02 -07:00
Michael Crosby
6ff220a116 Merge pull request #939 from ijc25/reconnect-shim-event-stream
Reconnect to shim event stream after containerd restart
2017-06-01 09:52:13 -07:00
Michael Crosby
ebf935d990 Add exec support to client
This also fixes a deadlock in the shim's reaper where execs would lockup
and/or miss a quick exiting exec process's exit status.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-05-31 11:50:23 -07:00
Ian Campbell
a5d246404c Reconnect to shim event stream after containerd restart
There are three aspects which need to be covered:

 - the runtime needs to restart its event pump when it reconnects (in
   loadContainer).
 - on the server side shim needs to monitor the stream context so it knows when
   the connection goes away.
 - if the shim's stream.Send() fails (because the stream died between taking
   the event off the channel and calling stream.Send()) then to avoid losing
   that event the shim should remember it and send it out first on the next
   stream.

The shim's event production machinery only handles producing a single event
stream, so add an interlock to ensure there is only one reader of the
`s.events` channel at a time. Subsequent attempts to use Events will block
until the existing owner is done.

Fixes #921.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
2017-05-31 13:48:44 +01:00
Yanqiang Miao
e4e80fb7b7 delete the redundant
Signed-off-by: Yanqiang Miao <miao.yanqiang@zte.com.cn>
2017-05-26 16:55:49 +08:00
Stephen J Day
539742881d
api/services: define the container metadata service
Working from feedback on the existing implementation, we have now
introduced a central metadata object to represent the lifecycle and pin
the resources required to implement what people today know as
containers. This includes the runtime specification and the root
filesystem snapshots. We also allow arbitrary labeling of the container.
Such provisions will bring the containerd definition of container closer
to what is expected by users.

The objects that encompass today's ContainerService, centered around the
runtime, will be known as tasks. These tasks take on the existing
lifecycle behavior of containerd's containers, which means that they are
deleted when they exit. Largely, there are no other changes except for
naming.

The `Container` object will operate purely as a metadata object. No
runtime state will be held on `Container`. It only informs the execution
service on what is required for creating tasks and the resources in use
by that container. The resources referenced by that container will be
deleted when the container is deleted, if not in use. In this sense,
users can create, list, label and delete containers in a similar way as
they do with docker today, without the complexity of runtime locks that
plagues current implementations.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-05-22 23:27:53 -07:00