Could issues where when exec processes fail the wait block is not
released.
Second, you could not dump stacks if the reaper loop locks up.
Third, the publisher was not waiting on the correct pid.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
By replacing grpc with ttrpc, we can reduce total memory runtime
requirements and binary size. With minimal code changes, the shim can
now be controlled by the much lightweight protocol, reducing the total
memory required per container.
When reviewing this change, take particular notice of the generated shim
code.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The binary name used for executing "containerd publish" was hard-coded
in the shim code, and hence it did not work with customized daemon
binary name. (e.g. `docker-containerd`)
This commit allows specifying custom daemon binary via `containerd-shim
-containerd-binary ...`.
The daemon invokes this command with `os.Executable()` path.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Because of a side-effect import, we have the possibility of pulling in
several unnecessary packages that are used by the plugin and not at
runtime to implement protobuf structures. Setting these imports to
`weak` prevents this from happening, reducing the total import set,
reducing memory usage and binary size.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
To avoid importing all of grpc when consuming events, the types of
events have been split in to a separate package. This should allow a
reduction in memory usage in cases where a package is consuming events
but not using the gprc service directly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This is needed for users on kernel older than 3.18 so they can avoid EBUSY
errors when trying to unlink, rename or remove a mountpoint that is present in
a shim namespace.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
ref: #1464
This tries to solve issues with races around process state. First it
adds the process mutex around the state call so that any state changes,
deletions, etc will be handled in order.
Second, for IsNoExist errors from the runtime, return a stopped state if
a process has been removed from the underlying OCI runtime but not from
the shim yet. This shouldn't happen with the lock from above but its
hare to verify this issue.
Third, handle shim disconnections and return an ErrNotFound.
Forth, don't abort returning all tasks if one task is unable to return
its state.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This ensure that when using the host pid, we don't let process alive,
preventing Wait() to return until they all die.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
This also fix the type used for RuncOptions.SystemCgroup, hence introducing
an API break.
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
Depends on https://github.com/containerd/go-runc/pull/24
The is currently a race with the reaper where you could miss some exit
events from processes.
The problem before and why the reaper was so complex was because
processes could fork, getting a pid, and then fail on an execve before
we would have time to register the process with the reaper. This could
cause pids to fill up in a map as a way to reduce the race.
This changes makes the reaper handle multiple subscribers so that the
caller can handle locking, for when they want to wait for a specific
pid, without affecting other callers using the reaper code.
Exit events are broadcast to multiple subscribers, in the case, the runc
commands and container pids that we get from a pid-file. Locking while
the entire container stats no longs affects runc commands where you want
to call `runc create` and wait until that has been completed.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Use the state pattern to handle process transitions from one state to
another and what actions can be performed on a process in a specific
state.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This reverts commit 06dc87ae59.
Revert "Change oom metric to const"
This reverts commit e800f08f9f.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
This removes the metric vec that was holding onto all task id and
namespace combinations forever, until containerd was restarted. This
was causing a memory leak with many task.
This also removes the shim cmd where the `Args` is quite large from the
reaper after the shim has been started cutting down on another leak.
This is the first pass through the reaper but more code is required to
fix all the issues when commands are added.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>