Request came from a slack message that shims do not output their versions making
it hard for users and operators to know what version of a shim they have on the
system. This adds a `-v` flag to the shims so that users can see if a shim is
in sync with containerd or what versions of shims that they are running.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
The Err() method should be called after the Scan() loop, not inside it.
Found by: git grep -A3 -F '.Scan()'
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Instead of having several dialer implementations, leave only one in
`pkg/dialer` and call it from `pkg/ttrpcutil`, `runtime/v(1|2)/shim`
which had their own
Closes#3471.
Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
When there is timeout or cancel for create container, killShim will fail
because of canceled context. The shim will be dangling and unmanageable.
Need to use new context to do cleanup.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
The background context aovids shim blocking when the ctx is cancelled
unexpectedly during shim start. But if the shim exits unexpectedly
before opening the pipe, the fd will never be closed.
`onCloseWithShimLog` makes sure that the shim log fd is closed properly
once the shim disconnects.
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
There's no OOM monitoring for the v2 cgroups yet, so it seems unlikely
that there was a leak in that case.
Signed-off-by: Seth Pellegrino <spellegrino@newrelic.com>
Only start watching the cgroup for OOMs when the first process starts
instead of on every process.
Signed-off-by: Seth Pellegrino <spellegrino@newrelic.com>
If the context is cancelled during `shim.Create()`, such as the client
disconnects unexpectedly. The created shim will never be deleted.
What's more, if the context is cancelled during `openShimLog()`, the
fifo will be closed and block the shim output.
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
Previously, the platform was closed as part of the Delete method when the
process was an init for a task and there were no more tasks after its deletion.
This can create problems if another task is created within the shim right after
the delete runs, which results in the platform being closed but the shim
continuing to run.
This change moves closing the platform to the Shutdown method after the shim's
context is canceled, which ensures the platform is only closed once the shim
is sure its done servicing containers.
Signed-off-by: Erik Sipsma <sipsma@amazon.com>
* only shim v2 runc v2 ("io.containerd.runc.v2") is supported
* only PID metrics is implemented. Others should be implemented in separate PRs.
* lots of code duplication in v1 metrics and v2 metrics. Dedupe should be separate PR.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Reized the I/O buffers to align with the size of the kernel buffers with fifos
and move the close aspect of the console to key off of the stdin closing.
Fixes#3738
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
- Our out of tree shim would like to publish events with ttrpc. These
functions should be exposed so our shim doesn't need to reimplement
publisher logic.
Signed-off-by: Kathryn Baldauf <kabaldau@microsoft.com>
The cgroup dependency brings in quite a lot only for WithNamespaceCgroupDeletion,
which is a namespaces.DeleteOpt.
Signed-off-by: Tibor Vass <tibor@docker.com>
Because of the way go handles flags, passing a flag that is not defined
will cause an error. In our case, if we kept this as a flag, then
third-party shims would break when they see this new flag. To fix this,
I moved this new configuration option to an env var. We should use env
vars from here on out to avoid breaking shim compat.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Previously the TTRPC address was generated as "<GRPC address>.ttrpc".
This change now allows explicit configuration of the TTRPC address, with
the default still being the old format if no value is specified.
As part of this change, a new configuration section is added for TTRPC
listener options.
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
When containerd-shim does reaper, the most processes are not init
process. Since json.Decode consumes more CPU resource, we should check
killall option for init process only.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
When using a multi-container shim, the fifo of the 2nd to Nth container
will not be opened when the ctx is done. This will cause an
`ErrReadClosed` that can be ignored.
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
This adds a singleton `timeout` package that will allow services and user
to configure timeouts in the daemon. When a service wants to use a
timeout, it should declare a const and register it's default value
inside an `init()` function for that package. When the default config
is generated, we can use the `timeout` package to provide the available
timeout keys so that a user knows that they can configure.
These show up in the config as follows:
```toml
[timeouts]
"io.containerd.timeout.shim.cleanup" = 5
"io.containerd.timeout.shim.load" = 5
"io.containerd.timeout.shim.shutdown" = 3
"io.containerd.timeout.task.state" = 2
```
Timeouts in the config are specified in seconds.
Timeouts are very hard to get right and giving this power to the user to
configure things is a huge improvement. Machines can be faster and
slower and depending on the CPU or load of the machine, a timeout may
need to be adjusted.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>