https://github.com/containerd/containerd/pull/8143 added an alias for
logrus.Fields and moved over most usages to this alias, but there was
one straggler.
Signed-off-by: Danny Canter <danny@dcantah.dev>
We pass in a callback using the ttrpc.WithOnClose functionality
for shims that use ttrpc, but with the newly added ability to use
GRPC for shims this was left as a follow-up. It doesn't seem like
grpc-go has anything similar so some options (that I could see) are:
This change introduces a new grpcConn wrapper type for the connection
that exposes a method to get notified when the users callback has run,
the same in functionality as TTRPC's `UserOnCloseWait`. The callback
gets passed in in a new `grpcDialContext` function that will:
1. Dial the connection as normal
2. Spin off a goroutine that will monitor the connections state
until it transitions to idle or shutdown and will then run the
callback.
Signed-off-by: Danny Canter <danny@dcantah.dev>
Recent work added the ability to use grpc for shims, it'd be nice to
have a debug (or info perhaps) log to show what protocol and addr the
shim sent over.
Signed-off-by: Danny Canter <danny@dcantah.dev>
- Add Target to mount.Mount.
- Add UnmountMounts to unmount a list of mounts in reverse order.
- Add UnmountRecursive to unmount deepest mount first for a given target, using
moby/sys/mountinfo.
Signed-off-by: Edgar Lee <edgarhinshunlee@gmail.com>
This commit upgrades github.com/containerd/typeurl to use typeurl.Any.
The interface hides gogo/protobuf/types.Any from containerd's Go client.
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
This commit removes gogoproto.enumvalue_customname,
gogoproto.goproto_enum_prefix and gogoproto.enum_customname.
All of them make proto-generated Go code more idiomatic, but we already
don't use these enums in our external-surfacing types and they are anyway
not supported by Google's official toolchain (see #6564).
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
After #4906, containerd opens fifo in read/write mode in linux platform
The original comment doesn't correct and is removed by #5174.
```
// original comment
// When using a multi-container shim, the fifo of the 2nd to Nth
// container will not be opened when the ctx is done. This will
// cause an ErrReadClosed that can be ignored.
```
However, we should add comment for checkCopyShimLogError to mention why
we call checkCopyShimLogError. The checkCopyShimLogError, it is to prevent
the flood of expected error messages after task die and the expected
errors depend on platform.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
If the shim has been killed and ttrpc connection has been
closed, the shimErr will not be nil. For this case, the event
subscriber, like moby/moby, might have received the exit or delete
events. Just in case, we should allow ttrpc-callback-on-close to
send the exit and delete events again. And the exit status will
depend on result of shimV2.Delete.
If not, the shim has been delivered the exit and delete events.
So we should remove the task record and prevent duplicate events from
ttrpc-callback-on-close.
Fix: #4769
Signed-off-by: Wei Fu <fuweid89@gmail.com>
The shim delete action needs bundle information to cleanup resources
created by shim. If the cleanup dead shim is called after delete bundle,
the part of resources maybe leaky.
The ttrpc client UserOnCloseWait() can make sure that resources are
cleanup before delete bundle, which synchronizes task deletion and
cleanup deadshim. It might slow down the task deletion, but it can make
sure that resources can be cleanup and avoid EBUSY umount case. For
example, the sandbox container like Kata/Firecracker might have mount
points over the rootfs. If containerd handles task deletion and cleanup
deadshim parallelly, the task deletion will meet EBUSY during umount and
fail to cleanup bundle, which makes case worse.
And also update cleanupAfterDeadshim, which makes sure that
cleanupAfterDeadshim must be called after shim disconnected. In some
case, shim fails to call runc-create for some reason, but the runc-create
already makes runc-init into ready state. If containerd doesn't call shim
deletion, the runc-init process will be leaky and hold the cgroup, which
makes pod terminating :(.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Dependencies may be switching to use the new `%w` formatting
option to wrap errors; switching to use `errors.Is()` makes
sure that we are still able to unwrap the error and detect the
underlying cause.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
For some reason, shimv2 process doesn't exist. The ttrpc doesn't detect
the connection closed by server until delete task. For this case, we
should ignore the ttrpc.ErrClosed and let task manager handle the
cleanup.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
If the context is cancelled during `shim.Create()`, such as the client
disconnects unexpectedly. The created shim will never be deleted.
What's more, if the context is cancelled during `openShimLog()`, the
fifo will be closed and block the shim output.
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
This adds a singleton `timeout` package that will allow services and user
to configure timeouts in the daemon. When a service wants to use a
timeout, it should declare a const and register it's default value
inside an `init()` function for that package. When the default config
is generated, we can use the `timeout` package to provide the available
timeout keys so that a user knows that they can configure.
These show up in the config as follows:
```toml
[timeouts]
"io.containerd.timeout.shim.cleanup" = 5
"io.containerd.timeout.shim.load" = 5
"io.containerd.timeout.shim.shutdown" = 3
"io.containerd.timeout.task.state" = 2
```
Timeouts in the config are specified in seconds.
Timeouts are very hard to get right and giving this power to the user to
configure things is a huge improvement. Machines can be faster and
slower and depending on the CPU or load of the machine, a timeout may
need to be adjusted.
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>