runtime/v2/runc: fix leaking socket path

When runC shimv2 starts, the StartShim interface will re-exec itself as
long-running process, which will read the `address` during initializing.

```happycase
Process

containerd-shim-runc-v1/v2 start             containerd-shim-runc-v1/v2

	initializing socket

	reexec containerd-shim-runc-v1/v2

	write address into file

						initializing

							read address

	write back to containerd daemon

						serving

						...

						remove address in Shutdown call
```

However, there is no synchronization after reexec. Then the data race is
like:

```leaking-case
Process

containerd-shim-runc-v1/v2 start             containerd-shim-runc-v1/v2

	initializing socket

	reexec containerd-shim-runc-v1/v2

						initializing

							read address

	write address into file

	write back to containerd daemon

						serving

						...

						fail to remove address
						because of empty address
```

The `address` should be writen into file first before reexec.

And if shutdown the whole service before cleanup temporary
resource (like socket file), the Shutdown caller will receive `ttrpc: closed`
sometime, which depends on go runtime scheduler. Then it also causes leaking
socket files.

Since the shimV2-Delete binary API must be called to cleanup shim temporary
resource and shimV2-runC-v1 doesn't support grouping multi containers in one,
it is safe to remove the socket file in the binary call for shimV2-runC-v1.
But for the shimV2-runC-v2 shim, we still cleanup socket in Shutdown.
Hopefully we can find a way to cleanup socket in shimV2-Delete binary
call.

Fix: #5173

Signed-off-by: Wei Fu <fuweid89@gmail.com>
This commit is contained in:
Wei Fu
2021-03-15 12:30:06 +08:00
parent d8208e2e37
commit d895118c7c
3 changed files with 34 additions and 16 deletions

View File

@@ -239,6 +239,13 @@ func run(id string, initFunc Init, config Config) error {
return err
}
}
// NOTE: If the shim server is down(like oom killer), the address
// socket might be leaking.
if address, err := ReadAddress("address"); err == nil {
_ = RemoveSocket(address)
}
select {
case <-publisher.Done():
return nil
@@ -299,15 +306,11 @@ func serve(ctx context.Context, server *ttrpc.Server, path string) error {
return err
}
go func() {
defer l.Close()
if err := server.Serve(ctx, l); err != nil &&
!strings.Contains(err.Error(), "use of closed network connection") {
logrus.WithError(err).Fatal("containerd-shim: ttrpc server failure")
}
l.Close()
if address, err := ReadAddress("address"); err == nil {
_ = RemoveSocket(address)
}
}()
return nil
}