integration: add ShouldRetryShutdown case based on #7496

Since the moby/moby can't handle duplicate exit event well, it's hard
for containerd to retry shutdown if there is error, like context
canceled.

In order to prevent from regression like #4769, I add skipped
integration case as TODO item and we should rethink about how to handle
the task/shim lifecycle.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
This commit is contained in:
Wei Fu
2023-08-11 08:06:15 +00:00
parent 8dcb2a6e6d
commit 601699a184
2 changed files with 76 additions and 1 deletions

View File

@@ -458,6 +458,12 @@ func (s *shimTask) delete(ctx context.Context, sandboxed bool, removeTask func(c
// If not, the shim has been delivered the exit and delete events.
// So we should remove the record and prevent duplicate events from
// ttrpc-callback-on-close.
//
// TODO: It's hard to guarantee that the event is unique and sent only
// once. The moby/moby should not rely on that assumption that there is
// only one exit event. The moby/moby should handle the duplicate events.
//
// REF: https://github.com/containerd/containerd/issues/4769
if shimErr == nil {
removeTask(ctx, s.ID())
}
@@ -466,7 +472,11 @@ func (s *shimTask) delete(ctx context.Context, sandboxed bool, removeTask func(c
// Let controller decide when to shutdown.
if !sandboxed {
if err := s.waitShutdown(ctx); err != nil {
log.G(ctx).WithField("id", s.ID()).WithError(err).Error("failed to shutdown shim task")
// FIXME(fuweid):
//
// If the error is context canceled, should we use context.TODO()
// to wait for it?
log.G(ctx).WithField("id", s.ID()).WithError(err).Error("failed to shutdown shim task and the shim might be leaked")
}
}