cri: handle sandbox/container exit event separately

The event monitor handles exit events one by one. If there is something wrong about deleting task, it will slow down the terminating Pods. In order to reduce the impact, the exit event watcher should handle exit event separately. If it failed, the watcher should put it into backoff queue and retry it. Signed-off-by: Wei Fu <fuweid89@gmail.com>
2020-10-31 17:39:10 +08:00
parent 643bb9b66d
commit e56de63099
6 changed files with 117 additions and 36 deletions
--- a/pkg/cri/server/container_start.go
+++ b/pkg/cri/server/container_start.go
@@ -148,10 +148,8 @@ func (c *criService) StartContainer(ctx context.Context, r *runtime.StartContain
 		return nil, errors.Wrapf(err, "failed to update container %q state", id)
 	}

-	// start the monitor after updating container state, this ensures that
-	// event monitor receives the TaskExit event and update container state
-	// after this.
-	c.eventMonitor.startExitMonitor(context.Background(), id, task.Pid(), exitCh)
+	// It handles the TaskExit event and update container state after this.
+	c.eventMonitor.startContainerExitMonitor(context.Background(), id, task.Pid(), exitCh)

 	return &runtime.StartContainerResponse{}, nil
 }