Files
kubernetes/pkg
Kubernetes Submit Queue 4796c7b409 Merge pull request #40727 from Random-Liu/handle-cri-in-place-upgrade
Automatic merge from submit-queue

CRI: Handle cri in-place upgrade

Fixes https://github.com/kubernetes/kubernetes/issues/40051.

## How does this PR restart/remove legacy containers/sandboxes?
With this PR, dockershim will convert and return legacy containers and infra containers as regular containers/sandboxes. Then we can rely on the SyncPod logic to stop the legacy containers/sandboxes, and the garbage collector to remove the legacy containers/sandboxes.

To forcibly trigger restart:
* For infra containers, we manually set `hostNetwork` to opposite value to trigger a restart (See [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L389))
* For application containers, they will be restarted with the infra container.
## How does this PR avoid extra overhead when there is no legacy container/sandbox?
For the lack of some labels, listing legacy containers needs extra `docker ps`. We should not introduce constant performance regression for legacy container cleanup. So we added the `legacyCleanupFlag`:
* In `ListContainers` and `ListPodSandbox`, only do extra `ListLegacyContainers` and `ListLegacyPodSandbox` when `legacyCleanupFlag` is `NotDone`.
* When dockershim starts, it will check whether there are legacy containers/sandboxes.
  * If there are none, it will mark `legacyCleanupFlag` as `Done`.
  * If there are any, it will leave `legacyCleanupFlag` as `NotDone`, and start a goroutine periodically check whether legacy cleanup is done.
This makes sure that there is overhead only when there are legacy containers/sandboxes not cleaned up yet.

## Caveats
* In-place upgrade will cause kubelet to restart all running containers.
* RestartNever container will not be restarted.
* Garbage collector sometimes keep the legacy containers for a long time if there aren't too many containers on the node. In that case, dockershim will keep performing extra `docker ps` which introduces overhead.
  * Manually remove all legacy containers will fix this.
  * Should we garbage collect legacy containers/sandboxes in dockershim by ourselves? /cc @yujuhong 
* Host port will not be reclaimed for the lack of checkpoint for legacy sandboxes. https://github.com/kubernetes/kubernetes/pull/39903 /cc @freehan 

/cc @yujuhong @feiskyer @dchen1107 @kubernetes/sig-node-api-reviews 
**Release note**:

```release-note
We should mention the caveats of in-place upgrade in release note.
```
2017-02-03 22:17:56 -08:00
..
2017-01-29 21:41:45 +01:00
2017-02-02 15:26:10 +01:00
2017-01-19 09:50:16 -05:00
2017-02-03 08:15:46 +01:00
2017-01-31 19:14:13 -05:00
2017-02-03 08:15:46 +01:00
2017-01-11 09:09:48 -05:00
2017-01-31 19:14:13 -05:00
2017-01-29 21:41:45 +01:00
2017-02-03 08:15:46 +01:00
2017-01-24 20:56:03 +01:00
2017-02-03 08:15:46 +01:00