Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)
Make CPU manager release CPUs when Pod enters completed phase.
**What this PR does / why we need it**: When CPU manager is enabled, this PR releases allocated CPUs when container is not running and is non-restartable.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52351
**Special notes for your reviewer**:
This bug is only reproduced for pods with `restartPolicy` = `Never` or `OnFailure`. The following output is from a 4 CPU node. This bug can be reproduced as long >= half the cores are requested.
pod1.yaml:
```
apiVersion: v1
kind: Pod
metadata:
name: test-pod1
spec:
containers:
- image: ubuntu
command: ["/bin/bash"]
args: ["-c", "sleep 5"]
name: test-container1
resources:
requests:
cpu: 2
memory: 100Mi
limits:
cpu: 2
memory: 100Mi
restartPolicy: "Never"
```
pod2.yaml:
```
apiVersion: v1
kind: Pod
metadata:
name: test-pod2
spec:
containers:
- image: ubuntu
command: ["/bin/bash"]
args: ["-c", "sleep 5"]
name: test-container1
resources:
requests:
cpu: 2
memory: 100Mi
limits:
cpu: 2
memory: 100Mi
restartPolicy: "Never"
```
Run a local Kubernetes cluster with CPU manager enabled.
```sh
KUBELET_FLAGS='--feature-gates=CPUManager=true --cpu-manager-policy=static --cpu-manager-reconcile-period=1s --kube-reserved=cpu=500m' ./hack/local-up-cluster.sh
```
_Before:_
Create `test-pod1` using pod1.yaml.
```
./cluster/kubectl.sh create -f pod1.yaml
```
Wait for the pod to complete and wait another 90 seconds (give enough time for GC to kick-in).
Create `test-pod2` using pod2.yaml.
```
./cluster/kubectl.sh create -f pod2.yaml
```
Get all pods in the cluster.
```
./cluster/kubectl.sh get pods -a
NAME READY STATUS RESTARTS AGE
test-pod1 0/1 Completed 0 1m
test-pod2 0/1 not enough cpus available to satisfy request 0 9s
```
_After:_
Create `test-pod1` using pod1.yaml.
```
./cluster/kubectl.sh create -f pod1.yaml
```
Wait for the pod to complete and wait another 90 seconds (give enough time for GC to kick-in).
Create `test-pod2` using pod2.yaml.
```
./cluster/kubectl.sh create -f pod2.yaml
```
Get all pods in the cluster.
```
./cluster/kubectl.sh get pods -a
NAME READY STATUS RESTARTS AGE
test-pod1 0/1 Completed 0 1m
test-pod2 0/1 Completed 0 9s
```
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)
Wait for container cleanup before deletion
We should wait to delete pod API objects until the pod's containers have been cleaned up. See issue: #50268 for background.
This changes the kubelet container gc, which deletes containers belonging to pods considered "deleted".
It adds two conditions under which a pod is considered "deleted", allowing containers to be deleted:
Pods where deletionTimestamp is set, and containers are not running
Pods that are evicted
This PR also changes the function PodResourcesAreReclaimed by making it return false if containers still exist.
The eviction manager will wait for containers of previous evicted pod to be deleted before evicting another pod.
The status manager will wait for containers to be deleted before removing the pod API object.
/assign @vishh
Automatic merge from submit-queue
When faild create pod sandbox record event.
I created pods because of the failure to create a sandbox, but there was no clear message telling me what was the failure, so I wanted to record an event when the sandbox was created.
**Release note**:
```release-note
NONE
```
Whenever pod sandbox needs to be recreated, all containers associated
with it will be killed by kubelet. This change ensures that the init
containers will be rerun in such cases.
The change also refactors the compute logic so that the control flow of
init containers act is more aligned with the regular containers. Unit
tests are added to verify the logic.
Automatic merge from submit-queue (batch tested with PRs 49444, 47864, 48584, 49395, 49118)
Move event type
Change SandboxChanged to a constant and move to the event package below.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45860, 45119, 44525, 45625, 44403)
Make a log line more clear in kuberuntime_manager.go.
Make a log in `podSandboxChanged` more clear.
@yujuhong @feiskyer
Automatic merge from submit-queue (batch tested with PRs 43653, 43654, 43652)
CRI: Check nil pointer to avoid kubelet panic.
When working on the containerd kubernetes integration, I casually returns an empty `sandboxStatus.Linux{}`, but it cause kubelet to panic.
This won't happen when runtime returns valid data, but we should not make the assumption here.
/cc @yujuhong @feiskyer
Automatic merge from submit-queue (batch tested with PRs 40392, 39242, 40579, 40628, 40713)
optimize podSandboxChanged() function and fix some function notes
Automatic merge from submit-queue (batch tested with PRs 40168, 40165, 39158, 39966, 40190)
CRI: upgrade protobuf to v3
For #38854, this PR upgrades CRI protobuf version to v3, and also updated related packages for confirming to new api.
**Release note**:
```
CRI: upgrade protobuf version to v3.
```