Automatic merge from submit-queue (batch tested with PRs 55114, 52976, 54871, 55122, 55140). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Make CRI logs parsing to a library
**What this PR does / why we need it**:
Make CRI logs parsing to a library.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55136
**Special notes for your reviewer**:
**Release note**:
```release-note
Add CRI log parsing library at pkg/kubelet/apis/cri/logs
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up redundant DNS related codes
**What this PR does / why we need it**:
As https://github.com/kubernetes/kubernetes/pull/54773#discussion_r148904955 described, resolv.conf setup for pod is handled by `generatePodSandboxConfig()`, though we have some redundant DNS related codes in `GenerateRunContainerOptions()` which seems to have no effect.
This PR cleans up the ineffective codes and rearranges the cluster DNS unit test and hopefully it would be less confusing.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55201
**Special notes for your reviewer**:
cc @Random-Liu @phsiao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55034, 55068). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Close the file before renaming in FileStore
Also change the unit test to use a real file system to detect errors
like this.
Automatic merge from submit-queue (batch tested with PRs 53679, 51063). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes to enable Windows CNI
**What this PR does / why we need it**:
This PR has fixed which enables Kubelet to use Windows CNI plugin.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#49646
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 55050, 53464, 54936, 55028, 54928). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix panic in kubelet because of uninitialized map
**What this PR does / why we need it**:
Initialized the uninitialized map in kubelet
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes [#54927](https://github.com/kubernetes/kubernetes/issues/54927)
**Special notes for your reviewer**:
The default value of --enable-controller-attach-detach is true, map will be initialized like:
```
if kl.enableControllerAttachDetach {
if node.Annotations == nil {
node.Annotations = make(map[string]string)
}
...
}
```
if set --enable-controller-attach-detach to false, map will have no Initialized.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55050, 53464, 54936, 55028, 54928). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: dockershim: remove orphaned checkpoint files
Fixes https://github.com/kubernetes/kubernetes/issues/55070
Currently, `ListPodSandbox()` returns a combined list of sandboxes populated from both the runtime and the dockershim checkpoint files. However the sandboxes in the checkpoint files might not exist anymore.
The kubelet sees the sandbox returned by `ListPodSandbox()` and determines it shouldn't be running and calls `StopPodSandbox()` on it. This generates an error when `StopContainer()` is called as the container does not exist. However the checkpoint file is not cleaned up. This leads to subsequent calls to `StopPodSandbox()` that fail in the same way each time.
This PR removes the checkpoint file if StopContainer fails due to container not found.
The only other place `RemoveCheckpoint()` is called, except if it is corrupt, is from `RemoveSandbox()`. If the container does not exist, what `RemoveSandbox()` would have done has been effectively been done already. So this is just clean up.
@derekwaynecarr @eparis @freehan @dcbw
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
StopPodSandbox should not log when container is already removed
**What this PR does / why we need it**:
StopPodSandbox should not log when a container is already gone. It should only log if it could not stop and the container was still present.
Fixes https://github.com/kubernetes/kubernetes/issues/55021
**Special notes for your reviewer**:
This was seen in our production logs, need to eliminate spam.
**Release note**:
```release-note
NONE
```
Following are part of this commit
+++++++++++++++++++++++++++++++++
* Windows CNI Support
(1) Support to use --network-plugin=cni
(2) Handled platform requirement of calling CNI ADD for all the containers.
(2.1) For POD Infra container, netNs has to be empty
(2.2) For all other containers, sharing the network namespace of POD container,
should pass netNS name as "container:<Pod Infra Container Id>", same as the
NetworkMode of the current container
(2.3) The Windows CNI plugin has to handle this to call into Platform.
Sample Windows CNI Plugin code to be shared soon.
* Sandbox support for Windows
(1) Sandbox support for Windows. Works only with Docker runtime.
(2) Retained CONTAINER_NETWORK as a backward compatibilty flag,
to not break existing deployments using it.
(3) Works only with CNI plugin enabled.
(*) Changes to reinvoke CNI ADD for every new container created. This is hooked up with PodStatus,
but would be ideal to move it outside of this, once we have CNI GET support
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add admission handler for device resources allocation
**What this PR does / why we need it**:
Add admission handler for device resources allocation to fail fast during pod creation
**Which issue this PR fixes**
fixes#51592
**Special notes for your reviewer**:
@jiayingz Sorry, there is something wrong with my branch in #51895. And I think the existing comments in the PR might be too long for others to view. So I closed it and opened the new one, as we have basically reach an agreement on the implement :)
I have covered the functionality and unit test part here, and would set about the e2e part ASAP
/cc @jiayingz @vishh @RenaudWasTaken
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add a file store utility package in kubelet
More and more components checkpoints (i.e., persist their states) in
kubelet. Refurbish and move the implementation in dockershim to a
utility package to improve code reusability.
Automatic merge from submit-queue (batch tested with PRs 52367, 53363, 54989, 54872, 54643). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Lift embedded structure out of ManifestURLHeader field
Related: #53833
```release-note
It is now possible to set multiple manifest url headers via the Kubelet's --manifest-url-header flag. Multiple headers for the same key will be added in the order provided. The ManifestURLHeader field in KubeletConfiguration object (kubeletconfig/v1alpha1) is now a map[string][]string, which facilitates writing JSON and YAML files.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubenet: yield lock while executing CNI plugin.
The CNI plugin can take up to 3 seconds to execute. CNI plugins can safely be
executed in parallel, so yield the lock to speed up pod creation.
This caused problems with the pod latency tests - previously, CNI plugins executed
in under 20ms. Now they must wait for DAD to finish and addresses to leave
tentative state.
Fixes: #54651
**What this PR does / why we need it**:
After upgrading CNI plugins to v0.6 in #51250, the pod latency tests began failing. This is because the plugins, in order to support IPv6, need to wait for DAD to finish. Because this
delay is while the kubenet lock is held, it significantly slows down the pod creation rate.
**Special notes for your reviewer**:
The CNI plugins also do locking for their critical paths, so it is safe to run them concurrently.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54894, 54630, 54828, 54926, 54865). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
set leveled logging (v=4) for 'updating container' message
**What this PR does / why we need it**:
Currently cpu_manager.go logs a line for every pod at every reconcilePeriod (10 sec default) when it reconciles and updates the pod's cpuset setting. This creates a lot of logging information that is not very interesting and we should suppress that by default by increasing the logging level.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#54804
**Special notes for your reviewer**:
I chose V(4) because that seems to be a popular level for messages at this detail. Happy to follow logging guideline if there is any.
**Release note**:
``` kubelet: cpu_manager logs informative reconcile message at V(4) to reduce clutter ```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update volume OWNERS to reflect active sig-storage reviewers
**What this PR does / why we need it**:
Update sig-storage reviewers to add new members and remove those that don't have as much time to review storage PRs. Approvers are unchanged.
**Special notes for your reviewer**:
For all those that have been removed, please approve. If you want to remain as a reviewer, let me know and I will add you back.
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 53962, 54708). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prevent successful containers from restarting with OnFailure restart policy
**What this PR does / why we need it**:
This is a follow-on to #54597 which makes sure that its validation
also applies to pods with a restart policy of OnFailure. This
deficiency was pointed out by @smarterclayton here:
https://github.com/kubernetes/kubernetes/pull/54530#discussion_r147226458
**Which issue this PR fixes** This is another fix to address #54499
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
improve the relation of ExecInContainer and Exec
keep the relation between ExecInContainer and Exec be consistence with PortForward in streaming server
fix#54903
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
The CNI plugin can take up to 3 seconds to execute. CNI plugins can safely be
executed in parallel, so yield the lock to speed up pod creation.
Fixes: #54651
Automatic merge from submit-queue (batch tested with PRs 49762, 52256). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add fake remote runtime service
**What this PR does / why we need it**:
Add fake remote runtime service.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
First step of #45206.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54656, 54552, 54389, 53634, 54408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add file backed state to cpu manager
**What this PR does / why we need it**:
Adds file backed `State` implementation to cpu manger with tests.
Reads from `State` are done from memory, while each write triggers state save to a file.
Any failure in reading the state file results in empty state
Next PR: #54409
Automatic merge from submit-queue (batch tested with PRs 54593, 54607, 54539, 54105). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Removed containers are not always waiting
fixes#54499
The issue was that a container that is removed (during pod deletion, for example), is assumed to be in a "waiting" state.
Instead, we should use the previous container state.
Fetching the most recent status is required to ensure that we accurately reflect the previous state. The status attached to a pod object is often stale.
I verified this by looking through the kubelet logs during a deletion, and verifying that the status updates do not transition from terminated -> pending.
cc @kubernetes/sig-node-bugs @sjenning @smarterclayton @derekwaynecarr @dchen1107
```release-note
Fix an issue where pods were briefly transitioned to a "Pending" state during the deletion process.
```
Automatic merge from submit-queue (batch tested with PRs 54597, 54593, 54081, 54271, 54600). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: check for illegal container state transition
supersedes https://github.com/kubernetes/kubernetes/pull/54530
Puts a state transition check in the kubelet status manager to detect and block illegal transitions; namely from terminated to non-terminated.
@smarterclayton @derekwaynecarr @dashpole @joelsmith @frobware
I confirmed that the reproducer in #54499 does not work with this check in place. The erroneous kubelet status update is rejected:
```
status_manager.go:301] Status update on pod default/test aborted: terminated container test-container attempted illegal transition to non-terminated state
```
After fix https://github.com/kubernetes/kubernetes/pull/54593, I do not see the message with the above mentioned reproducer.