Automatic merge from submit-queue
kubelet: check and enforce minimum docker api version
**What this PR does / why we need it**:
This PR adds enforcing a minimum docker api version (same with what we have do for dockertools).
**Which issue this PR fixes**
Fixes#42696.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)
Fix tiny typo
**What this PR does / why we need it**:
**Which issue this PR fixes**
Fix type typo introduced by PR #43368.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)
Kubelet:rkt Create any missing hostPath Volumes
When using a `hostPath` inside the `Pod.spec.volumes`, this PR allows to creates any missing directory on the node.
**What this PR does / why we need it**:
With rkt as the container runtime we cannot use `hostPath` volumes if the directory is missing.
**Special notes for your reviewer**:
This PR follows [#39965](https://github.com/kubernetes/kubernetes/pull/39965)
The labels should be
> area/rkt
> area/kubelet
Automatic merge from submit-queue (batch tested with PRs 42998, 42902, 42959, 43020, 42948)
Add Host field to TCPSocketAction
Currently, TCPSocketAction always uses Pod's IP in connection. But when a pod uses the host network, sometimes firewall rules may prevent kubelet from connecting through the Pod's IP.
This PR introduces the 'Host' field for TCPSocketAction, and if it is set to non-empty string, the probe will be performed on the configured host rather than the Pod's IP. This gives users an opportunity to explicitly specify 'localhost' as the target for the above situations.
```release-note
Add Host field to TCPSocketAction
```
Automatic merge from submit-queue (batch tested with PRs 42672, 42770, 42818, 42820, 40849)
Return early from eviction debug helpers if !glog.V(3)
Should keep us from running a bunch of loops needlessly.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43653, 43654, 43652)
CRI: Check nil pointer to avoid kubelet panic.
When working on the containerd kubernetes integration, I casually returns an empty `sandboxStatus.Linux{}`, but it cause kubelet to panic.
This won't happen when runtime returns valid data, but we should not make the assumption here.
/cc @yujuhong @feiskyer
Automatic merge from submit-queue (batch tested with PRs 42522, 42545, 42556, 42006, 42631)
Use pod sandbox id in checkpoint
**What this PR does / why we need it**: we should log out sandbox id when checkpoint error
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 41139, 41186, 38882, 37698, 42034)
Make kubelet never delete files on mounted filesystems
With bug #27653, kubelet could remove mounted volumes and delete user data.
The bug itself is fixed, however our trust in kubelet is significantly lower.
Let's add an extra version of RemoveAll that does not cross mount boundary
(rm -rf --one-file-system).
It calls lstat(path) three times for each removed directory - once in
RemoveAllOneFilesystem and twice in IsLikelyNotMountPoint, however this way
it's platform independent and the directory that is being removed by kubelet
should be almost empty.
Automatic merge from submit-queue (batch tested with PRs 43533, 43539)
kuberuntime: don't override the pod IP for pods using host network
This fixes the issue of not passing pod IP via downward API for host network pods.
Automatic merge from submit-queue (batch tested with PRs 43465, 43529, 43474, 43521)
kubelet/cni: hook network plugin Status() up to CNI network discovery
Ensure that the plugin returns NotReady status until there is a
CNI network available which can be used to set up pods.
Fixes: https://github.com/kubernetes/kubernetes/issues/43014
I think the only reason it wasn't done like this in the first place was that the dynamic "reread /etc/cni/net.d every 10s forever" was added long after the Status() hook was. What do you think?
@freehan @caseydavenport @luxas @jbeda
Automatic merge from submit-queue (batch tested with PRs 43398, 43368)
CRI: add support for dns cluster first policy
**What this PR does / why we need it**:
PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools was updated to support the feature.
This PR updates kuberuntime to support it for all runtimes.
**Which issue this PR fixes**
fixes#43352
**Special notes for your reviewer**:
Candidate for v1.6.
**Release note**:
```release-note
NONE
```
cc @thockin @luxas @vefimova @Random-Liu
PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support
setting dns options togather with hostnetwork. This commit rewrites
resolv.conf same as dockertools.
PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools
was updated to support the feature. This PR updates kuberuntime to
support it for all runtimes.
Also fixes#43352.
Automatic merge from submit-queue (batch tested with PRs 42828, 43116)
Apply taint tolerations for NoExecute for all static pods.
Fixed https://github.com/kubernetes/kubernetes/issues/42753
**Release note**:
```
Apply taint tolerations for NoExecute for all static pods.
```
cc/ @davidopp
Automatic merge from submit-queue (batch tested with PRs 40964, 42967, 43091, 43115)
Improve code coverage for pkg/kubelet/status/generate.go
**What this PR does / why we need it**:
Improve code coverage for pkg/kubelet/status/generate.go from #39559
Thanks.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 42747, 43030)
dockershim: remove corrupted sandbox checkpoints
This is a workaround to ensure that kubelet doesn't block forever when
the checkpoint is corrupted.
This is a workaround for #43021
Automatic merge from submit-queue (batch tested with PRs 42942, 42935)
[Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs
Fixes#42412
**Background**
Support for multiple GPUs is an experimental feature in v1.6.
Container restarts were handled incorrectly which resulted in stranding of GPUs
Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods.
**What does this PR do**
This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs.
GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache.
Node e2e has been extended to validate this failure scenario.
**Risk**
Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet.
**Workarounds**
In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever` in their pods.
There is no workaround for the race condition caused by using the runtime cache though.
Hence it is worth including this fix in v1.6.0.
cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews
Replaces #42560
Currently, TCPSocketAction always uses Pod's IP in connection. But when a
pod uses the host network, sometimes firewall rules may prevent kubelet
from connecting through the Pod's IP. This PR introduces the 'Host' field
for TCPSocketAction, and if it is set to non-empty string, the probe will
be performed on the configured host rather than the Pod's IP. This gives
users an opportunity to explicitly specify 'localhost' as the target for
the above situations.
Automatic merge from submit-queue
Invalid environment var names are reported and pod starts
When processing EnvFrom items, all invalid keys are collected and
reported as a single event.
The Pod is allowed to start.
fixes#42583