To ensure kubelet doesn't attempt network teardown on HostNetwork
containers that no longer exist but are still checkpointed, make
sure we preserve the HostNetwork property in checkpoints. If
the checkpoint indicates the container was a HostNetwork one,
don't tear down the network since that would fail anyway.
Related: https://github.com/kubernetes/kubernetes/issues/44307#issuecomment-299548609
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.
GenericPLEG's 1s relist() loop races against pod network setup. It
may be called after the infra container has started but before
network setup is done, since PLEG and the runtime's SyncPod() run
in different goroutines.
Track network setup status and don't bother trying to read the pod's
IP address if networking is not yet ready.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1434950
Mar 22 12:18:17 ip-172-31-43-89 atomic-openshift-node: E0322
12:18:17.651013 25624 docker_manager.go:378] NetworkPlugin
cni failed on the status hook for pod 'pausepods22' - Unexpected
command output Device "eth0" does not exist.
Automatic merge from submit-queue (batch tested with PRs 44326, 45768)
[CRI] Forcibly remove container
Forcibly remove the running containers in `RemoveContainer`. Since we should forcibly remove the running containers in `RemovePodSandbox`. See [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/api/v1alpha1/runtime/api.proto#L35).
cc @feiskyer @Random-Liu
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
Automatic merge from submit-queue (batch tested with PRs 42025, 44169, 43940)
[CRI] Remove all containers in the sandbox
Remove all containers in the sandbox, when we remove the sandbox.
/cc @feiskyer @Random-Liu
Signed-off-by: Xianglin Gao <xlgao@zju.edu.cn>
PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support
setting dns options togather with hostnetwork. This commit rewrites
resolv.conf same as dockertools.
In ListPodSandbox(), we
1. List all sandbox docker containers
2. List all sandbox checkpoints. If the checkpoint does not have a
corresponding container in (1), we return partial result based on
the checkpoint.
The problem is that new PodSandboxes can be created between step (1) and
(2). In those cases, we will see the checkpoints, but not the sandbox
containers. This leads to strange behavior because the partial result
from the checkpoint does not include some critical information. For
example, the creation timestamp'd be zero, and that would cause kubelet's
garbage collector to immediately remove the sandbox.
This change fixes that by getting the list of checkpoints before listing
all the containers (since in RunPodSandbox we create them in the reverse
order).
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)
CRI: Make dockershim better implements CRI.
When thinking about CRI Validation test, I found that `PodSandboxStatus.Linux.Namespaces.Options.HostPid` and `PodSandboxStatus.Linux.Namespaces.Options.HostIpc` are not populated. Although they are not used by kuberuntime now, we should populate them to conform to CRI.
/cc @yujuhong @feiskyer
Automatic merge from submit-queue (batch tested with PRs 38212, 38792, 39641, 36390, 39005)
Set MemorySwap to zero on Windows
Fixes https://github.com/kubernetes/kubernetes/issues/39003
@dchen1107 @michmike @kubernetes/sig-node-misc
We have observed that, after failing to create a container due to "device or
resource busy", docker may end up having inconsistent internal state. One
symptom is that docker will not report the existence of the "failed to create"
container, but if kubelet tries to create a new container with the same name,
docker will error out with a naming conflict message.
To work around this, this commit parses the creation error message and if there
is a naming conflict, it would attempt to remove the existing container.