cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.
If the container status exists but the container hasn't been created yet, it won't have an ID.
Have the probe wait for a valid status if the container ID is not yet set; otherwise, you'll see the
following cryptic log message from runtime.go: invalid container ID: "".
bridge-nf-call-iptables appears to only be relevant when the containers are
attached to a Linux bridge, which is usually the case with default Kubernetes
setups, docker, and flannel. That ensures that the container traffic is
actually subject to the iptables rules since it traverses a Linux bridge
and bridged traffic is only subject to iptables when bridge-nf-call-iptables=1.
But with other networking solutions (like openshift-sdn) that don't use Linux
bridges, bridge-nf-call-iptables may not be not relevant, because iptables is
invoked at other points not involving a Linux bridge.
The decision to set bridge-nf-call-iptables should be influenced by networking
plugins, so push the responsiblity out to them. If no network plugin is
specified, fall back to the existing bridge-nf-call-iptables=1 behavior.
cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).
This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.
Node controller is generating a huge amount of logging at v(3) that is
more appropriate for v(5). Split the log into two levels and ensure it
also ends up on one line (so grep works).
The pod manager generates a v(4) pod output on sync that always contains
a newline - since the size of the pod is so excessive in output, kick it
to v(5) for deep debugging (we're pretty happy with this loop).