when DefaultPodTopologySpread feature is enabled
If SelectorSpreadPriority is in use, PodTopologySpread gets inevitably enabled.
When only EvenPodsSpreadPriority is in use, PodTopologySpread is configured without system defaults.
Change-Id: I2389a585cd8ad0bd35b0d2acae1665cd46908b3e
When a pod is deleted, it is given a deletion timestamp. However the
pod might still run for some time during graceful shutdown. During
this time it might still produce CPU utilization metrics and be in a
Running phase.
Currently the HPA replica calculator attempts to ignore deleted pods
by skipping over them. However by not adding them to the ignoredPods
set, their metrics are not removed from the average utilization
calculation. This allows pods in the process of shutting down to drag
down the recommmended number of replicas by producing near 0%
utilization metrics.
In fact the ignoredPods set is misnomer. Those pods are not fully
ignored. When the replica calculator recommends to scale up, 0%
utilization metrics are filled in for those pods to limit the scale
up. This prevents overscaling when pods take some time to startup. In
fact, there should be 4 sets considered (readyPods, unreadyPods,
missingPods, ignoredPods) not just 3.
This change renames ignoredPods as unreadyPods and leaves the scaleup
limiting semantics. Another set (actually) ignoredPods is added to
which delete pods are added instead of being skipped during
grouping. Both ignoredPods and unreadyPods have their metrics removed
from consideration. But only unreadyPods have 0% utilization metrics
filled in upon scaleup.
Fix stupid golang loop variable closure thing.
Also, if we fail to initially set up the rules for one family, don't
try to set up a canary. eg, on the CI hosts, the kernel ip6tables
modules are not loaded, so any attempt to call ip6tables will fail.
Just log those errors once at startup rather than once a minute.
Discussion is ongoing about how to best handle dual-stack with clouds
and autodetected IPs, but there is at least agreement that people on
bare metal ought to be able to specify two explicit IPs on dual-stack
hosts, so allow that.
Several of the tests in TestNodeAddress() were no-ops because the test
code was only testing that NodeAddresses() returned all of the
expected addresses, but not testing that it was returning them in the
correct order.
The order that NodeAddresses() returns addresses in is very important,
so fix the tests to actually test it.
One existing test ("NodeIP is external") had its expectedAddresses in
the wrong order, but it seems clear from the name of the test that
this isn't actually what it expected.
Also, previously testKubeletHostname was "127.0.0.1" which ended up
interacting weirdly with the IPv4-vs-IPv6 sorting code in a way that
made some of the test results confusing if you didn't realize that
testKubeletHostname was an IPv4 address. Fix that by making it an
actual hostname instead, which then preserves the expected sorting.