![]() When a pod is deleted, it is given a deletion timestamp. However the pod might still run for some time during graceful shutdown. During this time it might still produce CPU utilization metrics and be in a Running phase. Currently the HPA replica calculator attempts to ignore deleted pods by skipping over them. However by not adding them to the ignoredPods set, their metrics are not removed from the average utilization calculation. This allows pods in the process of shutting down to drag down the recommmended number of replicas by producing near 0% utilization metrics. In fact the ignoredPods set is misnomer. Those pods are not fully ignored. When the replica calculator recommends to scale up, 0% utilization metrics are filled in for those pods to limit the scale up. This prevents overscaling when pods take some time to startup. In fact, there should be 4 sets considered (readyPods, unreadyPods, missingPods, ignoredPods) not just 3. This change renames ignoredPods as unreadyPods and leaves the scaleup limiting semantics. Another set (actually) ignoredPods is added to which delete pods are added instead of being skipped during grouping. Both ignoredPods and unreadyPods have their metrics removed from consideration. But only unreadyPods have 0% utilization metrics filled in upon scaleup. |
||
---|---|---|
.. | ||
config | ||
metrics | ||
BUILD | ||
doc.go | ||
horizontal_test.go | ||
horizontal.go | ||
legacy_horizontal_test.go | ||
legacy_replica_calculator_test.go | ||
OWNERS | ||
rate_limiters.go | ||
replica_calculator_test.go | ||
replica_calculator.go |