Add support for scaling to zero pods
minReplicas is allowed to be zero
condition is set once
Based on https://github.com/kubernetes/kubernetes/pull/61423
set original valid condition
add scale to/from zero and invalid metric tests
Scaling up from zero pods ignores tolerance
validate metrics when minReplicas is 0
Document HPA behaviour when minReplicas is 0
Documented minReplicas field in autoscaling APIs
current scale. Two important ones are when missing metrics might
change the direction of scaling, and when the recommended scale is
within tolerance of the current scale.
The way that ReplicaCalculator signals it's desire to not change the
current scale is by returning the current scale. However the current
scale is from scale.Status.Replicas and can be larger than
scale.Spec.Replicas (e.g. during Deployment rollout with configured
surge). This causes a positive feedback loop because
scale.Status.Replicas is written back into scale.Spec.Replicas,
further increasing the current scale.
This PR fixes the feedback loop by plumbing the replica count from
spec through horizontal.go and replica_calculator.go so the calculator
can punt with the right value.
Handle a case in the Horizontal Pod Autoscaler Controller when scaling
on multiple metrics and one or more is missing or invalid.
If all metrics are missing - return an error and leave the isScalingActive
condition as that for the last invalid metric.
If some metrics are missing/invalid and some are valid and found -
if a scale up would be triggered by the valid metrics ignore the missing
metrics and scale up, if a scale down would be triggered, return an error
and leave the isScalingActive condition as that for the last invalid metric.
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
HPA will treat initial size of autoscalee to avoid hastily overriding
recomendations made by HPA (if HPA set size and then was restarted) or by user
(initial size should be treated as human-generated recommendation).
Instead discard metric values for pods that are unready and have never
been ready (they may report misleading values, the original reason for
introducing scale up forbidden window).
Use per pod metric when pod is:
- Ready, or
- Not ready but creation timestamp and last readiness change are more
than 10s apart.
In the latter case we asume the pod was ready but later became unready.
We want to use metrics for such pods because sometimes such pods are
unready because they were getting too much load.
Similar to the change we made for `GetObjectMetricReplicas` in the
previous commit. Ensure that `GetExternalMetricReplicas` does not
include unready pods when its determining how many replica it desires.
Including unready pods can lead to over-scaling.
We did not change the behavior of `GetExternalPerPodMetricReplicas`, as
it is slightly less clear what is the desired behavior. We did make some
small naming refactorings to this method, which will make it easier to
ignore unready pods if we decide we want to.
Previously, when `GetObjectMetricReplicas` calculated the desired
replica count, it multiplied the usage ratio by the current number of replicas.
This method caused over-scaling when there were pods that were not ready
for a long period of time. For example, if there were pods A, B, and C,
and only pod A was ready, and the usage ratio was 500%, we would
previously specify 15 pods as the desired replicas (even though really
only one pod was handling the load).
After this change, we now multiple the usage
ratio by the number of ready pods for `GetObjectMetricReplicas`.
In the example above, we'd only desire 5 replica pods.
This change gives `GetObjectMetricReplicas` the same behavior as the
other replica calculator methods. Only `GetExternalMetricReplicas` and
`GetExternalPerPodMetricRepliacs` still allow unready pods to impact the
number of desired replicas. I will fix this issue in the following
commit.
There have been a couple of recent bugs in the "normalizing" part of the
`reconcileAutoscaler` method. This part of the code base is responsible
for, among other things, taking the suggested desired replicas based on
the metrics, ensuring it conforms to certain conditions, and updating it
if it does not. Isolate the part that converts the desired replicas
based on a given set of rules into its own function.
We are refactoring this part of the code base to make the logic simpler
and to make it easier to write unit tests.
Automatic merge from submit-queue (batch tested with PRs 53743, 53564). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Polymorphic Scale Client
This PR introduces a polymorphic scale client based on discovery information that's able to scale scalable resources in arbitrary group-versions, as long as they present the scale subresource in their discovery information.
Currently, it supports `extensions/v1beta1.Scale` and `autoscaling/v1.Scale`, but supporting other versions of scale if/when we produce them should be fairly trivial.
It also updates the HPA to use this client, meaning the HPA will now work on any scalable resource, not just things in the `extensions/v1beta1` API group.
**Release note**:
```release-note
Introduces a polymorphic scale client, allowing HorizontalPodAutoscalers to properly function on scalable resources in any API group.
```
Unblocks #29698
Unblocks #38756
Unblocks #49504Fixes#38810
This updates the HPA controller to use the polymorphic scale client from
client-go. This should enable HPAs to work with arbitrary scalable
resources, instead of just those in the extensions API group (meaning we
can deprecate the copy of ReplicationController in extensions/v1beta1).
It also means that the HPA controller now pays attention to the
APIVersion field in `scaleTargetRef` (more specifically, the group part
of it).
Note that currently, discovery information on which resources are
available where is only fetched once (the first time that it's
requested). In the future, we may want a refreshing discovery REST
mapper.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Make HPA tolerance a flag
**What this PR does / why we need it**:
Make HPA tolerance configurable as a flag. This change allows us to use
different tolerance values in production/testing.
**Which issue this PR fixes**:
Fixes#18155
**Release note:**
```release-note
Control HPA tolerance through the `horizontal-pod-autoscaler-tolerance` flag.
```
Signed-off-by: mattjmcnaughton <mattjmcnaughton@gmail.com>
Fix#53670
Fix a bug where `desiredReplicas` could be greater than `maxReplicas`
if the original value for `desiredReplicas > scaleUpLimit` and
`scaleUpLimit > maxReplicas`. Previously, when that happened, we would
scale up to `scaleUpLimit`, and then in the next auto-scaling run, scale
down to `maxReplicas`. Address this issue and introduce a regression
test.