Commit Graph

60 Commits

Author SHA1 Message Date
Kushagra
80384bbb55 spelling mistake rectified 2022-12-29 17:55:17 +00:00
Kushagra
f380ef8b61 Misleading message when there are no metrics. 2022-12-29 10:57:43 +00:00
Kubernetes Prow Robot
f9f9f0107d
Merge pull request #112544 from abhijit-dev82/master
HPA : Enhance error message to capture POD details
2022-10-28 04:14:30 -07:00
Abhijit
ac56e6f34e HPA : Enhance error message to capture POD details
HPA : Enhance error message to capture POD details
2022-10-17 14:21:28 +05:30
Kushagra
01b553145c requested changes: fix return type variables 2022-09-22 08:59:02 +00:00
Kushagra
cbea8d2248 requested changes 2022-09-19 08:15:06 +00:00
Kushagra
79f5c7da33 varibale name change for better understanding 2022-09-19 05:05:01 +00:00
Kushagra
b75fbda0ed requested changes 2022-09-05 04:54:59 +00:00
Kushagra
bb735bf689 revert for non-utilization metrics 2022-09-05 04:54:59 +00:00
Kushagra
6bb73bae06 FIX: hpa scale down with target >= 100 2022-09-05 04:54:59 +00:00
Kubernetes Prow Robot
5ade6c833f
Merge pull request #110695 from lokichoggio/hpa
code optimization: deal with error first to prevent unnecessary computing
2022-09-01 17:52:04 -07:00
j2gg0s
755098cc31 hpa: rename rebalanceIgnored to scaleUpWithUnready for understanding 2022-08-26 15:36:11 +08:00
Abirdcfly
2bca77a3d9 Update golangci-lint to 1.46.2 and fix errors
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-06-29 17:42:46 +08:00
lokichoggio
a86f1672c3
code optimization 2022-06-21 23:36:07 +08:00
wangyysde
d2abddd909 rename v2beta2 to v2
Signed-off-by: wangyysde <net_use@bzhy.com>

Generation swagger.json.

Use v2 path for hpa_cpu_field.

run update-codegen.sh

Signed-off-by: wangyysde <net_use@bzhy.com>
2021-11-09 10:34:54 +08:00
Mike Dame
7780024916 Wire contexts to Autoscaling controllers 2021-10-12 14:34:05 -04:00
Kubernetes Prow Robot
c5efee02ac
Merge pull request #89465 from shibataka000/84142-cm
Fix HPA bug about unintentional scale out during updating deployment when using PodMetric.
2020-12-16 04:44:21 -08:00
Arjun Naik
0fec7b0f7e Added functionality and API for pod autoscaling based on container resources
Signed-off-by: Arjun Naik <anaik@redhat.com>
2020-10-21 21:10:05 +02:00
Joseph Burnett
1ccaaa768d Ignore deleted pods.
When a pod is deleted, it is given a deletion timestamp. However the
pod might still run for some time during graceful shutdown. During
this time it might still produce CPU utilization metrics and be in a
Running phase.

Currently the HPA replica calculator attempts to ignore deleted pods
by skipping over them. However by not adding them to the ignoredPods
set, their metrics are not removed from the average utilization
calculation. This allows pods in the process of shutting down to drag
down the recommmended number of replicas by producing near 0%
utilization metrics.

In fact the ignoredPods set is misnomer. Those pods are not fully
ignored. When the replica calculator recommends to scale up, 0%
utilization metrics are filled in for those pods to limit the scale
up. This prevents overscaling when pods take some time to startup. In
fact, there should be 4 sets considered (readyPods, unreadyPods,
missingPods, ignoredPods) not just 3.

This change renames ignoredPods as unreadyPods and leaves the scaleup
limiting semantics. Another set (actually) ignoredPods is added to
which delete pods are added instead of being skipped during
grouping. Both ignoredPods and unreadyPods have their metrics removed
from consideration. But only unreadyPods have 0% utilization metrics
filled in upon scaleup.
2020-10-14 16:45:06 +02:00
shibataka000
dbd0d566ed Fix bug about unintentional scale out during updating deployment.
This commit fix bug in calcPlainMetricReplicas function.
Same bug in GetResourceReplicas function was fixed in #85027.
2020-03-25 15:47:40 +09:00
Kubernetes Prow Robot
c441a1a7dc
Merge pull request #85027 from shibataka000/fix-bug-about-unintentional-scale-out-during-updating-deployment
Fix HPA bug about unintentional scale out during updating deployment.
2020-03-24 04:50:46 -07:00
Alexander Zimmermann
a1c837022c
Fixed Golint errors in pkg/controller/podautoscaler 2020-02-06 17:16:38 +01:00
shibataka000
b7122770f8 Fix bug about unintentional scale out during updating deployment.
During rolling update with maxSurge=1 and maxUnavailable=0,
len(metrics) is greater than currentReplcas
and it may cause unintentional scale out.
2019-11-09 06:24:31 +00:00
Rinat Shigapov
d55f037b7d HPA scale-to-zero for custom object/external metrics
Add support for scaling to zero pods

minReplicas is allowed to be zero

condition is set once

Based on https://github.com/kubernetes/kubernetes/pull/61423

set original valid condition

add scale to/from zero and invalid metric tests

Scaling up from zero pods ignores tolerance

validate metrics when minReplicas is 0

Document HPA behaviour when minReplicas is 0

Documented minReplicas field in autoscaling APIs
2019-07-16 08:46:21 -05:00
Kubernetes Prow Robot
57eef32041
Merge pull request #79657 from josephburnett/hpastuck
Ignore unschedulable pods
2019-07-10 11:34:29 -07:00
Joseph Burnett
80e279d353 Ignore pending pods.
This change adds pending pods to the ignored set first before
selecting pods missing metrics. Pending pods are always ignored when
calculating scale.

When the HPA decides which pods and metric values to take into account
when scaling, it divides the pods into three disjoint subsets: 1)
ready 2) missing metrics and 3) ignored. First the HPA selects pods
which are missing metrics. Then it selects pods should be ignored
because they are not ready yet, or are still consuming CPU during
initialization. All the remaining pods go into the ready set. After
the HPA has decided what direction it wants to scale based on the
ready pods, it considers what might have happened if it had the
missing metrics. It makes a conservative guess about what the missing
metrics might have been, 0% if it wants to scale up--100% if it wants
to scale down. This is a good thing when scaling up, because newly
added pods will likely help reduce the usage ratio, even though their
metrics are missing at the moment. The HPA should wait to see the
results of its previous scale decision before it makes another
one. However when scaling down, it means that many missing metrics can
pin the HPA at high scale, even when load is completely removed. In
particular, when there are many unschedulable pods due to insufficient
cluster capacity, the many missing metrics (assumed to be 100%) can
cause the HPA to avoid scaling down indefinitely.
2019-07-10 12:16:33 +02:00
Joseph Burnett
39c4875321 There are various reasons that the HPA will decide not the change the
current scale. Two important ones are when missing metrics might
change the direction of scaling, and when the recommended scale is
within tolerance of the current scale.

The way that ReplicaCalculator signals it's desire to not change the
current scale is by returning the current scale. However the current
scale is from scale.Status.Replicas and can be larger than
scale.Spec.Replicas (e.g. during Deployment rollout with configured
surge). This causes a positive feedback loop because
scale.Status.Replicas is written back into scale.Spec.Replicas,
further increasing the current scale.

This PR fixes the feedback loop by plumbing the replica count from
spec through horizontal.go and replica_calculator.go so the calculator
can punt with the right value.
2019-07-02 14:21:32 +02:00
waynepeking348
b8b1720f12 Fix bug of ObjectPerPodMetricReplicas to initialize replicaCount with currentReplicas 2019-06-05 11:54:03 +00:00
Arjun Naik
c99d505001 Added functionality to use target average value for object metrics
Signed-off-by: Arjun Naik <arjun.rn@gmail.com>
2019-01-23 21:00:05 +01:00
Krzysztof Jastrzebski
985ba931b1 Use informer cache instead of active pod gets in HPA controller. 2018-09-05 11:31:27 +02:00
Kubernetes Submit Queue
2548fb08cd
Merge pull request #68068 from krzysztof-jastrzebski/hpas2
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Change CPU sample sanitization in HPA.

**What this PR does / why we need it**:
Change CPU sample sanitization in HPA.
    Ignore samples if:
    - Pod is beeing initalized - 5 minutes from start defined by flag
        - pod is unready
        - pod is ready but full window of metric hasn't been colected since
        transition
    - Pod is initialized - 5 minutes from start defined by flag:
        - Pod has never been ready after initial readiness period.

**Release notes:**
```release-note
Improve CPU sample sanitization in HPA by taking metric's freshness into account.
```
2018-08-31 10:17:44 -07:00
Krzysztof Jastrzebski
5357bf9eac Change CPU sample sanitization in HPA.
Ignore samples if:
- Pod is beeing initalized - 5 minutes from start defined by flag
    - pod is unready
    - pod is ready but full window of metric hasn't been colected since
    transition
- Pod is initialized - 5 minutes from start defined by flag:
    - Pod has never been ready after initial readiness period.
2018-08-30 23:13:14 +02:00
Kubernetes Submit Queue
42c6f1fb28
Merge pull request #67067 from moonek/master
Automatic merge from submit-queue (batch tested with PRs 67067, 67947). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Do not count soft-deleted pods for scaling purposes in HPA controller

**What this PR does / why we need it**:
The metrics of "soft-deleted" pods in general to be deleted should probably not matter for scaling purposes, since they'll be gone "soon", whether they're nodelost or just normally delete.

As long as soft-deleted pods still exist, they prevent normal scale up.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/62845

**Special notes for your reviewer**:

**Release note**:

```release-note
Stop counting soft-deleted pods for scaling purposes in HPA controller to avoid soft-deleted pods incorrectly affecting scale up replica count calculation.
```
2018-08-28 15:08:01 -07:00
moonek
3fedbe48e3 Do not count soft-deleted pods for scaling purposes in HPA controller 2018-08-28 16:27:47 +00:00
Krzysztof Jastrzebski
dfd88dbde0 Remove incorrect glog error from Horizontal Pod Autoscaler. 2018-08-28 09:18:25 +02:00
Mike Dame
c7102ee5dc Implement autoscaling/v2beta2 features in HPA controller 2018-08-27 11:07:52 -04:00
Joachim Bartosik
4fd6a1684d Make HPA more configurable
Duration of initialization taint on CPU and window of initial readiness
setting controlled by flags.

Adding API violation exceptions following example of e50340ee23
2018-08-24 13:13:02 +02:00
Joachim Bartosik
7d6676eab1 Improve HPA sample sanitization
After my previous changes HPA wasn't behaving correctly in the following
situation:

- Pods use a lot of CPU during initilization, become ready right after they initialize,
- Scale up triggers,
- When new pods become ready HPA counts their usage (even though it's not related to any work that needs doing),
- Another scale up, even though existing pods can handle work, no problem.
2018-08-21 16:22:06 +02:00
Joachim Bartosik
7681c284f5 Remove UpscaleForbiddenWindow
Instead discard metric values for pods that are unready and have never
been ready (they may report misleading values, the original reason for
introducing scale up forbidden window).

Use per pod metric when pod is:
- Ready, or
- Not ready but creation timestamp and last readiness change are more
  than 10s apart.

In the latter case we asume the pod was ready but later became unready.
We want to use metrics for such pods because sometimes such pods are
unready because they were getting too much load.
2018-08-01 17:47:23 +02:00
mattjmcnaughton
d33494d459 GetExternalMetricReplicas ignores unready pods
Similar to the change we made for `GetObjectMetricReplicas` in the
previous commit. Ensure that `GetExternalMetricReplicas` does not
include unready pods when its determining how many replica it desires.
Including unready pods can lead to over-scaling.

We did not change the behavior of `GetExternalPerPodMetricReplicas`, as
it is slightly less clear what is the desired behavior. We did make some
small naming refactorings to this method, which will make it easier to
ignore unready pods if we decide we want to.
2018-03-13 22:27:28 -04:00
mattjmcnaughton
7e3bce7b3e GetObjectMetricReplicas ignores unready pods
Previously, when `GetObjectMetricReplicas` calculated the desired
replica count, it multiplied the usage ratio by the current number of replicas.
This method caused over-scaling when there were pods that were not ready
for a long period of time. For example, if there were pods A, B, and C,
and only pod A was ready, and the usage ratio was 500%, we would
previously specify 15 pods as the desired replicas (even though really
only one pod was handling the load).

After this change, we now multiple the usage
ratio by the number of ready pods for `GetObjectMetricReplicas`.
In the example above, we'd only desire 5 replica pods.

This change gives `GetObjectMetricReplicas` the same behavior as the
other replica calculator methods. Only `GetExternalMetricReplicas` and
`GetExternalPerPodMetricRepliacs` still allow unready pods to impact the
number of desired replicas. I will fix this issue in the following
commit.
2018-03-07 08:13:01 -05:00
Beata Skiba
e5f8bfa023 Do not count failed pods as unready in HPA controller
Currently, when performing a scale up, any failed pods (which can be present for example in case of evictions performed by kubelet) will be treated as unready. Unready pods are treated as if they had 0% utilization which will slow down or even block scale up.

After this change, failed pods are ignored in all calculations. This way they do not influence neither scale up nor scale down replica calculations.
2018-03-01 16:21:02 +01:00
Aleksandra Malinowska
e58411c600 Implement external metrics in HPA 2018-02-27 14:10:29 +01:00
mattjmcnaughton
abd46684d4 Make HPA tolerance a flag
Fix #18155

Make HPA tolerance configurable as a flag. This change allows us to use
different tolerance values in production/testing.

Signed-off-by: mattjmcnaughton <mattjmcnaughton@gmail.com>
2017-09-28 22:01:51 -04:00
Solly Ross
c8690f367b Move consumers of autoscaling/v2alpha1 to v2beta1
This commit updates consumers (mainly the HPA controller, but also the
kubectl printers) of autoscaling/v2alpha1 to autoscaling/v2beta1.
2017-09-05 17:49:30 -04:00
Jacob Simpson
29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
bonowang
bbb0365d8d remove useless code 2017-07-07 17:59:44 +08:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
cde4772928 run ./root-rewrite-all-other-apis.sh, then run make all, pkg/... compiles 2017-06-22 11:30:52 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00