Commit Graph

251 Commits

Author SHA1 Message Date
Mike Dame
dd7e81a8cd Add dry run test for hpa v2beta2 2018-08-27 11:37:22 -04:00
Mike Dame
77d7f9cfa2 Generate files and modifications for autoscaling/v2beta2 and custom_metrics/v1beta2 2018-08-27 11:07:53 -04:00
Mike Dame
c7102ee5dc Implement autoscaling/v2beta2 features in HPA controller 2018-08-27 11:07:52 -04:00
Kubernetes Submit Queue
663551bebd
Merge pull request #67252 from jbartosik/metric-sanitization
Automatic merge from submit-queue (batch tested with PRs 66916, 67252, 67794, 67619, 67328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix HPA sample sanitization

**What this PR does / why we need it**: @mwielgus pointed out a case when HPA fails as a result of my changes to HPA algorithm:
- Have pods that use a lot of CPU during initilization, become ready right after they initialize,
- Trigger a scale up,
- When new pods become ready will will count their usage (even though it's not related to any work that needs doing),
- This triggers another scale up, even though existing pods can handle work, no problem.

The fix is:
- Use all samples for non-cpu metrics.
- Only use CPU samples if:
  - Pod is ready and was started more than 2 minutes ago, or
  - Pod is unready and last readiness change happened more than 10s after it was started.

Reasoning behind this in: https://docs.google.com/document/d/1UdtYedhmCxjaJIQi6hwJMY0eHQQKxlVD8lSHZC1BPOA/edit

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:
```release-note
Replace scale up forbidden window with disregarding CPU samples collected when pod was initializing.
```
2018-08-24 15:25:07 -07:00
Joachim Bartosik
4fd6a1684d Make HPA more configurable
Duration of initialization taint on CPU and window of initial readiness
setting controlled by flags.

Adding API violation exceptions following example of e50340ee23
2018-08-24 13:13:02 +02:00
Davanum Srinivas
9b43d97cd4
Add Labels to various OWNERS files
Will reduce the burden of manually adding labels. Information pulled
from:
https://github.com/kubernetes/community/blob/master/sigs.yaml

Change-Id: I17e661e37719f0bccf63e41347b628269cef7c8b
2018-08-21 13:59:08 -04:00
Joachim Bartosik
7d6676eab1 Improve HPA sample sanitization
After my previous changes HPA wasn't behaving correctly in the following
situation:

- Pods use a lot of CPU during initilization, become ready right after they initialize,
- Scale up triggers,
- When new pods become ready HPA counts their usage (even though it's not related to any work that needs doing),
- Another scale up, even though existing pods can handle work, no problem.
2018-08-21 16:22:06 +02:00
liangwenguo
8f8a7bb83f make the log more readable 2018-08-07 10:00:31 +08:00
Joachim Bartosik
7681c284f5 Remove UpscaleForbiddenWindow
Instead discard metric values for pods that are unready and have never
been ready (they may report misleading values, the original reason for
introducing scale up forbidden window).

Use per pod metric when pod is:
- Ready, or
- Not ready but creation timestamp and last readiness change are more
  than 10s apart.

In the latter case we asume the pod was ready but later became unready.
We want to use metrics for such pods because sometimes such pods are
unready because they were getting too much load.
2018-08-01 17:47:23 +02:00
Joachim Bartosik
086ed3c659 Rename desiredReplicas to expectedDesiredReplicas
Naming fields specifying values expected by test as expected.* is a nice
convention to have, lets follow it.
2018-08-01 17:43:01 +02:00
Kubernetes Submit Queue
1e3d23c5c3
Merge pull request #65907 from jbartosik/hpa-improv-refactor-run-test
Automatic merge from submit-queue (batch tested with PRs 64681, 65907). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make runTest easier to understand

Fewer nested conditions, more checking for incorrect looking test cases.

**What this PR does / why we need it**: Make HPA tests easier to understand.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-07-24 16:28:13 -07:00
Joachim Bartosik
3d1b6b0f6e Make runTest easier to understand
Instead of deducing metric type from details of struct describing it
test cases explicitly specify the metric type they use.
2018-07-24 17:27:17 +02:00
Kubernetes Submit Queue
ab00c609ee
Merge pull request #65901 from jbartosik/hpa-improv-refactor-replica-calc-test
Automatic merge from submit-queue (batch tested with PRs 66175, 66324, 65828, 65901, 66350). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Hpa improv refactor replica calc test

**What this PR does / why we need it**: prepareTestClient generates 4 fake clients, using replicaCalcTestCase object. This PR extracts a separate helper for generating each fake independently.

**Which issue(s) this PR fixes**

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-07-18 16:42:18 -07:00
Joachim Bartosik
9b91a89f3d Chop computeReplicasForMetrics to smaller pieces 2018-07-18 17:09:20 +02:00
Joachim Bartosik
4ef9033549 Extract helpers from prepareTestClient
Each of the subhelpers generates one client.
2018-07-18 17:02:30 +02:00
Dr. Stefan Schimanski
d79cf25497 Update external k8s.io/metrics imports 2018-07-02 10:44:18 +02:00
Jeff Grafton
23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Guoliang Wang
afa2a1cfe5 Fixing wrong unit test naming 2018-05-19 08:09:39 +08:00
David Eads
c5445d3c56 simplify api registration 2018-05-08 18:33:50 -04:00
David Eads
9a48066749 update restmapping to indicate fully qualified resource 2018-05-01 16:34:49 -04:00
David Eads
ef0d1ab819 remove incorrect static restmapper 2018-05-01 07:51:17 -04:00
Mikhail Mazurskiy
468655b76a
Use typed events client directly 2018-04-01 18:57:29 +10:00
Kubernetes Submit Queue
3d4cd0ace3
Merge pull request #61362 from bskiba/test-em-missing
Automatic merge from submit-queue (batch tested with PRs 60373, 61098, 61352, 61359, 61362). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add HPA test for FailedGetExternalMetric

**What this PR does / why we need it**:
Add a HPA test for missing external metrics.

**Release note**:

```
NONE
```
2018-03-21 22:39:22 -07:00
Beata Skiba
dd24087aa3 Add test for FailedGetExternalMetric 2018-03-19 15:22:36 +01:00
mattjmcnaughton
d33494d459 GetExternalMetricReplicas ignores unready pods
Similar to the change we made for `GetObjectMetricReplicas` in the
previous commit. Ensure that `GetExternalMetricReplicas` does not
include unready pods when its determining how many replica it desires.
Including unready pods can lead to over-scaling.

We did not change the behavior of `GetExternalPerPodMetricReplicas`, as
it is slightly less clear what is the desired behavior. We did make some
small naming refactorings to this method, which will make it easier to
ignore unready pods if we decide we want to.
2018-03-13 22:27:28 -04:00
mattjmcnaughton
7e3bce7b3e GetObjectMetricReplicas ignores unready pods
Previously, when `GetObjectMetricReplicas` calculated the desired
replica count, it multiplied the usage ratio by the current number of replicas.
This method caused over-scaling when there were pods that were not ready
for a long period of time. For example, if there were pods A, B, and C,
and only pod A was ready, and the usage ratio was 500%, we would
previously specify 15 pods as the desired replicas (even though really
only one pod was handling the load).

After this change, we now multiple the usage
ratio by the number of ready pods for `GetObjectMetricReplicas`.
In the example above, we'd only desire 5 replica pods.

This change gives `GetObjectMetricReplicas` the same behavior as the
other replica calculator methods. Only `GetExternalMetricReplicas` and
`GetExternalPerPodMetricRepliacs` still allow unready pods to impact the
number of desired replicas. I will fix this issue in the following
commit.
2018-03-07 08:13:01 -05:00
Beata Skiba
e5f8bfa023 Do not count failed pods as unready in HPA controller
Currently, when performing a scale up, any failed pods (which can be present for example in case of evictions performed by kubelet) will be treated as unready. Unready pods are treated as if they had 0% utilization which will slow down or even block scale up.

After this change, failed pods are ignored in all calculations. This way they do not influence neither scale up nor scale down replica calculations.
2018-03-01 16:21:02 +01:00
Aleksandra Malinowska
e58411c600 Implement external metrics in HPA 2018-02-27 14:10:29 +01:00
Maciej Pytel
66f4f9080d Add external metrics client to HPA rest client 2018-02-27 14:10:29 +01:00
Jeff Grafton
ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Matt Brown
151a7d2731 correct typo in HorizontalPodAutoscaler status condition
"succesfully" => "successfully"
2018-01-29 13:01:43 -05:00
Cao Shufeng
4e7398b67b remove duplicated import 2018-01-17 09:34:59 +08:00
mattjmcnaughton
5a165b0387 Add test coverage for metrics/utilization.go
Currently, there is no test coverage for this code. Since it does fairly
important calculations, test coverage seems helpful.
2018-01-06 10:26:51 -05:00
mattjmcnaughton
eb688e098f Add RESTClient Custom metrics empty test
Add testing for a previously untested path, which is tested when getting
resource metrics.
2018-01-05 08:40:24 -05:00
mattjmcnaughton
77e651aed1 Clarify error messages in HPA metrics
With the introduction of the RESTMetrics client, there are two ways to
fetch metrics for auto-scaling. However, they previously shared error
messages. This could be misleading. Make the error message more clearly
show which method is in use.
2018-01-03 21:55:59 -05:00
Jeff Grafton
efee0704c6 Autogenerate BUILD files 2017-12-23 13:12:11 -08:00
mattjmcnaughton
e74838b6ab Refactor reconcileAutoscaler method in hpa
There have been a couple of recent bugs in the "normalizing" part of the
`reconcileAutoscaler` method. This part of the code base is responsible
for, among other things, taking the suggested desired replicas based on
the metrics, ensuring it conforms to certain conditions, and updating it
if it does not. Isolate the part that converts the desired replicas
based on a given set of rules into its own function.

We are refactoring this part of the code base to make the logic simpler
and to make it easier to write unit tests.
2017-11-16 09:42:49 -05:00
Dr. Stefan Schimanski
bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski
2b201ead11 Fix and update comment with api.Scheme 2017-10-30 19:54:02 +01:00
Kevin
4c8539cece use core client with explicit version globally 2017-10-27 15:48:32 +08:00
Kubernetes Submit Queue
ca8d97d673 Merge pull request #53743 from DirectXMan12/feature/polymorphic-scale-client
Automatic merge from submit-queue (batch tested with PRs 53743, 53564). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Polymorphic Scale Client

This PR introduces a polymorphic scale client based on discovery information that's able to scale scalable resources in arbitrary group-versions, as long as they present the scale subresource in their discovery information.

Currently, it supports `extensions/v1beta1.Scale` and `autoscaling/v1.Scale`, but supporting other versions of scale if/when we produce them should be fairly trivial.

It also updates the HPA to use this client, meaning the HPA will now work on any scalable resource, not just things in the `extensions/v1beta1` API group.

**Release note**:
```release-note
Introduces a polymorphic scale client, allowing HorizontalPodAutoscalers to properly function on scalable resources in any API group.
```

Unblocks #29698
Unblocks #38756
Unblocks #49504 
Fixes #38810
2017-10-23 13:39:07 -07:00
Solly Ross
d2b41120ea Make HPA controller use polymorphic scale client
This updates the HPA controller to use the polymorphic scale client from
client-go.  This should enable HPAs to work with arbitrary scalable
resources, instead of just those in the extensions API group (meaning we
can deprecate the copy of ReplicationController in extensions/v1beta1).
It also means that the HPA controller now pays attention to the
APIVersion field in `scaleTargetRef` (more specifically, the group part
of it).

Note that currently, discovery information on which resources are
available where is only fetched once (the first time that it's
requested).  In the future, we may want a refreshing discovery REST
mapper.
2017-10-19 13:21:02 -04:00
hzxuzhonghu
96b48d4386 add event broadcaster logging for all contoller managers 2017-10-19 09:18:43 +08:00
Kubernetes Submit Queue
2d914ee703 Merge pull request #53984 from sttts/sttts-legacyscheme
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

pkg/api: extract Scheme/Registry/Codecs into pkg/api/legacyscheme

This serves as

- a preparation for the pkg/api->pkg/apis/core move
- and makes the dependency to the scheme explicit when vizualizing
  left depenncies.

The later helps with our our efforts to split up the monolithic repo
into self-contained sub-repos, e.g. for kubectl, controller-manager
and kube-apiserver in the future.
2017-10-18 10:49:10 -07:00
Dr. Stefan Schimanski
cad0364e73 Update bazel 2017-10-18 17:24:04 +02:00
Dr. Stefan Schimanski
7773a30f67 pkg/api/legacyscheme: fixup imports 2017-10-18 17:23:55 +02:00
Kubernetes Submit Queue
1d8f1e268f Merge pull request #47699 from supereagle/fix-typos
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typos: remove duplicated word in comments

**What this PR does / why we need it**: Remove the duplicated word `the` in comments

**Which issue this PR fixes** : fixes #

**Special notes for your reviewer**:

```release-note
NONE
```
2017-10-17 02:35:52 -07:00
Kubernetes Submit Queue
03cb11f020 Merge pull request #52275 from mattjmcnaughton/mattjmcnaughton/18155-hpa-tolerance-should-be-flag
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make HPA tolerance a flag

**What this PR does / why we need it**:
Make HPA tolerance configurable as a flag. This change allows us to use
different tolerance values in production/testing.

**Which issue this PR fixes**: 
Fixes #18155

**Release note:**
```release-note
Control HPA tolerance through the `horizontal-pod-autoscaler-tolerance` flag.
```

Signed-off-by: mattjmcnaughton <mattjmcnaughton@gmail.com>
2017-10-16 16:47:43 -07:00
Jeff Grafton
aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00