kubernetes

Author	SHA1	Message	Date
Rinat Shigapov	d55f037b7d	HPA scale-to-zero for custom object/external metrics Add support for scaling to zero pods minReplicas is allowed to be zero condition is set once Based on https://github.com/kubernetes/kubernetes/pull/61423 set original valid condition add scale to/from zero and invalid metric tests Scaling up from zero pods ignores tolerance validate metrics when minReplicas is 0 Document HPA behaviour when minReplicas is 0 Documented minReplicas field in autoscaling APIs	2019-07-16 08:46:21 -05:00
Joseph Burnett	7382fa464d	Add josephburnett to podautoscaler OWNERS.	2019-07-12 10:20:16 +02:00
Kubernetes Prow Robot	b500c740ee	Merge pull request #79859 from sukeesh/hpa-error-log-fix HPA incorrectly reported condition status	2019-07-11 07:28:55 -07:00
Kubernetes Prow Robot	57eef32041	Merge pull request #79657 from josephburnett/hpastuck Ignore unschedulable pods	2019-07-10 11:34:29 -07:00
Joseph Burnett	80e279d353	Ignore pending pods. This change adds pending pods to the ignored set first before selecting pods missing metrics. Pending pods are always ignored when calculating scale. When the HPA decides which pods and metric values to take into account when scaling, it divides the pods into three disjoint subsets: 1) ready 2) missing metrics and 3) ignored. First the HPA selects pods which are missing metrics. Then it selects pods should be ignored because they are not ready yet, or are still consuming CPU during initialization. All the remaining pods go into the ready set. After the HPA has decided what direction it wants to scale based on the ready pods, it considers what might have happened if it had the missing metrics. It makes a conservative guess about what the missing metrics might have been, 0% if it wants to scale up--100% if it wants to scale down. This is a good thing when scaling up, because newly added pods will likely help reduce the usage ratio, even though their metrics are missing at the moment. The HPA should wait to see the results of its previous scale decision before it makes another one. However when scaling down, it means that many missing metrics can pin the HPA at high scale, even when load is completely removed. In particular, when there are many unschedulable pods due to insufficient cluster capacity, the many missing metrics (assumed to be 100%) can cause the HPA to avoid scaling down indefinitely.	2019-07-10 12:16:33 +02:00
Sukeesh	44c3f0105f	fix incorrect hpa status	2019-07-08 17:27:38 +09:00
Joseph Burnett	39c4875321	There are various reasons that the HPA will decide not the change the current scale. Two important ones are when missing metrics might change the direction of scaling, and when the recommended scale is within tolerance of the current scale. The way that ReplicaCalculator signals it's desire to not change the current scale is by returning the current scale. However the current scale is from scale.Status.Replicas and can be larger than scale.Spec.Replicas (e.g. during Deployment rollout with configured surge). This causes a positive feedback loop because scale.Status.Replicas is written back into scale.Spec.Replicas, further increasing the current scale. This PR fixes the feedback loop by plumbing the replica count from spec through horizontal.go and replica_calculator.go so the calculator can punt with the right value.	2019-07-02 14:21:32 +02:00
waynepeking348	b8b1720f12	Fix bug of ObjectPerPodMetricReplicas to initialize replicaCount with currentReplicas	2019-06-05 11:54:03 +00:00
GuyTempleton	1efbde2815	Handle invalid metrics when scaling on multiple metrics Handle a case in the Horizontal Pod Autoscaler Controller when scaling on multiple metrics and one or more is missing or invalid. If all metrics are missing - return an error and leave the isScalingActive condition as that for the last invalid metric. If some metrics are missing/invalid and some are valid and found - if a scale up would be triggered by the valid metrics ignore the missing metrics and scale up, if a scale down would be triggered, return an error and leave the isScalingActive condition as that for the last invalid metric.	2019-05-29 23:20:40 +01:00
GuyTempleton	ee4dbbcbff	Add tests for handling scaling on unavailable metrics Add three tests for handling invalid metrics when scaling on multiple metrics - one for scaling up successfully (new behaviour) and two for ensuring we don't scale down (existing behaviour).	2019-05-29 23:11:32 +01:00
Kubernetes Prow Robot	4ebe11a6cb	Merge pull request #76110 from DirectXMan12/infra/prune-owners Prune directxman12 from metrics/autoscaling OWNERS	2019-04-29 14:35:36 -07:00
Davanum Srinivas	7b8c9acc09	remove unused code Change-Id: If821920ec8872e326b7d85437ad8d2620807799d	2019-04-19 08:36:31 -04:00
Joel Smith	f50696adda	Fix potential test flakes in HPA tests TestEventNotCreated and TestAvoidUncessaryUpdates Also, re-work the code so that the lock is never held while writing to the chan	2019-04-17 08:10:33 -06:00
Kubernetes Prow Robot	deb48e331a	Merge pull request #76189 from soltysh/fix_legacy_podautoscaler Fix flaky legacy pod autoscaler test	2019-04-05 14:34:05 -07:00
Kubernetes Prow Robot	1cdb4c965a	Merge pull request #74946 from ialidzhikov/clean-ineffectual-assignments Clean ineffectual assignments	2019-04-05 14:33:53 -07:00
Maciej Szulik	bcfd48c29e	Fix flaky legacy pod autoscaler test The reactor in runTest is set to catch all actions, but eventually it only handles CreateAction without checking action type which might fail sometimes when Patch arrives. This fix ensures we handle only the CreateAction.	2019-04-05 13:20:30 +02:00
Solly Ross	837976cb59	Prune directxman12 from metrics/autoscaling OWNERS Since I'm not really working on metrics or autoscaling stuff any more, I figured it was time to remove myself from the approvers list.	2019-04-03 16:24:51 -07:00
ialidzhikov	c3b2fb0d11	Clean ineffectual assignments Signed-off-by: ialidzhikov <i.alidjikov@gmail.com>	2019-03-23 00:27:07 +02:00
Kubernetes Prow Robot	4499275cb9	Merge pull request #72800 from stewart-yu/stewart-component-base Move config local to every controller in KCM	2019-03-21 19:26:19 -07:00
caiweidong	5fa000e5ed	Change log: avoid to print raw json response too frequently	2019-03-13 13:07:01 +08:00
stewart-yu	ecbd5427e7	auto-generated file	2019-03-02 12:55:26 +08:00
stewart-yu	e01ff1641c	move config local to every controllers in kube-controller-manager	2019-03-02 12:54:33 +08:00
Jordan Liggitt	d1e865ee34	Update client callers to use explicit versions	2019-02-26 08:36:30 -05:00
Kubernetes Prow Robot	808f2cf0ef	Merge pull request #72525 from justinsb/owners_should_not_be_executable Remove executable file permission from OWNERS files	2019-02-14 23:55:45 -08:00
Roy Lenferink	b43c04452f	Updated OWNERS files to include link to docs	2019-02-04 22:33:12 +01:00
Kubernetes Prow Robot	a4e3a5cb52	Merge pull request #71561 from anjensan/hpa-fix-current-metrics Fix 'currentMetrics' field for HPA with 'AverageValue' target type	2019-02-04 03:34:52 -08:00
Kubernetes Prow Robot	a3f74bd583	Merge pull request #72872 from arjunrn/object-average-value Added functionality for specifying target average value for object me…	2019-02-01 06:31:50 -08:00
Arjun Naik	c99d505001	Added functionality to use target average value for object metrics Signed-off-by: Arjun Naik <arjun.rn@gmail.com>	2019-01-23 21:00:05 +01:00
Justin SB	dd19b923b7	Remove executable file permission from OWNERS files	2019-01-11 16:42:59 -08:00
Krzysztof Jastrzebski	7498c14218	Update comments in Horizontal Pod Autoscaler Controller.	2019-01-07 10:06:21 +01:00
Krzysztof Jastrzebski	c6ebd126a7	Add request processing HPA into the queue after processing is finished. This fixes a bug with skipping request inserted by resync because previous one hasn't processed yet.	2019-01-04 11:59:57 +01:00
Jordan Liggitt	0ff455e340	generated files	2018-12-19 11:19:12 -05:00
Jordan Liggitt	fd9e9b01b1	Remove uses of extensions/v1beta1 clients	2018-12-19 11:18:53 -05:00
danielqsj	3c055aa4b4	Fix typos like limitting	2018-12-04 11:01:40 +08:00
Andrei Zhlobich	a8c58bcd24	Fix updating 'currentMetrics' field for HPA with 'AverageValue' target	2018-11-29 11:50:33 +01:00
Davanum Srinivas	954996e231	Move from glog to klog - Move from the old github.com/golang/glog to k8s.io/klog - klog as explicit InitFlags() so we add them as necessary - we update the other repositories that we vendor that made a similar change from glog to klog * github.com/kubernetes/repo-infra * k8s.io/gengo/ * k8s.io/kube-openapi/ * github.com/google/cadvisor - Entirely remove all references to glog - Fix some tests by explicit InitFlags in their init() methods Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135	2018-11-10 07:50:31 -05:00
Christoph Blecker	97b2992dc1	Update gofmt for go1.11	2018-10-05 12:59:38 -07:00
Joachim Bartosik	7d7c48a647	HPA stabilizes initial recommendation HPA will treat initial size of autoscalee to avoid hastily overriding recomendations made by HPA (if HPA set size and then was restarted) or by user (initial size should be treated as human-generated recommendation).	2018-09-19 14:54:55 +02:00
Krzysztof Jastrzebski	985ba931b1	Use informer cache instead of active pod gets in HPA controller.	2018-09-05 11:31:27 +02:00
Krzysztof Jastrzebski	958cba1c82	Replace scale down forbidden window Replacement is scale down stabilization window. HPA will scale down only to max of recommendations it made during that window. More details in https://docs.google.com/document/d/1IdG3sqgCEaRV3urPLA29IDudCufD89RYCohfBPNeWIM	2018-08-31 20:24:38 +02:00
Kubernetes Submit Queue	2548fb08cd	Merge pull request #68068 from krzysztof-jastrzebski/hpas2 Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. Change CPU sample sanitization in HPA. What this PR does / why we need it: Change CPU sample sanitization in HPA. Ignore samples if: - Pod is beeing initalized - 5 minutes from start defined by flag - pod is unready - pod is ready but full window of metric hasn't been colected since transition - Pod is initialized - 5 minutes from start defined by flag: - Pod has never been ready after initial readiness period. Release notes: ```release-note Improve CPU sample sanitization in HPA by taking metric's freshness into account. ```	2018-08-31 10:17:44 -07:00
Krzysztof Jastrzebski	5357bf9eac	Change CPU sample sanitization in HPA. Ignore samples if: - Pod is beeing initalized - 5 minutes from start defined by flag - pod is unready - pod is ready but full window of metric hasn't been colected since transition - Pod is initialized - 5 minutes from start defined by flag: - Pod has never been ready after initial readiness period.	2018-08-30 23:13:14 +02:00
Kubernetes Submit Queue	42c6f1fb28	Merge pull request #67067 from moonek/master Automatic merge from submit-queue (batch tested with PRs 67067, 67947). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Do not count soft-deleted pods for scaling purposes in HPA controller What this PR does / why we need it: The metrics of "soft-deleted" pods in general to be deleted should probably not matter for scaling purposes, since they'll be gone "soon", whether they're nodelost or just normally delete. As long as soft-deleted pods still exist, they prevent normal scale up. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes https://github.com/kubernetes/kubernetes/issues/62845 Special notes for your reviewer: Release note: ```release-note Stop counting soft-deleted pods for scaling purposes in HPA controller to avoid soft-deleted pods incorrectly affecting scale up replica count calculation. ```	2018-08-28 15:08:01 -07:00
moonek	3fedbe48e3	Do not count soft-deleted pods for scaling purposes in HPA controller	2018-08-28 16:27:47 +00:00
Krzysztof Jastrzebski	dfd88dbde0	Remove incorrect glog error from Horizontal Pod Autoscaler.	2018-08-28 09:18:25 +02:00
Mike Dame	dd7e81a8cd	Add dry run test for hpa v2beta2	2018-08-27 11:37:22 -04:00
Mike Dame	77d7f9cfa2	Generate files and modifications for autoscaling/v2beta2 and custom_metrics/v1beta2	2018-08-27 11:07:53 -04:00
Mike Dame	c7102ee5dc	Implement autoscaling/v2beta2 features in HPA controller	2018-08-27 11:07:52 -04:00
Kubernetes Submit Queue	663551bebd	Merge pull request #67252 from jbartosik/metric-sanitization Automatic merge from submit-queue (batch tested with PRs 66916, 67252, 67794, 67619, 67328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Fix HPA sample sanitization What this PR does / why we need it: @mwielgus pointed out a case when HPA fails as a result of my changes to HPA algorithm: - Have pods that use a lot of CPU during initilization, become ready right after they initialize, - Trigger a scale up, - When new pods become ready will will count their usage (even though it's not related to any work that needs doing), - This triggers another scale up, even though existing pods can handle work, no problem. The fix is: - Use all samples for non-cpu metrics. - Only use CPU samples if: - Pod is ready and was started more than 2 minutes ago, or - Pod is unready and last readiness change happened more than 10s after it was started. Reasoning behind this in: https://docs.google.com/document/d/1UdtYedhmCxjaJIQi6hwJMY0eHQQKxlVD8lSHZC1BPOA/edit Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Special notes for your reviewer: Release note: ```release-note Replace scale up forbidden window with disregarding CPU samples collected when pod was initializing. ```	2018-08-24 15:25:07 -07:00
Joachim Bartosik	4fd6a1684d	Make HPA more configurable Duration of initialization taint on CPU and window of initial readiness setting controlled by flags. Adding API violation exceptions following example of `e50340ee23`	2018-08-24 13:13:02 +02:00

1 2 3 4 5 ...

346 Commits