Deployment should ignore Pods that don't match the selector, even if
they have a ControllerRef pointing to one of the ReplicaSets it owns.
The ReplicaSet itself will orphan the Pod as soon as it syncs.
The list functions in deployment/util are used outside the Deployment
controller itself. Therefore, they don't do actual adoption/orphaning.
However, they still need to avoid listing things that don't belong.
Although Deployment already applied its ControllerRef to adopt matching
ReplicaSets, it actually still used label selectors to list objects that
it controls. That meant it didn't actually follow the rules of
ControllerRef, so it could still fight with other controller types.
This should mean that the special handling for overlapping Deployments
is no longer necessary, since each Deployment will only see objects that
it owns (via ControllerRef).
Automatic merge from submit-queue (batch tested with PRs 31783, 41988, 42535, 42572, 41870)
controller: ensure deployment rollback is re-entrant
Make rollbacks re-entrant in the Deployment controller, otherwise
fast enqueues of a Deployment may end up in undesired behavior
- redundant rollbacks.
Fixes https://github.com/kubernetes/kubernetes/issues/36703
@kubernetes/sig-apps-bugs
Make rollbacks re-entrant in the Deployment controller, otherwise
fast enqueues of a Deployment may end up in undesired behavior
- redundant rollbacks.
Automatic merge from submit-queue
kubectl: respect deployment strategy parameters for rollout status
Fixes https://github.com/kubernetes/kubernetes/issues/40496
`rollout status` now respects the strategy parameters for a RollingUpdate Deployment. This means that it will exit as soon as minimum availability is reached for a rollout (note that if you allow maximum availability, `rollout status` will succeed as soon as the new pods are created)
@janetkuo @AdoHe ptal
Automatic merge from submit-queue
Small fix to the bootstrap TokenCleaner
Accidentally missed setting options and so the TokenCleaner was in a retry loop. Also moved from using an explicit timer over cached values vs. relying on a short resync timeout.
```release-note
```
Putting this in the 1.6 milestone as this is clearly a bug fix in a new feature.
Automatic merge from submit-queue (batch tested with PRs 42456, 42457, 42414, 42480, 42370)
In DaemonSet e2e test, don't check nodes with NoSchedule taints
Fixes#42345
For example, master node has a ismaster:NoSchedule taint. We don't expect pods to be created there without toleration.
cc @marun @lukaszo @kargakis @yujuhong @Random-Liu @davidopp @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42456, 42457, 42414, 42480, 42370)
controller: reduce log verbosity for deployments
Fixes https://github.com/kubernetes/kubernetes/issues/41187
Labeling as a bug fix since I think excessive logging should be considered as a bug.
@kubernetes/sig-apps-bugs
Accidentally missed setting options and so the TokenCleaner was in a retry loop. Also moved from using an explicit timer over cached values vs. relying on a short resync timeout.
Signed-off-by: Joe Beda <joe.github@bedafamily.com>
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)
Implement bulk polling of volumes
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
cc @justinsb
Automatic merge from submit-queue (batch tested with PRs 42365, 42429, 41770, 42018, 35055)
controller: statefulsets respect observed generation
StatefulSets do not update ObservedGeneration even though the API field is in place. This means that clients can never be sure whether the StatefulSet controller has observed the latest spec of a StatefulSet.
@kubernetes/sig-apps-bugs
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)
RC/RS: Fully Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings ReplicaSet and ReplicationController into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
Although RC/RS had partially implemented ControllerRef, they didn't use it to determine which controller to sync, or to update expectations. This could lead to instability or controllers getting stuck.
Ref: https://github.com/kubernetes/kubernetes/issues/24433
**Special notes for your reviewer**:
**Release note**:
```release-note
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue
HPA Controller: Use Custom Metrics API
This commit switches over the HPA controller to use the custom metrics
API. It also converts the HPA controller to use the generated client
in k8s.io/metrics for the resource metrics API.
In order to enable support, you must enable
`--horizontal-pod-autoscaler-use-rest-clients` on the
controller-manager, which will switch the HPA controller's MetricsClient
implementation over to use the standard rest clients for both custom
metrics and resource metrics. This requires that at the least resource
metrics API is registered with kube-aggregator, and that the controller
manager is pointed at kube-aggregator. For this to work, Heapster
must be serving the new-style API server (`--api-server=true`).
Before this merges, this will need kubernetes/metrics#2 to merge, and a godeps update to pull that in.
It's also semi-dependent on kubernetes/heapster#1537, but that is not required in order for this to merge.
**Release note**:
```release-note
Allow the Horizontal Pod Autoscaler controller to talk to the metrics API and custom metrics API as standard APIs.
```
Automatic merge from submit-queue
statefulset: wait for pvc cache sync
#42056 switched the statefulset controller to use the pvc shared informer/lister, but accidentally left out waiting for its cache to sync.
cc @kubernetes/sig-apps-pr-reviews @kargakis @foxish @kow3ns @smarterclayton @deads2k
Automatic merge from submit-queue
fix rsListerSynced and podListerSynced for DeploymentController
**What this PR does / why we need it**:
There is a mistake when initializing `DeploymentController`'s `rsListerSynced` and `podListerSynced` in `NewDeploymentController`, they are all initialized to `Deployment`'s `Informer`, so the `DeploymentController` maybe running before the `ReplicaSet` cache and `Pod` cache has been synced.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Indeed according unit test is neccessary, but this bug fix is simple, and if the tests is neccessary I will submit another PR later.
**Release note**:
```release-note
```
This commit switches over the HPA controller to use the custom metrics
API. It also converts the HPA controller to use the generated client
in k8s.io/metrics for the resource metrics API.
In order to enable support, you must enable
`--horizontal-pod-autoscaler-use-rest-clients` on the
controller-manager, which will switch the HPA controller's MetricsClient
implementation over to use the standard rest clients for both custom
metrics and resource metrics. This requires that at the least resource
metrics API is registered with kube-aggregator, and that the controller
manager is pointed at kube-aggregator. For this to work, Heapster
must be serving the new-style API server (`--api-server=true`).
Automatic merge from submit-queue (batch tested with PRs 41234, 42186, 41615, 42028, 41788)
apimachinery: handle duplicated and conflicting type registration
Double registrations were leading to duplications in `KnownKinds()`. Conflicting registrations with same gvk, but different types were not detected.
Automatic merge from submit-queue (batch tested with PRs 41234, 42186, 41615, 42028, 41788)
Make DaemonSet respect critical pods annotation when scheduling
**What this PR does / why we need it**: #41612
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#41612
**Special notes for your reviewer**:
**Release note**:
```release-note
Make DaemonSet respect critical pods annotation when scheduling.
```
cc @kubernetes/sig-apps-feature-requests @erictune @vishh @liggitt @kargakis @lukaszo @piosz @davidopp
Automatic merge from submit-queue
Enforce Node Allocatable via cgroups
This PR enforces node allocatable across all pods using a top level cgroup as described in https://github.com/kubernetes/community/pull/348
This PR also provides an option to enforce `kubeReserved` and `systemReserved` on user specified cgroups.
This PR will by default make kubelet create top level cgroups even if `kubeReserved` and `systemReserved` is not specified and hence `Allocatable = Capacity`.
```release-note
New Kubelet flag `--enforce-node-allocatable` with a default value of `pods` is added which will make kubelet create a top level cgroup for all pods to enforce Node Allocatable. Optionally, `system-reserved` & `kube-reserved` values can also be specified separated by comma to enforce node allocatable on cgroups specified via `--system-reserved-cgroup` & `--kube-reserved-cgroup` respectively. Note the default value of the latter flags are "".
This feature requires a **Node Drain** prior to upgrade failing which pods will be restarted if possible or terminated if they have a `RestartNever` policy.
```
cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-node-feature-requests
TODO:
- [x] Adjust effective Node Allocatable to subtract hard eviction thresholds
- [x] Add unit tests
- [x] Complete pending e2e tests
- [x] Manual testing
- [x] Get the proposal merged
@dashpole is working on adding support for evictions for enforcing Node allocatable more gracefully. That work will show up in a subsequent PR for v1.6
Automatic merge from submit-queue (batch tested with PRs 42053, 41282, 42056, 41663, 40927)
Fully remove hand-written listers and informers
Note: the first commit is from #41927. Adding do-not-merge for now as we'll want that to go in first, and then I'll rebase this on top.
Update statefulset controller to use a lister for PVCs instead of a client request. Also replace a unit test's dependency on legacylisters with the generated ones. cc @kargakis @kow3ns @foxish @kubernetes/sig-apps-pr-reviews
Remove all references to pkg/controller/informers and pkg/client/legacylisters, and remove those packages.
@smarterclayton @deads2k this should be it!
cc @gmarek @wojtek-t @derekwaynecarr @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41814, 41922, 41957, 41406, 41077)
pv_controller: Do not report exponential backoff as error.
It's not an error when recycle/delete/provision operation cannot be started
because it has failed recently. It will be restarted automatically when
backoff expires.
This just pollutes logs without any useful information:
```
E0214 08:00:30.428073 77288 pv_controller.go:1410] error scheduling operaion "delete-pvc-1fa0e8b4-f2b5-11e6-a8bb-fa163ecb84eb[1fbd52ee-f2b5-11e6-a8bb-fa163ecb84eb]": Failed to create operation with name "delete-pvc-1fa0e8b4-f2b5-11e6-a8bb-fa163ecb84eb[1fbd52ee-f2b5-11e6-a8bb-fa163ecb84eb]". An operation with that name failed at 2017-02-14 08:00:15.631133152 -0500 EST. No retries permitted until 2017-02-14 08:00:31.631133152 -0500 EST (16s). Last error: "Cannot delete the volume \"11a4faea-bfc7-4713-88b3-dec492480dba\", it's still attached to a node".
```
```release-note
NONE
```
@kubernetes/sig-storage-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
Make DaemonSets survive taint-based evictions when nodes turn unreachable/notReady
**What this PR does / why we need it**:
DaemonPods shouldn't be deleted by NodeController in case of Node problems.
This PR is to add infinite tolerations for Unreachable/NotReady NoExecute Taints, so that they won't be deleted by NodeController when a node goes unreachable/notReady.
**Which issue this PR fixes** :
fixes#41738
Related PR: #41133
**Special notes for your reviewer**:
**Release note**:
```release-note
Make DaemonSets survive taint-based evictions when nodes turn unreachable/notReady.
```
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
Modify CronJob API to add job history limits, cleanup jobs in controller
**What this PR does / why we need it**:
As discussed in #34710: this adds two limits to `CronJobSpec`, to limit the number of finished jobs created by a CronJob to keep.
**Which issue this PR fixes**: fixes#34710
**Special notes for your reviewer**:
cc @soltysh, please have a look and let me know what you think -- I'll then add end to end testing and update the doc in a separate commit. What is the timeline to get this into 1.6?
The plan:
- [x] API changes
- [x] Changing versioned APIs
- [x] `types.go`
- [x] `defaults.go` (nothing to do)
- [x] `conversion.go` (nothing to do?)
- [x] `conversion_test.go` (nothing to do?)
- [x] Changing the internal structure
- [x] `types.go`
- [x] `validation.go`
- [x] `validation_test.go`
- [x] Edit version conversions
- [x] Edit (nothing to do?)
- [x] Run `hack/update-codegen.sh`
- [x] Generate protobuf objects
- [x] Run `hack/update-generated-protobuf.sh`
- [x] Generate json (un)marshaling code
- [x] Run `hack/update-codecgen.sh`
- [x] Update fuzzer
- [x] Actual logic
- [x] Unit tests
- [x] End to end tests
- [x] Documentation changes and API specs update in separate commit
**Release note**:
```release-note
Add configurable limits to CronJob resource to specify how many successful and failed jobs are preserved.
```
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Add storage.k8s.io/v1 API
v1 API is direct copy of v1beta1 API. This v1 API gets installed and exposed in this PR, I tested that kubectl can create both v1beta1 and v1 StorageClass.
~~Rest of Kubernetes (controllers, examples,. tests, ...) still use v1beta1 API, I will update it when this PR gets merged as these changes would get lost among generated code.~~ Most parts use v1 API now, it would not compile / run tests without it.
**Release note**:
```
Kubernetes API storage.k8s.io for storage objects is now fully supported and is available as storage.k8s.io/v1. Beta version of the API storage.k8s.io/v1beta1 is still available in this release, however it will be removed in a future Kubernetes release.
Together with the API endpoint, StorageClass annotation "storageclass.beta.kubernetes.io/is-default-class" is deprecated and "storageclass.kubernetes.io/is-default-class" should be used instead to mark a default storage class. The beta annotation is still working in this release, however it won't be supported in the next one.
```
@kubernetes/sig-storage-misc
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)
Switch scheduler to use generated listers/informers
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
I think this can wait until master is open for 1.7 pulls, given that we're close to the 1.6 freeze.
After this and #41482 go in, the only code left that references legacylisters will be federation, and 1 bit in a stateful set unit test (which I'll clean up in a follow-up).
@resouer I imagine this will conflict with your equivalence class work, so one of us will be doing some rebasing 😄
cc @wojtek-t @gmarek @timothysc @jayunit100 @smarterclayton @deads2k @liggitt @sttts @derekwaynecarr @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)
controller: fix requeueing progressing deployments
Drop the secondary queue and add either ratelimited or after the
required amount of time that we need to wait directly in the main
queue. In this way we can always be sure that we will sync back
the Deployment if its progress has yet to resolve into a complete
(NewReplicaSetAvailable) or TimedOut condition.
This should also simplify the deployment controller a bit.
Fixes https://github.com/kubernetes/kubernetes/issues/39785. Once this change soaks, I will move the test out of the flaky suite.
@kubernetes/sig-apps-misc
Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361)
Refactor ControllerRefManager
**What this PR does / why we need it**:
To prepare for implementing ControllerRef across all controllers (https://github.com/kubernetes/community/pull/298), this pushes the common adopt/orphan logic into ControllerRefManager so each controller doesn't have to duplicate it.
This also shares the adopt/orphan logic between Pods and ReplicaSets, so it lives in only one place.
**Which issue this PR fixes**:
**Special notes for your reviewer**:
**Release note**:
```release-note
```
cc @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361)
Allow multiple mounts in StatefulSet volume zone placement
We have some heuristics that ensure that volumes (and hence stateful set
pods) are spread out across zones. Sadly they forgot to account for
multiple mounts. This PR updates the heuristic to ignore the mount name
when we see something that looks like a statefulset volume, thus
ensuring that multiple mounts end up in the same AZ.
Fix#35695
```release-note
Fix zone placement heuristics so that multiple mounts in a StatefulSet pod are created in the same zone
```
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
Automatic merge from submit-queue (batch tested with PRs 41146, 41486, 41482, 41538, 41784)
Switch statefulset controller to shared informers
Originally part of #40097
I *think* the controller currently makes a deep copy of a StatefulSet before it mutates it, but I'm not 100% sure. For those who are most familiar with this code, could you please confirm?
@beeps @smarterclayton @ingvagabund @sttts @liggitt @deads2k @kubernetes/sig-apps-pr-reviews @kubernetes/sig-scalability-pr-reviews @timothysc @gmarek @wojtek-t
Automatic merge from submit-queue (batch tested with PRs 38957, 41819, 41851, 40667, 41373)
Fix deployment helper - no assumptions on only one new ReplicaSet
#40415
**Release note**:
```release-note
NONE
```
@kubernetes/sig-apps-bugs
Conversions can mutate the underlying object (and ours were).
Make a deepcopy before our first conversion at the very start
of the reconciler method in order to avoid mutating the shared
informer cache during conversion.
Fixes#41768
Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)
changes to cleanup the volume plugin for recycle
**What this PR does / why we need it**:
Code cleanup. Changing from creating a new interface from the plugin, that then calls a function to recycle a volume, to adding the function to the plugin itself.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#26230
**Special notes for your reviewer**:
Took same approach from closed PR #28432.
Do you want the approach to be the same for NewDeleter(), NewMounter(), NewUnMounter() and should they be in this same PR or submit different PR's for those?
**Release note**:
```NONE
```
Drop the secondary queue and add either ratelimited or after the
required amount of time that we need to wait directly in the main
queue. In this way we can always be sure that we will sync back
the Deployment if its progress has yet to resolve into a complete
(NewReplicaSetAvailable) or TimedOut condition.
Automatic merge from submit-queue
Make controller-manager resilient to stale serviceaccount tokens
Now that the controller manager is spinning up controller loops using service accounts, we need to be more proactive in making sure the clients will actually work.
Future additional work:
* make a controller that reaps invalid service account tokens (c.f. https://github.com/kubernetes/kubernetes/issues/20165)
* allow updating the client held by a controller with a new token while the controller is running (c.f. https://github.com/kubernetes/kubernetes/issues/4672)
Automatic merge from submit-queue
Convert HPA controller to support HPA v2 mechanics
This PR converts the HPA controller to support the mechanics from HPA v2.
The HPA controller continues to make use of the HPA v1 client, but utilizes
the conversion logic to work with autoscaling/v2alpha1 objects internally.
It is the follow-up PR to #36033 and part of kubernetes/features#117.
**Release note**:
```release-note
NONE
```
We have some heuristics that ensure that volumes (and hence stateful set
pods) are spread out across zones. Sadly they forgot to account for
multiple mounts. This PR updates the heuristic to ignore the mount name
when we see something that looks like a statefulset volume, thus
ensuring that multiple mounts end up in the same AZ.
Fix#35695
Automatic merge from submit-queue
Switch service controller to shared informers
Originally part of #40097
cc @deads2k @smarterclayton @gmarek @wojtek-t @timothysc @sttts @liggitt @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41604, 41273, 41547)
Switch pv controller to shared informer
This is WIP because I still need to do something with bazel? and add 'get storageclasses' to the controller-manager rbac role
@jsafrane PTAL and make sure I did not break anything in the PV controller. Do we need to clone the volumes/claims we get from the shared informer before we use them? I could not find a place where we modify them but you would know for certain.
cc @ncdc because I copied what you did in your other PRs.
Automatic merge from submit-queue (batch tested with PRs 41517, 41494, 41163)
Deployment: filter out old RSes that are deleted or with non-zero replicas before cleanup
Fixes#36379
cc @zmerlynn @yujuhong @kargakis @kubernetes/sig-apps-bugs
To prepare for implementing ControllerRef across all controllers,
this pushes the common adopt/orphan logic into ControllerRefManager
so each controller doesn't have to duplicate it.
This also shares the adopt/orphan logic between Pods and ReplicaSets,
so it lives in only one place.
This commit converts the HPA controller over to using the new version of
the HorizontalPodAutoscaler object found in autoscaling/v2alpha1. Note
that while the autoscaler will accept requests for object metrics, the
scale client will return an error on attempts to get object metrics
(since that requires the new custom metrics API, which is not yet
implemented).
This also enables the HPA object in v2alpha1 as a retrievable API
version by default.
Automatic merge from submit-queue
Remove alpha provisioning
This is the first part of https://github.com/kubernetes/features/issues/36
@kubernetes/sig-storage-misc
**Release note**:
```release-note
Alpha version of dynamic volume provisioning is removed in this release. Annotation
"volume.alpha.kubernetes.io/storage-class" does not have any special meaning. A default storage class
and DefaultStorageClass admission plugin can be used to preserve similar behavior of Kubernetes cluster,
see https://kubernetes.io/docs/user-guide/persistent-volumes/#class-1 for details.
```
Automatic merge from submit-queue
Switch serviceaccounts controller to generated shared informers
Originally part of #40097
cc @deads2k @sttts @liggitt @smarterclayton @gmarek @wojtek-t @timothysc @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Switch resourcequota controller to shared informers
Originally part of #40097
I have had some issues with this change in the past, when I updated `pkg/quota` to use the new informers while `pkg/controller/resourcequota` remained on the old informers. In this PR, both are switched to using the new informers. The issues in the past were lots of flakey test failures in the ResourceQuota e2es, where it would randomly fail to see deletions and handle replenishment. I am hoping that now that everything here is consistently using the new informers, there won't be any more of these flakes, but it's something to keep an eye out for.
I also think `pkg/controller/resourcequota` could be cleaned up. I don't think there's really any need for `replenishment_controller.go` any more since it's no longer running individual controllers per kind to replenish. It instead just uses the shared informer and adds event handlers to it. But maybe we do that in a follow up.
cc @derekwaynecarr @smarterclayton @wojtek-t @deads2k @sttts @liggitt @timothysc @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 41196, 41252, 41300, 39179, 41449)
controller: cleanup workload controllers a bit
* Switches glog.Errorf to utilruntime.HandleError in DS and RC controllers
* Drops a couple of unused variables in the DS, SS, and Deployment controllers
* Updates some comments
@kubernetes/sig-apps-misc
It's not an error when recycle/delete/provision operation cannot be started
because it has failed recently. It will be restarted automatically when
backoff expires.
Automatic merge from submit-queue (batch tested with PRs 41115, 41212, 41346, 41340, 41172)
Enable PodTolerateNodeTaints predicate in DaemonSet controller
Ref #28687, this enables the PodTolerateNodeTaints predicate to the daemonset controller
cc @Random-Liu @dchen1107 @davidopp @mikedanese @kubernetes/sig-apps-pr-reviews @kubernetes/sig-node-pr-reviews @kargakis @lukaszo
```release-note
Make DaemonSet controller respect node taints and pod tolerations.
```
* Switches glog.Errorf to utilruntime.HandleError in DS and RC controllers
* Drops a couple of unused variables in the DS, SS, and Deployment controllers
* Updates some comments
Automatic merge from submit-queue (batch tested with PRs 41137, 41268)
Allow the CertificateController to use any Signer implementation.
**What this PR does / why we need it**:
This will allow developers to create `CertificateController`s with arbitrary `Signer`s, instead of forcing the use of `CFSSLSigner`. It matches the behavior of allowing an arbitrary `AutoApprover` to be passed in the constructor.
**Release note**:
```release-note
NONE
```
CC @mikedanese
Automatic merge from submit-queue (batch tested with PRs 41248, 41214)
Switch hpa controller to shared informer
**What this PR does / why we need it**: switch the hpa controller to use a shared informer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: Only the last commit is relevant. The others are from #40759, #41114, #41148
**Release note**:
```release-note
```
cc @smarterclayton @deads2k @sttts @liggitt @DirectXMan12 @timothysc @kubernetes/sig-scalability-pr-reviews @jszczepkowski @mwielgus @piosz
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
TaintController
```release-note
This PR adds a manager to NodeController that is responsible for removing Pods from Nodes tainted with NoExecute Taints. This feature is beta (as the rest of taints) and enabled by default. It's gated by controller-manager enable-taint-manager flag.
```
Automatic merge from submit-queue (batch tested with PRs 41112, 41201, 41058, 40650, 40926)
Promote TokenReview to v1
Peer to https://github.com/kubernetes/kubernetes/pull/40709
We have multiple features that depend on this API:
- [webhook authentication](https://kubernetes.io/docs/admin/authentication/#webhook-token-authentication)
- [kubelet delegated authentication](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authentication)
- add-on API server delegated authentication
The API has been in use since 1.3 in beta status (v1beta1) with negligible changes:
- Added a status field for reporting errors evaluating the token
This PR promotes the existing v1beta1 API to v1 with no changes
Because the API does not persist data (it is a query/response-style API), there are no data migration concerns.
This positions us to promote the features that depend on this API to stable in 1.7
cc @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-misc
```release-note
The authentication.k8s.io API group was promoted to v1
```
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
StatefulSet hardening
**What this PR does / why we need it**:
This PR contains the following changes to StatefulSet. Only one change effects the semantics of how the controller operates (This is described in #38418), and this change only brings the controller into conformance with its documented behavior.
1. pcb and pcb controller are removed and their functionality is encapsulated in StatefulPodControlInterface. This class modules the design contoller.PodControlInterface and provides an abstraction to clientset.Interface which is useful for testing purposes.
2. IdentityMappers has been removed to clarify what properties of a Pod are mutated by the controller. All mutations are performed in the UpdateStatefulPod method of the StatefulPodControlInterface.
3. The statefulSetIterator and petQueue classes are removed. These classes sorted Pods by CreationTimestamp. This is brittle and not resilient to clock skew. The current control loop, which implements the same logic, is in stateful_set_control.go. The Pods are now sorted and considered by their ordinal indices, as is outlined in the documentation.
4. StatefulSetController now checks to see if the Pods matching a StatefulSet's Selector also match the Name of the StatefulSet. This will make the controller resilient to overlapping, and will be enhanced by the addition of ControllerRefs.
5. The total lines of production code have been reduced, and the total number of unit tests has been increased. All new code has 100% unit coverage giving the module 83% coverage. Tests for StatefulSetController have been added, but it is not practical to achieve greater coverage in unit testing for this code (the e2e tests for StatefulSet cover these areas).
6. Issue #38418 is fixed in that StaefulSet will ensure that all Pods that are predecessors of another Pod are Running and Ready prior to launching a new Pod. This removes the potential for deadlock when a Pod needs to be rescheduled while its predecessor is hung in Pending or Initializing.
7. All reference to pet have been removed from the code and comments.
**Which issue this PR fixes**
fixes #38418,#36859
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue #38418 which, under circumstance, could cause StatefulSet to deadlock.
Mediates issue #36859. StatefulSet only acts on Pods whose identity matches the StatefulSet, providing a partial mediation for overlapping controllers.
```
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
Implement TTL controller and use the ttl annotation attached to node in secret manager
For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.
Apiserver cannot keep up with it very easily.
Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.
So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is).
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.
@dchen1107 @thockin @liggitt
Automatic merge from submit-queue (batch tested with PRs 40917, 41181, 41123, 36592, 41183)
Set all node conditions to Unknown when node is unreachable
**What this PR does / why we need it**:
Sets all node conditions to Unknown when node does not report status/unreachable
**Which issue this PR fixes**
fixes https://github.com/kubernetes/kubernetes/issues/36273
Automatic merge from submit-queue (batch tested with PRs 41037, 40118, 40959, 41084, 41092)
Switch CSR controller to use shared informer
Switch the CSR controller to use a shared informer. Originally part of #40097 but I'm splitting that up into multiple PRs.
I have added a test to try to ensure we don't mutate the cache. It could use some fleshing out for additional coverage but it gets the initial job done, I think.
cc @mikedanese @deads2k @liggitt @sttts @kubernetes/sig-scalability-pr-reviews
1. pcb and pcb controller are removed and their functionality is
encapsulated in StatefulPodControlInterface.
2. IdentityMappers has been removed to clarify what properties of a Pod are
mutated by the controller. All mutations are performed in the
UpdateStatefulPod method of the StatefulPodControlInterface.
3. The statefulSetIterator and petQueue classes are removed. These classes
sorted Pods by CreationTimestamp. This is brittle and not resilient to
clock skew. The current control loop, which implements the same logic,
is in stateful_set_control.go. The Pods are now sorted and considered by
their ordinal indices, as is outlined in the documentation.
4. StatefulSetController now checks to see if the Pods matching a
StatefulSet's Selector also match the Name of the StatefulSet. This will
make the controller resilient to overlapping, and will be enhanced by
the addition of ControllerRefs.
Automatic merge from submit-queue
Update owners file for job and cronjob controller
I've just noticed we have outdated OWNERS files for job and cronjob controllers.
@erictune ptal
@kubernetes/sig-contributor-experience-pr-reviews fyi
Automatic merge from submit-queue (batch tested with PRs 40345, 38183, 40236, 40861, 40900)
refactor approver and signer interfaces to be consisten w.r.t. apiserver interaction
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)
Promote SubjectAccessReview to v1
We have multiple features that depend on this API:
SubjectAccessReview
- [webhook authorization](https://kubernetes.io/docs/admin/authorization/#webhook-mode)
- [kubelet delegated authorization](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authorization)
- add-on API server delegated authorization
The API has been in use since 1.3 in beta status (v1beta1) with negligible changes:
- Added a status field for reporting errors evaluating access
- A typo was discovered in the SubjectAccessReviewSpec Groups field name
This PR promotes the existing v1beta1 API to v1, with the only change being the typo fix to the groups field. (fixes https://github.com/kubernetes/kubernetes/issues/32709)
Because the API does not persist data (it is a query/response-style API), there are no data migration concerns.
This positions us to promote the features that depend on this API to stable in 1.7
cc @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-misc
```release-note
The authorization.k8s.io API group was promoted to v1
```
Automatic merge from submit-queue
federation: Refactoring namespaced resources deletion code from kube ns controller and sharing it with fed ns controller
Ref https://github.com/kubernetes/kubernetes/issues/33612
Refactoring code in kube namespace controller to delete all resources in a namespace when the namespace is deleted. Refactored this code into a separate NamespacedResourcesDeleter class and calling it from federation namespace controller.
This is required for enabling cascading deletion of namespaced resources in federation apiserver.
Before this PR, we were directly deleting the namespaced resources and assuming that they go away immediately. With cascading deletion, we will have to wait for the corresponding controllers to first delete the resources from underlying clusters and then delete the resource from federation control plane. NamespacedResourcesDeleter has this waiting logic.
cc @kubernetes/sig-federation-misc @caesarxuchao @derekwaynecarr @mwielgus
Automatic merge from submit-queue
Replace hand-written informers with generated ones
Replace existing uses of hand-written informers with generated ones.
Follow-up commits will switch the use of one-off informers to shared
informers.
This is a precursor to #40097. That PR will switch one-off informers to shared informers for the majority of the code base (but not quite all of it...).
NOTE: this does create a second set of shared informers in the kube-controller-manager. This will be resolved back down to a single factory once #40097 is reviewed and merged.
There are a couple of places where I expanded the # of caches we wait for in the calls to `WaitForCacheSync` - please pay attention to those. I also added in a commented-out wait in the attach/detach controller. If @kubernetes/sig-storage-pr-reviews is ok with enabling the waiting, I'll do it (I'll just need to tweak an integration test slightly).
@deads2k @sttts @smarterclayton @liggitt @soltysh @timothysc @lavalamp @wojtek-t @gmarek @sjenning @derekwaynecarr @kubernetes/sig-scalability-pr-reviews
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
Automatic merge from submit-queue
Update daemon set controller OWNERS file
Adding myself as reviewer, adding @mikedanese as approver
cc @kargakis @lukasredynk
Automatic merge from submit-queue (batch tested with PRs 35782, 35831, 39279, 40853, 40867)
genericapiserver: cut off more dependencies – episode 7
Follow-up of https://github.com/kubernetes/kubernetes/pull/40822
approved based on #40363
Automatic merge from submit-queue (batch tested with PRs 40855, 40859)
PV binding: send an event when there are no PVs to bind
This is similar to scheduler that says "no nodes available to schedule pods"
when it can't schedule a pod.
@kubernetes/sig-storage-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 40810, 40695)
Prevent pv controller from forcefully overwrite provisioned volume name
**What this PR does / why we need it**:
This PR adds a fix to prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name. Instead, it overwrites the volume's name only when it is missing. This allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
**Which issue this PR fixes**
This PR does not have an issue affiliated, but it will allow PR #38924 to properly implement dynamically provisioned volume in namespaces other than default.
Automatic merge from submit-queue (batch tested with PRs 39169, 40719, 38954, 40808, 40689)
Add StatefulSets checks at Service level
Hi!
Please let me propose some very small e2e testsuite enhancement.
This PR removed a `TODO` about checking governing service at unit test level (which is hard) and adds this to e2e testsuite.
Thanks
Sebastian
This fix prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name. Instead, it allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
Automatic merge from submit-queue (batch tested with PRs 34543, 40606)
sync client-go and move util/workqueue
The vision of client-go is that it provides enough utilities to build a reasonable controller. It has been copying `util/workqueue`. This makes it authoritative.
@liggitt I'm getting really close to making client-go authoritative ptal.
approved based on https://github.com/kubernetes/kubernetes/issues/40363
Automatic merge from submit-queue
controller: don't run informers in unit tests when unnecessary
Fixes https://github.com/kubernetes/kubernetes/issues/39908
@mfojtik it seems that using informers makes the deployment sync for the initial relist so this races with the enqueue that these tests are testing.
Automatic merge from submit-queue
Decrease Daemonset burst replicas due to DoS conditions.
**What this PR does / why we need it**:
We are seeing DoS conditions on our Registry if were running a large cluster with too many daemonsets bursting at once.
**Special notes for your reviewer**:
I decided not to plumb through yet another variable to the command line. Ideally such parameters could be tweaked via a configuration file.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40126, 40565, 38777, 40564, 40572)
Do not swallow error in asw.updateNodeStatusUpdateNeeded
Ref #39056
Bubble the error up to `SetNodeUpdateStatusNeeded` and log it out.
NOTE: This does not modify interface of `SetNodeUpdateStatusNeeded`
Automatic merge from submit-queue (batch tested with PRs 40239, 40397, 40449, 40448, 40360)
move the discovery and dynamic clients
Moved the dynamic client, discovery client, testing/core, and testing/cache to `client-go`. Dependencies on api groups we don't have generated clients for have dropped out, so federation, kubeadm, and imagepolicy.
@caesarxuchao @sttts
approved based on https://github.com/kubernetes/kubernetes/issues/40363
Automatic merge from submit-queue (batch tested with PRs 39538, 40188, 40357, 38214, 40195)
genericapiserver: cut off more dependencies – episode 2
Compare commit subjects.
approved based on #40363
Automatic merge from submit-queue (batch tested with PRs 39064, 40294)
Refactor persistent volume tests
This is an attempt to make the binder tests a bit more concise. The PVCs are being created by a "templating" function. There is also a handful of PVs in the tests but those vary quite more and I don't think similar approach would save us much code.
Reference:
https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29006#-KPJuVeDE0O6TvDP9jia
@jsafrane: I hope this is what you have on mind.
Automatic merge from submit-queue (batch tested with PRs 40196, 40143, 40277)
Emit warning event when CronJob cannot determine starting time
**What this PR does / why we need it**:
In #39608, we've modified the error message for when a CronJob has too many unmet starting times to enumerate to figure out the next starting time. This makes it more "actionable", and the user can now set a deadline to avoid running into this. However, the error message is still only controller level AFAIK and thus not exposed to the user. From his perspective, there is no way to tell why the CronJob is not scheduling the next instance.
The PR adds a warning event in addition to the error in the controller manager's log.
**Which issue this PR fixes**: This is an addition to PR #39608 regarding #36311.
**Special notes for your reviewer**: cc @soltysh
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40232, 40235, 40237, 40240)
Fixup pet terminology in log and user-facing events
**What this PR does / why we need it**:
Removes some user-facing strings for pet terminology.
Automatic merge from submit-queue
make client-go more authoritative
Builds on https://github.com/kubernetes/kubernetes/pull/40103
This moves a few more support package to client-go for origination.
1. restclient/watch - nodep
1. util/flowcontrol - used interface
1. util/integer, util/clock - used in controllers and in support of util/flowcontrol
Automatic merge from submit-queue
controller: decouple cleanup policy from deployment strategies
Deployments get cleaned up only when they are paused, they get scaled up/down,
or when the strategy that drives rollouts completes. This means that stuck
deployments that fall into none of the above categories will not get cleaned
up. Since cleanup is already safe by itself (we only delete old replica sets
that are synced by the replica set controller and have no replicas) we can
execute it for every deployment when there is no intention to rollback.
Fixes https://github.com/kubernetes/kubernetes/issues/40068
Deployments get cleaned up only when they are paused, they get scaled up/down,
or when the strategy that drives rollouts completes. This means that stuck
deployments that fall into none of the above categories will not get cleaned
up. Since cleanup is already safe by itself (we only delete old replica sets
that are synced by the replica set controller and have no replicas) we can
execute it for every deployment when there is no intention to rollback.
Automatic merge from submit-queue (batch tested with PRs 34763, 38706, 39939, 40020)
Use Statefulset instead in e2e and controller
Quick fix ref: #35534
We should finish the issue to meet v1.6 milestone.
Automatic merge from submit-queue
genericapiserver: cut off pkg/serviceaccount dependency
**Blocked** by pkg/api/validation/genericvalidation to be split up and moved into apimachinery.
Automatic merge from submit-queue
Do not list CronJob unmet starting times beyond deadline
**What this PR does / why we need it**:
See #36311. `getRecentUnmetScheduleTimes` gives up after 100 unmet times to avoid wasting too much CPU or memory generating all the times, as it generates them sequentially.
When concurrency is forbidden, this is conceptually un-necessary: we only need the last unmet start time. This suggests that when concurrency is forbidden, we could generate times by going backward in time from now. This is not very practical as CronJob currently relies on a package that only provides `Next` and no `Prev`. Hand-cooking a `Prev` does not seem like a good idea. I could submit a PR to the cron library to add a `Prev` method, and use that when concurrency is forbidden through something like `getLastUnmetScheduleTime`. This would be `O(1)` and there would be no limit involved.
(edit: actually, even for the other concurrency settings, we only start the last unmet start times -- there is a `TODO` in the controller to actually start all of them, but that is not implemented at the moment. This means the solution would apply, at least temporarily, to all concurrency settings).
cc @soltysh what do you think?
In the meantime, I would suggest to do something simple. Currently, the user has no way to configure anything to ensure that his CronJob will not get stuck if one job takes more that 100 unmet times.
`getRecentUnmetScheduleTimes` starts with an initial time corresponding to the last start (or to the creation of the CronJob, if nothing has started yet). However, when `StartingDeadlineSeconds` is set, the controller will not start anything that is older than the deadline, so if the last start is way beyond the deadline, we are generating potentially lots of unmet start times that will not be considered by the scheduler for scheduling anyway.
Consider a job running every minute, where the last instance has taken 120 minutes. This means there are more than 100 unmet times when we start counting from the last start time.
**The PR makes `getRecentUnmetScheduleTimes` only consider times that do not fall beyond the deadline.** Here, the CronJob can be configured with a `StartingDeadlineSeconds` of, say, 10 minutes. After the 120min job has run, `getRecentUnmetScheduleTimes` will only consider the times in the last 10 minutes from now, and will not get stuck.
As a side note on the max. number of unmet times to use as limits in terms of CPU used by the controller: I have run a quick benchmark on my i7 mac. Schedules corresponding to "once a week" tend to be more expensive to generate unmet times for. Just FYI.
```
+--------------+---------------+--------------+
| SCHEDULE | MISSED STARTS | TIMING |
+--------------+---------------+--------------+
| */1 * * * ? | 100 | 383.645µs |
| */30 * * * ? | 100 | 354.765µs |
| 30 1 * * ? | 100 | 1.065124ms |
| 30 1 * * 0 | 100 | 1.80034ms |
| */1 * * * ? | 500 | 1.341365ms |
| */30 * * * ? | 500 | 1.814441ms |
| 30 1 * * ? | 500 | 8.475012ms |
| 30 1 * * 0 | 500 | 10.020613ms |
| */1 * * * ? | 1000 | 2.551697ms |
| */30 * * * ? | 1000 | 4.075813ms |
| 30 1 * * ? | 1000 | 17.674945ms |
| 30 1 * * 0 | 1000 | 19.149324ms |
| */1 * * * ? | 10000 | 25.725531ms |
| */30 * * * ? | 10000 | 87.520022ms |
| 30 1 * * ? | 10000 | 174.29216ms |
| 30 1 * * 0 | 10000 | 196.565748ms |
+--------------+---------------+--------------+
```
using
```.go
package main
import (
"fmt"
"time"
"os"
"strconv"
"github.com/robfig/cron"
"github.com/olekukonko/tablewriter"
)
func timeSchedule(schedule string, iterations int) (time.Duration) {
sched, err := cron.ParseStandard(schedule)
if err != nil {
panic(fmt.Sprintf("Unparseable schedule: %s", err))
}
start := time.Now()
t := time.Now()
for i := 1; i <= iterations; i++ {
t = sched.Next(t)
}
return time.Since(start)
}
func main() {
table := tablewriter.NewWriter(os.Stdout)
table.SetHeader([]string{"Schedule", "Missed starts", "Timing"})
schedules := []string{"*/1 * * * ?", "*/30 * * * ?", "30 1 * * ?", "30 1 * * 0"}
iteration_nums := []int{100, 500, 1000, 10000}
for _, iterations := range iteration_nums {
for _, schedule := range schedules {
table.Append([]string{schedule,
strconv.Itoa(iterations),
timeSchedule(schedule, iterations).String()})
}
}
table.Render()
}
```
**Which issue this PR fixes**: fixes#36311
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue
Updated unit tests
@janetkuo updated the flaky unit test to have the same structure with regard to uncasting as the rest of the tests. ptal
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Update deployment equality helper
@mfojtik @janetkuo this is split out of https://github.com/kubernetes/kubernetes/pull/38714 to reduce the size of that PR, ptal
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Made cache.Controller to be interface.
**What this PR does / why we need it**:
#37504
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39661, 39740, 39801, 39468, 39743)
fix nodeStatusUpdateRetry count exceeding condition judgement
When tryUpdateNodeStatus() return err,err!=nil, but nc.kubeClient.Core().Nodes().Get() return no err, err==nil,
And we run nodeStatusUpdateRetry times, when for loop ends, err == nil, we can not print error info and run continue, so maybe the condition judgement is not right
Maybe caused #38671
When tryUpdateNodeStatus() return err,err!=nil, but nc.kubeClient.Core().Nodes().Get() return no err, err==nil,
And we run nodeStatusUpdateRetry times, when for loop ends, err == nil, we can not print error info and run continue, so the condition judgement is wrong.
Automatic merge from submit-queue (batch tested with PRs 39483, 39088, 38787)
daemonset: differentiate between cases in nodeShouldRun
specifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.
fixes https://github.com/kubernetes/kubernetes/issues/28839 among other things
secifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)
kubelet: request client auth certificates from certificate API.
This fixes kubeadm and --experiment-kubelet-bootstrap.
cc @liggitt
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
HPA Controller: Check for 0-sum request value
In certain conditions in which the set of metrics returned by Heapster
is completely disjoint from the set of pods returned by the API server,
we can have a request sum of zero, which can cause a panic (due to
division by zero). This checks for that condition.
Fixes#39680
**Release note**:
```release-note
Fixes an HPA-related panic due to division-by-zero.
```
Automatic merge from submit-queue
certificates: add a signing profile to the internal types
Here is a strawman of a CertificateSigningProfile type which would be used by the certificates controller when configuring cfssl. Side question: what magnitude of change warrants a design proposal?
@liggitt @gtank
In certain conditions in which the set of metrics returned by Heapster
is completely disjoint from the set of pods returned by the API server,
we can have a request sum of zero, which can cause a panic (due to
division by zero). This checks for that condition.
Fixes#39680
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)
Increasing times on reconciling volumes fixing impact to AWS.
#**What this PR does / why we need it**:
We are currently blocked by API timeouts with PV volumes. See https://github.com/kubernetes/kubernetes/issues/39526. This is a workaround, not a fix.
**Special notes for your reviewer**:
A second PR will be dropped with CLI cobra options in it, but we are starting with increasing the reconciliation periods. I am dropping this without major testing and will test on our AWS account. Will be marked WIP until I run smoke tests.
**Release note**:
```release-note
Provide kubernetes-controller-manager flags to control volume attach/detach reconciler sync. The duration of the syncs can be controlled, and the syncs can be shut off as well.
```
Automatic merge from submit-queue (batch tested with PRs 37845, 39439, 39514, 39457, 38866)
Log a warning message when failed to find kind for resource in garbage collector controller
at this time, I do not think thirdparty api group version resources should be taken care by garbage collector controllers, and this line of call will fail actually: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/garbagecollector/garbagecollector.go#L565, and as a result, the garbagecollector controller failed to start.