Automatic merge from submit-queue
Remove alpha provisioning
This is the first part of https://github.com/kubernetes/features/issues/36
@kubernetes/sig-storage-misc
**Release note**:
```release-note
Alpha version of dynamic volume provisioning is removed in this release. Annotation
"volume.alpha.kubernetes.io/storage-class" does not have any special meaning. A default storage class
and DefaultStorageClass admission plugin can be used to preserve similar behavior of Kubernetes cluster,
see https://kubernetes.io/docs/user-guide/persistent-volumes/#class-1 for details.
```
Automatic merge from submit-queue
Switch serviceaccounts controller to generated shared informers
Originally part of #40097
cc @deads2k @sttts @liggitt @smarterclayton @gmarek @wojtek-t @timothysc @kubernetes/sig-scalability-pr-reviews
Automatic merge from submit-queue
Change default attach_detach_controller reconciler sync period to 1 minute
When default reconciler sync period is set to 5 second, we often see
rateLimit issue for a large cluster. This PR is changing the period to 1
minute to mitigate this problem.
Make this period longer means that there might be some period of time
that the cached information in master's attach_detach_controller is out
of date. The node might use this information to mount to the wrong
device. For GCE PD, since device path is uniquely associated with volume
id, so mount operation will just fail because of this outdated
information. For AWS, before kubelet might mount to the wrong volume
because device path could be reused immediately once it is available.
But after PR #38818, device path will only be reused after all device
paths have been explored. That means it is very unlikely that kubelet will
mount to a wrong volume that is using the old device path that had been
assigned to the same node.
**Release note**:
```release-note
We change the default attach_detach_controller sync period to 1 minute to reduce the query frequency through cloud provider to check whether volumes are attached or not.
```
When default reconciler sync period is set to 5 second, we often see
rateLimit issue for a large cluster. This PR is change the period to 1
minute to mitigate this problem.
Make this period longer means that there might be some period of time
that the cached information in master's attach_detach_controller is out
of date. The node might use this information to mount to the wrong
device. For GCE PD, since device path is uniquely associated with volume
id, so mount operation will just fail because of this outdated
information. For AWS, before kubelet might mount to the wrong volume
because device path could be reused immediately once it is available.
But after PR #38818, device path will only be reused after all device
paths have been explored. That means it is very unlikely that kubelet will
mount to a wrong volume that is using the old device path that had been
assigned to the same node.
Automatic merge from submit-queue (batch tested with PRs 41137, 41268)
Allow the CertificateController to use any Signer implementation.
**What this PR does / why we need it**:
This will allow developers to create `CertificateController`s with arbitrary `Signer`s, instead of forcing the use of `CFSSLSigner`. It matches the behavior of allowing an arbitrary `AutoApprover` to be passed in the constructor.
**Release note**:
```release-note
NONE
```
CC @mikedanese
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
TaintController
```release-note
This PR adds a manager to NodeController that is responsible for removing Pods from Nodes tainted with NoExecute Taints. This feature is beta (as the rest of taints) and enabled by default. It's gated by controller-manager enable-taint-manager flag.
```
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
ResyncPeriod Comment
ResyncPeriod Comment:
// ResyncPeriod returns a function which generates a duration each time it is
// invoked; this is so that multiple controllers don't get into lock-step and all
// hammer the apiserver with list requests simultaneously.
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
Implement TTL controller and use the ttl annotation attached to node in secret manager
For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.
Apiserver cannot keep up with it very easily.
Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.
So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is).
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.
@dchen1107 @thockin @liggitt
Automatic merge from submit-queue (batch tested with PRs 40345, 38183, 40236, 40861, 40900)
refactor approver and signer interfaces to be consisten w.r.t. apiserver interaction
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
Automatic merge from submit-queue
Remove packages which are now apimachinery
Removes all the content from the packages that were moved to `apimachinery`. This will force all vendoring projects to figure out what's wrong. I had to leave many empty marker packages behind to have verify-godep succeed on vendoring heapster.
@sttts straight deletes and simple adds
Automatic merge from submit-queue (batch tested with PRs 39947, 39936, 39902, 39859, 39915)
don't lie about starting the controllers in the controller manager
We print started even if it didn't start.
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39661, 39740, 39801, 39468, 39743)
add --controllers to controller manager
Adds a `--controllers` flag to the `kube-controller-manager` to indicate which controllers are enabled and disabled. From the help:
```
--controllers stringSlice A list of controllers to enable. '*' enables all on-by-default controllers, 'foo' enables the controller named 'foo', '-foo' disables the controller named 'foo'.
All controllers: certificatesigningrequests, cronjob, daemonset, deployment, disruption, endpoint, garbagecollector, horizontalpodautoscaling, job, namespace, podgc, replicaset, replicationcontroller, resourcequota, serviceaccount, statefuleset
```
Automatic merge from submit-queue
make discovery static when extensions/thirdpartyresources is not enabled
this should be a bug fix, if `extensions/thirdpartyresources` is enabled, the result of `Discovery().ServerPreferredNamespacedResources` will be dynamic then, so we are making the `discoverResourcesFn` static only when the `extensions/thirdpartyresources` is not enabled.
Automatic merge from submit-queue (batch tested with PRs 39150, 38615)
Add work queues to PV controller
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead, where we
can enqueue volumes/claims as we want.
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
Automatic merge from submit-queue (batch tested with PRs 37959, 36221)
Recycle Pod Template Check
The kube-controller-manager has two command line arguments (--pv-recycler-pod-template-filepath-hostpath and --pv-recycler-pod-template-filepath-nfs) that specify a recycle pod template. The recycle pod template may not contain the volume that shall be recycled.
A check is added to make sure that the recycle pod template contains at least a volume.
cc: @jsafrane
The kube-controller-manager has two command line arguments (--pv-recycler-pod-template-filepath-hostpath and --pv-recycler-pod-template-filepath-nfs) that specify a recycle pod template. The recycle pod template may not contain the volume that shall be recycled.
A check is added to make sure that the recycle pod template contains at least a volume.
Currently, the HPA considers unready pods the same as ready pods when
looking at their CPU and custom metric usage. However, pods frequently
use extra CPU during initialization, so we want to consider them
separately.
This commit causes the HPA to consider unready pods as having 0 CPU
usage when scaling up, and ignores them when scaling down. If, when
scaling up, factoring the unready pods as having 0 CPU would cause a
downscale instead, we simply choose not to scale. Otherwise, we simply
scale up at the reduced amount caculated by factoring the pods in at
zero CPU usage.
The effect is that unready pods cause the autoscaler to be a bit more
conservative -- large increases in CPU usage can still cause scales,
even with unready pods in the mix, but will not cause the scale factors
to be as large, in anticipation of the new pods later becoming ready and
handling load.
Similarly, if there are pods for which no metrics have been retrieved,
these pods are treated as having 100% of the requested metric when
scaling down, and 0% when scaling up. As above, this cannot change the
direction of the scale.
This commit also changes the HPA to ignore superfluous metrics -- as
long as metrics for all ready pods are present, the HPA we make scaling
decisions. Currently, this only works for CPU. For custom metrics, we
cannot identify which metrics go to which pods if we get superfluous
metrics, so we abort the scale.
Automatic merge from submit-queue
Enable HPA controller based on autoscaling/v1 api group
ref #29778
``` release-note
Enable HPA controller based on autoscaling/v1 api group.
```
Automatic merge from submit-queue
lister-gen updates
- Remove "zz_generated." prefix from generated lister file names
- Add support for expansion interfaces
- Switch to new generated JobLister
@deads2k @liggitt @sttts @mikedanese @caesarxuchao for the lister-gen changes
@soltysh @deads2k for the informer / job controller changes