1. pcb and pcb controller are removed and their functionality is
encapsulated in StatefulPodControlInterface.
2. IdentityMappers has been removed to clarify what properties of a Pod are
mutated by the controller. All mutations are performed in the
UpdateStatefulPod method of the StatefulPodControlInterface.
3. The statefulSetIterator and petQueue classes are removed. These classes
sorted Pods by CreationTimestamp. This is brittle and not resilient to
clock skew. The current control loop, which implements the same logic,
is in stateful_set_control.go. The Pods are now sorted and considered by
their ordinal indices, as is outlined in the documentation.
4. StatefulSetController now checks to see if the Pods matching a
StatefulSet's Selector also match the Name of the StatefulSet. This will
make the controller resilient to overlapping, and will be enhanced by
the addition of ControllerRefs.
Automatic merge from submit-queue
Update owners file for job and cronjob controller
I've just noticed we have outdated OWNERS files for job and cronjob controllers.
@erictune ptal
@kubernetes/sig-contributor-experience-pr-reviews fyi
Automatic merge from submit-queue (batch tested with PRs 40345, 38183, 40236, 40861, 40900)
refactor approver and signer interfaces to be consisten w.r.t. apiserver interaction
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)
Promote SubjectAccessReview to v1
We have multiple features that depend on this API:
SubjectAccessReview
- [webhook authorization](https://kubernetes.io/docs/admin/authorization/#webhook-mode)
- [kubelet delegated authorization](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authorization)
- add-on API server delegated authorization
The API has been in use since 1.3 in beta status (v1beta1) with negligible changes:
- Added a status field for reporting errors evaluating access
- A typo was discovered in the SubjectAccessReviewSpec Groups field name
This PR promotes the existing v1beta1 API to v1, with the only change being the typo fix to the groups field. (fixes https://github.com/kubernetes/kubernetes/issues/32709)
Because the API does not persist data (it is a query/response-style API), there are no data migration concerns.
This positions us to promote the features that depend on this API to stable in 1.7
cc @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-misc
```release-note
The authorization.k8s.io API group was promoted to v1
```
Automatic merge from submit-queue
federation: Refactoring namespaced resources deletion code from kube ns controller and sharing it with fed ns controller
Ref https://github.com/kubernetes/kubernetes/issues/33612
Refactoring code in kube namespace controller to delete all resources in a namespace when the namespace is deleted. Refactored this code into a separate NamespacedResourcesDeleter class and calling it from federation namespace controller.
This is required for enabling cascading deletion of namespaced resources in federation apiserver.
Before this PR, we were directly deleting the namespaced resources and assuming that they go away immediately. With cascading deletion, we will have to wait for the corresponding controllers to first delete the resources from underlying clusters and then delete the resource from federation control plane. NamespacedResourcesDeleter has this waiting logic.
cc @kubernetes/sig-federation-misc @caesarxuchao @derekwaynecarr @mwielgus
Automatic merge from submit-queue
Replace hand-written informers with generated ones
Replace existing uses of hand-written informers with generated ones.
Follow-up commits will switch the use of one-off informers to shared
informers.
This is a precursor to #40097. That PR will switch one-off informers to shared informers for the majority of the code base (but not quite all of it...).
NOTE: this does create a second set of shared informers in the kube-controller-manager. This will be resolved back down to a single factory once #40097 is reviewed and merged.
There are a couple of places where I expanded the # of caches we wait for in the calls to `WaitForCacheSync` - please pay attention to those. I also added in a commented-out wait in the attach/detach controller. If @kubernetes/sig-storage-pr-reviews is ok with enabling the waiting, I'll do it (I'll just need to tweak an integration test slightly).
@deads2k @sttts @smarterclayton @liggitt @soltysh @timothysc @lavalamp @wojtek-t @gmarek @sjenning @derekwaynecarr @kubernetes/sig-scalability-pr-reviews
This makes it so that only the controller loop talks to the
API server directly. The signatures for Sign and Approve also
become more consistent, while allowing the Signer to report
conditions (which it wasn't able to do before).
Automatic merge from submit-queue
Update daemon set controller OWNERS file
Adding myself as reviewer, adding @mikedanese as approver
cc @kargakis @lukasredynk
Automatic merge from submit-queue (batch tested with PRs 35782, 35831, 39279, 40853, 40867)
genericapiserver: cut off more dependencies – episode 7
Follow-up of https://github.com/kubernetes/kubernetes/pull/40822
approved based on #40363
Automatic merge from submit-queue (batch tested with PRs 40855, 40859)
PV binding: send an event when there are no PVs to bind
This is similar to scheduler that says "no nodes available to schedule pods"
when it can't schedule a pod.
@kubernetes/sig-storage-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 40810, 40695)
Prevent pv controller from forcefully overwrite provisioned volume name
**What this PR does / why we need it**:
This PR adds a fix to prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name. Instead, it overwrites the volume's name only when it is missing. This allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
**Which issue this PR fixes**
This PR does not have an issue affiliated, but it will allow PR #38924 to properly implement dynamically provisioned volume in namespaces other than default.
Automatic merge from submit-queue (batch tested with PRs 39169, 40719, 38954, 40808, 40689)
Add StatefulSets checks at Service level
Hi!
Please let me propose some very small e2e testsuite enhancement.
This PR removed a `TODO` about checking governing service at unit test level (which is hard) and adds this to e2e testsuite.
Thanks
Sebastian
This fix prevents the PV controller from forcefully overwriting the provisioned volume's name with the generated PV name. Instead, it allows dynamic provisioner implementers to set the name of the volume to a value that they choose.
Automatic merge from submit-queue (batch tested with PRs 34543, 40606)
sync client-go and move util/workqueue
The vision of client-go is that it provides enough utilities to build a reasonable controller. It has been copying `util/workqueue`. This makes it authoritative.
@liggitt I'm getting really close to making client-go authoritative ptal.
approved based on https://github.com/kubernetes/kubernetes/issues/40363
Automatic merge from submit-queue
controller: don't run informers in unit tests when unnecessary
Fixes https://github.com/kubernetes/kubernetes/issues/39908
@mfojtik it seems that using informers makes the deployment sync for the initial relist so this races with the enqueue that these tests are testing.
Automatic merge from submit-queue
Decrease Daemonset burst replicas due to DoS conditions.
**What this PR does / why we need it**:
We are seeing DoS conditions on our Registry if were running a large cluster with too many daemonsets bursting at once.
**Special notes for your reviewer**:
I decided not to plumb through yet another variable to the command line. Ideally such parameters could be tweaked via a configuration file.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40126, 40565, 38777, 40564, 40572)
Do not swallow error in asw.updateNodeStatusUpdateNeeded
Ref #39056
Bubble the error up to `SetNodeUpdateStatusNeeded` and log it out.
NOTE: This does not modify interface of `SetNodeUpdateStatusNeeded`
Automatic merge from submit-queue (batch tested with PRs 40239, 40397, 40449, 40448, 40360)
move the discovery and dynamic clients
Moved the dynamic client, discovery client, testing/core, and testing/cache to `client-go`. Dependencies on api groups we don't have generated clients for have dropped out, so federation, kubeadm, and imagepolicy.
@caesarxuchao @sttts
approved based on https://github.com/kubernetes/kubernetes/issues/40363
Automatic merge from submit-queue (batch tested with PRs 39538, 40188, 40357, 38214, 40195)
genericapiserver: cut off more dependencies – episode 2
Compare commit subjects.
approved based on #40363
Automatic merge from submit-queue (batch tested with PRs 39064, 40294)
Refactor persistent volume tests
This is an attempt to make the binder tests a bit more concise. The PVCs are being created by a "templating" function. There is also a handful of PVs in the tests but those vary quite more and I don't think similar approach would save us much code.
Reference:
https://reviewable.kubernetes.io/reviews/kubernetes/kubernetes/29006#-KPJuVeDE0O6TvDP9jia
@jsafrane: I hope this is what you have on mind.
Automatic merge from submit-queue (batch tested with PRs 40196, 40143, 40277)
Emit warning event when CronJob cannot determine starting time
**What this PR does / why we need it**:
In #39608, we've modified the error message for when a CronJob has too many unmet starting times to enumerate to figure out the next starting time. This makes it more "actionable", and the user can now set a deadline to avoid running into this. However, the error message is still only controller level AFAIK and thus not exposed to the user. From his perspective, there is no way to tell why the CronJob is not scheduling the next instance.
The PR adds a warning event in addition to the error in the controller manager's log.
**Which issue this PR fixes**: This is an addition to PR #39608 regarding #36311.
**Special notes for your reviewer**: cc @soltysh
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 40232, 40235, 40237, 40240)
Fixup pet terminology in log and user-facing events
**What this PR does / why we need it**:
Removes some user-facing strings for pet terminology.
Automatic merge from submit-queue
make client-go more authoritative
Builds on https://github.com/kubernetes/kubernetes/pull/40103
This moves a few more support package to client-go for origination.
1. restclient/watch - nodep
1. util/flowcontrol - used interface
1. util/integer, util/clock - used in controllers and in support of util/flowcontrol
Automatic merge from submit-queue
controller: decouple cleanup policy from deployment strategies
Deployments get cleaned up only when they are paused, they get scaled up/down,
or when the strategy that drives rollouts completes. This means that stuck
deployments that fall into none of the above categories will not get cleaned
up. Since cleanup is already safe by itself (we only delete old replica sets
that are synced by the replica set controller and have no replicas) we can
execute it for every deployment when there is no intention to rollback.
Fixes https://github.com/kubernetes/kubernetes/issues/40068
Deployments get cleaned up only when they are paused, they get scaled up/down,
or when the strategy that drives rollouts completes. This means that stuck
deployments that fall into none of the above categories will not get cleaned
up. Since cleanup is already safe by itself (we only delete old replica sets
that are synced by the replica set controller and have no replicas) we can
execute it for every deployment when there is no intention to rollback.
Automatic merge from submit-queue (batch tested with PRs 34763, 38706, 39939, 40020)
Use Statefulset instead in e2e and controller
Quick fix ref: #35534
We should finish the issue to meet v1.6 milestone.
Automatic merge from submit-queue
genericapiserver: cut off pkg/serviceaccount dependency
**Blocked** by pkg/api/validation/genericvalidation to be split up and moved into apimachinery.
Automatic merge from submit-queue
Do not list CronJob unmet starting times beyond deadline
**What this PR does / why we need it**:
See #36311. `getRecentUnmetScheduleTimes` gives up after 100 unmet times to avoid wasting too much CPU or memory generating all the times, as it generates them sequentially.
When concurrency is forbidden, this is conceptually un-necessary: we only need the last unmet start time. This suggests that when concurrency is forbidden, we could generate times by going backward in time from now. This is not very practical as CronJob currently relies on a package that only provides `Next` and no `Prev`. Hand-cooking a `Prev` does not seem like a good idea. I could submit a PR to the cron library to add a `Prev` method, and use that when concurrency is forbidden through something like `getLastUnmetScheduleTime`. This would be `O(1)` and there would be no limit involved.
(edit: actually, even for the other concurrency settings, we only start the last unmet start times -- there is a `TODO` in the controller to actually start all of them, but that is not implemented at the moment. This means the solution would apply, at least temporarily, to all concurrency settings).
cc @soltysh what do you think?
In the meantime, I would suggest to do something simple. Currently, the user has no way to configure anything to ensure that his CronJob will not get stuck if one job takes more that 100 unmet times.
`getRecentUnmetScheduleTimes` starts with an initial time corresponding to the last start (or to the creation of the CronJob, if nothing has started yet). However, when `StartingDeadlineSeconds` is set, the controller will not start anything that is older than the deadline, so if the last start is way beyond the deadline, we are generating potentially lots of unmet start times that will not be considered by the scheduler for scheduling anyway.
Consider a job running every minute, where the last instance has taken 120 minutes. This means there are more than 100 unmet times when we start counting from the last start time.
**The PR makes `getRecentUnmetScheduleTimes` only consider times that do not fall beyond the deadline.** Here, the CronJob can be configured with a `StartingDeadlineSeconds` of, say, 10 minutes. After the 120min job has run, `getRecentUnmetScheduleTimes` will only consider the times in the last 10 minutes from now, and will not get stuck.
As a side note on the max. number of unmet times to use as limits in terms of CPU used by the controller: I have run a quick benchmark on my i7 mac. Schedules corresponding to "once a week" tend to be more expensive to generate unmet times for. Just FYI.
```
+--------------+---------------+--------------+
| SCHEDULE | MISSED STARTS | TIMING |
+--------------+---------------+--------------+
| */1 * * * ? | 100 | 383.645µs |
| */30 * * * ? | 100 | 354.765µs |
| 30 1 * * ? | 100 | 1.065124ms |
| 30 1 * * 0 | 100 | 1.80034ms |
| */1 * * * ? | 500 | 1.341365ms |
| */30 * * * ? | 500 | 1.814441ms |
| 30 1 * * ? | 500 | 8.475012ms |
| 30 1 * * 0 | 500 | 10.020613ms |
| */1 * * * ? | 1000 | 2.551697ms |
| */30 * * * ? | 1000 | 4.075813ms |
| 30 1 * * ? | 1000 | 17.674945ms |
| 30 1 * * 0 | 1000 | 19.149324ms |
| */1 * * * ? | 10000 | 25.725531ms |
| */30 * * * ? | 10000 | 87.520022ms |
| 30 1 * * ? | 10000 | 174.29216ms |
| 30 1 * * 0 | 10000 | 196.565748ms |
+--------------+---------------+--------------+
```
using
```.go
package main
import (
"fmt"
"time"
"os"
"strconv"
"github.com/robfig/cron"
"github.com/olekukonko/tablewriter"
)
func timeSchedule(schedule string, iterations int) (time.Duration) {
sched, err := cron.ParseStandard(schedule)
if err != nil {
panic(fmt.Sprintf("Unparseable schedule: %s", err))
}
start := time.Now()
t := time.Now()
for i := 1; i <= iterations; i++ {
t = sched.Next(t)
}
return time.Since(start)
}
func main() {
table := tablewriter.NewWriter(os.Stdout)
table.SetHeader([]string{"Schedule", "Missed starts", "Timing"})
schedules := []string{"*/1 * * * ?", "*/30 * * * ?", "30 1 * * ?", "30 1 * * 0"}
iteration_nums := []int{100, 500, 1000, 10000}
for _, iterations := range iteration_nums {
for _, schedule := range schedules {
table.Append([]string{schedule,
strconv.Itoa(iterations),
timeSchedule(schedule, iterations).String()})
}
}
table.Render()
}
```
**Which issue this PR fixes**: fixes#36311
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue
Updated unit tests
@janetkuo updated the flaky unit test to have the same structure with regard to uncasting as the rest of the tests. ptal
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Update deployment equality helper
@mfojtik @janetkuo this is split out of https://github.com/kubernetes/kubernetes/pull/38714 to reduce the size of that PR, ptal
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Made cache.Controller to be interface.
**What this PR does / why we need it**:
#37504
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39661, 39740, 39801, 39468, 39743)
fix nodeStatusUpdateRetry count exceeding condition judgement
When tryUpdateNodeStatus() return err,err!=nil, but nc.kubeClient.Core().Nodes().Get() return no err, err==nil,
And we run nodeStatusUpdateRetry times, when for loop ends, err == nil, we can not print error info and run continue, so maybe the condition judgement is not right
Maybe caused #38671
When tryUpdateNodeStatus() return err,err!=nil, but nc.kubeClient.Core().Nodes().Get() return no err, err==nil,
And we run nodeStatusUpdateRetry times, when for loop ends, err == nil, we can not print error info and run continue, so the condition judgement is wrong.
Automatic merge from submit-queue (batch tested with PRs 39483, 39088, 38787)
daemonset: differentiate between cases in nodeShouldRun
specifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.
fixes https://github.com/kubernetes/kubernetes/issues/28839 among other things
secifically we need to differentiate between wanting to run,
should run and should continue running. This is required to
support all taint effects and will improve reporting and end
user debuggability.
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)
kubelet: request client auth certificates from certificate API.
This fixes kubeadm and --experiment-kubelet-bootstrap.
cc @liggitt
Automatic merge from submit-queue (batch tested with PRs 39694, 39383, 39651, 39691, 39497)
HPA Controller: Check for 0-sum request value
In certain conditions in which the set of metrics returned by Heapster
is completely disjoint from the set of pods returned by the API server,
we can have a request sum of zero, which can cause a panic (due to
division by zero). This checks for that condition.
Fixes#39680
**Release note**:
```release-note
Fixes an HPA-related panic due to division-by-zero.
```
Automatic merge from submit-queue
certificates: add a signing profile to the internal types
Here is a strawman of a CertificateSigningProfile type which would be used by the certificates controller when configuring cfssl. Side question: what magnitude of change warrants a design proposal?
@liggitt @gtank
In certain conditions in which the set of metrics returned by Heapster
is completely disjoint from the set of pods returned by the API server,
we can have a request sum of zero, which can cause a panic (due to
division by zero). This checks for that condition.
Fixes#39680
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)
Increasing times on reconciling volumes fixing impact to AWS.
#**What this PR does / why we need it**:
We are currently blocked by API timeouts with PV volumes. See https://github.com/kubernetes/kubernetes/issues/39526. This is a workaround, not a fix.
**Special notes for your reviewer**:
A second PR will be dropped with CLI cobra options in it, but we are starting with increasing the reconciliation periods. I am dropping this without major testing and will test on our AWS account. Will be marked WIP until I run smoke tests.
**Release note**:
```release-note
Provide kubernetes-controller-manager flags to control volume attach/detach reconciler sync. The duration of the syncs can be controlled, and the syncs can be shut off as well.
```
Automatic merge from submit-queue (batch tested with PRs 37845, 39439, 39514, 39457, 38866)
Log a warning message when failed to find kind for resource in garbage collector controller
at this time, I do not think thirdparty api group version resources should be taken care by garbage collector controllers, and this line of call will fail actually: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/garbagecollector/garbagecollector.go#L565, and as a result, the garbagecollector controller failed to start.
Automatic merge from submit-queue
Remove jobs that do not exist from active list of CronJob
**What this PR does / why we need it**: This PR modifies the controller for CronJob to remove from the active job list any job that does not exist anymore, to avoid staying blocked in active state forever. See #37957.
**Which issue this PR fixes**: fixes#37957
**Special notes for your reviewer**:
**Release note**:
```
```
Automatic merge from submit-queue (batch tested with PRs 39284, 39367)
Remove HostRecord annotation (beta feature)
The annotation has made it to GA so this code should be deleted.
**Release note**:
```release-note
The 'endpoints.beta.kubernetes.io/hostnames-map' annotation is no longer supported. Users can use the 'Endpoints.subsets[].addresses[].hostname' field instead.
```
Automatic merge from submit-queue (batch tested with PRs 39092, 39126, 37380, 37093, 39237)
Endpoints with TolerateUnready annotation, should list Pods in state terminating
**What this PR does / why we need it**:
We are using preStop lifecycle hooks to gracefully remove a node from a cluster. This hook is potentially long running and after the preStop hook is fired, the DNS resolution of the soon to be stopped Pod is failing, which causes a failure there.
**Special notes for your reviewer**:
Would be great to backport that to 1.4, 1.3
**Release note**:
```release-note
Endpoints, that tolerate unready Pods, are now listing Pods in state Terminating as well
```
@bprashanth
Automatic merge from submit-queue (batch tested with PRs 39075, 39350, 39353)
Move pkg/api.{Context,RequestContextMapper} into pkg/genericapiserver/api/request
**Based on #39350**
Automatic merge from submit-queue
DaemonSet ObservedGeneration
Extracting ObserverdGeneration part from #31693. It also implements #7328 for DaemonSets.
cc @kargakis
Automatic merge from submit-queue (batch tested with PRs 39150, 38615)
Add work queues to PV controller
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead, where we
can enqueue volumes/claims as we want.
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
Automatic merge from submit-queue
Avoid unnecessary memory allocations
Low-hanging fruits in saving memory allocations. During our 5000-node kubemark runs I've see this:
ControllerManager:
- 40.17% k8s.io/kubernetes/pkg/util/system.IsMasterNode
- 19.04% k8s.io/kubernetes/pkg/controller.(*PodControllerRefManager).Classify
Scheduler:
- 42.74% k8s.io/kubernetes/plugin/pkg/scheduler/algrorithm/predicates.(*MaxPDVolumeCountChecker).filterVolumes
This PR is eliminating all of those.
Automatic merge from submit-queue (batch tested with PRs 39093, 34273)
start breaking up controller manager into two pieces
This PR addresses: https://github.com/kubernetes/features/issues/88
This commit starts breaking the controller manager into two pieces, namely,
1. cloudprovider dependent piece
2. coudprovider agnostic piece
the controller manager has the following control loops -
- nodeController
- volumeController
- routeController
- serviceController
- replicationController
- endpointController
- resourceQuotaController
- namespaceController
- deploymentController
etc..
among the above controller loops,
- nodeController
- volumeController
- routeController
- serviceController
are cloud provider dependent. As kubernetes has evolved tremendously, it has become difficult
for different cloudproviders (currently 8), to make changes and iterate quickly. Moreover, the
cloudproviders are constrained by the kubernetes build/release lifecycle. This commit is the first
step in moving towards a kubernetes code base where cloud providers specific code will move out of
the core repository, and will be maintained by the cloud providers themselves.
I have added a new cloud provider called "external", which signals the controller-manager that
cloud provider specific loops are being run by another controller. I have added these changes in such
a way that the existing cloud providers are not affected. This change is completely backwards compatible, and does not require any changes to the way kubernetes is run today.
Finally, along with the controller-manager, the kubelet also has cloud-provider specific code, and that will be addressed in a different commit/issue.
@alena1108 @ibuildthecloud @thockin @dchen1107
**Special notes for your reviewer**:
@thockin - Im making this **WIP** PR to ensure that I don't stray too far from everyone's view of how we should make this change. As you can see, only one controller, namely `nodecontroller` can be disabled with the `--cloudprovider=external` flag at the moment. I'm working on cleaning up the `rancher-controller-manger` that I wrote to test this.
Secondly, I'd like to use this PR to address cloudprovider specific code in kubelet and api-server.
**Kubelet**
Kubelet uses provider specific code for node registration and for checking node-status. I thought of two ways to divide the kubelet:
- We could start a cloud provider specific kubelet on each host as a part of kubernetes, and this cloud-specific-kubelet does node registration and node-status checks.
- Create a kubelet plugin for each provider, which will be started by kubelet as a long running service. This plugin can be packaged as a binary.
I'm leaning towards the first option. That way, kubelet does not have to manage another process, and we can offload the process management of the cloud-provider-specific-kubelet to something like systemd.
@dchen1107 @thockin what do you think?
**Kube-apiserver**
Kube-apiserver uses provider specific code for distributing ssh keys to all the nodes of a cluster. Do you have any suggestions about how to address this?
**Release note**:
``` release-note
```
Addresses: kubernetes/features#88
This commit starts breaking the controller manager into two pieces, namely,
1. cloudprovider dependent piece
2. coudprovider agnostic piece
the controller manager has the following control loops -
- nodeController
- volumeController
- routeController
- serviceController
- replicationController
- endpointController
- resourcequotacontroller
- namespacecontroller
- deploymentController etc..
among the above controller loops,
- nodeController
- volumeController
- routeController
- serviceController
are cloud provider dependent. As kubernetes has evolved tremendously, it has become difficult
for different cloudproviders (currently 8), to make changes and iterate quickly. Moreover, the
cloudproviders are constrained by the kubernetes build/release lifecycle. This commit is the first
step in moving towards a kubernetes code base where cloud providers specific code will move out of
the core repository, and will be maintained by the cloud providers themselves.
Finally, along with the controller-manager, the kubelet also has cloud-provider specific code, and that will
be addressed in a different commit/issue.
Automatic merge from submit-queue
Fix DaemonSet cache mutation
**What this PR does / why we need it**: stops the DaemonSetController from mutating the DaemonSet shared informer cache
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#38985
cc @deads2k @mikedanese @lavalamp @smarterclayton
Automatic merge from submit-queue (batch tested with PRs 36888, 38180, 38855, 38590)
Fix variable shadowing in exponential backoff when deleting volumes
While https://github.com/kubernetes/kubernetes/pull/38339 implemented exponential backoff on
volume deletion, that PR suffers from a minor bug when error thrown on volume deletion is anything other than `VolumeInUse` errors - in which case exponential backoff will not work.
This PR fixes that. This PR also makes unit tests more deterministic because exponential backoff changed the way operations are permitted.
CC @jsafrane @childsb @wongma7
Automatic merge from submit-queue (batch tested with PRs 38426, 38917, 38891, 38935)
if statement must be true
**What this PR does / why we need it**:
if len(metrics.Items)==0, the function would been returned. so the statement if len(metrics.Items) > 0 is redudant, it must be true.
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Add dsStoreSynced so we also wait on this cache when starting the
DaemonSetController.
Switch to using a fake clientset in the unit tests.
Fix TestNumberReadyStatus so it doesn't expect the cache to be mutated.
Automatic merge from submit-queue (batch tested with PRs 34353, 33837, 38878)
Revert "daemonset: bail out after we enqueue once"
I get overzealous sometimes.
Reverts kubernetes/kubernetes#38780
Automatic merge from submit-queue
Fix Recreate for Deployments and stop using events in e2e tests
Fixes https://github.com/kubernetes/kubernetes/issues/36453 by removing events from the deployment tests. The test about events during a Rolling deployment is redundant so I just removed it (we already have another test specifically for Rolling deployments).
Closes https://github.com/kubernetes/kubernetes/issues/32567 (preferred to use pod LISTs instead of a new status API field for replica sets that would add many more writes to replica sets).
@kubernetes/deployment
Automatic merge from submit-queue
daemonset: bail out after we enqueue once
This isn't terrible because we dedup in the queue but it's a waste of
cycles.
Automatic merge from submit-queue (batch tested with PRs 38154, 38502)
Rename "release_1_5" clientset to just "clientset"
We used to keep multiple releases in the main repo. Now that [client-go](https://github.com/kubernetes/client-go) does the versioning, there is no need to keep releases in the main repo. This PR renames the "release_1_5" clientset to just "clientset", clientset development will be done in this directory.
@kubernetes/sig-api-machinery @deads2k
```release-note
The main repository does not keep multiple releases of clientsets anymore. Please find previous releases at https://github.com/kubernetes/client-go
```
Automatic merge from submit-queue
controller: adopt pods only when controller is not deleted
When a replica set is deleted it will continue adopting pods thus driving the worker that handles it in erroring out because the adoption is [always cancelled](59c313730c/pkg/controller/controller_ref_manager.go (L110)) in the controller reference manager.
```
E1212 14:40:31.245773 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-73c3m_791e16cb-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.258462 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-73c3m_791e16cb-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.259131 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-73c3m_791e16cb-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.259149 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-wrmt8_791e3d46-c070-11e6-a234-68f72840e7df because the controlller is being deleted
I1212 14:40:31.268012 7964 deployment_controller.go:314] Error syncing deployment e2e-tests-deployment-2rr3m/test-rollover-deployment: Operation cannot be fulfilled on deployments.extensions "test-rollover-deployment": the object has been modified; please apply your changes to the latest version and try again
E1212 14:40:31.277252 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-73c3m_791e16cb-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.277276 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-wrmt8_791e3d46-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.277287 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-bmqpn_81482114-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.289148 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-b6s4x_82fa8343-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.289169 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-73c3m_791e16cb-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.289176 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-wrmt8_791e3d46-c070-11e6-a234-68f72840e7df because the controlller is being deleted
E1212 14:40:31.289181 7964 replica_set.go:616] cancel the adopt attempt for pod e2e-tests-deployment-2rr3m_test-rollover-deployment-1981456318-bmqpn_81482114-c070-11e6-a234-68f72840e7df because the controlller is being deleted
```
@kubernetes/deployment @caesarxuchao
Automatic merge from submit-queue
Curating Owners: pkg/controller
cc @jsafrane @mikedanese @bprashanth @derekwaynecarr @thockin @saad-ali
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone **lgtms** and then someone
experienced in the project **approves**), we are adding new reviewers to
existing owners files.
## If You Care About the Process:
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
perfectly: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
## TLDR:
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the OWNERS file to add
the names of people that should be reviewing code in the future in the **reviewers** section. You probably do NOT need to modify the **approvers** section.
3. Notify me if you want some OWNERS file to be removed. Being an approver or reviewer
of a parent directory makes you a reviewer/approver of the subdirectories too, so not all
OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue
bump log level on service status update
ref: https://github.com/kubernetes/kubernetes/issues/38349
I tried to reproduce the problem in #38349 and failed. Not sure why service status update failed and service controller skip status update in the next round. What I have observed is that if service status update failed due to conflict, the next round of processServiceUpdate will correct it.
Bumping log level to get a better signal when it occurs.
Automatic merge from submit-queue (batch tested with PRs 38608, 38299)
controller: set unavailableReplicas correctly when scaling down
```
deployment_controller.go:299] Error syncing deployment
e2e-tests-kubectl-2l7xx/e2e-test-nginx-deployment:
Deployment.extensions "e2e-test-nginx-deployment" is invalid:
status.unavailableReplicas: Invalid value: -1:
must be greater than or equal to 0
```
The validation error above occurs usually when a Deployment is
scaled down. In such a case we should default unavailableReplicas
to 0 instead of making an invalid api call.
@kubernetes/deployment
Automatic merge from submit-queue
Remove json serialization annotations from internal types
fixes#3933
Internal types should never be serialized, and including json serialization tags on them makes it possible to accidentally do that without realizing it.
fixes in this PR:
* types
* [x] remove json tags from internal types
* [x] fix references from serialized types to internal ObjectMeta
* generation
* [x] remove generated json codecs for internal types (they should never be used)
* kubectl
* [x] fix `apply` to operate on versioned object
* [x] fix sorting by field to operate on versioned object
* [x] fix `--record` to build annotation patch using versioned object
* hpa
* [x] fix unmarshaling to internal CustomMetricTargetList in validation
* thirdpartyresources
* [x] fix encoding API responses using internal ObjectMeta
* tests
* [x] fix tests to use versioned objects when checking encoded content
* [x] fix tests passing internal objects to generic printers
follow ups (will open tracking issues or additional PRs):
- [ ] remove json tags from internal kubeconfig types (`kubectl config set` pathfinding needs to work against external type)
- [ ] HPA should version CustomMetricTargetList serialization in annotations
- [ ] revisit how TPR resthandlers encoding objects
- [ ] audit and add tests for printer use (human-readable printer requires internal versions, generic printers require external versions)
- [ ] add static analysis tests preventing new internal types from adding tags
- [ ] add static analysis tests requiring json tags on external types (and enforcing lower-case first letter)
- [ ] add more tests for `kubectl get` exercising known and unknown types with all output options