If the namespace controller encounters an error trying to delete a
single GroupVersionResource, add the error to an aggregated list of
errors and continue attempting to delete all the GroupVersionResources
instead of stopping at the first error. Return the aggregated error list
(if any) when done. This allows us to delete as much of the content in
the namespace as we can in each pass.
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)
Refactor reconciler volume log and error messages
**What this PR does / why we need it**:
Utilizes volume-specific error and log messages introduced in #44969, inside files that also log volume information.
Specifically:
- pkg/kubelet/volumemanager/reconciler/reconciler.go,
- pkg/controller/volume/attachdetach/reconciler/reconciler.go, and
- pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go
**Which issue this PR fixes** : fixes#40905
**Special notes for your reviewer**:
**Release note**:
```release-note
```
NONE
Automatic merge from submit-queue
Move all API related annotations into annotation_key_constants.go
Separate from #45869. See https://github.com/kubernetes/kubernetes/pull/45869#discussion_r116839411 for details.
This PR does nothing but move constants around :)
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45709, 41939)
delete err when return _
Signed-off-by: yupengzte <yu.peng36@zte.com.cn>
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 45247, 45810, 45034, 45898, 45899)
Apiregistration v1alpha1→v1beta1
Promoting apiregistration api from v1alpha1 to v1beta1.
API Registration is responsible for registering an API `Group`/`Version` with
another kubernetes like API server. The `APIService` holds information
about the other API server in `APIServiceSpec` type as well as general
`TypeMeta` and `ObjectMeta`. The `APIServiceSpec` type have the main
configuration needed to do the aggregation. Any request coming for
specified `Group`/`Version` will be directed to the service defined by
`ServiceReference` (on port 443) after validating the target using provided
`CABundle` or skipping validation if development flag `InsecureSkipTLSVerify`
is set. `Priority` is controlling the order of this API group in the overall
discovery document.
The return status is a set of conditions for this aggregation. Currently
there is only one condition named "Available", if true, it means the
api/server requests will be redirected to specified API server.
```release-note
API Registration is now in beta.
```
Automatic merge from submit-queue (batch tested with PRs 45664, 45861)
Fix#45213: Syncing jobs would return error when podController exception
**What this PR does / why we need it**:
Jobcontroller: Syncing jobs would return error when podController exception
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#45213
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 44337, 45775, 45832, 45574, 45758)
daemoncontroller.go:format for
**What this PR does / why we need it**:
format for.
delete redundant para.
make code clean.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44748, 45692)
Limiting client go packages visibility, round 3
Continue the work in the merged PR https://github.com/kubernetes/kubernetes/pull/45258
These packages in client-go will be gone after #44065 is fixed:
pkg/api/helper, pkg/api/util, internal version of api groups, API install packages.
This PR removes the dependency on these packages and add bazel visibility rules to prevent relapse.
Automatic merge from submit-queue (batch tested with PRs 45685, 45572, 45624, 45723, 45733)
resource quota full resync was removed in error
**What this PR does / why we need it**:
the quota controller should have had a full resync interval, and it was inadvertently removed in the move to shared informers.
**Which issue this PR fixes**
This fixes quota recalculation happening at the specified interval.
**Special notes for your reviewer**:
**Release note**:
```release-note
the resource quota controller was not adding quota to be resynced at proper interval
```
change import of client-go/api/helper to kubernetes/api/helper
remove unnecessary use of client-go/api.registry
change use of client-go/pkg/util to kubernetes/pkg/util
remove dependency on client-go/pkg/apis/extensions
remove unnecessary invocation of k8s.io/client-go/extension/intsall
change use of k8s.io/client-go/pkg/apis/authentication to v1
Automatic merge from submit-queue (batch tested with PRs 45304, 45006, 45527)
increase the QPS for namespace controller
The namespace controller is really chatty. Especially to discovery since that involves two requests for every API version available. This bumps the QPS and burst on the namespace controller to avoid being stuck waiting.
Automatic merge from submit-queue (batch tested with PRs 45508, 44258, 44126, 45441, 45320)
cloud initialize node in external cloud controller
@thockin This PR adds support in the `cloud-controller-manager` to initialize nodes (instead of kubelet, which did it previously)
This also adds support in the kubelet to skip node cloud initialization when `--cloud-provider=external`
Specifically,
Kubelet
1. The kubelet has a new flag called `--provider-id` which uniquely identifies a node in an external DB
2. The kubelet sets a node taint - called "ExternalCloudProvider=true:NoSchedule" if cloudprovider == "external"
Cloud-Controller-Manager
1. The cloud-controller-manager listens on "AddNode" events, and then processes nodes that starts with that above taint. It performs the cloud node initialization steps that were previously being done by the kubelet.
2. On addition of node, it figures out the zone, region, instance-type, removes the above taint and updates the node.
3. Then periodically queries the cloudprovider for node addresses (which was previously done by the kubelet) and updates the node if there are new addresses
```release-note
NONE
```
Automatic merge from submit-queue
fix the typos of e.g.
fix the typos of e.g.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 43732, 45413)
Extend timeouts in timed_workers_test
Fix#45375
If it won't be enough I'll rewrite it to allow injectable timers.
Automatic merge from submit-queue (batch tested with PRs 43732, 45413)
Handle maxUnavailable larger than spec.replicas
**What this PR does / why we need it**:
Handle maxUnavailable larger than spec.replicas
**Which issue this PR fixes**
fixes#42479
**Special notes for your reviewer**:
None
**Release note**:
```
NONE
```
Automatic merge from submit-queue
stateful_pod_control.go: format the code
**What this PR does / why we need it**:
1.Improve the quality of the code.
2.Reduce reduandant parameters
3.add one comma
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Update token controller test to test async retry
Fixes#44819https://github.com/kubernetes/kubernetes/pull/44625 changed the token controller to queue a retry if the live service account's resourceVersion did not match our cache.
This updates the unit test that was testing that condition to test async queue behavior (which this condition now drives)
Automatic merge from submit-queue (batch tested with PRs 42477, 44462)
Use storage.v1 instead of v1beta1
storage.v1beta1 was used to work around GKE which did not expose v1. Now that GKE is updated, we can switch everything to v1.
This is simple sed v1beta1 -> v1 + enabled a new test + changed preference of exposed interfaces in `storage/install/install.go`.
@msau42, PTAL and let me know when GKE is updated with storage v1 API and this PR can be actually merged.
@kubernetes/sig-storage-pr-reviews
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44741, 44853, 44572, 44797, 44439)
controller: fix saturation check in Deployments
Fixes https://github.com/kubernetes/kubernetes/issues/44436
@kubernetes/sig-apps-bugs
I'll cherry-pick this back to 1.6 and 1.5
Automatic merge from submit-queue (batch tested with PRs 40060, 44860, 44865, 44825, 44162)
servicecontroller: remove unused zone field
The zone field was unused, and this complicated e.g. #39996
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44862, 42241, 42101, 43181, 44147)
Feature/hpa upscale downscale delay configurable
**What this PR does / why we need it**:
Makes "upscale forbidden window" and "downscale forbidden window" duration configurable in arguments of kube-controller-manager. Those are options of horizontal pod autoscaler.
**Special notes for your reviewer**:
Please have a look @DirectXMan12 , the PR as discussed in Slack.
**Release note**:
```
Make "upscale forbidden window" and "downscale forbidden window" duration configurable in arguments of kube-controller-manager. Those are options of horizontal pod autoscaler. Right now are hardcoded 3 minutes for upscale, and 5 minutes to downscale. But sometimes cluster administrator might want to change this for his own needs.
```
Automatic merge from submit-queue (batch tested with PRs 43575, 44672)
Update deployment and daemonset completeness checks
maxUnavailable being taken into account for deployment completeness has caused a lot of confusion (https://github.com/kubernetes/kubernetes/issues/44395, https://github.com/kubernetes/kubernetes/issues/44657, https://github.com/kubernetes/kubernetes/issues/40496, others as well I am sure) so I am willing to just stop using it and require all of the new Pods for a Deployment to be available for the Deployment to be considered complete (hence both `rollout status` and ProgressDeadlineSeconds will not be successful in cases where a 1-pod Deployment never becomes successful because its Pod never transitions to ready).
@kubernetes/sig-apps-api-reviews thoughts?
```release-note
Deployments and DaemonSets are now considered complete once all of the new pods are up and running - affects `kubectl rollout status` (and ProgressDeadlineSeconds for Deployments)
```
Fixes https://github.com/kubernetes/kubernetes/issues/44395
Automatic merge from submit-queue
Exclude master from LoadBalancer / NodePort
The servicecontroller documents that the master is excluded from the
LoadBalancer / NodePort, but this is broken for clusters where we are
using taints for the master (as introduced in 1.6), instead of marking
the master as unschedulable.
This restores the desired documented behaviour, by excluding nodes that
are labeled as masters with the new 1.6 labels, even if they use the new
1.6 taints.
Fix#33884
```release-note
Exclude nodes labeled as master from LoadBalancer / NodePort; restores documented behaviour
```
Automatic merge from submit-queue
Improve Service controller's code coverage a little bit
**What this PR does / why we need it**:
Improves the code coverage for Service Controller
Before
```
go test --cover ./pkg/controller/service
ok k8s.io/kubernetes/pkg/controller/service 0.101s coverage: 23.4% of statements
```
After
```
go test --cover ./pkg/controller/service/
ok k8s.io/kubernetes/pkg/controller/service 0.094s coverage: 62.0% of statements
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
More unit testing
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44625, 43594, 44756, 44730)
Check for terminating Pod prior to launching successor in StatefulSet
Modifies sync loop for StatefulSet controller to check if a Pod is terminating before launching its successor. Fixes#44229. Should be cherry picked into 1.6 branch.
**Which issue this PR fixes**
fixes#44229
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44625, 43594, 44756, 44730)
Retry secret reference addition on conflict
* Tolerates leading or trailing etcd reads when fetching liveServiceAccount - fixes#25416
* Tolerates conflicts when updating the service account with the secret reference (does RetryOnConflict before deleting token and completely restarting the flow) - fixes#44054
Automatic merge from submit-queue
More RC/RS controller logging updates
We were comparing the address of the old and new RC.spec.replicas and we
have to compare the values. This only affects logging.
Update RS controller to match RC controller to log when spec.replicas
changes, not status.replicas.
@kargakis @janetkuo @sttts @liggitt
We were comparing the address of the old and new RC.spec.replicas and we
have to compare the values. This only affects logging.
Update RS controller to match RC controller to log when spec.replicas
changes, not status.replicas.
Automatic merge from submit-queue (batch tested with PRs 41498, 44487)
Use len of pods in stateful set error
**What this PR does / why we need it**:
Sync stateful set reports wrong error, we need to fix it.
**Release note**:
```release-note
`NONE`
```
The servicecontroller documents that the master is excluded from the
LoadBalancer / NodePort, but this is broken for clusters where we are
using taints for the master (as introduced in 1.6), instead of marking
the master as unschedulable.
This restores the desired documented behaviour, by excluding nodes that
are labeled as masters with the new 1.6 labels, even if they use the new
1.6 taints.
Fix#33884
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732)
Fix issue #34242: Attach/detach should recover from a crash
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This required a small change also in the volume plugin interface.
Fixes Issue #34242.
cc: @jsafrane @jingxu97
Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)
Job: Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings Job into full compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
This ensures that Job does not fight with other controllers over control of Pods.
Ref: #24433
**Special notes for your reviewer**:
**Release note**:
```release-note
Job controller now respects ControllerRef to avoid fighting over Pods.
```
cc @erictune @kubernetes/sig-apps-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 42177, 42176, 44721)
CronJob: Respect ControllerRef
**What this PR does / why we need it**:
This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings CronJob into compliance with ControllerRef. See the individual commit messages for details.
**Which issue this PR fixes**:
This ensures that other controllers do not fight over control of objects that a CronJob owns.
**Special notes for your reviewer**:
**Release note**:
```release-note
CronJob controller now respects ControllerRef to avoid fighting with other controllers.
```
cc @erictune @kubernetes/sig-apps-pr-reviews
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.