Tested with 2000 nodes, this actually meets the GCE API specifications
(which is nutty). Previous PR (#25178) was based on a mistaken
understanding of a poorly documented set of limitations, and even
poorer testing, for which I am embarassed.
This change implements the desiredStateOfWorld populator to sync up with
the pod informer. It periodically check each pod in the
desiredStateOfworld and verify whether it is still in pod informer
cache. If it not, remove it from the desiredStateOfWorld
In OpenStack Mitaka, the name field for members was added as an optional
field but does not exist in Liberty. Therefore the current
implementation for lbaas v2 will not work in Liberty.
Lots of comments describing the heuristics, how it fits together and the
limitations.
In particular, we can't guarantee correct volume placement if the set of
zones is changing between allocating volumes.
Automatic merge from submit-queue
Deployment controller's cleanupUnhealthyReplicas should respect minReadySeconds
```release-note
Fixed an issue that Deployment may be scaled down further than allowed by maxUnavailable when minReadySeconds is set.
```
Fixes#26834
Detected by a flake in deployment rollover e2e test (the only test that specifies `minReadySeconds`).
cc @kubernetes/deployment @pwittrock
cc @mqliang who first added `cleanupUnhealthyReplicas` in deployment controller
[]()
Automatic merge from submit-queue
Reapply ScheduledJob tests (2ab885a53a)
Re-applied the ScheduledJob tests (#25737) which were reverted due to an integration test error in #27184.
The problem was in `TestBatchGroupBackwardCompatibility` which is testing backwards compatibility for storing jobs (`extensions/v1beta1` vs `batch/v1`), which is not needed for `batch/v2alpha1`. I've added a skip to aforementioned test for that group. See `test/integration/master_test.go` for the actual fix.
@caesarxuchao @mikedanese ptal
@piosz @jszczepkowski @erictune fyi
[]()
Automatic merge from submit-queue
GCE provider: Limit Filter calls to regexps rather than insane blobs
Filters can't exceed 4k, and GET requests against the GCE API are also limited, so these break down in different ways at different cluster counts. Fix it by introducing an advisory `node-instance-prefix` configuration in the GCE provider that can hint the `EnsureLoadBalancer`/`UpdateLoadBalancer code` (and the firewall creation/update code). If it's not there, or wrong (a hostname that's registered violates it), just ignore it and grab the whole project.
Fixes#27731
[]()
Filters can't exceed 4k, and GET requests against the GCE API are also
limited, so these break down in different ways at different cluster
counts. Fix it by introducing an advisory node-instance-prefix
configuration in the GCE provider that can hint the
EnsureLoadBalancer/UpdateLoadBalancer code (and the firewall
creation/update code). If it's not there, or wrong (a hostname that's
registered violates it), just ignore it and grab the whole project.
When kubelet starts a pod that refers to non-existing PV, PVC or Node, it
should clearly show that the requested element does not exist.
Previous "PersistentVolumeClaim 'default/ceph-claim-wm' is not in cache"
looks like random kubelet hiccup, while "PersistentVolumeClaim
'default/ceph-claim-wm' not found" suggests that the object may not exist at
all and it might be an user error.
Fixes#27523
Automatic merge from submit-queue
Remove pod mutation for volumes annotated with supplemental groups
Removes the pod mutation added in #20490 -- partially resolves#27197 from the standpoint of making the feature inactive in 1.3. Our plan is to make this work correctly in 1.4.
@kubernetes/sig-storage
Automatic merge from submit-queue
swap FIRSTSEEN/LASTSEEN columns in `kubectl get event -w`
```release-note
Show LASTSEEN, the sorting key, as the first column in `kubectl get event` output
```
Not having LASTSEEN as the first column can confuse users into thinking
that events are not delivered in order.
Fixes#27060
Automatic merge from submit-queue
Kubelet Volume Manager Wait For Attach Detach Controller and Backoff on Error
* Closes https://github.com/kubernetes/kubernetes/issues/27483
* Modified Attach/Detach controller to report `Node.Status.AttachedVolumes` on successful attach (unique volume name along with device path).
* Modified Kubelet Volume Manager wait for Attach/Detach controller to report success before proceeding with attach.
* Closes https://github.com/kubernetes/kubernetes/issues/27492
* Implemented an exponential backoff mechanism for for volume manager and attach/detach controller to prevent operations (attach/detach/mount/unmount/wait for controller attach/etc) from executing back to back unchecked.
* Closes https://github.com/kubernetes/kubernetes/issues/26679
* Modified volume `Attacher.WaitForAttach()` methods to uses the device path reported by the Attach/Detach controller in `Node.Status.AttachedVolumes` instead of calling out to cloud providers.
Modify attach/detach controller to keep track of volumes to report
attached in Node VolumeToAttach status.
Modify kubelet volume manager to wait for volume to show up in Node
VolumeToAttach status.
Implement exponential backoff for errors in volume manager and attach
detach controller
Automatic merge from submit-queue
rkt: Map kubelet's `--stage1-image` flag to rkt's `--stage1-name` flag.
This enables rkt to use cached stage1 image instead of unpacking the stage1 image every time for every pod.
After this change, users need to preload the stage1 images in order to enable rkt to find the stage1 image with the name specified by this flag.
Also, the cloud config is modified to pre-load the stage1 images.
cc @kubernetes/sig-rktnetes @kubernetes/sig-node
Automatic merge from submit-queue
clarify kubectl recursive flag description
Clarify the description of the recursive flag in `kubectl` so that it's more intuitive to the user
This should make it into v1.3 as the rest of the recursive feature PR's will be available in 1.3
Automatic merge from submit-queue
kubectl describe node is allocatable aware
`kubectl describe node` will render node.status.allocatable if present.
in addition, it will report allocated resources relative to node.status.allocatable if present instead of capacity.
old code was confusing if you setup system-reserved and kube-reserved as allocated resource percentages were relative to node capacity and not schedulable amount of resources.
this is a small but valuable usability improvement, so i think it would be good to make 1.3 milestone.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @kubernetes/kubectl @davidopp
Automatic merge from submit-queue
httplog: Increase stack size
The previous size, of 2KB, in practice always was filled mostly by
http server-releated stuff well above the panic itself, and truncated
before anything of real value was printed in some cases.
This increases the stack size so that panics are printed in full (well, except for really large ones).
cc @lavalamp
This enables rkt to use cached stage1 image instead of unpacking the
stage1 image every time for every pod.
After this change, users need to preload the stage1 images in order to
enable rkt to find the stage1 image with the name specified by this flag.
Hot attach of disk to a scsi controller will work only if the
controller type is lsilogic-sas or paravirtual.This patch filters
the existing controller for these types, if it doesn't find one it
creates a new scsi controller.
Automatic merge from submit-queue
Allow disabling of dynamic provisioning
Allow administrators to opt-out of dynamic provisioning. Provisioning is still on by default, which is the current behavior.
Per a conversation with @jsafrane, a boolean toggle was added and plumbed through into the controller. Deliberate disabling will simply return nil from `provisionClaim` whereas a misconfigured provisioner will continue on and generate error events for the PVC.
@kubernetes/rh-storage @saad-ali @thockin @abhgupta
Automatic merge from submit-queue
Fill PV.Status.Message with deleter/recycler errors.
Instead of empty `Message` `kubectl describe pv` now shows:
```
Name: nfs
Labels: <none>
Status: Failed
Claim: default/nfs
Reclaim Policy: Recycle
Access Modes: RWX
Capacity: 1Mi
Message: Recycler failed: Pod was active on the node longer than specified deadline
Source:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 10.999.999.999
Path: /
ReadOnly: false
```
This is actually a regression since 1.2
@kubernetes/sig-storage
Automatic merge from submit-queue
Allow emitting PersistentVolume events.
Similarly to Nodes, PersistentVolumes are not in any namespace and we should
not block events on them. Currently, these events are rejected with
`Event "nfs.145841cf9c8cfaf0" is invalid: involvedObject.namespace: Invalid value: "": does not match involvedObject`
Automatic merge from submit-queue
add unit and integration tests for rbac authorizer
This PR adds lots of tests for the RBAC authorizer.
The plan over the next couple days is to add a lot more test cases.
Updates #23396
cc @erictune