Automatic merge from submit-queue
update deployment and replicaset listers
Updates the deployment lister to avoid copies and updates the deployment controller to use shared informers.
Pushing WIP to see which tests are broken.
Automatic merge from submit-queue
Delete evicted pet
If pet was evicted by kubelet - it will stuck in this state forever.
By analogy to regular pod we need to re-create pet so that it will
be re-scheduled to another node, so in order to re-create pet
and preserve consitent naming we will delete it in petset controller
and create after that.
fixes: https://github.com/kubernetes/kubernetes/issues/31098
Automatic merge from submit-queue
Fix issue in updating device path when volume is attached multiple times
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.
This PR partially fixes issue #29324
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.
This PR partially fixes issue #29324
Automatic merge from submit-queue
PetSet replica count status test
**What this PR does / why we need it**: It adds a test for PetSet status replica count. It should fail now, but will pass when https://github.com/kubernetes/kubernetes/pull/32117 is merged.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#31965
**Special notes for your reviewer**: It will need to be rebased after #32117 is merged in, don't need detailed review before that.
**Release note**:
```release-note
NONE
```
Added fakeKubeClient and other fake types needed to test what is sent to
API when replica count is updated. These fakes can be extended for
other tests.
Automatic merge from submit-queue
Fix TestCreateWithNonExistentOwner
Fix#30228
As https://github.com/kubernetes/kubernetes/issues/30228#issuecomment-248779567 described, the GC did delete the garbage, it's the test logic failed.
The test used to rely on `gc.QueuesDrained()`, which could return before the GC finished processing. It seems to be the only possible reason of the test failure. Hence, this PR changed the test to poll for the deletion of garbage.
Automatic merge from submit-queue
MinReadySeconds / AvailableReplicas for ReplicaSets
This PR adds minReadySeconds and availableReplicas for replica sets / replication controllers
Partially addresses https://github.com/kubernetes/kubernetes/issues/28381
cc: @mfojtik
@bgrant0607 for the api changes, @janetkuo for controller changes
Automatic merge from submit-queue
PetSet returns valid replica count in status
**What this PR does / why we need it**: It prevents the PetSet replica count to be invalid regardless of pods not being created due to
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#31965
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.
Also, if we want to use different values for the Node.Name (which is
an important step for making installation easier), we need to keep
better control over this.
A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
Automatic merge from submit-queue
Refactor volume controller parameters into a structure
`persistentvolumecontroller.NewPersistentVolumeController` has 11 arguments now,
put them into a structure.
Also, rename `NewPersistentVolumeController` to `NewController`, `persistentvolume`
is already name of the package.
Fixes#30219
We had another bug where we confused the hostname with the NodeName.
To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.
A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName
Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
Automatic merge from submit-queue
controller: don't retry deployments with overlapping selectors
Returning an error will cause the deployment to be requeued. We should
just emit an event for deployments with overlapping selectors and silently
drop then out of the queue. This should be transitioned to a Condition
once we have them.
@kubernetes/deployment ptal
In order to determine whether a node should run its daemon pod,
DaemonController creates a dummy pod based on DaemonSet's template and
then uses scheduler predicates (currently GeneralPredicates) to test
whether such pod can be run by the node. The problem was that
DaemonController was not setting Namespace for the dummy pod. This was
not affecting currently used GeneralPredicates but this problem could
bite later when some namespace-dependent predicates are added to
GeneralPredicates or directly to DaemonController's node checks
(e.g. pod affinity).
Stumbled upon it while working on e2e test for #31136
Returning an error will cause the deployment to be requeued. We should
just emit an event for deployments with overlapping selectors and silently
drop then out of the queue. This should be transitioned to a Condition
once we have them.
persistentvolumecontroller.NewPersistentVolumeController has 11 arguments now,
put them into a structure.
Also, rename NewPersistentVolumeController to NewController, persistentvolume
is already name of the package.
Fixes#30219
Automatic merge from submit-queue
Do not report error when deleting an attached volume
Persistent volume controller should not send warning events to a PV and mark the PV as failed when the volume is still attached.
This happens when a user quickly deletes a pod and associated PVC - PV is slowly detaching, while the PVC is already deleted and the PV enters Failed phase.
`Deleter.Deleter` can now return `tryAgainError`, which is sent as INFO to the PV to let the user know we did not forget to delete the PV, however the PV stays in Released state. The controller tries again in the next sync (15 seconds by default).
Fixes#31511
Automatic merge from submit-queue
simplify RC listers
Make the RC and SVC listers use the common list functions that more closely match client APIs, are consistent with other listers, and avoid unnecessary copies.
Automatic merge from submit-queue
Allow garbage collection to work against different API prefixes
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
Allows OpenShift to use the GC
Automatic merge from submit-queue
Remove hacks from ScheduledJobs cron spec parsing
Previusly `github.com/robfig/cron` library did not allow passing cron spec without seconds. First commit updates the library, which has additional method ParseStandard which follows the standard cron spec, iow. minute, hour, day of month, month, day of week.
@janetkuo @erictune as promised in #30227 I've updated the library and now I'm updating it in k8s
Automatic merge from submit-queue
Move HighWaterMark to the top of the struct in order to fix arm, second time
ref: #33117
Sorry for not fixing everyone at once, but I seriously wasn't prepared for that quick LGTM 😄, so here's the other half.
@lavalamp
> lgtm, but seriously, this is terrible, we probably have this bug all over. And what if someone embeds the etcdWatcher struct in something else not at the top? We need the compiler to enforce things like this, it just can't be done manually. Can you file or link a golang issue for this?
I totally agree! There isn't currently a way of programmatically detecting this unfortunately.
I guess @davecheney or @minux can explain better to you why it's so hard.
This is noted in https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/multi-platform.md as a corner case indeed.
@pwittrock This should be cherrypicked toghether with #33117
Automatic merge from submit-queue
Fix race condition in setting node statusUpdateNeeded flag
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.
So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.
So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.
This PR also changes a unit test due to the workflow changes
Automatic merge from submit-queue
Send recycle events from pod to pv.
This allows users to diagnose what's wrong with recycler. Recycler pods are started automatically with a cryptic name and they are deleted immediately when they finish.
e.g, `kubectl describe pv` could show that NFS cannot be mounted (and how many pods have tried it):
```
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
59m 59m 1 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(5421800e-347b-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
53m 53m 1 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(3c9809e5-347c-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
46m 46m 1 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(250dd2a2-347d-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
40m 40m 1 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(0d84ea33-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
33m 33m 1 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(f5fb63bf-347e-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
27m 27m 1 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Unable to mount volumes for pod "recycler-for-nfs_default(de7128fd-347f-11e6-a79b-3c970e965218)": timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
1h 3m 75 {persistentvolume-controller } Normal RecyclerPod Recycler pod: Successfully assigned recycler-for-nfs to 127.0.0.1
1h 3m 76 {persistentvolume-controller } Normal RecyclerPod Recycler pod: Pod was active on the node longer than specified deadline
1h 1m 12 {persistentvolume-controller } Warning RecyclerPod Recycler pod: Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "recycler-for-nfs"/"default". list of unattached/unmounted volumes=[vol]
20m 1m 4 {persistentvolume-controller } Warning RecyclerPod (events with common reason combined)
```
These steps were necessary:
- added event watcher to volume.RecycleVolumeByWatchingPodUntilCompletion
- pass all these events through volume plugins to volume controller
- rework volume.RecycleVolumeByWatchingPodUntilCompletion unit tests to a table (too much copy-paste)
- fix all unit tests along the way
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
Automatic merge from submit-queue
Fix FakeNodeHandler Update behaviour
Two problems:
1. Get is always using Existing nodes slice, and you will for sure miss any updated data
2. Each Update adds a duplicate node entry to UpdatedNodes slice
For the 1st, we will try to find a node in UpdatedNodes slice (same as for the List).
2nd - append only if there is no node with same name as updated, if there is we will replace object in UpdatedNodes slice.