kubernetes

Files

Jing Xu 69b9f9b1f0 Fix issue in node status updating VolumeAttached list

During volume detach, the following might happen in reconciler

1. Pod is deleting
2. remove volume from reportedAsAttached, so node status updater will
update volumeAttached list
3. detach failed due to some issue
4. volume is added back in reportedAsAttached
5. reconciler loops again the volume, remove volume from
reportedAsAttached
6. detach will not be trigged because exponential back off, detach call
will fail with exponential backoff error
7. another pod is added which using the same volume on the same node
8. reconciler loops and it will NOT try to tigger detach anymore

At this point, volume is still attached and in actual state, but
volumeAttached list in node status does not has this volume anymore, and
will block volume mount from kubelet.

The fix in first round is to add volume back into the volume list that
need to reported as attached at step 6 when detach call failed with
error (exponentical backoff). However this might has some performance
issue if detach fail for a while. During this time, volume will be keep
removing/adding back to node status which will cause a surge of API
calls.

So we changed to logic to check first whether operation is safe to retry which
means no pending operation or it is not in exponentical backoff time
period before calling detach. This way we can avoid keep removing/adding
volume from node status.

Change-Id: I5d4e760c880d72937d34b9d3e904ecad125f802e

2021-10-05 09:44:35 -07:00

actual_state_of_world_test.go

Recover CSI volumes from dangling attachments

2020-12-11 18:31:53 -08:00

actual_state_of_world.go

Fix issue in node status updating VolumeAttached list

2021-10-05 09:44:35 -07:00

desired_state_of_world_test.go

Add list of pods that use a volume to multiattach events

2018-01-24 13:22:03 +01:00

desired_state_of_world.go

Merge pull request #78105 from cwdsuzhou/narrow_down_lock

2019-06-14 04:08:23 -07:00