Commit Graph

77 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
c34b359bd7 Merge pull request #45923 from verult/cxing/NodeStatusUpdaterFix
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)

Node status updater now deletes the node entry in attach updates...

… when node is missing in NodeInformer cache.

- Added RemoveNodeFromAttachUpdates as part of node status updater operations.



**What this PR does / why we need it**: Fixes issue of unnecessary node status updates when node is deleted.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #42438

**Special notes for your reviewer**: Unit tested added, but a more comprehensive test involving the attach detach controller requires certain testing functionality that is currently absent, and will require larger effort. Will be added at a later time.

There is an edge case caused by the following steps:
1) A node is deleted and restarted. The node exists, but is not yet recognized by Kubernetes.
2) A pod requiring a volume attach with nodeName specifically set to this node.

This would make the pod stuck in ContainerCreating state. This is low-pri since it's a specific edge case that can be avoided.

**Release note**:

```release-note
NONE
```
2017-05-26 12:58:02 -07:00
Cheng Xing
f9dc2d5ca3 Node status updater now deletes the node entry in attach updates when node is missing in NodeInformer cache. Fixes #42438.
- Added RemoveNodeFromAttachUpdates as part of node status updater operations.
2017-05-24 18:31:47 -07:00
NickrenREN
add091b1fb fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
2017-05-25 09:27:24 +08:00
Alexander Block
06baeb33b2 Don't try to attach volumes which are already attached to other nodes 2017-05-18 06:56:30 +02:00
Hemant Kumar
951a36aac7 Add Keepterminatedpodvolumes as a annotation on node
and lets make sure that controller respects it
and doesn't detaches mounted volumes.
2017-05-11 22:31:14 -04:00
NickrenREN
0861688237 add and clear err message in RemoveVolumeFromReportAsAttached 2017-05-08 09:37:21 +08:00
Tomas Smetana
852c44ae59 Fix issue #34242: Attach/detach should recover from a crash
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.
2017-04-20 13:04:50 +02:00
Mike Danese
a05c3c0efd autogenerated 2017-04-14 10:40:57 -07:00
Justin Santa Barbara
1d357b334f volumes: simplify append-to-slice code 2017-02-28 10:37:28 -05:00
Justin Santa Barbara
0ee71ef214 volumes: add comment on getNodeAndVolume
Add comments on getNodeAndVolume to explain the code - it is a little
subtle, and it confused me on first reading.
2017-02-28 10:30:10 -05:00
Harry Zhang
70941f65bf Do not swallow error in volume 2017-01-25 21:29:48 +08:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Jeff Grafton
20d221f75c Enable auto-generating sources rules 2017-01-05 14:14:13 -08:00
Mike Danese
161c391f44 autogenerated 2016-12-29 13:04:10 -08:00
Mike Danese
c87de85347 autoupdate BUILD files 2016-12-12 13:30:07 -08:00
rkouj
638ef1b977 SetNodeUpdateStatusNeeded whenever nodeAdd event is received 2016-11-30 21:12:34 -08:00
Chao Xu
bcc783c594 run hack/update-all.sh 2016-11-23 15:53:09 -08:00
Chao Xu
7eeb71f698 cmd/kube-controller-manager 2016-11-23 15:53:09 -08:00
Jing Xu
abbde43374 Add sync state loop in master's volume reconciler
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.

This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.

To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.

More details are explained in PR #33760
2016-10-28 09:24:53 -07:00
Mike Danese
3b6a067afc autogenerated 2016-10-21 17:32:32 -07:00
Jing Xu
9e8edf6baf Fix issue in updating device path when volume is attached multiple times
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
2016-10-03 17:14:23 -07:00
Justin Santa Barbara
54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Jing Xu
14cad206f5 Fix race conditino in setting node statusUpdateNeeded flag
This PR fixes the race condition in setting node statusUpdateNeeded flag
in master's attachdetach controller. This flag is used to indicate
whether a node status has been updated by the node_status_updater or
not. When updater finishes update a node status, it is set to false.
When the node status is changed such as volume is detached or new volume
is attached to the node, the flag is set to true so that updater can
update the status again. The previous workflow has a race condition as
follows
1. updater gets the currently attached volume list from the node which needs to be
updated.
2. A new volume A is attached to the same node right after 1 and set the
flag to TRUE
3. updater updates the node attached volume list (which does not include volume A) and then set the flag to FALSE.
The result is that volume A will be never added to the attached volume
list so at node side, this volume is never attached.

So in this PR, the flag is set to FALSE when updater tries to get the
attached volume list (as in an atomic operation). So in the above
example, after step 2, the flag will be TRUE again, in step 3, updater
does not set the flag if updates is sucessful. So after that, flag is
still TRUE and in next round of update, the node status will be updated.

This PR also changes a unit test due to the workflow changes
2016-09-22 14:02:30 -07:00
Jing Xu
efaceb28cc Fix race condition in updating attached volume between master and node
This PR tries to fix issue #29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 13:51:08 -07:00
Jing Xu
b9157b7524 Post event message for volume attachment
This PR is to add event message when attaching volume fails to help
users to debug. For detach failure, may address in a different PR since
it requires more data structure change.
2016-09-01 16:24:36 -07:00
Paul Morie
c884297990 Fix collisions issues / timeouts for mounts
For non-attachable volumes, do not call GetVolumeName on the plugin and instead
generate a unique name based on the identity of the pod and the name of the volume
within the pod.
2016-07-27 17:53:50 -04:00
saadali
0dd17fff22 Reorganize volume controllers and manager 2016-07-01 18:50:25 -07:00