Commit Graph

132 Commits

Author SHA1 Message Date
Dr. Stefan Schimanski
1d053c4f7c controllers: simplify deepcopy calls 2017-08-29 19:21:24 +02:00
mtanino
5ff9dc0b3b WaitForAttach refactoring for iSCSI attacher/detacher
This change is prerequisite for implementing iSCSI attacher
and detacher.

In order to use chap authentication at iSCSI plugin after
implementing attacher and detacher, secret is needed at
AttachDisk() which is called from WaitForAttach().
To obtain secret, pod information is required, but
WaitForAttach() doesn't pass pod information inside.

This patch adds 'pod' as an argument of WaitForAttach()
and adds changes to drivers who implements WaitForAttach().

Fixes #48953
2017-08-26 17:21:34 -04:00
Cheng Xing
396c3c7c6f Adding dynamic Flexvolume plugin discovery capability, using filesystem watch. 2017-08-25 11:42:32 -07:00
Kubernetes Submit Queue
b75d423979 Merge pull request #51066 from vmware/MultiAttachVolumeIssueVsphere
Automatic merge from submit-queue

Allow attach of volumes to multiple nodes for vSphere

This is a fix for issue #50944 which doesn't allow a volume to be attached to a new node after the node is powered off where the volume was previously attached.

Current behaviour:
One of the cluster worker nodes was powered off in vCenter.
Pods running on this node have been rescheduled on different nodes but got stuck in ContainerCreating. It failed to attach the volume on the new node with error "Multi-Attach error for volume pvc-xxx, Volume is already exclusively attached to one node and can't be attached to another" and hence the application running in the pod has no data available because the volume is not attached to the new node. Since the volume is still attached to powered off node, any attempt to attach the volume on the new node failed with error "Multi-Attach error". It's stuck for 6 minutes until attach/detach controller forcefully tried to detach the volume on the powered off node. After the end of 6 minutes when volume is detached on powered off node, the volume is now successfully attached on the new node and application has now the data available.

What is expected to happen:
I would want the attach/detach controller to go ahead with the attach of the volume on new node where the pod got provisioned instead of waiting for the volume to be detached on the powered off node. It is ok to eventually delete the volume on the powered off node after 6 minutes. This way the application downtime is low and pods are up as soon as possible.

The current fix ignore, vSphere volumes/persistent volume to check for multi-attach scenario in attach/detach controller.

@jingxu97 @saad-ali : Can you please take a look at it.

@tusharnt @divyenpatel @rohitjogvmw @luomiao 

```release-note
Allow attach of volumes to multiple nodes for vSphere
```
2017-08-23 14:32:31 -07:00
Kubernetes Submit Queue
70632276bb Merge pull request #50806 from verult/VolumeNotYetAttached
Automatic merge from submit-queue (batch tested with PRs 50806, 48789, 49922, 49935, 50438)

On AttachDetachController node status update, do not retry when node …

…doesn't exist but keep the node entry in cache.



**What this PR does / why we need it**: An alternative fix for https://github.com/kubernetes/kubernetes/issues/42438 which also fixes #50721.

Instead of removing the node entry entirely from the node status update cache (which prevents the node from ever being updated even when it recovers), here the node status updater does nothing, so that there won't be an update retry until the node is re-added, where the cache entry is set to true.

Will cherry pick to prior versions after this is merged.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50721 

**Release Note**:
``` release-note
On AttachDetachController node status update, do not retry when node doesn't exist but keep the node entry in cache.
```

/assign @jingxu97 
/cc @saad-ali 
/sig storage
/release-note
2017-08-22 19:45:27 -07:00
Balu Dontu
cfdff1ae46 Multi-Attach volume fix for vSphere 2017-08-21 18:06:29 -07:00
Cheng Xing
1234d2f500 On AttachDetachController node status update, do not retry when node doesn't exist but keep the node entry in cache 2017-08-16 15:42:15 -07:00
Jan Safranek
bc0e170d9c Add pluginName to VolumeHost.GetMouter
Different plugins can get different mounter, depending where the mount
utilities are.
2017-08-14 12:16:26 +02:00
Jan Safranek
282404cbc9 Add Exec interface to VolumeHost
This exec should be used by volume plugins to execute mount utilities.
It will eventually execute things in mount containers.
2017-08-14 12:16:25 +02:00
Jeff Grafton
a7f49c906d Use buildozer to delete licenses() rules except under third_party/ 2017-08-11 09:32:39 -07:00
Jeff Grafton
33276f06be Use buildozer to remove deprecated automanaged tags 2017-08-11 09:31:50 -07:00
Hemant Kumar
f4e792ed42 Log attach detach controller skipping pods at higher priority
This will help us in tracking down problems related to pods
not getting added to desired state of world because of events
arriving out of order or some other problem related to that.
2017-07-28 13:23:28 -04:00
supereagle
adc0eef43e remove duplicated import and wrong alias name of api package 2017-07-25 10:04:25 +08:00
Jacob Simpson
b565f53822 update-bazel.sh 2017-07-17 15:06:08 -07:00
Chao Xu
9d489c8504 manual changes 2017-07-17 15:05:38 -07:00
Jacob Simpson
a765b8cfca Migrate api.Scheme to scheme.Scheme 2017-07-17 15:05:38 -07:00
Jacob Simpson
29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
Alexander Block
61275ad8d4 Fix flaky test Test_Run_OneVolumeAttachAndDetachMultipleNodesWithReadWriteMany
Only relying on the NewAttacher/Detacher call counts is not enough as they
happen in parallel to the testing/verification code and thus the actual
attaching/detaching may not be done yet, resulting in flaky test results.

Fixes #46244
2017-07-11 18:21:50 +02:00
Kubernetes Submit Queue
c662e1d7d8 Merge pull request #46949 from xingzhou/typo
Automatic merge from submit-queue

Fixed a comment typo

Typo fix

Fixed #48414 

**Release note**:
```
None
```
2017-07-03 11:33:36 -07:00
Kubernetes Submit Queue
d19a2841e3 Merge pull request #47645 from jsafrane/integration-test-speedup
Automatic merge from submit-queue (batch tested with PRs 48139, 48042, 47645, 48054, 48003)

Speed up attach/detach controller integration tests

Internal attach/detach controller timers should be configurable and tests should use much shorter values.

`reconcilerSyncDuration` is deliberately left out of `TimerConfig` because it's the only one that's not a constant one, it's configurable by user.

Fixes #47129 

Before:
```
--- PASS: TestPodDeletionWithDswp (63.21s)
--- PASS: TestPodUpdateWithWithADC (13.68s)
--- PASS: TestPodUpdateWithKeepTerminatedPodVolumes (13.55s)
--- PASS: TestPodAddedByDswp (183.01s)
--- PASS: TestPersistentVolumeRecycler (12.55s)
--- PASS: TestPersistentVolumeDeleter (12.54s)
--- PASS: TestPersistentVolumeBindRace (3.51s)
--- PASS: TestPersistentVolumeClaimLabelSelector (12.50s)
--- PASS: TestPersistentVolumeClaimLabelSelectorMatchExpressions (12.54s)
--- PASS: TestPersistentVolumeMultiPVs (3.05s)
--- PASS: TestPersistentVolumeMultiPVsPVCs (4.36s)
--- PASS: TestPersistentVolumeControllerStartup (7.29s)
--- PASS: TestPersistentVolumeProvisionMultiPVCs (5.02s)
--- PASS: TestPersistentVolumeMultiPVsDiffAccessModes (12.48s)
ok  	k8s.io/kubernetes/test/integration/volume	359.727s
```

After:
```
--- PASS: TestPodDeletionWithDswp (3.71s)
--- PASS: TestPodUpdateWithWithADC (3.63s)
--- PASS: TestPodUpdateWithKeepTerminatedPodVolumes (3.70s)
--- PASS: TestPodAddedByDswp (5.68s)
--- PASS: TestPersistentVolumeRecycler (12.54s)
--- PASS: TestPersistentVolumeDeleter (12.55s)
--- PASS: TestPersistentVolumeBindRace (3.55s)
--- PASS: TestPersistentVolumeClaimLabelSelector (12.50s)
--- PASS: TestPersistentVolumeClaimLabelSelectorMatchExpressions (12.52s)
--- PASS: TestPersistentVolumeMultiPVs (3.98s)
--- PASS: TestPersistentVolumeMultiPVsPVCs (3.85s)
--- PASS: TestPersistentVolumeControllerStartup (7.18s)
--- PASS: TestPersistentVolumeProvisionMultiPVCs (5.23s)
--- PASS: TestPersistentVolumeMultiPVsDiffAccessModes (12.48s)
ok  	k8s.io/kubernetes/test/integration/volume	103.267s
```

PV controller tests are the slowest ones now.

@kubernetes/sig-storage-pr-reviews 
/assign @gnufied 

```release-note
NONE
```
2017-06-27 14:08:17 -07:00
Kubernetes Submit Queue
18362beb0d Merge pull request #42254 from justinsb/volumes_dont_leak_nodestatusupdateneeded
Automatic merge from submit-queue

volumes: SetNodeStatusUpdateNeeded on error

If an error happened during the UpdateNodeStatuses loop, there were some
code paths where we would not call SetNodeStatusUpdateNeeded, leaking
the state.  Add it to all paths by adding a function.

Part of #40583

```release-note
NONE
```
2017-06-22 21:43:04 -07:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
f2d3220a11 run root-rewrite-import-client-go-api-types 2017-06-22 11:30:59 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
Kubernetes Submit Queue
d0a2beb1e7 Merge pull request #42249 from justinsb/volumes_logging
Automatic merge from submit-queue (batch tested with PRs 42252, 42251, 42249, 47512, 47887)

volumes: Add logging when removing node fails

Part of #40583

```release-note
NONE
```
2017-06-21 22:13:30 -07:00
Kubernetes Submit Queue
b795ec7de0 Merge pull request #42251 from justinsb/simplify_append
Automatic merge from submit-queue (batch tested with PRs 42252, 42251, 42249, 47512, 47887)

volumes: simplify append-to-slice code

Minor simplification - can append to empty/nil slice.

Part of #40583

```release-note
NONE
```
2017-06-21 22:13:27 -07:00
Kubernetes Submit Queue
bebe346d5f Merge pull request #42252 from justinsb/volumes_raise_loglevels
Automatic merge from submit-queue (batch tested with PRs 42252, 42251, 42249, 47512, 47887)

volumes: promote some logs from info -> warning

Part of #40583

```release-note
NONE
```
2017-06-21 22:13:24 -07:00
Kubernetes Submit Queue
2df2247a82 Merge pull request #42250 from justinsb/volumes_getnodeandvolume_comment
Automatic merge from submit-queue

volumes: add comment on getNodeAndVolume

Add comments on getNodeAndVolume to explain the code - it is a little
subtle, and it confused me on first reading.

Part of #40583

```release-note
NONE
```
2017-06-20 15:07:47 -07:00
Jan Safranek
b28790a63b Speed up attach/detach controller integration tests
Internal attach/detach controller timers should be configurable and tests
should use much shorter values.

reconcilerSyncDuration is deliberately left out of TimerConfig because it's
the only one that's not a constant one, it's configurable by user.
2017-06-16 12:15:04 +02:00
Xing Zhou
750d0d8730 Fixed a comment typo 2017-06-05 10:47:59 +08:00
Justin Santa Barbara
d420531f95 volumes: SetNodeStatusUpdateNeeded on error
If an error happened during the UpdateNodeStatuses loop, there were some
code paths where we would not call SetNodeStatusUpdateNeeded, leaking
the state.  Add it to all paths by adding a function.

Part of #40583
2017-06-01 00:32:20 -04:00
Shyam Jeedigunta
4425864707 Migrate kubelet configmap management logic to an interface 2017-05-31 10:39:36 +02:00
Kubernetes Submit Queue
0aad9d30e3 Merge pull request #44897 from msau42/local-storage-plugin
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)

Local storage plugin

**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes.  Scheduler predicate will direct already-bound PVCs to the node that the local PV is at.  PVC binding still happens independently.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
Part of #43640

**Release note**:

```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume.  These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
2017-05-30 23:20:02 -07:00
Kubernetes Submit Queue
c34b359bd7 Merge pull request #45923 from verult/cxing/NodeStatusUpdaterFix
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)

Node status updater now deletes the node entry in attach updates...

… when node is missing in NodeInformer cache.

- Added RemoveNodeFromAttachUpdates as part of node status updater operations.



**What this PR does / why we need it**: Fixes issue of unnecessary node status updates when node is deleted.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #42438

**Special notes for your reviewer**: Unit tested added, but a more comprehensive test involving the attach detach controller requires certain testing functionality that is currently absent, and will require larger effort. Will be added at a later time.

There is an edge case caused by the following steps:
1) A node is deleted and restarted. The node exists, but is not yet recognized by Kubernetes.
2) A pod requiring a volume attach with nodeName specifically set to this node.

This would make the pod stuck in ContainerCreating state. This is low-pri since it's a specific edge case that can be avoided.

**Release note**:

```release-note
NONE
```
2017-05-26 12:58:02 -07:00
Cheng Xing
f9dc2d5ca3 Node status updater now deletes the node entry in attach updates when node is missing in NodeInformer cache. Fixes #42438.
- Added RemoveNodeFromAttachUpdates as part of node status updater operations.
2017-05-24 18:31:47 -07:00
NickrenREN
add091b1fb fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
2017-05-25 09:27:24 +08:00
Justin Santa Barbara
35be997c2f volumes: promote some logs from info -> warning
Part of #40583
2017-05-23 22:36:42 -04:00
Michelle Au
6ade5461ad Add GetNodeLabels to VolumeHost interface 2017-05-22 14:44:06 -07:00
Alexander Block
06baeb33b2 Don't try to attach volumes which are already attached to other nodes 2017-05-18 06:56:30 +02:00
Kubernetes Submit Queue
6dbe853e29 Merge pull request #45544 from ianchakeres/reconciler-err-cleanup
Automatic merge from submit-queue (batch tested with PRs 45990, 45544, 45745, 45742, 45678)

Refactor reconciler volume log and error messages

**What this PR does / why we need it**:
Utilizes volume-specific error and log messages introduced in #44969, inside files that also log volume information. 

Specifically: 

- pkg/kubelet/volumemanager/reconciler/reconciler.go, 
- pkg/controller/volume/attachdetach/reconciler/reconciler.go, and
- pkg/kubelet/volumemanager/populator/desired_state_of_world_populator.go


**Which issue this PR fixes** : fixes #40905

**Special notes for your reviewer**:

**Release note**:

```release-note
```
NONE
2017-05-17 18:40:51 -07:00
Ian Chakeres
b1315f4491 Refactor reconciler volume log and error messages 2017-05-11 22:33:17 -07:00
Hemant Kumar
951a36aac7 Add Keepterminatedpodvolumes as a annotation on node
and lets make sure that controller respects it
and doesn't detaches mounted volumes.
2017-05-11 22:31:14 -04:00
Hemant Kumar
9a1a9cbe08 detach the volume when pod is terminated
Make sure volume is detached when pod is terminated because
of any reason and not deleted from api server.
2017-05-11 22:18:22 -04:00
NickrenREN
0861688237 add and clear err message in RemoveVolumeFromReportAsAttached 2017-05-08 09:37:21 +08:00
Tomas Smetana
3064fe4d39 Fix issue #44757: Flaky Test_AttachDetachControllerRecovery 2017-04-21 12:43:54 +02:00
Tomas Smetana
852c44ae59 Fix issue #34242: Attach/detach should recover from a crash
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.
2017-04-20 13:04:50 +02:00
NickrenREN
5cafb9042b find and add active pods for dswp
loops through the list of active pods and ensures that each one exists in the desired state of the world cache
2017-04-18 11:21:37 +08:00
Matthew Wong
e1ce33d944 WaitForCacheSync before running attachdetach controller 2017-04-17 14:02:33 -04:00
Mike Danese
a05c3c0efd autogenerated 2017-04-14 10:40:57 -07:00
Andy Goldstein
e63fcf708d Make controller Run methods consistent
- startup/shutdown logging
- wait for cache sync logging
- defer utilruntime.HandleCrash()
- wait for stop channel before exiting
2017-04-14 07:27:45 -04:00