Commit Graph

97 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
d19a2841e3 Merge pull request #47645 from jsafrane/integration-test-speedup
Automatic merge from submit-queue (batch tested with PRs 48139, 48042, 47645, 48054, 48003)

Speed up attach/detach controller integration tests

Internal attach/detach controller timers should be configurable and tests should use much shorter values.

`reconcilerSyncDuration` is deliberately left out of `TimerConfig` because it's the only one that's not a constant one, it's configurable by user.

Fixes #47129 

Before:
```
--- PASS: TestPodDeletionWithDswp (63.21s)
--- PASS: TestPodUpdateWithWithADC (13.68s)
--- PASS: TestPodUpdateWithKeepTerminatedPodVolumes (13.55s)
--- PASS: TestPodAddedByDswp (183.01s)
--- PASS: TestPersistentVolumeRecycler (12.55s)
--- PASS: TestPersistentVolumeDeleter (12.54s)
--- PASS: TestPersistentVolumeBindRace (3.51s)
--- PASS: TestPersistentVolumeClaimLabelSelector (12.50s)
--- PASS: TestPersistentVolumeClaimLabelSelectorMatchExpressions (12.54s)
--- PASS: TestPersistentVolumeMultiPVs (3.05s)
--- PASS: TestPersistentVolumeMultiPVsPVCs (4.36s)
--- PASS: TestPersistentVolumeControllerStartup (7.29s)
--- PASS: TestPersistentVolumeProvisionMultiPVCs (5.02s)
--- PASS: TestPersistentVolumeMultiPVsDiffAccessModes (12.48s)
ok  	k8s.io/kubernetes/test/integration/volume	359.727s
```

After:
```
--- PASS: TestPodDeletionWithDswp (3.71s)
--- PASS: TestPodUpdateWithWithADC (3.63s)
--- PASS: TestPodUpdateWithKeepTerminatedPodVolumes (3.70s)
--- PASS: TestPodAddedByDswp (5.68s)
--- PASS: TestPersistentVolumeRecycler (12.54s)
--- PASS: TestPersistentVolumeDeleter (12.55s)
--- PASS: TestPersistentVolumeBindRace (3.55s)
--- PASS: TestPersistentVolumeClaimLabelSelector (12.50s)
--- PASS: TestPersistentVolumeClaimLabelSelectorMatchExpressions (12.52s)
--- PASS: TestPersistentVolumeMultiPVs (3.98s)
--- PASS: TestPersistentVolumeMultiPVsPVCs (3.85s)
--- PASS: TestPersistentVolumeControllerStartup (7.18s)
--- PASS: TestPersistentVolumeProvisionMultiPVCs (5.23s)
--- PASS: TestPersistentVolumeMultiPVsDiffAccessModes (12.48s)
ok  	k8s.io/kubernetes/test/integration/volume	103.267s
```

PV controller tests are the slowest ones now.

@kubernetes/sig-storage-pr-reviews 
/assign @gnufied 

```release-note
NONE
```
2017-06-27 14:08:17 -07:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
f2d3220a11 run root-rewrite-import-client-go-api-types 2017-06-22 11:30:59 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
Kubernetes Submit Queue
d0a2beb1e7 Merge pull request #42249 from justinsb/volumes_logging
Automatic merge from submit-queue (batch tested with PRs 42252, 42251, 42249, 47512, 47887)

volumes: Add logging when removing node fails

Part of #40583

```release-note
NONE
```
2017-06-21 22:13:30 -07:00
Jan Safranek
b28790a63b Speed up attach/detach controller integration tests
Internal attach/detach controller timers should be configurable and tests
should use much shorter values.

reconcilerSyncDuration is deliberately left out of TimerConfig because it's
the only one that's not a constant one, it's configurable by user.
2017-06-16 12:15:04 +02:00
Shyam Jeedigunta
4425864707 Migrate kubelet configmap management logic to an interface 2017-05-31 10:39:36 +02:00
Kubernetes Submit Queue
0aad9d30e3 Merge pull request #44897 from msau42/local-storage-plugin
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)

Local storage plugin

**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes.  Scheduler predicate will direct already-bound PVCs to the node that the local PV is at.  PVC binding still happens independently.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
Part of #43640

**Release note**:

```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume.  These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
2017-05-30 23:20:02 -07:00
NickrenREN
add091b1fb fix regression in UX experience for double attach volume
send event when volume is not allowed to multi-attach
2017-05-25 09:27:24 +08:00
Michelle Au
6ade5461ad Add GetNodeLabels to VolumeHost interface 2017-05-22 14:44:06 -07:00
Hemant Kumar
951a36aac7 Add Keepterminatedpodvolumes as a annotation on node
and lets make sure that controller respects it
and doesn't detaches mounted volumes.
2017-05-11 22:31:14 -04:00
Hemant Kumar
9a1a9cbe08 detach the volume when pod is terminated
Make sure volume is detached when pod is terminated because
of any reason and not deleted from api server.
2017-05-11 22:18:22 -04:00
Tomas Smetana
852c44ae59 Fix issue #34242: Attach/detach should recover from a crash
When the attach/detach controller crashes and a pod with attached PV is deleted
afterwards the controller will never detach the pod's attached volumes. To
prevent this the controller should try to recover the state from the nodes
status.
2017-04-20 13:04:50 +02:00
NickrenREN
5cafb9042b find and add active pods for dswp
loops through the list of active pods and ensures that each one exists in the desired state of the world cache
2017-04-18 11:21:37 +08:00
Matthew Wong
e1ce33d944 WaitForCacheSync before running attachdetach controller 2017-04-17 14:02:33 -04:00
Andy Goldstein
e63fcf708d Make controller Run methods consistent
- startup/shutdown logging
- wait for cache sync logging
- defer utilruntime.HandleCrash()
- wait for stop channel before exiting
2017-04-14 07:27:45 -04:00
Tomas Smetana
6898bc60ce Attach/detach controller: fix potential race in constructor 2017-03-17 13:34:53 +01:00
Justin Santa Barbara
b7edfda828 volumes: Add logging when removing node fails 2017-02-28 10:17:33 -05:00
deads2k
fd34b11e13 react to informer updates 2017-02-13 09:18:32 -05:00
Andy Goldstein
70c6087600 Replace hand-written informers with generated ones
Replace existing uses of hand-written informers with generated ones.
 Follow-up commits will switch the use of one-off informers to shared
 informers.
2017-02-06 13:49:27 -05:00
deads2k
8a12000402 move client/record 2017-01-31 19:14:13 -05:00
deads2k
b0b156b381 make tools/cache authoritative 2017-01-25 08:29:45 -05:00
Wojciech Tyczynski
bf7138652f SecretVolume using secret manager 2017-01-23 16:10:01 +01:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Kubernetes Submit Queue
7c3fff1a95 Merge pull request #39551 from chrislovecnm/reconciler-time-increases
Automatic merge from submit-queue (batch tested with PRs 39628, 39551, 38746, 38352, 39607)

Increasing times on reconciling volumes fixing impact to AWS.

#**What this PR does / why we need it**:

We are currently blocked by API timeouts with PV volumes.  See https://github.com/kubernetes/kubernetes/issues/39526.  This is a workaround, not a fix.

**Special notes for your reviewer**:

A second PR will be dropped with CLI cobra options in it, but we are starting with increasing the reconciliation periods.  I am dropping this without major testing and will test on our AWS account. Will be marked WIP until I run smoke tests.

**Release note**:

```release-note
Provide kubernetes-controller-manager flags to control volume attach/detach reconciler sync.  The duration of the syncs can be controlled, and the syncs can be shut off as well. 
```
2017-01-10 11:54:15 -08:00
chrislovecnm
ac49139c9f updates from review 2017-01-09 17:20:19 -07:00
chrislovecnm
a973c38c7d The capability to control duration via controller-manager flags,
and the option to shut off reconciliation.
2017-01-09 16:47:13 -07:00
NickrenREN
639572ac68 fix redundant alias and remove unused function 2017-01-09 17:13:09 +08:00
rkouj
d5f7610b82 Refactor operation_executor to make it unit testable 2016-12-27 15:12:16 -08:00
Chao Xu
03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Jordan Liggitt
6819706adf
Pass addressable values to DeepCopy 2016-12-08 14:16:01 -05:00
Hemant Kumar
fcf5d79be7 Add integration tests for desire state of world populator
This adds tests for code introduced here :
https://github.com/kubernetes/kubernetes/issues/26994

Via integration test we can now verify that if pod delete
event is somehow missed by AttachDetach controller - it still
get cleaned up by Desired State of World populator.
2016-12-06 06:52:52 -05:00
Kubernetes Submit Queue
c552f8918b Merge pull request #37727 from rkouj/bug-fix-upgrade-test
Automatic merge from submit-queue

SetNodeUpdateStatusNeeded whenever nodeAdd event is received

**What this PR does / why we need it**:
Bug fix and SetNodeStatusUpdateNeeded for a node whenever its api object is added. This is to ensure that we don't lose the attached list of volumes in the node when its api object is deleted and recreated.

fixes https://github.com/kubernetes/kubernetes/issues/37586
         https://github.com/kubernetes/kubernetes/issues/37585


**Special notes for your reviewer**:


<!--  Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access) 
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`. 
-->
2016-12-02 05:44:57 -08:00
rkouj
638ef1b977 SetNodeUpdateStatusNeeded whenever nodeAdd event is received 2016-11-30 21:12:34 -08:00
deads2k
d973158a4e make controller manager use specified stop channel 2016-11-28 15:02:21 -05:00
Chao Xu
7eeb71f698 cmd/kube-controller-manager 2016-11-23 15:53:09 -08:00
Rajat Ramesh Koujalagi
d81e216fc6 Better messaging for missing volume components on host to perform mount 2016-11-09 15:16:11 -08:00
Paul Morie
4722cb299b Remove GetRootContext from VolumeHost 2016-11-03 12:21:19 -04:00
Jing Xu
abbde43374 Add sync state loop in master's volume reconciler
At master volume reconciler, the information about which volumes are
attached to nodes is cached in actual state of world. However, this
information might be out of date in case that node is terminated (volume
is detached automatically). In this situation, reconciler assume volume
is still attached and will not issue attach operation when node comes
back. Pods created on those nodes will fail to mount.

This PR adds the logic to periodically sync up the truth for attached volumes kept in the actual state cache. If the volume is no longer attached to the node, the actual state will be updated to reflect the truth. In turn, reconciler will take actions if needed.

To avoid issuing many concurrent operations on cloud provider, this PR
tries to add batch operation to check whether a list of volumes are
attached to the node instead of one request per volume.

More details are explained in PR #33760
2016-10-28 09:24:53 -07:00
Justin Santa Barbara
54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Mike Danese
a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Jing Xu
efaceb28cc Fix race condition in updating attached volume between master and node
This PR tries to fix issue #29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
2016-09-12 13:51:08 -07:00
Kubernetes Submit Queue
6ce405c6ee Merge pull request #27778 from screeley44/k8-vol-executor
Automatic merge from submit-queue

Add Events for operation_executor to show status of mounts, failed/successful to show in describe events

Fixes #27590 
@saad-ali @pmorie @erinboyd

After talking with @pmorie last week about the above issue, I decided to poke around and see if I could remedy.  The refactoring broke my previous UXP merged PR's that correctly showed failed mount errors in the describe events.  However, Not sure I implemented correctly, but it tested out and seems to be working, let me know what I missed or if this is not the correct approach.

```
Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  2m		2m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned nfs-bb-pod1 to 127.0.0.1
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedMount	Unable to mount volumes for pod "nfs-bb-pod1_default(a94f64f1-37c9-11e6-9aa5-52540073d346)": timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  44s		44s		1	{kubelet 127.0.0.1}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "nfs-bb-pod1"/"default". list of unattached/unmounted volumes=[nfsvol]
  38s		38s		1	{kubelet }				Warning		FailedMount	Unable to mount volumes for pod "a94f64f1-37c9-11e6-9aa5-52540073d346": Mount failed: exit status 32
Mounting arguments: nfs1.rhs:/opt/data99 /var/lib/kubelet/pods/a94f64f1-37c9-11e6-9aa5-52540073d346/volumes/kubernetes.io~nfs/nfsvol nfs []
Output: mount.nfs: Connection timed out

Resolution hint: Check and make sure the NFS Server exists (ensure that correct IPAddress/Hostname was given) and is available/reachable.
Also make sure firewall ports are open on both client and NFS Server (2049 v4 and 2049, 20048 and 111 for v3).
Use commands telnet <nfs server> <port> and showmount <nfs server> to help test connectivity.
```
2016-08-19 08:27:48 -07:00
Scott Creeley
782d7d9815 Add Events for operation_executor to show status of mounts, failed or successful 2016-08-17 09:53:47 -04:00
Avesh Agarwal
52a60fe3be Fix default resource limits (node capacities) for downward api volumes 2016-08-16 14:41:17 -04:00
saadali
afd8a58e5c Reduce DSW populator sleep period from 5 min to 1 2016-07-20 01:03:04 -07:00
saadali
0dd17fff22 Reorganize volume controllers and manager 2016-07-01 18:50:25 -07:00