This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).
This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
Automatic merge from submit-queue
volume integration: wait for PVs before creating PVCs
The test should wait until all volumes are processed by volume controller (i.e. in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity 50000000000 but got pvc-2 capacity 52000000000".
Fixes#27179 (together with #26894)
The test should wait until all volumes are processed by volume controller (i.e.
in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller
might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity
50000000000 but got pvc-2 capacity 52000000000".
When we create a PV, we should created it withoud Spec.ClaimRef.UID.
In rare cases, when 'PV added' event with UID is processed before 'PVC
added' (created by for loop few lines above), the controller does not know
a PVC with this UID and considers the PV as released. Reclaim policy is
then executed and the PV is deleted and it's never bound.
With UID="", the controller waits for the PVC to get created and binds
it.
Different tests should use different objects and watchers - I noticed
sometimes an event from old tests leaked into subsequent test in the
same function.
And add some logs.
The test tries to bind configured nr. of PVs to the same nr. of PVCs.
'100' is used by default, which should take ~1-3 seconds (depends on log level).
Periodic sync is needed in rare cases, which may add another 10 seconds. - cache
from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance.
Set KUBE_INTEGRATION_PV_* env. variables as described in
persistent_volume_test.go and run the tests:
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PV_SYNC_PERIOD=10s KUBE_INTEGRATION_PV_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
Log level '2' is useful to get timestamps of various events like
'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs:
claims are bound'.
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
- create 100 PV, ranging from 0 to 99GB; create 1 PVC to claim 50GB. Verify only one PV is bound and rest are pending
- create 2 PVs with different access modes (RWM, RWO), 1 PVC to claim RWM PV. Verify RWM is bound and RWO is not bound.
Signed-off-by: Huamin Chen <hchen@redhat.com>
- Expand Attacher/Detacher interfaces to break up work more
explicitly.
- Add arguments to all functions to avoid having implementers store
the data needed for operations.
- Expand unit tests to check that Attach, Detach, WaitForAttach,
WaitForDetach, MountDevice, and UnmountDevice get call where
appropriet.
Recycle controller tries to recycle or delete a PV several times.
It stores count of failed attempts and timestamp of the last attempt in
annotations of the PV.
By default, the controller tries to recycle/delete a PV 3 times in
10 minutes interval. These values are configurable by
kube-controller-manager --pv-recycler-maximum-retry=X --pvclaimbinder-sync-period=Y
arguments.
Fixes#19860 (it may be easier to look at the issue to see exact sequence
to reproduce the bug and understand the fix).
When PersistentVolumeProvisionerController.reconcileClaim() is called with the
same claim in short succession (e.g. the claim is created by an user and
at the same time periodic check of all claims is scheduled), the second
reconcileClaim() call gets an old copy of the claim as its parameter.
The method should always reload the claim to get a fresh copy with all
annotations, possibly added by previous reconcileClaim() call.
The same applies to PersistentVolumeClaimBinder.syncClaim().
Also update all the test to store claims in "fake" API server before calling
syncClaim and reconcileClaim.
Combine the fields that will be used for content transformation
(content-type, codec, and group version) into a single struct in client,
and then pass that struct into the rest client and request. Set the
content-type when sending requests to the server, and accept the content
type as primary.
Will form the foundation for content-negotiation via the client.