Commit Graph

132 Commits

Author SHA1 Message Date
k8s-merge-robot
9aeeef1d81 Merge pull request #26414 from jsafrane/reduce-sync-period
Automatic merge from submit-queue

Reduce volume controller sync period

fixes #24236 and most probably also fixes #25294.
Needs #25881! With the cache, binder is not affected by sync period. Without the cache, binding of 1000 PVCs takes more than 5 minutes (instead of ~70 seconds).

15 seconds were chosen by fair 2d10 roll :-)
2016-05-30 05:54:51 -07:00
k8s-merge-robot
5643b7498f Merge pull request #25881 from jsafrane/devel/pv-add-cache
Automatic merge from submit-queue

volume controller: Add cache with the latest version of PVs and PVCs

When the controller binds a PV to PVC, it saves both objects to etcd. However, there is still an old version of these objects in the controller Informer cache. So, when a new PVC comes, the PV is still seen as available and may get bound to the new PVC. This will be blocked by etcd, still, it creates unnecessary traffic that slows everything down.

To make everything worse, when periodic sync with the old PVC is performed, this PVC is seen by the controller as Pending (while it's already Bound on etcd) and will be bound to a different PV. Writing to this PV won't be blocked by etcd, only subsequent write of the PVC fails. So, the controller will need to roll back the PV in another transaction(s). The controller can keep itself pretty busy this way.

Also, we save bound PVs (and PVCs) as two transactions - we save say PV.Spec first and then .Status. The controller gets "PV.Spec updated" event from etcd and tries to fix the Status, as it seems to the controller it's outdated. This write again fails - there already is a correct version in etcd.

As we can't influence the Informer cache, it is read-only to the controller, this patch introduces second cache in the controller, which holds latest and greatest version on PVs and PVCs to prevent these useless writes to etcd . It gets updated with events from etcd *and* after etcd confirms successful save of PV/PVC modified by the controller.

The cache stores only *pointers* to PVs/PVCs, so in ideal case it shares the actual object data with the informer cache. They will diverge only for a short time when the controller modifies something and the informer cache did not get update events yet.

@kubernetes/sig-storage
2016-05-30 04:13:18 -07:00
Jan Safranek
2aa9f1dd8f Reduce volume controller sync period 2016-05-30 09:59:31 +02:00
Alex Mohr
9803393a67 Merge pull request #25960 from jsafrane/do-not-sort-bind
volume controller: Speed up binding by not sorting volumes
2016-05-26 15:47:14 -07:00
k8s-merge-robot
da7d3c189a Merge pull request #25869 from jsafrane/devel/operation-logs
Automatic merge from submit-queue

volume controller: use better operation names

Using volume/claim.UID in the operation name is not really useful, as UIDs are not logged by rest of the controller. On the other hand, volume.Name and claim.Namespace/Name is logged pretty often and it would help to log these also in operation name. Still, I'd prefer to have the operation name really unique to be protected from users deleting a volume and quickly creating another one with the same name, so UID is still part of the operation name.

This has been already proven to be very useful in controller debugging.
2016-05-25 17:58:07 -07:00
Brendan Burns
88663fc58b Add some extra checking in the tests to prevent flakes. 2016-05-23 16:25:02 -07:00
k8s-merge-robot
62a8394eb4 Merge pull request #25263 from jsafrane/devel/adopt-recycle-pod
Automatic merge from submit-queue

volume recycler: Don't start a new recycler pod if one already exists.

Recycling is a long duration process and when the recycler controller is restarted in the meantime, it should not start a new recycler pod if there is one already running.

This means that the recycler pod must have deterministic name based on name of the recycled PV, we then get name conflicts when creating the pod.

Two things need to be changed:

- recycler controller and recycler plugins must pass the PV.Name to place, where the pod is created. This is most of the patch and it should be pretty straightforward.

- create recycler pod with deterministic name and check "already exists" error.

When at it, remove useless 'resourceVersion' argument and make log messages starting with lowercase.

There is an unit test to check the behavior + there is an e2e test that checks that regular recycling is not broken (it does not try to run two recycler pods in parallel as the recycler is single-threaded now).
2016-05-21 02:28:26 -07:00
Jan Safranek
c7da3abd5b volume controller: Speed up binding by not sorting volumes
The binder sorts all available volumes first, then it filters out volumes
that cannot be bound by processing each volume in a loop and then finds
the smallest matching volume by binary search.

So, if we process every available volume in a loop, we can also remember the
smallest matching one and save us potentially long sorting (and quick binary
search).
2016-05-20 12:26:39 +02:00
Jan Safranek
0279232360 volume controller: Add cache with the latest version of PVs and PVCs
When the controller binds a PV to PVC, it saves both objects to etcd.
However, there is still an old version of these objects in the controller
Informer cache. So, when a new PVC comes, the PV is still seen as available
and may get bound to the new PVC. This will be blocked by etcd, still, it
creates unnecessary traffic that slows everything down.

Also, we save bound PV/PVC as two transactions - we save PV/PVC.Spec first
and then .Status. The controller gets "PV/PVC.Spec updated" event from etcd
and tries to fix the Status, as it seems to the controller it's outdated.
This write again fails - there already is a correct version in etcd.

We can't influence the Informer cache, it is read-only to the controller.

To prevent these useless writes to etcd, this patch introduces second cache
in the controller, which holds latest and greatest version on PVs and PVCs.
It gets updated with events from etcd *and* after etcd confirms successful
save of PV/PVC modified by the controller.

The cache stores only *pointers* to PVs/PVCs, so in ideal case it shares the
actual object data with the informer cache. They will diverge only when
the controller modifies something and the informer cache did not get update
events yet.
2016-05-19 16:09:06 +02:00
Jan Safranek
e9a6ec29a0 volume controller: use better operation names
Using volume/claim.UID in the operation name is not really useful, as UIDs
are not logged by rest of the controller. On the other hand, volume.Name and
claim.Namespace/Name is logged pretty often and it would help to log these
also in operation name.

This has been already proven to be very useful in controller debugging.
2016-05-19 14:19:33 +02:00
Jan Safranek
0ee9160f88 volume recycler: Don't start a new recycler pod if one already exists.
Recycling is a long duration process and when the recycler controller is
restarted in the meantime, it should not start a new recycler pod if there is
one already running.

This means that the recycler pod must have deterministic name based on name
of the recycled PV, we then get name conflicts when creating the pod.

Two things need to be changed:
- recycler controller and recycler plugins must pass the PV.Name to place,
  where the pod is created.

- create recycler pod with deterministic name and check "already exists" error.

When at it, remove useless 'resourceVersion' argument and make log messages
starting with lowercase.
2016-05-19 12:58:25 +02:00
Jan Safranek
61d630ddf7 volume controller: Fix method name in a log message
It's deleteVolume, not deleteClaim.
2016-05-19 12:54:17 +02:00
Jan Safranek
01b20d8e77 Generate shorter provisioned PV names.
GCE PD names are generated out of provisioned PV.Name, therefore it should be
as short as possible and still unique.
2016-05-18 10:06:51 +02:00
Jan Safranek
79b91b9ee0 Refactor persistent volume initialization
There should be only one initialization function, shared by the real
controller and unit tests.
2016-05-18 10:06:51 +02:00
Jan Safranek
7f549511e2 Big move and rename
- remove persistentvolume_ prefix from all files
- split controller.go into controller.go and controller_base.go (to have them
  under 1500 lines for github)
2016-05-18 10:06:51 +02:00
Jan Safranek
c5fe1f943c Fixed binder logging
- we need the original volume/claim in error paths
- don't report version conflicts as errors (they happen pretty often and we
  recover from them)
2016-05-18 10:06:51 +02:00
Jan Safranek
41adcc5496 Speed up binding of provisioned volumes
This fixes e2e test for provisioning - it expects that provisioned volumes
are bound quickly.

Majority of this patch is update of test framework needs to initialize the
controller appropriately.
2016-05-18 10:06:51 +02:00
Jan Safranek
c6f05c8056 provisioning: Add unit testso for provisioning errors. 2016-05-18 10:06:51 +02:00
Jan Safranek
c24b33793c unit test: Add possibility to inject kubeclient errors. 2016-05-18 10:06:51 +02:00
Jan Safranek
92dc159ab6 Delete provisioned volumes that are not needed.
We should delete volumes that are provisioned for a claim and the claim
gets bound to different volume during the provisioning.
2016-05-18 10:06:51 +02:00
Jan Safranek
9fb0f7a3fd provisioning: Unit tests 2016-05-18 10:06:51 +02:00
Jan Safranek
514d595881 provisioning: Implement provisioner 2016-05-18 10:06:51 +02:00
Jan Safranek
75b0e2ad63 provisioning: Refactor volume plugins.
NewPersistentVolumeTemplate() and Provision() are merged into one call.
2016-05-18 10:06:51 +02:00
Jan Safranek
dd7890c362 delete: Implement Deleter 2016-05-18 10:06:51 +02:00
Jan Safranek
22e68d4622 recycler: unit tests
- Add reclaim policy to newVolume() call.
- Implement reactor Volumes().Get().
- Implement mock volume plugin.
- Add recycler tests.
- Add a synchronization condition to controller.scheduleOperation
  - we need to pause the controller here, let the test to do some bad things
    to the controller and test error cases in recycleVolumeOperation.

Test framework gets more and more complicated... But this is the last piece,
I promise.
2016-05-18 10:06:24 +02:00
Jan Safranek
a08d826ca5 Make a separate functions to emit events and change status.
These two seem to be always used together.
2016-05-18 10:06:24 +02:00
Jan Safranek
1feb346830 recycler: implement recycler
Also update the old unit test to pass. New unit tests will be added in
subsequent commit.
2016-05-18 10:06:24 +02:00
Jan Safranek
56cae2dc20 unit test framework: Wait for all running operations to finish during all tests. 2016-05-18 10:06:24 +02:00
Jan Safranek
cf68370371 recycler: Maintain a list of long-running operations.
We need to keep list of running recyclers, deleters and provisioners in
memory in order not to start a new recycling/deleting/provisioning twice
for the same volume/claim.

This will be eventually replaced by GoRoutineMap from PR #24838.
2016-05-18 10:06:24 +02:00
Jan Safranek
4e47f69cba recycler: Implement volume host interfaces.
We need the controller to implement volume.VolumeHost interface to be able
to call recycle plugins.
2016-05-18 10:06:24 +02:00
Jan Safranek
a17f0d5949 Move release logic to standalone function. 2016-05-18 10:06:23 +02:00
Jan Safranek
7b73384fda Add controller method tests. 2016-05-18 10:05:14 +02:00
Jan Safranek
af295719f6 Add events. 2016-05-17 15:14:11 +02:00
Jan Safranek
61019b2401 Process deleted PVs
To speed up marking claims as "lost".
2016-05-17 15:14:10 +02:00
Jan Safranek
50b61ae168 Add "multi-sync" tests.
These test will call syncVolume/syncClaim until they reach consistent state.
2016-05-17 15:14:09 +02:00
Jan Safranek
f4f252e81c Implement syncVolume. 2016-05-17 15:14:08 +02:00
Jan Safranek
5949b956f5 Implement syncClaim with bound claims. 2016-05-17 15:14:06 +02:00
Jan Safranek
eff6b50b93 Bind unbound claims in syncClaim. 2016-05-17 15:14:06 +02:00
Jan Safranek
e620bfc9cc Add unit test framework.
It's quite complicated one, see subsequent commits for usage.
2016-05-17 15:14:05 +02:00
Jan Safranek
a195802d3e Make standalone function to check for (pre-)bound volumes.
Note the semantic change, we now check for UID=""
2016-05-17 15:14:04 +02:00
Jan Safranek
20305f9235 Don't process events until fully initialized.
We do not want to process any volume / claim events until both PV and claim
caches are fully loaded.
2016-05-17 15:14:03 +02:00
Jan Safranek
71aa892a86 Implement volume controller skeleton.
This is a simple controller that watches changes of PersistentVolumes and
PersistentVolumeClaims.
2016-05-17 15:14:02 +02:00
Jan Safranek
b86e5923b2 Rename types.go to persistentvolume_index.go
With some changes:
- make some method private, nobody seems to use them.
- adapt to framework.NewIndexerInformer instead of using custom cache.
2016-05-17 15:14:01 +02:00
Jan Safranek
6fa527a460 Remove all three PersistentVolume controllers.
We will add new ones gradually in smaller chunks.
2016-05-10 17:57:54 +02:00
Prashanth Balasubramanian
6bc3052551 PetSet alpha controller 2016-05-04 18:39:17 -07:00
gmarek
3171aac57c Generated clients can return their RESTClients, RESTClient can return its RateLimiter 2016-04-27 22:15:10 +02:00
Sami Wagiaalla
99744df6ee Add loging to the recently recycled PV section 2016-04-04 10:07:03 -04:00
Andy Goldstein
5ce2e6d576 Check claimRef UID when processing a recycled PV, take 2
Reorder code a bit so it doesn't allow a case where you get some error other than "not found"
combined with a non-nil Claim.

Add test case.
2016-03-28 15:25:17 -04:00
saadali
79012f6d53 Rename volume.Builder to Mounter and volume.Cleaner to Unmounter 2016-03-25 11:29:58 -07:00
Sami Wagiaalla
b77abe56a2 Check claimRef UID when processing a recycled PV 2016-03-16 17:00:05 -04:00