The implementation consists of
- identifying all places where VolumeSource.PersistentVolumeClaim has
a special meaning and then ensuring that the same code path is taken
for an ephemeral volume, with the ownership check
- adding a controller that produces the PVCs for each embedded
VolumeSource.EphemeralVolume
- relaxing the PVC protection controller such that it removes
the finalizer already before the pod is deleted (only
if the GenericEphemeralVolume feature is enabled): this is
needed to break a cycle where foreground deletion of the pod
blocks on removing the PVC, which waits for deletion of the pod
The controller was derived from the endpointslices controller.
endpointSliceTracker creates a set of resource versions for each
service, the resource versions in the set could be deleted when
endpointslices are deleted, but the set and its key in the map is never
deleted, leading to memory leak.
This patch deletes the set if the service is deleted, and stops
initializing an empty set when "read-only" methods "Has" and "Stale" are
called.
And give ownership to pkg/scheduler/framework/plugins/volumebinding
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I4bd89b1745a2be0e458601056ab905bdd6692195
Currently kube-proxy defaults the min-sync-period for
iptables to 0. However, as explained by Dan Winship,
"With minSyncPeriod: 0, you run iptables-restore 100 times.
With minSyncPeriod: 1s , you run iptables-restore once.
With minSyncPeriod: 10s , you also run iptables-restore once,
but you might have to wait 10 seconds first"
Previously, we didn't check the contents of the result after calling out
to the plugin endpoint. This could have resulted in errors if the plugin
returned either 'nil' or an empty result. This patch fixes this.
Previously, we were passing the variable 'devices' to this function,
when we should have been passing 'allocated'. This bug crept in due to a
variable name change that didn't propogate its way through the entire
function. The tests added in the previous commit would have caught this.
If no potential victims could be found, there is no need to evaluate the node
again, since its state didn't change.
It's safe to return and thus prevent scheduling from running the filter plugins
again.
NOTE:
A node that is filtered out by filter plugins could pass the filter plugins if
there is a change on that node, i.e. pods termination on that node.
Previously, this could be either caught by the normal `schedule` or `preempt` (pods
are terminated when the preemption logic tries to find the nodes and re-evaluate
the filter plugins.)
Actually, this shouldn't be taken care by the preemption, consider the routine
of `schedule` is always running when the interval is "zero", let `schedule`
take care of it will release `preempt` from something irrelevant with the `preemption`.
Due to above reason, couple of testcase as well as the logic of checking the existence
of victim pods are removed as it will never happen after the change.
Signed-off-by: Dave Chen <dave.chen@arm.com>