These events are currently emitted for a pod using a generic ephemeral volume:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3s default-scheduler 0/1 nodes are available: 1 persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume" not found.
Warning FailedScheduling 2s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
The one about "persistentvolumeclaim not found" is potentially confusing. It
occurs because the scheduler typically checks the pod before the ephemeral
volume controller had a chance to create the PVC.
This is a bit easier to understand:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4s default-scheduler 0/1 nodes are available: 1 waiting for ephemeral volume controller to create the persistentvolumeclaim "my-csi-app-inline-volume-my-csi-volume".
Warning FailedScheduling 2s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
This metrics is measured in seconds so it makes no sense starting from
1000 as init value. This breaks also the scheduler e2e metric thus make
users unable to compute, for example, their SLO for the scheduler.
Even if this metric is deprecated, it should behave correctly until it is
completely removed to avoid user confusion.
For example, for each volume created, the minimum value exposed
as a metric is 16.6min (1000sec/60) which is obviously wrong as logic.
In this commit, we migrate bucket creation to start from reasonable
numbers, copying the incrementation from the conventions that the
scheduler follows itself.
Signed-off-by: dntosas <ntosas@gmail.com>
All dependencies of VolumeBinding plugin from
"k8s.io/kubernetes/pkg/controller/volume/scheduling" package moved to
"k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding" package:
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache.go
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache_test.go
- whole file pkg/controller/volume/scheduling/scheduler_binder.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_fake.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_test.go
Package "k8s.io/kubernetes/pkg/controller/volume/scheduling/metrics" moved
to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding/metrics"
because it only used in VolumeBinding plugin and (e2e) tests.
More described in issue #89930 and PR #102953.
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
This PR also looses the check to allow zero since the API doc has
explained that value zero indicates no waiting.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The implementation consists of
- identifying all places where VolumeSource.PersistentVolumeClaim has
a special meaning and then ensuring that the same code path is taken
for an ephemeral volume, with the ownership check
- adding a controller that produces the PVCs for each embedded
VolumeSource.EphemeralVolume
- relaxing the PVC protection controller such that it removes
the finalizer already before the pod is deleted (only
if the GenericEphemeralVolume feature is enabled): this is
needed to break a cycle where foreground deletion of the pod
blocks on removing the PVC, which waits for deletion of the pod
The controller was derived from the endpointslices controller.
And give ownership to pkg/scheduler/framework/plugins/volumebinding
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I4bd89b1745a2be0e458601056ab905bdd6692195
This uses the information provided by a CSI driver deployment for
checking whether a node has access to enough storage to create the
currently unbound volumes, if the CSI driver opts into that checking
with CSIDriver.Spec.VolumeCapacity != false.
This resolves a TODO from commit 95b530366a.
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
The scheduler doesn't really need to know in detail which reasons
rendered a node unusable for a node. All it needs from the volume
binder is a list of reasons that it then can present to the user.
This seems a bit cleaner. But the main reason for the change is that
it simplifies the checking of CSI inline volumes and perhaps later
capacity checking. Both will lead to new failure reasons, which then
can be added without changing the interface.