UnschedulableAndUnresolvable
This change adds an additional check in the volumebinding scheduler
plugin to handle PVC with phase ClaimLost which will allow the
scheduler to return UnschedulableAndUnresolvable during the PreFilter
stage and skip the rest of the node evaluation since the PVC is
bound to a PV that does not exist.
Without this change, the FailedScheduling error message would look like:
0/10 nodes are available: 2 node(s) had taint {node/test: true},
that the pod didn't tolerate, 6 node(s) had taint {node/unhealthy: true},
that the pod didn't tolerate, 2 pvc(s) bound to non-existent pv(s)
Which is still evaluating every single node to determine that the pod
cannot be scheduled because the PVC is bound to a non-existent PV
With this change, the FailedScheduling error message would look like:
0/10 nodes are available: 1 persistentvolumeclaim "foo" bound
to non-existent persistentvolume "bar"
Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
This change will make the message more clear when there
is a case of PVC(s) bound to PV(s) that no longer exists
and scheduler does not select the node due to this issue.
Previous error message would look like:
0/2 nodes are available: 2 pvc(s) bound to non-existent pv(s)
Updated message looks like:
0/2 nodes are available: 2 node(s) unavailable due to one or more
pvc(s) bound to non-existent pv(s)
For larger clusters with many different reasons of nodes that
are not available, the current message can be very misleading for
users to think that there are many PVCs lost due to PVs deleted but
in fact it could be just a single PVC case but many nodes not selected
by the scheduler due to this case.
Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
All dependencies of VolumeBinding plugin from
"k8s.io/kubernetes/pkg/controller/volume/scheduling" package moved to
"k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding" package:
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache.go
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache_test.go
- whole file pkg/controller/volume/scheduling/scheduler_binder.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_fake.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_test.go
Package "k8s.io/kubernetes/pkg/controller/volume/scheduling/metrics" moved
to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding/metrics"
because it only used in VolumeBinding plugin and (e2e) tests.
More described in issue #89930 and PR #102953.
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
The scheduler doesn't really need to know in detail which reasons
rendered a node unusable for a node. All it needs from the volume
binder is a list of reasons that it then can present to the user.
This seems a bit cleaner. But the main reason for the change is that
it simplifies the checking of CSI inline volumes and perhaps later
capacity checking. Both will lead to new failure reasons, which then
can be added without changing the interface.