Currently, the status code returned is `Unschedulable` when an internal error
found, the `Unschedulable` status is built from a `FitError` which means no
fit nodes found without a internal error.
Instead of build an Unschedulable status from the `FitError`, return the
Error status directly.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The feature gate gets locked to "true", with the goal to remove it in two
releases.
All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.
Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
The name concatenation and ownership check were originally considered small
enough to not warrant dedicated functions, but the intent of the code is more
readable with them.
Previously, the situation was ignored, which might have had the effect that Pod
scheduling continued (?) even though the Pod+PVC weren't known to be in an
acceptable state.
When adding the ephemeral volume feature, the special case for
PersistentVolumeClaim volume sources in kubelet's host path and node
limits checks was overlooked. An ephemeral volume source is another
way of referencing a claim and has to be treated the same way.
UnschedulableAndUnresolvable
This change adds an additional check in the volumebinding scheduler
plugin to handle PVC with phase ClaimLost which will allow the
scheduler to return UnschedulableAndUnresolvable during the PreFilter
stage and skip the rest of the node evaluation since the PVC is
bound to a PV that does not exist.
Without this change, the FailedScheduling error message would look like:
0/10 nodes are available: 2 node(s) had taint {node/test: true},
that the pod didn't tolerate, 6 node(s) had taint {node/unhealthy: true},
that the pod didn't tolerate, 2 pvc(s) bound to non-existent pv(s)
Which is still evaluating every single node to determine that the pod
cannot be scheduled because the PVC is bound to a non-existent PV
With this change, the FailedScheduling error message would look like:
0/10 nodes are available: 1 persistentvolumeclaim "foo" bound
to non-existent persistentvolume "bar"
Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
This change will make the message more clear when there
is a case of PVC(s) bound to PV(s) that no longer exists
and scheduler does not select the node due to this issue.
Previous error message would look like:
0/2 nodes are available: 2 pvc(s) bound to non-existent pv(s)
Updated message looks like:
0/2 nodes are available: 2 node(s) unavailable due to one or more
pvc(s) bound to non-existent pv(s)
For larger clusters with many different reasons of nodes that
are not available, the current message can be very misleading for
users to think that there are many PVCs lost due to PVs deleted but
in fact it could be just a single PVC case but many nodes not selected
by the scheduler due to this case.
Signed-off By: Yibo Zhuang <yibzhuang@gmail.com>
Checking for all topology labels is not backwards compatible. Clusters were nodes don't have zone labels effectively have default spreading disabled.
Change only applies to system defaults.
Some plugins expect the new feature gate struct. We can inject that additional
parameter via a helper function instead of having to repeat the same anonymous
function for each plugin.