Merge pull request #55933 from bsalamat/starvation3
Automatic merge from submit-queue (batch tested with PRs 54316, 53400, 55933, 55786, 55794). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Add support to take nominated pods into account during scheduling to avoid starvation of higher priority pods **What this PR does / why we need it**: When a pod preempts lower priority pods, the preemptor gets a "nominated node name" annotation. We call such a pod a nominated pod. This PR adds the logic to take such nominated pods into account when scheduling other pods on the same node that the nominated pod is expected to run. This is needed to avoid starvation of preemptor pods. Otherwise, lower priority pods may fill up the space freed after preemption before the preemptor gets a chance to get scheduled. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes #54501 **Special notes for your reviewer**: This PR is built on top of #55109 and includes all the changes there as well. **Release note**: ```release-note Add support to take nominated pods into account during scheduling to avoid starvation of higher priority pods. ``` /sig scheduling ref/ #47604
This commit is contained in:
@@ -49,8 +49,9 @@ type ScheduleAlgorithm interface {
|
||||
Schedule(*v1.Pod, NodeLister) (selectedMachine string, err error)
|
||||
// Preempt receives scheduling errors for a pod and tries to create room for
|
||||
// the pod by preempting lower priority pods if possible.
|
||||
// It returns the node where preemption happened, a list of preempted pods, and error if any.
|
||||
Preempt(*v1.Pod, NodeLister, error) (selectedNode *v1.Node, preemptedPods []*v1.Pod, err error)
|
||||
// It returns the node where preemption happened, a list of preempted pods, a
|
||||
// list of pods whose nominated node name should be removed, and error if any.
|
||||
Preempt(*v1.Pod, NodeLister, error) (selectedNode *v1.Node, preemptedPods []*v1.Pod, cleanupNominatedPods []*v1.Pod, err error)
|
||||
// Predicates() returns a pointer to a map of predicate functions. This is
|
||||
// exposed for testing.
|
||||
Predicates() map[string]FitPredicate
|
||||
|
Reference in New Issue
Block a user