Commit Graph

651 Commits

Author SHA1 Message Date
AxeZhan
be48c93689 Sched framework: expose NodeInfo in all functions of PluginsRunner interface 2023-12-15 11:30:06 +08:00
Kubernetes Prow Robot
4189053453
Merge pull request #121755 from kerthcet/fix/node-update-event
Fix nodeUpdate event missing some potential changes
2023-12-13 22:36:03 +01:00
Kensei Nakada
3b8f25dfdd fix: disable SchedulerQueueingHints feature flag by default 2023-12-13 04:16:43 +00:00
kerthcet
e5b86c1034 Fix node update event will miss some potential changes
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-27 15:33:47 +08:00
kerthcet
f77a4543d1 Unregister events in schedulingGates plugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-06 10:01:13 +08:00
Mengjiao Liu
b0a73213d6 kube-scheduler: convert the remaining part to use contextual logging 2023-10-24 17:56:48 +08:00
Kubernetes Prow Robot
6d7d249372
Merge pull request #121077 from chrishenzie/readwriteoncepod-ga
Graduate ReadWriteOncePod to GA
2023-10-24 05:26:05 +02:00
Kubernetes Prow Robot
441d4b54ae
Merge pull request #120397 from ty-dc/StaticCheck
cleanup: omit comparison with bool constants
2023-10-24 05:25:52 +02:00
Kubernetes Prow Robot
5a4e792e06
Merge pull request #120534 from pohly/dra-scheduler-ssa-as-fallback
dra scheduler: fall back to SSA for PodSchedulingContext updates
2023-10-23 21:06:58 +02:00
Kubernetes Prow Robot
581552eaf0
Merge pull request #116065 from sanposhiho/match-label-key-alternative
feature(scheduler): implement matchLabelKeys in PodAffinity and PodAntiAffinity
2023-10-23 18:39:13 +02:00
Chris Henzie
2dbd405583 Graduate ReadWriteOncePod to GA 2023-10-20 10:40:39 -07:00
Kensei Nakada
cb5dc46edf feature(scheduler): simplify QueueingHint by introducing new statuses 2023-10-19 11:02:11 +00:00
Kensei Nakada
d5d3c26337 feature(scheduler): implement matchLabelKeys in PodAffinity and PodAntiAffinity 2023-10-18 11:28:02 +00:00
Kubernetes Prow Robot
3ac83f528d
Merge pull request #119290 from carlory/add-logger
the scheduling queue logs the error and treats it as QueueAfterBackoff
2023-09-22 08:10:49 -07:00
carlory
0105a002bc when the hint fn returns error, the scheduling queue logs the error and treats it as QueueAfterBackoff.
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

Co-authored-by: Kante Yin <kerthcet@gmail.com>

Co-authored-by: XsWack <xushiwei5@huawei.com>
2023-09-21 09:40:44 +08:00
Mengjiao Liu
a7466f44e0 Change the scheduler plugins PluginFactory function to use context parameter to pass logger
- Migrated pkg/scheduler/framework/plugins/nodevolumelimits to use contextual logging
- Fix golangci-lint validation failed
- Check for plugins creation err
2023-09-20 17:49:54 +08:00
Patrick Ohly
7cac1dcf67 dra scheduler: fall back to SSA for PodSchedulingContext updates
During scheduler_perf testing, roughly 10% of the PodSchedulingContext update
operations failed with a conflict error. Using SSA would avoid that, but
performance measurements showed that this causes a considerable
slowdown (primarily because of the slower encoding with JSON instead of
protobuf, but also because server-side processing is more expensive).

Therefore a normal update is tried first and SSA only gets used when there has
been a conflict. Using SSA in that case instead of giving up outright is better
because it avoids another scheduling attempt.
2023-09-15 15:05:38 +02:00
Kensei Nakada
0d3eafdfa3
fix(scheduling_queue): always put Pods with no unschedulable plugins into activeQ/backoffQ (#119105)
* always put Pods with no unschedulable plugins into activeQ/backoffQ

* address review comments
2023-09-11 09:30:11 -07:00
Kensei Nakada
87d49a51be fix(queue_test): make sure the first bind failure via counter 2023-09-06 19:47:54 +00:00
tao.yang
b35357b6c0 cleanup: omit comparison with bool constants
Signed-off-by: tao.yang <tao.yang@daocloud.io>
2023-09-05 10:24:38 +08:00
kerthcet
e6dfdb240f Output the error message for better analylsis
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-09-04 20:31:48 +08:00
kerthcet
c29234d3e1 Add integration tests for all bind plugins skipped in TestBindPlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-28 22:05:58 +08:00
Patrick Ohly
2472291790 api: introduce separate VolumeResourceRequirements struct
PVC and containers shared the same ResourceRequirements struct to define their
API. When resource claims were added, that struct got extended, which
accidentally also changed the PVC API. To avoid such a mistake from happening
again, PVC now uses its own VolumeResourceRequirements struct.

The `Claims` field gets removed because risk of breaking someone is low:
theoretically, YAML files which have a claims field for volumes now
get rejected when validating against the OpenAPI. Such files
have never made sense and should be fixed.

Code that uses the struct definitions needs to be updated.
2023-08-21 15:31:28 +02:00
SataQiu
ef7d404702 using wait.PollUntilContextTimeout instead of deprecated wait.Poll for pkg/scheduler
using wait.PollUntilContextTimeout instead of deprecated wait.Poll for test/integration/scheduler

using wait.PollUntilContextTimeout instead of deprecated wait.Poll for test/e2e/scheduling

using wait.ConditionWithContextFunc for PodScheduled/PodIsGettingEvicted/PodScheduledIn/PodUnschedulable/PodSchedulingError
2023-08-17 17:25:09 +08:00
git-jxj
a5b3a4b738
cleanup: Update deprecated FromInt to FromInt32 (#119858)
* redo commit

* apply suggestions from liggitt

* update Parse function based on suggestions
2023-08-16 09:33:01 -07:00
Kubernetes Prow Robot
130a5a423f
Merge pull request #119785 from sanposhiho/waitonpermit-fiterror
fix: register the plugin rejects Pods in WaitOnPermit to UnschedulablePlugins
2023-08-15 23:13:04 -07:00
Kubernetes Prow Robot
57212647e9
Merge pull request #119769 from Huang-Wei/bug/prefilter-preemption
Fix a bug that PostFilter plugin may don't function if previous PreFilter plugins return Skip
2023-08-15 23:12:50 -07:00
Kubernetes Prow Robot
67c33faddd
Merge pull request #117631 from skitt/intstr-fromint32-testing
Test: use new intstr functions
2023-08-15 15:16:27 -07:00
Kensei Nakada
cf3f0bd778 fix: register the plugin rejects Pods in WaitOnPermit to UnschedulablePlugins 2023-08-12 07:18:01 +00:00
Wei Huang
765f3916c2
Fix a bug that PostFilter plugin may not function if previous PreFilter plugins return Skip 2023-08-10 13:43:00 -07:00
Kensei Nakada
050c0437e6 fix: broadcast when pod is pushed back to activeQ directly in AddUnschedulableIfNotPresent 2023-08-09 03:32:14 +00:00
Kensei Nakada
c7e7eee554
feature(scheduling_queue): track events per Pods (#118438)
* feature(sscheduling_queue): track events per Pods

* fix typos

* record events in one slice and make each in-flight Pod to refer it

* fix: use Pop() in test before AddUnschedulableIfNotPresent to register in-flight Pods

* eliminate MakeNextPodFuncs

* call Done inside the scheduling queue

* fix comment

* implement done() not to require lock in it

* fix UTs

* improve the receivedEvents implementation based on suggestions

* call DonePod when we don't call AddUnschedulableIfNotPresent

* fix UT

* use queuehint to filter out events for in-flight Pods

* fix based on suggestion from aldo

* fix based on suggestion from Wei

* rename lastEventBefore → previousEvent

* fix based on suggestion

* address comments from aldo

* fix based on the suggestion from Abdullah

* gate in-flight Pods logic by the SchedulingQueueHints feature gate
2023-07-17 15:53:07 -07:00
carlory
0599b3caa0 change the QueueingHintFn to pass a logger 2023-07-13 00:56:41 +08:00
kerthcet
47ef977ddd Direct reference to the packages
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-08 12:03:46 +08:00
kerthcet
c0eb0caf4a Support fine-gained rescheduling in ReservePlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 13:30:29 +08:00
kerthcet
278a8376e1 Fix: fiterror in permit plugin not handled perfectly
We only added failed plulgins, but actually this will not work unless
we make the status with a fitError because we only copy the failured plugins
to podInfo if it is a fitError

Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 10:35:59 +08:00
Kubernetes Prow Robot
d9714078f8
Merge pull request #118551 from sanposhiho/event-to-register
feature(scheduler): implement ClusterEventWithHint to filter out useless events
2023-06-26 06:41:45 -07:00
Kensei Nakada
6f8d38406a feature(scheduler): implement ClusterEventWithHint to filter out useless events 2023-06-22 13:36:19 +00:00
Kensei Nakada
be14b026e3 fix the integration test 2023-06-11 04:47:18 +00:00
Mengjiao Liu
074900e81b scheduler: update the scheduler interface and cache methods to use contextual logging 2023-05-29 13:26:32 +08:00
Patrick Ohly
9e9a6cde4b test/integration/scheduler: fix data races
The plugins get called by scheduler goroutines. At least the polling seems to
be done concurrently and thus needs locking.

Locking the PreBindPlugin state is less obvious. It might be that the scheduler
is really done with the test pod, but that ordering doesn't seem to be enough
for the race detector. It's simpler to add mutex locking.
2023-05-17 17:10:09 +02:00
Stephen Kitt
3418ceaca6
test: replace intstr.FromInt with intstr.FromInt32
This touches cases where FromInt() is used on numeric constants, or
values which are already int32s, or int variables which are defined
close by and can be changed to int32s with little impact.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-05-10 09:34:16 +02:00
Kante Yin
859359ad6a Fix strict linting
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-05-04 10:25:10 +08:00
Kante Yin
a7035f5459 Pass Context to StartTestServer
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-05-04 10:25:09 +08:00
Kante Yin
2d866ec2fc Teardown only scheduler in integration tests
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-05-04 10:09:24 +08:00
Aldo Culquicondor
d1dfa89953
Add integration test for DefaultBinder
Change-Id: I71ea08104024403a7d9ebcf3725fc3ff17997229
2023-03-14 13:57:11 -04:00
Andrea Tosatto
cae19f9e85 Remove deprecated pod-eviction-timeout flag from controller-manager 2023-03-07 18:14:18 +00:00
kerthcet
e5c812bbe7 Remove CLI flag enable-taint-manager
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-03-07 18:11:49 +00:00
Kante Yin
014be8444a Make sure resoruces will be cleaned up when initializing error
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-02-17 17:10:38 +08:00
Chris Henzie
dbc7d8ded0 feat: support preemption for pods using ReadWriteOncePod PVCs
PVCs using the ReadWriteOncePod access mode can only be referenced by a
single pod. When a pod is scheduled that uses a ReadWriteOncePod PVC,
return "Unschedulable" if the PVC is already in-use in the cluster.

To support preemption, the "VolumeRestrictions" scheduler plugin
computes cycle state during the PreFilter phase. This cycle state
contains the number of references to the ReadWriteOncePod PVCs used by
the pod-to-be-scheduled.

During scheduler simulation (AddPod and RemovePod), we add and remove
reference counts from the cycle state if they use any of these
ReadWriteOncePod PVCs.

In the Filter phase, the scheduler checks if there are any PVC reference
conflicts, and returns "Unschedulable" if there is a conflict.

This is a required feature for the ReadWriteOncePod beta. See for more context:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/2485-read-write-once-pod-pv-access-mode#beta
2023-01-30 10:59:22 -08:00