Commit Graph

3077 Commits

Author SHA1 Message Date
AxeZhan
2863b3d1ab Revert "refactor: simplify RunScorePlugins for readability + performance"
This reverts commit a7eb7ed5c6.
2023-07-20 10:50:32 +08:00
Kubernetes Prow Robot
15450a3f02 Merge pull request #119318 from codefromthecrypt/CycleState-docs
Improve docs on framework.CycleState
2023-07-18 07:19:10 -07:00
Adrian Cole
89ab733760 Improve docs on framework.CycleState
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Kante Yin <kerthcet@gmail.com>
2023-07-18 14:48:20 +08:00
Kensei Nakada
c7e7eee554 feature(scheduling_queue): track events per Pods (#118438)
* feature(sscheduling_queue): track events per Pods

* fix typos

* record events in one slice and make each in-flight Pod to refer it

* fix: use Pop() in test before AddUnschedulableIfNotPresent to register in-flight Pods

* eliminate MakeNextPodFuncs

* call Done inside the scheduling queue

* fix comment

* implement done() not to require lock in it

* fix UTs

* improve the receivedEvents implementation based on suggestions

* call DonePod when we don't call AddUnschedulableIfNotPresent

* fix UT

* use queuehint to filter out events for in-flight Pods

* fix based on suggestion from aldo

* fix based on suggestion from Wei

* rename lastEventBefore → previousEvent

* fix based on suggestion

* address comments from aldo

* fix based on the suggestion from Abdullah

* gate in-flight Pods logic by the SchedulingQueueHints feature gate
2023-07-17 15:53:07 -07:00
Kensei Nakada
34640772ed implement SchedulerQueueingHints feature gate 2023-07-14 12:31:27 +00:00
carlory
0599b3caa0 change the QueueingHintFn to pass a logger 2023-07-13 00:56:41 +08:00
Patrick Ohly
6f1a29520f scheduler/dra: reduce pod scheduling latency
This is a combination of two related enhancements:
- By implementing a PreEnqueue check, the initial pod scheduling
  attempt for a pod with a claim template gets avoided when the claim
  does not exist yet.
- By implementing cluster event checks, only those pods get
  scheduled for which something changed, and they get scheduled
  immediately without delay.
2023-07-12 11:17:04 +02:00
Patrick Ohly
e01db32573 scheduler util: handle cache.DeletedFinalStateUnknown in As
Informer callbacks must be prepared to get cache.DeletedFinalStateUnknown as
the deleted object. They can use that as hint that some information may have
been missed, but typically they just retrieve the stored object inside it.
2023-07-12 11:07:59 +02:00
Patrick Ohly
ef48efc736 scheduler dynamicresources: minor logging improvements
This makes some complex values a bit more readable.
2023-07-12 11:07:59 +02:00
Kubernetes Prow Robot
e0dafe57a3 Merge pull request #117351 from pohly/dra-generated-resource-claim-names
DRA: generated resource claim names
2023-07-11 10:33:11 -07:00
Patrick Ohly
444d23bd2f dra: generated name for ResourceClaim from template
Generating the name avoids all potential name collisions. It's not clear how
much of a problem that was because users can avoid them and the deterministic
names for generic ephemeral volumes have not led to reports from users. But
using generated names is not too hard either.

What makes it relatively easy is that the new pod.status.resourceClaimStatus
map stores the generated name for kubelet and node authorizer, i.e. the
information in the pod is sufficient to determine the name of the
ResourceClaim.

The resource claim controller becomes a bit more complex and now needs
permission to modify the pod status. The new failure scenario of "ResourceClaim
created, updating pod status fails" is handled with the help of a new special
"resource.kubernetes.io/pod-claim-name" annotation that together with the owner
reference identifies exactly for what a ResourceClaim was generated, so
updating the pod status can be retried for existing ResourceClaims.

The transition from deterministic names is handled with a special case for that
recovery code path: a ResourceClaim with no annotation and a name that follows
the Kubernetes <= 1.27 naming pattern is assumed to be generated for that pod
claim and gets added to the pod status.

There's no immediate need for it, but just in case that it may become relevant,
the name of the generated ResourceClaim may also be left unset to record that
no claim was needed. Components processing such a pod can skip whatever they
normally would do for the claim. To ensure that they do and also cover other
cases properly ("no known field is set", "must check ownership"),
resourceclaim.Name gets extended.
2023-07-11 14:23:48 +02:00
Kubernetes Prow Robot
c95b16b280 Merge pull request #118608 from utam0k/podtopologyspread-prescore-skip
Return Skip in PodTopologySpread#PreScore under specific conditions
2023-07-10 09:27:07 -07:00
Kubernetes Prow Robot
0ae9aaacfa Merge pull request #118271 from tangwz/add_nodeports_prefilter_skip_status
feat(NodePorts): return Skip status in PreFilter
2023-07-09 20:49:04 -07:00
Kubernetes Prow Robot
09899b986f Merge pull request #118926 from mengjiao-liu/improve-scheduler-use-cmp.Diff
scheduler test: Use cmp.Diff instead of reflect.DeepEqual for pkg/scheduler/internal/cache
2023-07-08 21:51:04 -07:00
Gunju Kim
7286d122fb Mark pods with restartable init containers as UnschedulableAndUnresolvable
This marks the pods with restartable init containers as
`UnschedulableAndUnresolvable` if the feature gate is disabled to avoid
the inconsistency in resource calculation between the scheduler and the
older kubelet.
2023-07-08 07:26:13 +09:00
kerthcet
c0eb0caf4a Support fine-gained rescheduling in ReservePlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 13:30:29 +08:00
Kubernetes Prow Robot
b07a843cb5 Merge pull request #119046 from kerthcet/fix/handle-unschedule-plugins
Fix fitError in Permit plugin not handled perfectly
2023-07-06 21:01:03 -07:00
kerthcet
278a8376e1 Fix: fiterror in permit plugin not handled perfectly
We only added failed plulgins, but actually this will not work unless
we make the status with a fitError because we only copy the failured plugins
to podInfo if it is a fitError

Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 10:35:59 +08:00
Kubernetes Prow Robot
aeed7da616 Merge pull request #119077 from sanposhiho/follow-up-hint
clean up the implementation around QueueingHintFn
2023-07-06 13:39:15 -07:00
Kensei Nakada
be0db3f93d clean up the implementation around QueueingHintFn 2023-07-06 16:07:39 +00:00
tangwz
1bf2f6c9c0 feat(NodePorts): return Skip status in PreFilter 2023-07-06 08:42:08 +08:00
Kubernetes Prow Robot
293c1b8378 Merge pull request #118025 from AxeZhan/score-metrics
feature(scheduler): plugin_evaluation_total metric support preScore/score
2023-07-05 05:14:56 -07:00
Mengjiao Liu
443bf3b01b scheduler test: Use cmp.Diff instead of reflect.DeepEqual for pkg/scheduler/internal/cache 2023-07-05 16:00:25 +08:00
Kubernetes Prow Robot
0852a2759a Merge pull request #118965 from mengjiao-liu/use-cmp.Diff-scheduler-queue
scheduler test: Use cmp.Diff instead of reflect.DeepEqual for pkg/scheduler/internal/queue/
2023-07-04 05:29:05 -07:00
Heba Elayoty
d548983dbb Use table-driven table for TestPerPodSchedulingMetrics
Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com>
2023-06-29 14:51:55 -07:00
Shingo Omura
d53762ec3a remove unnecessary comment in pkg/scheduler/framework.QueueingHintFn
event is not passed to QueueingHintFn but it exists a comment about it.
event is unnecessary in QueueingHintFn because QueueingHintFn is used in
ClusterEventWithHint and ClusterEventWithHint already have ClusterEvent.

Signed-off-by: Shingo Omura <everpeace@gmail.com>
2023-06-29 21:22:20 +09:00
Kubernetes Prow Robot
3a9c639d5a Merge pull request #118312 from mengjiao-liu/improve-scheduler-cache-test
scheduler: add test name and remove redundant test tables to improve cache_test.go
2023-06-29 02:51:36 -07:00
Mengjiao Liu
72294e4eff scheduler test: Use cmp.Diff instead of reflect.DeepEqual for pkg/scheduler/internal/queue/ 2023-06-29 15:28:42 +08:00
utam0k
ef26510164 Return Skip in PodTopologySpread#PreScore under specific conditions
Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-06-28 12:08:10 +00:00
Kubernetes Prow Robot
52457842d1 Merge pull request #117055 from cyclinder/csi_migration
remove CSI-migration gate
2023-06-28 04:28:31 -07:00
kidddddddddddddddddddddd
9c7166ff63 wait for eventhandlers to sync before run scheduler 2023-06-27 23:19:34 +08:00
Kubernetes Prow Robot
d9714078f8 Merge pull request #118551 from sanposhiho/event-to-register
feature(scheduler): implement ClusterEventWithHint to filter out useless events
2023-06-26 06:41:45 -07:00
Kensei Nakada
6f8d38406a feature(scheduler): implement ClusterEventWithHint to filter out useless events 2023-06-22 13:36:19 +00:00
Heba Elayoty
902c711fb4 Unset gated pod info timestamp in addToActiveQ
Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com>
2023-06-21 14:16:08 -07:00
Kubernetes Prow Robot
bc8e312857 Merge pull request #117903 from sourcelliu/dynamic
feature(DynamicResources): return Skip in PreFilter
2023-06-20 17:48:20 -07:00
Kubernetes Prow Robot
9740bc0e0a Merge pull request #118606 from sanposhiho/refactor-score
refactor: simplify RunScorePlugins for readability + performance
2023-06-13 21:41:57 -07:00
Mengjiao Liu
22de2c27d1 scheduler: improve cache_test.go
- Add test name to enhance test readability
- Remove redundant test tables
2023-06-12 19:02:50 +08:00
Kensei Nakada
a7eb7ed5c6 refactor: simplify RunScorePlugins for readability + performance 2023-06-11 03:29:05 +00:00
Kubernetes Prow Robot
4483bf66fe Merge pull request #116635 from mengjiao-liu/contextual-logging-plugin-interpodaffinity
Migrated `pkg/scheduler/framework/plugins/interpodaffinity` to contextual logging
2023-06-09 08:14:13 -07:00
Kubernetes Prow Robot
d58492b19c Merge pull request #114688 from sanposhiho/sanposhiho/scheduling-one-score
feature(schedule_one): use heap to find the highest score node
2023-06-08 15:40:12 -07:00
Kubernetes Prow Robot
2057a48ee5 Merge pull request #114771 from sanposhiho/scheduling_perf_scheduler_scheduling_attempt_duration_seconds
feature(scheduler_perf): distinguish result in scheduler_scheduling_attempt_duration_seconds metric result
2023-06-07 06:18:13 -07:00
Yuan Chen
9eaa50cc82 Rename scheduler queue variables for consistency 2023-06-05 09:02:06 -07:00
SataQiu
410b6023d6 scheduler: fix code style issues for pkg/scheduler 2023-06-05 17:29:49 +08:00
cyclinder
8e4228a8c1 remove CSI-migration gate 2023-06-04 18:40:17 +08:00
Kensei Nakada
a4ea058cc7 feature(scheduler_perf): distinguish result in scheduler_scheduling_attempt_duration_seconds metric result 2023-06-02 14:45:55 +00:00
Mengjiao Liu
6d23da045f Migrated pkg/scheduler/framework/plugins/interpodaffinity to use contextual logging 2023-06-01 18:24:54 +08:00
likakuli
5a14573258 clean: use info instead of error to log queue closed message when scheduler exit
Signed-off-by: likakuli <1154584512@qq.com>
2023-05-31 11:07:24 +08:00
Mengjiao Liu
074900e81b scheduler: update the scheduler interface and cache methods to use contextual logging 2023-05-29 13:26:32 +08:00
Kensei Nakada
0535e74224 feature(schedule_one): use heap to find the highest score node 2023-05-27 11:34:32 +00:00
Kubernetes Prow Robot
f7cfb5f02f Merge pull request #118257 from pohly/dra-scheduler-plugin-loopvar-fix
dra scheduler plugin test: fix loopvar bug and "reserve" expected data
2023-05-26 06:06:53 -07:00