Commit Graph

1473 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
84b4972903
Merge pull request #122317 from pacoxu/revert-122058-scheduler-first-fit
Revert "Scheduler first fit"
2023-12-14 12:57:34 +01:00
Kubernetes Prow Robot
68ef2423e8
Merge pull request #122288 from sanposhiho/revert-qhint-nodeunschedulable
Revert "scheduler/NodeUnschedulable: reduce pod scheduling latency"
2023-12-14 10:49:19 +01:00
Paco Xu
1160521a4f
Revert "Scheduler first fit" 2023-12-14 17:27:25 +08:00
Kubernetes Prow Robot
7e4751964e
Merge pull request #122080 from SataQiu/clean-test-20231128
clean up unused parameters for volume zone unit test
2023-12-14 06:17:04 +01:00
Kubernetes Prow Robot
84424a8c19
Merge pull request #122068 from caohe/fix-multi-point
fix(scheduler): fix incorrect loop logic in MultiPoint to avoid a plugin being loaded multiple times
2023-12-14 05:10:37 +01:00
Kubernetes Prow Robot
517091cdc5
Merge pull request #122058 from aleksandra-malinowska/scheduler-first-fit
Scheduler first fit
2023-12-14 05:10:19 +01:00
Kubernetes Prow Robot
b155d51f97
Merge pull request #122041 from uniemimu/cleanup
remove unnecessary fmt.Sprintf call
2023-12-14 05:10:11 +01:00
Kubernetes Prow Robot
5322af7f9e
Merge pull request #122022 from sanposhiho/extender-fix
fix: requeue pods rejected by Extenders properly
2023-12-14 05:10:01 +01:00
Kubernetes Prow Robot
f708c47469
Merge pull request #122017 from sanposhiho/doc-shared-lister
fix(doc): elaborate the documentation of SnapshotSharedLister
2023-12-14 05:09:52 +01:00
Kubernetes Prow Robot
de2f38f8a8
Merge pull request #122014 from sanposhiho/owner
put storage related plugins under SIG-Storage reviewing
2023-12-14 05:09:43 +01:00
Kubernetes Prow Robot
f4240cbf92
Merge pull request #121953 from utam0k/not-found-pvc-lister
return not-found errors properly from fake listers
2023-12-14 03:53:45 +01:00
Kubernetes Prow Robot
fb011badd7
Merge pull request #121670 from kerthcet/bug/add-status-message
Fix empty status message in logging
2023-12-13 22:34:51 +01:00
Kubernetes Prow Robot
badc4102ac
Merge pull request #121572 from Prateek462003/myFeature
Added Logging for all the enabled plugins in each extension point
2023-12-13 22:34:06 +01:00
Kubernetes Prow Robot
e13f098c8e
Merge pull request #121387 from KunWuLuan/SidercarContainerChecking
move SidecarContainers featureGate checking
2023-12-13 21:26:09 +01:00
Kubernetes Prow Robot
74afd1a06f
Merge pull request #119539 from HirazawaUi/remove-not-register-event-code
remove unregistered event code
2023-12-13 21:25:33 +01:00
Kensei Nakada
7aeecc42a4 Revert "scheduler/NodeUnschedulable: reduce pod scheduling latency"
This reverts commit 28dbe8a34d.
2023-12-13 03:18:02 +00:00
Kensei Nakada
329b873e4e Revert "scheduler/nodeaffinity: reduce pod scheduling latency"
This reverts commit 1d88bf9789.
2023-12-13 02:57:45 +00:00
carlory
9e1adced5d noderesourcefit: scheduler queueing hints
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>
2023-12-13 10:02:52 +08:00
AxeZhan
210ed2ebbd add preScore for volumeBinding 2023-12-06 15:35:35 +08:00
caohe
1f5738df84 fix(scheduler): fix incorrect loop logic in MultiPoint to avoid a plugin being loaded multiple times
Signed-off-by: caohe <caohe9603@gmail.com>
2023-11-29 20:14:18 +08:00
Aleksandra Malinowska
199dc03bdd Don't evaluate extra nodes if there's no score plugin defined 2023-11-28 10:39:49 +01:00
SataQiu
c86189d1de clean up unused parameters for volume zone unit test 2023-11-28 11:37:41 +08:00
hub-Prateek
a601ebd6b6 Changed the log message 2023-11-26 11:41:42 +05:30
Ukri Niemimuukko
02b0dc98dd remove unnecessary fmt.Sprintf call
default_preemtion.go has an unnecessary fmt.Sprintf call which
triggers common code checkers. This removes it.

Signed-off-by: Ukri Niemimuukko <ukri.niemimuukko@intel.com>
2023-11-25 21:36:04 +02:00
hub-Prateek
eb45a8f2f5 Added comments 2023-11-24 11:01:15 +05:30
hub-Prateek
76be319571 Optimzed the code 2023-11-24 10:58:33 +05:30
hub-Prateek
5c99f3a24e Logged the return value of ListPlugins 2023-11-24 00:19:42 +05:30
Kensei Nakada
468e2dac81 fix: requeue pods rejected by Extenders properly 2023-11-23 13:20:02 +00:00
Kensei Nakada
03b8241fce put storage related plugins under SIG-Storage reviewing 2023-11-23 08:35:49 +00:00
Kensei Nakada
4d9df1134f fix(doc): elaborate the documentation of SnapshotSharedLister 2023-11-23 08:19:26 +00:00
Kensei Nakada
52ff7f8dc0 fix(framework): elaborate the document on PostFilter 2023-11-23 05:18:49 +00:00
hub-Prateek
9cb2d1cf6d Removed Comments 2023-11-22 22:32:19 +05:30
hub-Prateek
1dca49157a Utilized ListPlugins method 2023-11-22 02:13:55 +05:30
utam0k
aba817ac1d
return not-found errors properly from fake listeres
Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-11-20 19:14:08 +09:00
Kensei Nakada
005e85c4d3
fix(framework): remove the mention about what happens with nil from EventsToRegister 2023-11-18 15:47:31 +09:00
lianghao208
34e620d18c Support score extension function in preemption. 2023-11-15 15:28:20 +08:00
Kubernetes Prow Robot
5ce0bd95cc
Merge pull request #121677 from kerthcet/cleanup/remove-evnet
Unregister events in schedulingGates for performance
2023-11-10 05:03:33 +01:00
kunwuluan
a00a610d15
move SidecarContainers featureGate checking
to PreFilter

Signed-off-by: KunWuLuan <kunwuluan@gmail.com>
2023-11-06 10:46:52 +08:00
kerthcet
f77a4543d1 Unregister events in schedulingGates plugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-06 10:01:13 +08:00
Patrick Ohly
2a23061f6c scheduler: fix performance regression at -v3 + contextual logging
The logging instrumentation for contextual logging that was added for 1.29
slowed down the scheduler (i.e. logging verbosity <= 3) by a significant
percentage (-28.66% for SchedulingBasic/5000Nodes at -v3) if (and only if!)
contextual logging was enabled.

Retrieving the logger from the context causes no measurable slowdown, it's only
the various WithName/WithValues calls which cause this.

By being more careful about when to use those, the performance impact can be
avoided:
- At -v3 or lower, only `WithValues("pod")` is used once per scheduling cycle.
  This has the intended effect that all log messages for the cycle include the
  pod information. Once contextual logging is GA, "pod" key/value pairs can
  be removed from all log calls.
- At -v4 or higher, richer log entries get produced where `WithValues` is also
  used for the node (when applicable) and `WithName` is used for the current
  operation and plugin.

With these changes, enabling contextual logging causes no measurable slowdown
at -v3 or lower. At -v4, the slowdown depends on the test case (-30.51%
throughput for SchedulingBasic/5000Nodes, no change for
SchedulingCSIPVs/5000Nodes). For some unknown reason (measuring bias?),
SchedulingCSIPVs/500Nodes has a ~3& *higher* throughput with contextual
logging.
2023-11-03 17:28:55 +01:00
kerthcet
5bf63036c7 Make EnablePodSchedulingReadiness public
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-03 11:44:56 +08:00
hub-Prateek
7b60e7e2a3 Added plugins enabled at each extension point 2023-11-01 23:03:13 +05:30
kerthcet
fade7463cd Add String() to framework status
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-01 17:01:36 +08:00
Kubernetes Prow Robot
d84ee0ba69
Merge pull request #121632 from kerthcet/fix/runscoreplugins
Fix panic when process RunScorePlugins for cap out of range
2023-10-31 13:14:32 +01:00
kerthcet
b02aad42fa Fix panic when process RunScorePlugins for cap out of range
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-10-31 16:02:16 +08:00
Kensei Nakada
c7842d9c63 narrow down the scope of EnqueueExtensions to subscribe less cluster events 2023-10-27 14:14:37 +00:00
Kubernetes Prow Robot
fd5c406112
Merge pull request #120933 from mengjiao-liu/contextual-logging-scheduler-remaining-part
kube-scheduler: convert the remaining part to use contextual logging
2023-10-27 10:30:58 +02:00
Kubernetes Prow Robot
53e91942bf
Merge pull request #117780 from sourcelliu/schedulinggates
Improve the performance of schedulinggates
2023-10-26 18:14:32 +02:00
Kubernetes Prow Robot
2749509f35
Merge pull request #121469 from sanposhiho/renamerename
cleanup: rename failedPlugin to plugin in framework.Status
2023-10-25 20:16:43 +02:00
Kensei Nakada
27bb66fd7b cleanup: rename failedPlugin to plugin in framework.Status 2023-10-25 12:03:56 +00:00
Kubernetes Prow Robot
9aa04752e7
Merge pull request #118463 from testwill/replace_loop
chore: slice replace loop
2023-10-24 15:04:39 +02:00
Mengjiao Liu
b0a73213d6 kube-scheduler: convert the remaining part to use contextual logging 2023-10-24 17:56:48 +08:00
Kubernetes Prow Robot
6d7d249372
Merge pull request #121077 from chrishenzie/readwriteoncepod-ga
Graduate ReadWriteOncePod to GA
2023-10-24 05:26:05 +02:00
Kubernetes Prow Robot
5a4e792e06
Merge pull request #120534 from pohly/dra-scheduler-ssa-as-fallback
dra scheduler: fall back to SSA for PodSchedulingContext updates
2023-10-23 21:06:58 +02:00
Kensei Nakada
fb6b10997a cleanup: remove useless test 2023-10-22 04:41:59 +00:00
Chris Henzie
2dbd405583 Graduate ReadWriteOncePod to GA 2023-10-20 10:40:39 -07:00
Kubernetes Prow Robot
8d4ccd67e3
Merge pull request #119517 from sanposhiho/block-status
feature(scheduler): simplify QueueingHintFn by introducing new statuses
2023-10-20 19:24:31 +02:00
Kensei Nakada
4f5bc7e8d7 fix based on reviews 2023-10-20 02:53:06 +00:00
Michal Wozniak
32fdb55192 Use Patch instead of SSA for Pod Disruption condition 2023-10-19 21:00:19 +02:00
Kensei Nakada
cb5dc46edf feature(scheduler): simplify QueueingHint by introducing new statuses 2023-10-19 11:02:11 +00:00
Kubernetes Prow Robot
ff4ba92b1c
Merge pull request #119155 from carlory/fix-118893-1
nodeaffinity: scheduler queueing hints
2023-10-19 09:31:27 +02:00
Kubernetes Prow Robot
78b34aa8fc
Merge pull request #116938 from olderTaoist/fix-image-locality
fix ImageLocality plugin score is inconsistent
2023-10-19 06:15:26 +02:00
olderTaoist
5d5958e338 fix ImageLocality plugin score is inconsistent 2023-10-17 09:38:03 +08:00
Kubernetes Prow Robot
46c307868f
Merge pull request #119176 from carlory/fix-118893-2
nodeports: scheduler queueing hints
2023-10-10 19:07:07 +02:00
Kubernetes Prow Robot
e224fc75ca
Merge pull request #116885 from mengjiao-liu/contextual-logging-scheduler-plugin-examples
Migrated  `pkg/scheduler/framework/plugins/examples/` to use contextual logging
2023-10-09 20:32:46 +02:00
Mengjiao Liu
9cca527c4b Migrated pkg/scheduler/framework/plugins/examples/ to use contextual logging 2023-10-09 11:43:17 +08:00
carlory
7cba35f651 nodeports: scheduler queueing hints
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-10-08 11:34:43 +08:00
HirazawaUi
b20bc79a60 remove not register event code 2023-10-07 21:59:01 +08:00
bzsuni
6200eb04af use generic sets in scheduler
Signed-off-by: bzsuni <bingzhe.sun@daocloud.io>
2023-09-28 21:31:33 +08:00
Kubernetes Prow Robot
9c5698f514
Merge pull request #116803 from mengjiao-liu/contextual-logging-scheduler-plugin-volumebinding
Migrated `pkg/scheduler/framework/plugins/volumebinding` to contextual logging
2023-09-27 15:04:38 -07:00
carlory
1d88bf9789 scheduler/nodeaffinity: reduce pod scheduling latency
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>
2023-09-25 09:41:22 +08:00
Kubernetes Prow Robot
3ac83f528d
Merge pull request #119290 from carlory/add-logger
the scheduling queue logs the error and treats it as QueueAfterBackoff
2023-09-22 08:10:49 -07:00
Mengjiao Liu
3eb6c4d368 Migrated pkg/scheduler/framework/plugins/volumebinding to contextual logging 2023-09-21 11:28:12 +08:00
carlory
0105a002bc when the hint fn returns error, the scheduling queue logs the error and treats it as QueueAfterBackoff.
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

Co-authored-by: Kante Yin <kerthcet@gmail.com>

Co-authored-by: XsWack <xushiwei5@huawei.com>
2023-09-21 09:40:44 +08:00
Mengjiao Liu
a7466f44e0 Change the scheduler plugins PluginFactory function to use context parameter to pass logger
- Migrated pkg/scheduler/framework/plugins/nodevolumelimits to use contextual logging
- Fix golangci-lint validation failed
- Check for plugins creation err
2023-09-20 17:49:54 +08:00
charles-chenzz
c8b9d64d81 scheduler test: unify util to fake pod. 2023-09-18 20:05:01 +08:00
Patrick Ohly
7cac1dcf67 dra scheduler: fall back to SSA for PodSchedulingContext updates
During scheduler_perf testing, roughly 10% of the PodSchedulingContext update
operations failed with a conflict error. Using SSA would avoid that, but
performance measurements showed that this causes a considerable
slowdown (primarily because of the slower encoding with JSON instead of
protobuf, but also because server-side processing is more expensive).

Therefore a normal update is tried first and SSA only gets used when there has
been a conflict. Using SSA in that case instead of giving up outright is better
because it avoids another scheduling attempt.
2023-09-15 15:05:38 +02:00
Stephen Kitt
3cb0b520d6
Scheduler CSI tests: switch maxVols to int32
This ends up stored in an int32 Count, use the target type throughout
to avoid narrowing conversions.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-15 09:52:50 +02:00
wackxu
28dbe8a34d scheduler/NodeUnschedulable: reduce pod scheduling latency
Signed-off-by: wackxu <xushiwei5@huawei.com>
2023-09-14 10:23:43 +08:00
Stephen Kitt
9990307146
kube-scheduler: drop deprecated pointer package
This replaces deprecated k8s.io/utils/pointer functions with their ptr
equivalent.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-13 09:42:19 +02:00
Kubernetes Prow Robot
db49b13ccd
Merge pull request #120252 from kerthcet/cleanup/framework-import
Move framework testing libraries to the right place
2023-09-12 17:44:11 -07:00
kerthcet
6fbb8ec7e4 Move scheduler testing utils to /scheduler/testing
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-09-12 13:42:38 +08:00
Patrick Ohly
6f9140e421 DRA scheduler: stop allocating before deallocation
This fixes a test flake:

    [sig-node] DRA [Feature:DynamicResourceAllocation] multiple nodes reallocation [It] works
    /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:552

      [FAILED] number of deallocations
      Expected
          <int64>: 2
      to equal
          <int64>: 1
      In [It] at: /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:651 @ 09/05/23 14:01:54.652

This can be reproduced locally with

    stress -p 10 go test ./test/e2e -args -ginkgo.focus=DynamicResourceAllocation.*reallocation.works  -ginkgo.no-color -v=4 -ginkgo.v

Log output showed that the sequence of events leading to this was:
- claim gets allocated because of selected node
- a different node has to be used, so PostFilter sets
  claim.status.deallocationRequested
- the driver deallocates
- before the scheduler can react and select a different node,
  the driver allocates *again* for the original node
- the scheduler asks for deallocation again
- the driver deallocates again (causing the test failure)
- eventually the pod runs

The fix is to disable allocations first by removing the selected node and then
starting to deallocate.
2023-09-11 10:56:17 +02:00
Kubernetes Prow Robot
a64a3e16ec
Merge pull request #120253 from pohly/dra-scheduler-podschedulingcontext-updates
dra scheduler: refactor PodSchedulingContext updates
2023-09-08 02:48:14 -07:00
Patrick Ohly
5c7dac2d77 dra scheduler: refactor PodSchedulingContext updates
Instead of modifying the PodSchedulingContext and then creating or updating it,
now the required changes (selected node, potential nodes) are tracked and the
actual input for an API call is created if (and only if) needed at the end.

This makes the code easier to read and change. In particular, replacing the
Update call with Patch or Apply is easy.
2023-09-08 08:06:06 +02:00
Kubernetes Prow Robot
2d5b6f16f5
Merge pull request #120213 from pohly/dra-scheduler-resourceclass-missing
dra: resourceclass missing
2023-09-06 23:47:09 -07:00
Patrick Ohly
c682d2b8c5 scheduler: add ResourceClass events
When filtering fails because a ResourceClass is missing, we can treat the pod
as "unschedulable" as long as we then also register a cluster event that wakes
up the pod. This is more efficient than periodically retrying.
2023-09-06 11:14:08 +02:00
Kubernetes Prow Robot
cd91351dff
Merge pull request #117720 from kerthcet/feat/remove-selector-spread
Remove deprecated selectorSpread
2023-08-29 00:25:22 -07:00
Kubernetes Prow Robot
3e910875a7
Merge pull request #120125 from kerthcet/cleanup/write-to-cycle
Make sure skipped score plugins always returned
2023-08-28 15:13:20 -07:00
Kubernetes Prow Robot
029d518970
Merge pull request #117588 from kerthcet/cleanup/use-genericset
Avoid duplicated dots in pod status when preempting
2023-08-28 08:39:44 -07:00
kerthcet
580f83ab4a Avoid duplicated dots in pod condition
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-28 22:36:36 +08:00
kerthcet
855b445d28 Remove deprecated selectorSpread
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-28 22:11:33 +08:00
Kubernetes Prow Robot
faf1b5d655
Merge pull request #114685 from AxeZhan/dynamicresources
dynamic resource allocation: optimize class.SuitableNodes usage
2023-08-28 04:43:43 -07:00
kerthcet
3d583398fe Avoid to build the error msg for twice
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-28 17:13:39 +08:00
Kubernetes Prow Robot
b910deb3a1
Merge pull request #120000 from kerthcet/cleanup/no-duplication
Remove duplicate codes in framework RemovePod
2023-08-24 04:22:20 -07:00
kerthcet
ab01848134 Make sure skip score plugins alwarys returned
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-24 13:39:47 +08:00
kerthcet
9ee94b0204 Remove duplicate codes in framework RemovePod
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-23 18:23:41 +08:00
Patrick Ohly
2472291790 api: introduce separate VolumeResourceRequirements struct
PVC and containers shared the same ResourceRequirements struct to define their
API. When resource claims were added, that struct got extended, which
accidentally also changed the PVC API. To avoid such a mistake from happening
again, PVC now uses its own VolumeResourceRequirements struct.

The `Claims` field gets removed because risk of breaking someone is low:
theoretically, YAML files which have a claims field for volumes now
get rejected when validating against the OpenAPI. Such files
have never made sense and should be fixed.

Code that uses the struct definitions needs to be updated.
2023-08-21 15:31:28 +02:00
Kubernetes Prow Robot
ea3318cb71
Merge pull request #119971 from kwakubiney/chore/include-pod-uid-in-event-log
chore: attach pod UID to event log
2023-08-21 04:13:22 -07:00
Kubernetes Prow Robot
312dc127a9
Merge pull request #118923 from AxeZhan/volume_zone_csi
[Scheduler]Translate beta label to ga in volume_zone
2023-08-17 20:20:28 -07:00