kubernetes

Author	SHA1	Message	Date
lianghao208	34e620d18c	Support score extension function in preemption.	2023-11-15 15:28:20 +08:00
Kubernetes Prow Robot	5ce0bd95cc	Merge pull request #121677 from kerthcet/cleanup/remove-evnet Unregister events in schedulingGates for performance	2023-11-10 05:03:33 +01:00
kerthcet	f77a4543d1	Unregister events in schedulingGates plugin Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-11-06 10:01:13 +08:00
Patrick Ohly	2a23061f6c	scheduler: fix performance regression at -v3 + contextual logging The logging instrumentation for contextual logging that was added for 1.29 slowed down the scheduler (i.e. logging verbosity <= 3) by a significant percentage (-28.66% for SchedulingBasic/5000Nodes at -v3) if (and only if!) contextual logging was enabled. Retrieving the logger from the context causes no measurable slowdown, it's only the various WithName/WithValues calls which cause this. By being more careful about when to use those, the performance impact can be avoided: - At -v3 or lower, only `WithValues("pod")` is used once per scheduling cycle. This has the intended effect that all log messages for the cycle include the pod information. Once contextual logging is GA, "pod" key/value pairs can be removed from all log calls. - At -v4 or higher, richer log entries get produced where `WithValues` is also used for the node (when applicable) and `WithName` is used for the current operation and plugin. With these changes, enabling contextual logging causes no measurable slowdown at -v3 or lower. At -v4, the slowdown depends on the test case (-30.51% throughput for SchedulingBasic/5000Nodes, no change for SchedulingCSIPVs/5000Nodes). For some unknown reason (measuring bias?), SchedulingCSIPVs/500Nodes has a ~3& higher throughput with contextual logging.	2023-11-03 17:28:55 +01:00
kerthcet	5bf63036c7	Make EnablePodSchedulingReadiness public Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-11-03 11:44:56 +08:00
Kubernetes Prow Robot	d84ee0ba69	Merge pull request #121632 from kerthcet/fix/runscoreplugins Fix panic when process RunScorePlugins for cap out of range	2023-10-31 13:14:32 +01:00
kerthcet	b02aad42fa	Fix panic when process RunScorePlugins for cap out of range Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-10-31 16:02:16 +08:00
Kensei Nakada	c7842d9c63	narrow down the scope of EnqueueExtensions to subscribe less cluster events	2023-10-27 14:14:37 +00:00
Kubernetes Prow Robot	fd5c406112	Merge pull request #120933 from mengjiao-liu/contextual-logging-scheduler-remaining-part kube-scheduler: convert the remaining part to use contextual logging	2023-10-27 10:30:58 +02:00
Kubernetes Prow Robot	53e91942bf	Merge pull request #117780 from sourcelliu/schedulinggates Improve the performance of schedulinggates	2023-10-26 18:14:32 +02:00
Kubernetes Prow Robot	2749509f35	Merge pull request #121469 from sanposhiho/renamerename cleanup: rename failedPlugin to plugin in framework.Status	2023-10-25 20:16:43 +02:00
Kensei Nakada	27bb66fd7b	cleanup: rename failedPlugin to plugin in framework.Status	2023-10-25 12:03:56 +00:00
Kubernetes Prow Robot	9aa04752e7	Merge pull request #118463 from testwill/replace_loop chore: slice replace loop	2023-10-24 15:04:39 +02:00
Mengjiao Liu	b0a73213d6	kube-scheduler: convert the remaining part to use contextual logging	2023-10-24 17:56:48 +08:00
Kubernetes Prow Robot	6d7d249372	Merge pull request #121077 from chrishenzie/readwriteoncepod-ga Graduate ReadWriteOncePod to GA	2023-10-24 05:26:05 +02:00
Kubernetes Prow Robot	5a4e792e06	Merge pull request #120534 from pohly/dra-scheduler-ssa-as-fallback dra scheduler: fall back to SSA for PodSchedulingContext updates	2023-10-23 21:06:58 +02:00
Kensei Nakada	fb6b10997a	cleanup: remove useless test	2023-10-22 04:41:59 +00:00
Chris Henzie	2dbd405583	Graduate ReadWriteOncePod to GA	2023-10-20 10:40:39 -07:00
Kubernetes Prow Robot	8d4ccd67e3	Merge pull request #119517 from sanposhiho/block-status feature(scheduler): simplify QueueingHintFn by introducing new statuses	2023-10-20 19:24:31 +02:00
Kensei Nakada	4f5bc7e8d7	fix based on reviews	2023-10-20 02:53:06 +00:00
Michal Wozniak	32fdb55192	Use Patch instead of SSA for Pod Disruption condition	2023-10-19 21:00:19 +02:00
Kensei Nakada	cb5dc46edf	feature(scheduler): simplify QueueingHint by introducing new statuses	2023-10-19 11:02:11 +00:00
Kubernetes Prow Robot	ff4ba92b1c	Merge pull request #119155 from carlory/fix-118893-1 nodeaffinity: scheduler queueing hints	2023-10-19 09:31:27 +02:00
Kubernetes Prow Robot	78b34aa8fc	Merge pull request #116938 from olderTaoist/fix-image-locality fix ImageLocality plugin score is inconsistent	2023-10-19 06:15:26 +02:00
olderTaoist	5d5958e338	fix ImageLocality plugin score is inconsistent	2023-10-17 09:38:03 +08:00
Kubernetes Prow Robot	46c307868f	Merge pull request #119176 from carlory/fix-118893-2 nodeports: scheduler queueing hints	2023-10-10 19:07:07 +02:00
Kubernetes Prow Robot	e224fc75ca	Merge pull request #116885 from mengjiao-liu/contextual-logging-scheduler-plugin-examples Migrated `pkg/scheduler/framework/plugins/examples/` to use contextual logging	2023-10-09 20:32:46 +02:00
Mengjiao Liu	9cca527c4b	Migrated `pkg/scheduler/framework/plugins/examples/` to use contextual logging	2023-10-09 11:43:17 +08:00
carlory	7cba35f651	nodeports: scheduler queueing hints Co-authored-by: Kensei Nakada <handbomusic@gmail.com> Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>	2023-10-08 11:34:43 +08:00
bzsuni	6200eb04af	use generic sets in scheduler Signed-off-by: bzsuni <bingzhe.sun@daocloud.io>	2023-09-28 21:31:33 +08:00
Kubernetes Prow Robot	9c5698f514	Merge pull request #116803 from mengjiao-liu/contextual-logging-scheduler-plugin-volumebinding Migrated `pkg/scheduler/framework/plugins/volumebinding` to contextual logging	2023-09-27 15:04:38 -07:00
carlory	1d88bf9789	scheduler/nodeaffinity: reduce pod scheduling latency Co-authored-by: Kensei Nakada <handbomusic@gmail.com>	2023-09-25 09:41:22 +08:00
Kubernetes Prow Robot	3ac83f528d	Merge pull request #119290 from carlory/add-logger the scheduling queue logs the error and treats it as QueueAfterBackoff	2023-09-22 08:10:49 -07:00
Mengjiao Liu	3eb6c4d368	Migrated `pkg/scheduler/framework/plugins/volumebinding` to contextual logging	2023-09-21 11:28:12 +08:00
carlory	0105a002bc	when the hint fn returns error, the scheduling queue logs the error and treats it as QueueAfterBackoff. Co-authored-by: Kensei Nakada <handbomusic@gmail.com> Co-authored-by: Kante Yin <kerthcet@gmail.com> Co-authored-by: XsWack <xushiwei5@huawei.com>	2023-09-21 09:40:44 +08:00
Mengjiao Liu	a7466f44e0	Change the scheduler plugins PluginFactory function to use context parameter to pass logger - Migrated pkg/scheduler/framework/plugins/nodevolumelimits to use contextual logging - Fix golangci-lint validation failed - Check for plugins creation err	2023-09-20 17:49:54 +08:00
charles-chenzz	c8b9d64d81	scheduler test: unify util to fake pod.	2023-09-18 20:05:01 +08:00
Patrick Ohly	7cac1dcf67	dra scheduler: fall back to SSA for PodSchedulingContext updates During scheduler_perf testing, roughly 10% of the PodSchedulingContext update operations failed with a conflict error. Using SSA would avoid that, but performance measurements showed that this causes a considerable slowdown (primarily because of the slower encoding with JSON instead of protobuf, but also because server-side processing is more expensive). Therefore a normal update is tried first and SSA only gets used when there has been a conflict. Using SSA in that case instead of giving up outright is better because it avoids another scheduling attempt.	2023-09-15 15:05:38 +02:00
Stephen Kitt	3cb0b520d6	Scheduler CSI tests: switch maxVols to int32 This ends up stored in an int32 Count, use the target type throughout to avoid narrowing conversions. Signed-off-by: Stephen Kitt <skitt@redhat.com>	2023-09-15 09:52:50 +02:00
wackxu	28dbe8a34d	scheduler/NodeUnschedulable: reduce pod scheduling latency Signed-off-by: wackxu <xushiwei5@huawei.com>	2023-09-14 10:23:43 +08:00
Stephen Kitt	9990307146	kube-scheduler: drop deprecated pointer package This replaces deprecated k8s.io/utils/pointer functions with their ptr equivalent. Signed-off-by: Stephen Kitt <skitt@redhat.com>	2023-09-13 09:42:19 +02:00
Kubernetes Prow Robot	db49b13ccd	Merge pull request #120252 from kerthcet/cleanup/framework-import Move framework testing libraries to the right place	2023-09-12 17:44:11 -07:00
kerthcet	6fbb8ec7e4	Move scheduler testing utils to /scheduler/testing Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-09-12 13:42:38 +08:00
Patrick Ohly	6f9140e421	DRA scheduler: stop allocating before deallocation This fixes a test flake: [sig-node] DRA [Feature:DynamicResourceAllocation] multiple nodes reallocation [It] works /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:552 [FAILED] number of deallocations Expected <int64>: 2 to equal <int64>: 1 In [It] at: /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:651 @ 09/05/23 14:01:54.652 This can be reproduced locally with stress -p 10 go test ./test/e2e -args -ginkgo.focus=DynamicResourceAllocation.reallocation.works -ginkgo.no-color -v=4 -ginkgo.v Log output showed that the sequence of events leading to this was: - claim gets allocated because of selected node - a different node has to be used, so PostFilter sets claim.status.deallocationRequested - the driver deallocates - before the scheduler can react and select a different node, the driver allocates again* for the original node - the scheduler asks for deallocation again - the driver deallocates again (causing the test failure) - eventually the pod runs The fix is to disable allocations first by removing the selected node and then starting to deallocate.	2023-09-11 10:56:17 +02:00
Kubernetes Prow Robot	a64a3e16ec	Merge pull request #120253 from pohly/dra-scheduler-podschedulingcontext-updates dra scheduler: refactor PodSchedulingContext updates	2023-09-08 02:48:14 -07:00
Patrick Ohly	5c7dac2d77	dra scheduler: refactor PodSchedulingContext updates Instead of modifying the PodSchedulingContext and then creating or updating it, now the required changes (selected node, potential nodes) are tracked and the actual input for an API call is created if (and only if) needed at the end. This makes the code easier to read and change. In particular, replacing the Update call with Patch or Apply is easy.	2023-09-08 08:06:06 +02:00
Kubernetes Prow Robot	2d5b6f16f5	Merge pull request #120213 from pohly/dra-scheduler-resourceclass-missing dra: resourceclass missing	2023-09-06 23:47:09 -07:00
Patrick Ohly	c682d2b8c5	scheduler: add ResourceClass events When filtering fails because a ResourceClass is missing, we can treat the pod as "unschedulable" as long as we then also register a cluster event that wakes up the pod. This is more efficient than periodically retrying.	2023-09-06 11:14:08 +02:00
Kubernetes Prow Robot	cd91351dff	Merge pull request #117720 from kerthcet/feat/remove-selector-spread Remove deprecated selectorSpread	2023-08-29 00:25:22 -07:00
Kubernetes Prow Robot	3e910875a7	Merge pull request #120125 from kerthcet/cleanup/write-to-cycle Make sure skipped score plugins always returned	2023-08-28 15:13:20 -07:00

1 2 3 4 5 ...

1383 Commits