Commit Graph

3414 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
084d6c4968 Merge pull request #125699 from pohly/scheduler-framework-logging
scheduler: fix klog.KObjSlice when applied to []*NodeInfo
2024-06-26 01:50:23 -07:00
Patrick Ohly
719a49cc13 scheduler: fix klog.KObjSlice when applied to []*NodeInfo
The DRA plugin does that. It didn't actually work and only printed an error
message about NodeInfo not implementing klog.KMetata. That's not a compile-time
check due to limitations with Go generics and had been missed earlier.
2024-06-26 08:11:31 +02:00
Kubernetes Prow Robot
d0579b6f9c Merge pull request #125683 from likakuli/fix-benchmarkupdatesnapshot
clean: add nodeinfo to cache
2024-06-25 14:18:39 -07:00
Kubernetes Prow Robot
59673f0f37 Merge pull request #125578 from nayihz/fix_sche_queue_update
skip update pod that exist in scheduling cycle
2024-06-25 14:18:19 -07:00
Kubernetes Prow Robot
8c478a06d8 Merge pull request #124595 from pohly/dra-scheduler-assume-cache-eventhandlers
DRA: scheduler event handlers via assume cache
2024-06-25 11:56:28 -07:00
likakuli
ea6ca270b5 clean: add nodeinfo to cache
Signed-off-by: likakuli <1154584512@qq.com>
2024-06-25 21:29:05 +08:00
Patrick Ohly
1b63639d31 DRA scheduler: use assume cache to list claims
This finishes the transition to the assume cache as source of truth for the
current set of claims.

The tests have to be adapted. It's not enough anymore to directly put objects
into the informer store because that doesn't change the assume cache
content. Instead, normal Create/Update calls and waiting for the cache update
are needed.
2024-06-25 14:00:25 +02:00
Patrick Ohly
9a6f3b9388 scheduler: central ResourceClaim assume cache
This enables connecting the event handler for ResourceClaim to the assume
cache, which addresses a theoretic race condition.

It may also be useful for implementing the autoscaler support, because now
the autoscaler can modify the content of the cache.
2024-06-25 14:00:25 +02:00
Patrick Ohly
dea16757ef scheduler: AddEventHandler for assume cache
This enables using the assume cache for cluster events.
2024-06-25 14:00:25 +02:00
Patrick Ohly
639f86915b scheduler: add FIFO queue
This is a basic implementation of a first-in-first-out queue with unbounded
size. It's useful for cases where a channel with fixed size might deadlock.

The caller is responsible for locking.
2024-06-25 13:56:15 +02:00
nayihz
26dcab1146 skip update pod that exist in scheduling cycle 2024-06-24 17:11:09 +08:00
Kubernetes Prow Robot
8c508c5480 Merge pull request #125527 from sanposhiho/gated-pods-filter-out-bug
fix: skip isPodWorthRequeuing only when SchedulingGates gates the pod
2024-06-21 12:22:55 -07:00
Kensei Nakada
98a3182398 correct comment 2024-06-20 23:48:42 +00:00
Kensei Nakada
2304806cbe elaborate comment more 2024-06-20 23:43:41 +00:00
Kensei Nakada
fa8da84835 remove fixme comment 2024-06-20 23:36:25 +00:00
Kensei Nakada
2c4dc6b65b elaborate comments 2024-06-20 23:36:05 +00:00
Kubernetes Prow Robot
a008776ec9 Merge pull request #125279 from HirazawaUi/add-poddeleted-queueinghintfn
Add QueueingHintFn for pod events in VolumeRestriction plugin
2024-06-19 12:22:41 -07:00
Kubernetes Prow Robot
64355780d9 Merge pull request #125495 from pohly/dra-scheduler-fix-parameter-indexing
DRA: fix indexing of generated parameters
2024-06-18 04:10:38 -07:00
Kubernetes Prow Robot
ab8ad49b47 Merge pull request #125533 from kaisoz/sched-test-disruption-target-cond
scheduler: Test that the DisruptionTarget condition is added at preemption time
2024-06-18 01:14:28 -07:00
Tomas Tormo
8d7c113434 Test that the DisruptionTarget condition is added at preemption 2024-06-17 16:59:52 +00:00
HirazawaUi
f9693e0c0a Implement QueueingHintFn for pod deleted event 2024-06-17 22:42:04 +08:00
Kensei Nakada
dd3af9a85b fix: skip isPodWorthRequeuing only when SchedulingGates gates the pod 2024-06-17 01:14:34 +00:00
Patrick Ohly
e0fce54d02 DRA: fix indexing of generated parameters
The claim parameter key didn't include the namespace of the claim. In the case
where two namespaces used the exact same parameter reference, the "too many
generated parameters" case got triggered incorrectly and lookup could have
returned an object from the wrong namespace.

Found while running the E2E tests in parallel:

              message: 'running PreFilter plugin "DynamicResources": multiple generated claim
                parameters for ConfigMap. dra-8794/parameters-3 found: [dra-4729/parameters-4
                dra-7328/parameters-4 dra-8794/parameters-4 dra-3402/parameters-4 dra-6156/parameters-4
                dra-1839/parameters-4 dra-7434/parameters-4 dra-6504/parameters-4]'
2024-06-13 17:27:04 +02:00
Kubernetes Prow Robot
9c8c61aee4 Merge pull request #122234 from AxeZhan/podUpdateEvent
[Scheduler]Put pod into the correct queue during podUpdate
2024-06-12 12:28:17 -07:00
AxeZhan
d66f8f9413 schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00
Patrick Ohly
c339eafb76 scheduler: allow PreBind to return "Pending" and "Unschedulable"
Any error result from PreBind was treated as a pod scheduling failure. This was
overlooked when moving blocking API calls in the DRA plugin into a PreBind
implementation, leading to:

    E0604 15:45:50.980929  306340 schedule_one.go:1048] "Error scheduling pod; retrying" err="waiting for resource driver" pod="test/test-draqld28"

That's because DRA's PreBind does some updates in the apiserver, then returns
Pending to wait for the outcome.

The fix is to allow PreBind to return the same special status codes as other
extension points.
2024-06-06 15:28:08 +02:00
AxeZhan
cf73c9d93c remove EvaluatedNodes field in Diagnosis struct 2024-06-04 14:20:55 +08:00
Kubernetes Prow Robot
cfe5a7d03a Merge pull request #125213 from carlory/fix-dra-flaky
fix dra flaky test on TestPlugin
2024-06-03 13:32:10 -07:00
Kubernetes Prow Robot
8bd36c60bd Merge pull request #125197 from gabesaba/prefilter_perf
[scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable
2024-06-03 07:35:41 -07:00
Gabe
c8f0ea1a54 Don't fill in NodeToStatusMap with UnschedulableAndUnresolvable 2024-05-31 15:52:16 +00:00
carlory
2794baf4c0 fix dra flaky test on TestPlugin 2024-05-30 23:22:37 +08:00
Kubernetes Prow Robot
ee2c1ffa80 Merge pull request #124630 from carlory/fix-123731
DRA: scheduler: index claim and class parameters to simplify lookup
2024-05-29 14:38:14 -07:00
Gabe
7ea3bf4db4 Revert "scheduler: preallocation for NodeToStatusMap"
This reverts commit 9fcd791c01.
2024-05-29 14:09:58 +00:00
carlory
3072987fcc DRA: scheduler: index claim and class parameters to simplify lookup 2024-05-27 15:57:10 +08:00
Kubernetes Prow Robot
0f584a9b86 Merge pull request #124933 from AxeZhan/fix_panic
[Scheduler] Use allNodes when calculating nextStartNodeIndex
2024-05-21 10:29:35 -07:00
AxeZhan
d6d1e6ad8a base on allNodes when calculating nextStartNodeIndex 2024-05-18 00:30:38 +08:00
NoicFank
31a4b13238 enhancement(scheduler): share waitingPods among profiles 2024-05-17 17:07:27 +08:00
Toru Komatsu
5722db7aa3 QueueingHint for CSILimit when deleting pods (#121508)
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-05-14 11:07:11 -07:00
Kensei Nakada
9cd62186e8 cleanup: eliminate unncessary NodeToStatusMap creation 2024-05-11 12:14:22 +00:00
Kubernetes Prow Robot
9d87fa215d Merge pull request #124735 from AxeZhan/evaluatedNodes
Change EvaluatedNodes to count Nodes that reach Filter phase only
2024-05-09 22:43:22 -07:00
AxeZhan
bcf1c55837 evaluated nodes only consider filter stage 2024-05-10 12:40:12 +08:00
Kubernetes Prow Robot
df074ed002 Merge pull request #124546 from carlory/remove-rbd
CephRBD volume plugin and its csi migration support are removed
2024-05-09 20:50:12 -07:00
Kubernetes Prow Robot
db82fd1604 Merge pull request #124618 from gabesaba/gated_performance
Filter gated pods before calling isPodWorthRequeueing
2024-05-09 11:33:23 -07:00
carlory
c8e91b9bc2 CephRBD volume plugin ( ) and its csi migration support were removed in this release 2024-05-09 22:55:34 +08:00
Kubernetes Prow Robot
e798b9c269 Merge pull request #124714 from sanposhiho/prealloc
scheduler: preallocation for NodeToStatusMap
2024-05-07 07:07:58 -07:00
Kensei Nakada
9fcd791c01 scheduler: preallocation for NodeToStatusMap 2024-05-07 00:01:24 +00:00
Kubernetes Prow Robot
8240d882ab Merge pull request #124500 from carlory/scheduler-deprecate-non-csi-plugins
scheduler deprecates non-csi volumelimit plugins
2024-05-06 08:03:04 -07:00
Kubernetes Prow Robot
ade0d2140a Merge pull request #124578 from sanposhiho/scheduler_perf_scheduler_plugin_execution_duration_seconds
support `scheduler_plugin_execution_duration_seconds` in scheduler_perf
2024-05-05 06:40:44 -07:00
Kubernetes Prow Robot
f97ac220fd Merge pull request #124666 from chengjoey/ut-for-123465
add integration test for pod with pvc has node-affinity to non-existent/illegal nodes
2024-05-03 05:50:00 -07:00
joey
a56cc6b100 add integration test for pod with pvc has node-affinity to non-existent/existent nodes
Signed-off-by: joey <zchengjoey@gmail.com>
2024-05-03 19:45:31 +08:00