Commit Graph

1486 Commits

Author SHA1 Message Date
Kevin Klues
21a0dd1d70 dra scheduler: create default claim/class parameters instead of nil
Without this, the scheduler was crashing in newClaimController() in
pkg/scheduler/framework/plugins/dynamicresources/structuredparameters.go

The code in newClaimController() assumes that the parameters are not nil.
Furthermore it assumes that there is at least one DriverRequest populated in
order to allocate any resources to a claim.

This PR adds logic to define default claim/class parameters that will allow
allocation to proceed even if an end user doesn't provide any class or claim
parameters themselves.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-03-11 13:57:16 +00:00
Patrick Ohly
251b3859b0 dra scheduler: consider in-flight allocation for resource calculation
Storing a modified claim with allocation and the original resource version in
the assume cache was not reliable: if an update was received, it replaced the
modified claim and the resource that was reserved for the claim might have been
used for some other claim.

To fix this, the in-flight claims are now stored in the map instead of just a
boolean and the status stored there overrides whatever is in the assume cache.

Logging got extended to diagnose this problem better. It started to occur in
E2E tests after splitting the claim update so that first the finalizer is set
and then the status, because setting the finalizer triggered an update.
2024-03-07 22:26:16 +01:00
Patrick Ohly
0b6a0d686a dra api: rename NodeResourceSlice -> ResourceSlice
While currently those objects only get published by the kubelet for node-local
resources, this could change once we also support network-attached
resources. Dropping the "Node" prefix enables such a future extension.

The NodeName in ResourceSlice and StructuredResourceHandle then becomes
optional. The kubelet still needs to provide one and it must match its own node
name, otherwise it doesn't have permission to access ResourceSlice objects.
2024-03-07 22:22:55 +01:00
Patrick Ohly
d4d5ade7f5 dra: add "named resources" structured parameter model
Like the current device plugin interface, a DRA driver using this model
announces a list of resource instances. In contrast to device plugins, this
list is made available to the scheduler together with attributes that can be
used to select suitable instances when they are not all alike.

Because this is the first structured parameter model, some checks that
previously were not possible, in particular "is one structured parameter field
set", now gets enabled. Adding another structured parameter model will be
similar.

The applyconfigs code generator assumes that all types in an API are defined in
a single package. If it wasn't for that, it would be possible to place the
"named resources" types in separate packages, which makes their names in the Go
code more natural and provides an indication of their stability level because
the package name could include a version.
2024-03-07 22:21:16 +01:00
Patrick Ohly
096e948905 dra scheduler: support structured parameters
When a claim uses structured parameters, as indicated by the resource class
flag, the scheduler is responsible for allocating it. To do this it needs to
gather information about available node resources by watching
NodeResourceSlices and then match the in-tree claim parameters against those
resources.
2024-03-07 22:21:04 +01:00
Patrick Ohly
eb1470d60d scheduler: fix assume cache with no index
The assume cache in the volumbinding plugin can be created with no separate
index, but List then failed because it tried to use the empty index name
instead of using the store's List function.
2024-03-07 16:09:44 +01:00
Kubernetes Prow Robot
bc00c9eef0
Merge pull request #123366 from kerthcet/feat/support-initcontainer
Consider initContainer images in pod scheduling
2024-03-05 08:24:30 -08:00
Kubernetes Prow Robot
6929a11f69
Merge pull request #123481 from sanposhiho/mindomain-stable
graduate MinDomainsInPodTopologySpread to stable
2024-03-04 17:18:53 -08:00
Kubernetes Prow Robot
e4a14fe0f5
Merge pull request #123575 from Huang-Wei/pod-scheduling-readiness-stable
Graduate PodSchedulingReadiness to stable
2024-03-03 22:29:38 -08:00
Tim Hockin
467d5d745c
Get rid of unused API type NodeResources 2024-03-01 15:13:50 -08:00
Wei Huang
01db4ae9e7
Graduate PodSchedulingReadiness to stable 2024-02-28 23:18:44 -08:00
Kensei Nakada
58a826a59a graduate MinDomainsInPodTopologySpread to stable 2024-02-28 10:42:29 +00:00
Aleksandra Malinowska
dd1e617ba0
Scheduler first fit (#123384)
* Don't evaluate extra nodes if there's no score plugin defined

* Fix existing unit test (add no op scoring plugin)

* Add unit tests for no score plugin scenario

* address review comments

* add a test with non-filter, non-scoring extender
2024-02-26 11:07:19 -08:00
kerthcet
65faa9c680 Consider initContainer images in pod scheduling
Co-authored-by:     xiaomudk <xiaomudk@gmail.com>
Co-authored-by:     kerthcet <kerthcet@gmail.com>
Signed-off-by: kerthcet <kerthcet@gmail.com>
2024-02-19 14:17:57 +08:00
kerthcet
b3ba6bda2b Add missed clusterEvents to UnrollWildCardResource
Signed-off-by: kerthcet <kerthcet@gmail.com>
2024-02-19 11:55:50 +08:00
AxeZhan
630ff96f9d Revert "Scheduler first fit" 2024-02-14 20:43:59 +08:00
Kubernetes Prow Robot
ad19beaa83
Merge pull request #123117 from kerthcet/fix/wild-resource
Fix registered wildcard clusterEvents doesn't work in scheduler requeueing
2024-02-09 10:34:15 -08:00
Kubernetes Prow Robot
e566bd7769
Merge pull request #121952 from sanposhiho/optimize-csi
add(nodevolumelimits): return UnschedulableAndUnresolvable when PVC is not found
2024-02-06 07:16:28 -08:00
kerthcet
f97dec2840 Add comments about wildcard clusterEvent
Signed-off-by: kerthcet <kerthcet@gmail.com>
2024-02-05 11:46:59 +08:00
kerthcet
d81023db30 When matching clusterEvent, we should consider the "*" additionally
Signed-off-by: kerthcet <kerthcet@gmail.com>
2024-02-04 14:59:26 +08:00
Toru Komatsu
3a4c35cc89
Comment on QHint for CSILimit when CSINodes are added (#122758)
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-02-02 22:16:20 -08:00
Kubernetes Prow Robot
278ea691e0
Merge pull request #122946 from NoicFank/enhance-sheduler-waiting-pods
enhancement(scheduler): share waitingPods among profiles
2024-02-02 02:11:32 -08:00
NoicFank
227c1915db enhancement(scheduler): share waitingPods among profiles 2024-02-01 10:06:23 +08:00
Kubernetes Prow Robot
c606448922
Merge pull request #122996 from Huang-Wei/cleanup-dra-postfilter
DRA: always returns Unschedulable in PostFilter
2024-01-27 08:19:44 -08:00
Kubernetes Prow Robot
02aaad0de9
Merge pull request #121876 from pohly/dra-reserve-during-pod-binding
dra: reserve + publish during pod binding
2024-01-26 19:58:01 +01:00
Wei Huang
ceabc4aba8
DRA: always returns Unschedulable in PostFilter 2024-01-26 09:44:00 -08:00
Patrick Ohly
6cf4203751 dra scheduler: reformat code
By continuing with the next item in the if clause, the else is no longer needed
and indention can be reduced.
2024-01-26 10:58:03 +01:00
Patrick Ohly
a809a6353b scheduler: publish PodSchedulingContext during PreBind
Blocking API calls during a scheduling cycle like the DRA plugin is doing slow
down overall scheduling, i.e. also affecting pods which don't use DRA.

It is easy to move the blocking calls into a goroutine while the scheduling
cycle ends with "pod unschedulable". The hard part is handling an error when
those API calls then fail in the background. There is a solution for that
(see https://github.com/kubernetes/kubernetes/pull/120963), but it's complex.

Instead, publishing the modified PodSchedulingContext can also be done
later. In the more common case of a pod which is ready for binding except for
its claims, that'll be in PreBind, which runs in a separate goroutine already.

In the less common case that a pod cannot be scheduled, that'll be in
Unreserve which is still blocking.
2024-01-26 10:58:03 +01:00
Patrick Ohly
5d1509126f dra: patch ReservedFor during PreBind
This moves adding a pod to ReservedFor out of the main scheduling cycle into
PreBind. There it is done concurrently in different goroutines. For claims
which were specifically allocated for a pod (the most common case), that
usually makes no difference because the claim is already reserved.

It starts to matter when that pod then cannot be scheduled for other reasons,
because then the claim gets unreserved to allow deallocating it. It also
matters for claims that are created separately and then get used multiple times
by different pods.

Because multiple pods might get added to the same claim rapidly independently
from each other, it makes sense to do all claim status updates via patching:
then it is no longer necessary to have an up-to-date copy of the claim because
the patch operation will succeed if (and only if) the patched claim is valid.

Server-side-apply cannot be used for this because a client always has to send
the full list of all entries that it wants to be set, i.e. it cannot add one
entry unless it knows the full list.
2024-01-26 10:58:03 +01:00
Kubernetes Prow Robot
6c493a1ef9
Merge pull request #122969 from kerthcet/fix/claim
[DRA] Fix indexing the error value in unavailableClaim
2024-01-25 17:34:11 +01:00
kerthcet
7801173f6e get the error claim in dra
Signed-off-by: kerthcet <kerthcet@gmail.com>
2024-01-25 23:22:50 +08:00
kerthcet
8371e4cf93 quick break when met
Signed-off-by: kerthcet <kerthcet@gmail.com>
2024-01-23 19:40:15 +08:00
Kubernetes Prow Robot
c6887b1c00
Merge pull request #117803 from sourcelliu/preFilterState
Optimize the performance of the Clone method of preFilterState
2024-01-19 10:57:20 +01:00
amewayne
71c3593f85 support nodeAnnotationsChanged event to trigger rescheduling 2024-01-10 22:38:54 +08:00
Kubernetes Prow Robot
919d4624a0
Merge pull request #122503 from sunbinnnnn/scheduler-extender-support-ignore-bind
Support ignore scheduler extender error when binding
2024-01-08 17:30:44 +01:00
Kubernetes Prow Robot
5b979a3a53
Merge pull request #122498 from Gekko0114/close
Allow framework plugins to be closed
2024-01-08 17:30:36 +01:00
Neil Sun
87816ffb2c Support ignore scheduler extender error when binding
Signed-off-by: sunbinnnnn <sunbinnnnn@hotmail.com>
2024-01-08 21:06:25 +08:00
Kubernetes Prow Robot
b529e6ff1c
Merge pull request #122622 from nayihz/cleanup_comment
swap originalPod and modifiedPod to match the comments
2024-01-06 14:20:50 +01:00
nayihz
edff1c3b2f swap originalPod and modifiedPod to match the comments. 2024-01-06 19:07:18 +08:00
moriya
288c00c0c7 Allow framework plugins to be closed 2024-01-06 10:11:19 +09:00
Kensei Nakada
09abd6be5a address reviews 2024-01-02 02:10:41 +00:00
Kensei Nakada
5ab2317947 run all PreFilter when the preemption will happen later in the same scheduling cycle 2024-01-01 09:44:06 +00:00
Kubernetes Prow Robot
3be9a8cc73
Merge pull request #122351 from sanposhiho/doc-update-for-add
doc: make it clear that how newly scheduled Pods are interpreted in cluster events
2023-12-31 08:04:43 +01:00
Kensei Nakada
e1e035e3a8 doc: make it clear that newly scheduled Pods are Pod/Add events 2023-12-31 05:58:12 +00:00
Kensei Nakada
bf1b3a161b add(nodevolumelimits): return UnschedulableAndUnresolvable when PVC is not found 2023-12-30 23:00:56 +00:00
Kubernetes Prow Robot
afa3f114d6
Merge pull request #117024 from sanposhiho/nodeaffinity-pre-score-skip
feature(NodeAffinity): return Skip in PreScore when nothing to do in Score
2023-12-27 16:53:51 +01:00
Kubernetes Prow Robot
e2a6ce713c
Merge pull request #122415 from pohly/dra-scheduler-deallocation-fix
dra scheduler: fix incorrect tracking of claim candidates for reallocation
2023-12-25 10:45:00 +01:00
Aleksandra Malinowska
e19be41f58 Don't evaluate extra nodes if there's no score plugin defined 2023-12-21 13:29:46 +01:00
Patrick Ohly
b0d4a8cd6d dra scheduler: fix incorrect tracking of claim candidates for reallocation
When dealing with unschedulable pods, the intent was to deallocate only claims
which are allocated and use delayed allocation. That if check wasn't handled
correctly, causing also claims with immediate allocation to be considered as
candidates.

Found during code reading, probably has never occurred in practice yet.
2023-12-20 09:04:01 +01:00
Kubernetes Prow Robot
2b5c0c281d
Merge pull request #122310 from weilaaa/use_buildin_max_min_instead
use build-in max and min func to instead of k8s.io/utils/integer funcs
2023-12-18 19:25:34 +01:00