Commit Graph

3509 Commits

Author SHA1 Message Date
Abhishek Kr Srivastav
9d10ddb060 Fix Go vet errors for master golang
Co-authored-by: Rajalakshmi-Girish <rajalakshmi.girish1@ibm.com>
Co-authored-by: Abhishek Kr Srivastav <Abhishek.kr.srivastav@ibm.com>
2025-01-08 15:11:34 +05:30
Kubernetes Prow Robot
f7d6fad111 Merge pull request #128431 from NoicFank/automated-cherry-pick-of-#128307-upstream-release-1.31
Automated cherry pick of #128307: bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it
2024-11-12 09:13:07 +00:00
NoicFank
2d540ade5f bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it.
Introducing pdb to preemption had disrupted the orderliness of pods in the victims,
which would leads picking wrong victim node with higher priority pod on it.
2024-10-30 15:36:30 +08:00
AxeZhan
d8d31947dc tests for nodes with different nodeName and name 2024-09-24 06:41:05 +00:00
AxeZhan
fdca80f8dc manually revert #109877 2024-09-24 06:41:05 +00:00
Wei Huang
9eec84c67f fix a scheduler preemption issue that victim is not patched properly 2024-08-14 11:48:11 -07:00
Kubernetes Prow Robot
39a80796b6 Merge pull request #122628 from sanposhiho/pod-smaller-events
add(scheduler/framework): implement smaller Pod update events
2024-07-23 18:01:46 -07:00
Kubernetes Prow Robot
a00181d4d4 Merge pull request #121902 from carlory/kep-3751-pv-controller
[kep-3751] pvc bind pv with vac
2024-07-23 11:02:13 -07:00
Kubernetes Prow Robot
43691598da Merge pull request #126227 from sanposhiho/queueing_hint_execution_duration_seconds
feature: support queueing_hint_execution_duration_seconds metric
2024-07-23 02:12:29 -07:00
Kensei Nakada
3f59d9fc4c fix typo 2024-07-23 17:43:21 +09:00
carlory
3a6a4830df pvc bind pv with vac 2024-07-23 15:04:11 +08:00
Kubernetes Prow Robot
d21b17264e Merge pull request #125488 from pohly/dra-1.31
DRA for 1.31
2024-07-22 11:45:55 -07:00
Patrick Ohly
9f36c8d718 DRA: add DRAControlPlaneController feature gate for "classic DRA"
In the API, the effect of the feature gate is that alpha fields get dropped on
create. They get preserved during updates if already set. The
PodSchedulingContext registration is *not* restricted by the feature gate.
This enables deleting stale PodSchedulingContext objects after disabling
the feature gate.

The scheduler checks the new feature gate before setting up an informer for
PodSchedulingContext objects and when deciding whether it can schedule a
pod. If any claim depends on a control plane controller, the scheduler bails
out, leading to:

    Status:       Pending
    ...
      Warning  FailedScheduling             73s   default-scheduler  0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.

The rest of the changes prepare for testing the new feature separately from
"structured parameters". The goal is to have base "dra" jobs which just enable
and test those, then "classic-dra" jobs which add DRAControlPlaneController.
2024-07-22 18:09:34 +02:00
Patrick Ohly
599fe605f9 DRA scheduler: adapt to v1alpha3 API
The structured parameter allocation logic was written from scratch in
staging/src/k8s.io/dynamic-resource-allocation/structured where it might be
useful for out-of-tree components.

Besides the new features (amount, admin access) and API it now supports
backtracking when the initial device selection doesn't lead to a complete
allocation of all claims.

Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
Co-authored-by: John Belamaric <jbelamaric@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
91d7882e86 DRA: new API for 1.31
This is a complete revamp of the original API. Some of the key
differences:
- refocused on structured parameters and allocating devices
- support for constraints across devices
- support for allocating "all" or a fixed amount
  of similar devices in a single request
- no class for ResourceClaims, instead individual
  device requests are associated with a mandatory
  DeviceClass

For the sake of simplicity, optional basic types (ints, strings) where the null
value is the default are represented as values in the API types. This makes Go
code simpler because it doesn't have to check for nil (consumers) and values
can be set directly (producers). The effect is that in protobuf, these fields
always get encoded because `opt` only has an effect for pointers.

The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new
"request" field. This is considered acceptable because the entire `claims`
field in the pod spec is still alpha.

The implementation is complete enough to bring up the apiserver.
Adapting other components follows.
2024-07-22 18:09:34 +02:00
Kubernetes Prow Robot
8b8f84c6a7 Merge pull request #125862 from sanposhiho/cleanup-nominated
cleanup: remove duplicated AddNominatedPod
2024-07-22 06:50:03 -07:00
Kensei Nakada
2a51bd81fa fix: async metric recording 2024-07-22 21:32:19 +09:00
杨朱 · Kiki
bc3c07091b Fix a bug where the target pod doesn't become schedulable within 5 minutes when a deleted pod uses the same PVC with the ReadWriteOncePod access mode. (#126263)
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>
2024-07-22 01:20:34 -07:00
Patrick Ohly
8a629b9f15 DRA: remove "sharable" from claim allocation result
Now all claims are shareable up to the limit imposed by the size of the
"reserverFor" array.

This is one of the agreed simplifications for 1.31.
2024-07-21 17:28:14 +02:00
Patrick Ohly
de5742ae83 DRA: remove immediate allocation
As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate
allocation is one of those features which can be removed because it makes no
sense for structured parameters and the justification for classic DRA is weak.
2024-07-21 17:28:14 +02:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
Kensei Nakada
82a54e8cc8 cleanup: remove duplicated addNominatedPodUnlocked 2024-07-21 16:04:25 +09:00
Kensei Nakada
fa8092f838 support UpdatePodScaleDown instead of UpdatePodRequest 2024-07-20 19:20:38 +09:00
Kensei Nakada
0dee497876 fix: make updatePodOther private 2024-07-20 17:49:46 +09:00
Kensei Nakada
0b133c7fa9 modify test 2024-07-20 17:44:57 +09:00
Kensei Nakada
e46fe0b673 register UpdatePodOther to a general Update 2024-07-20 17:44:57 +09:00
Kensei Nakada
066826d476 fix wordings 2024-07-20 17:44:57 +09:00
Kensei Nakada
4283ab5df3 use PodUpdateOther internally 2024-07-20 17:44:55 +09:00
Kensei Nakada
0cd1ee4259 add(scheduler/framework): implement smaller Pod update events 2024-07-20 17:44:23 +09:00
Kubernetes Prow Robot
64ba17c605 Merge pull request #125571 from liggitt/filter-auth-02-sar
add field and label selectors to authorization
2024-07-19 15:30:01 -07:00
Jordan Liggitt
03d48b7683 Move CEL env initialization out of package init()
This ensures compatibility version and feature gates can be initialized
before cached CEL environments are created.
2024-07-19 15:06:48 -04:00
Kubernetes Prow Robot
6f3f115378 Merge pull request #126222 from macsko/dont_lock_activeq_twice_in_activate_in_scheduling_queue
Don't lock activeQ twice when activating pod in scheduling queue
2024-07-19 12:03:10 -07:00
bells17
e1aa8197ed volumebinding: scheduler queueing hints - CSIStorageCapacity (#124961)
* volumebinding: scheduler queueing hints - CSIStorageCapacity

* Fixed points mentioned in the review

* Fixed points mentioned in the review

* Update pkg/scheduler/framework/plugins/volumebinding/volume_binding.go

Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

* Update pkg/scheduler/framework/plugins/volumebinding/volume_binding_test.go

Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

* Fixed points mentioned in the review

* volume_binding.go を更新

Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

---------

Co-authored-by: Kensei Nakada <handbomusic@gmail.com>
2024-07-19 07:53:52 -07:00
Kensei Nakada
7ef3cf5d07 feature: support queueing_hint_execution_duration_seconds metric 2024-07-19 23:13:07 +09:00
Kubernetes Prow Robot
01eb9f4754 Merge pull request #125929 from sanposhiho/requeueing-metrics
add: implement event_handling_duration_seconds metric
2024-07-19 04:43:00 -07:00
Maciej Skoczeń
7421ded6f9 Don't lock activeQ twice when activating pod in scheduling queue 2024-07-19 09:18:42 +00:00
Kensei Nakada
9ff3227b15 add: implement event_handling_duration_seconds metric 2024-07-18 18:16:57 +09:00
Kubernetes Prow Robot
24fbb13eaf Merge pull request #126113 from googs1025/enqueueExtensions_refactor
scheduler: Add ctx param and error return to EnqueueExtensions.EventsToRegister()
2024-07-18 00:53:25 -07:00
googs1025
a3978e8315 scheduler: Add ctx param and error return to EnqueueExtensions.EventsToRegister() 2024-07-18 12:22:17 +08:00
Kubernetes Prow Robot
5d40866fae Merge pull request #125994 from carlory/fix-job-api
clean up codes after PodDisruptionConditions was promoted to GA
2024-07-17 14:37:09 -07:00
Kubernetes Prow Robot
d879103c28 Merge pull request #125820 from macsko/add_separate_lock_for_pod_nominator_scheduling_queue
Add a separate lock for pod nominator in scheduling queue
2024-07-17 12:06:10 -07:00
Maciej Skoczeń
5def93b10a Add a separate lock for pod nominator in scheduling queue 2024-07-17 07:58:59 +00:00
bells17
4c3c4128af volumebinding: scheduler queueing hints - StorageClass 2024-07-17 15:03:17 +09:00
bells17
aceb4468b6 volumebinding: scheduler queueing hints - PersistentVolumeClaim 2024-07-16 12:48:50 +09:00
Kubernetes Prow Robot
ae1caa40a2 Merge pull request #125961 from Jerry-yz/master
Chore: fix scheduler code comment typos
2024-07-15 19:27:30 -07:00
Kubernetes Prow Robot
c7dab2a507 Merge pull request #125280 from HirazawaUi/add-pvc-events-queueinghintfn
Add QueueingHintFn for pvc events in VolumeRestriction plugin
2024-07-12 11:47:30 -07:00
HirazawaUi
cd13be8654 Add QueueingHintFn for pvc events in VolumeRestriction plugin 2024-07-13 00:25:39 +08:00
Kubernetes Prow Robot
bae59799e9 Merge pull request #126050 from sanposhiho/refactor-move
cleanup: refactor the way extracting Node smaller events
2024-07-12 08:13:12 -07:00
Kubernetes Prow Robot
31062790a1 Merge pull request #125855 from googs1025/refactor_scheduler_ut
chore: call close framework when finishing
2024-07-12 05:14:35 -07:00
Hiroyuki Moriya
52a622ad6d volumezone: scheduler queueing hints: pv (#125001)
* volumezone: scheduler queueing hints

* add_comment
2024-07-12 05:14:27 -07:00