Commit Graph

1044 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
53e91942bf
Merge pull request #117780 from sourcelliu/schedulinggates
Improve the performance of schedulinggates
2023-10-26 18:14:32 +02:00
Kubernetes Prow Robot
2749509f35
Merge pull request #121469 from sanposhiho/renamerename
cleanup: rename failedPlugin to plugin in framework.Status
2023-10-25 20:16:43 +02:00
Kensei Nakada
27bb66fd7b cleanup: rename failedPlugin to plugin in framework.Status 2023-10-25 12:03:56 +00:00
Kubernetes Prow Robot
9aa04752e7
Merge pull request #118463 from testwill/replace_loop
chore: slice replace loop
2023-10-24 15:04:39 +02:00
Mengjiao Liu
b0a73213d6 kube-scheduler: convert the remaining part to use contextual logging 2023-10-24 17:56:48 +08:00
Kubernetes Prow Robot
6d7d249372
Merge pull request #121077 from chrishenzie/readwriteoncepod-ga
Graduate ReadWriteOncePod to GA
2023-10-24 05:26:05 +02:00
Kubernetes Prow Robot
5a4e792e06
Merge pull request #120534 from pohly/dra-scheduler-ssa-as-fallback
dra scheduler: fall back to SSA for PodSchedulingContext updates
2023-10-23 21:06:58 +02:00
Chris Henzie
2dbd405583 Graduate ReadWriteOncePod to GA 2023-10-20 10:40:39 -07:00
Kensei Nakada
cb5dc46edf feature(scheduler): simplify QueueingHint by introducing new statuses 2023-10-19 11:02:11 +00:00
Kubernetes Prow Robot
ff4ba92b1c
Merge pull request #119155 from carlory/fix-118893-1
nodeaffinity: scheduler queueing hints
2023-10-19 09:31:27 +02:00
Kubernetes Prow Robot
46c307868f
Merge pull request #119176 from carlory/fix-118893-2
nodeports: scheduler queueing hints
2023-10-10 19:07:07 +02:00
Kubernetes Prow Robot
e224fc75ca
Merge pull request #116885 from mengjiao-liu/contextual-logging-scheduler-plugin-examples
Migrated  `pkg/scheduler/framework/plugins/examples/` to use contextual logging
2023-10-09 20:32:46 +02:00
Mengjiao Liu
9cca527c4b Migrated pkg/scheduler/framework/plugins/examples/ to use contextual logging 2023-10-09 11:43:17 +08:00
carlory
7cba35f651 nodeports: scheduler queueing hints
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-10-08 11:34:43 +08:00
HirazawaUi
b20bc79a60 remove not register event code 2023-10-07 21:59:01 +08:00
bzsuni
6200eb04af use generic sets in scheduler
Signed-off-by: bzsuni <bingzhe.sun@daocloud.io>
2023-09-28 21:31:33 +08:00
Kubernetes Prow Robot
9c5698f514
Merge pull request #116803 from mengjiao-liu/contextual-logging-scheduler-plugin-volumebinding
Migrated `pkg/scheduler/framework/plugins/volumebinding` to contextual logging
2023-09-27 15:04:38 -07:00
carlory
1d88bf9789 scheduler/nodeaffinity: reduce pod scheduling latency
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>
2023-09-25 09:41:22 +08:00
Kubernetes Prow Robot
3ac83f528d
Merge pull request #119290 from carlory/add-logger
the scheduling queue logs the error and treats it as QueueAfterBackoff
2023-09-22 08:10:49 -07:00
Mengjiao Liu
3eb6c4d368 Migrated pkg/scheduler/framework/plugins/volumebinding to contextual logging 2023-09-21 11:28:12 +08:00
carlory
0105a002bc when the hint fn returns error, the scheduling queue logs the error and treats it as QueueAfterBackoff.
Co-authored-by: Kensei Nakada <handbomusic@gmail.com>

Co-authored-by: Kante Yin <kerthcet@gmail.com>

Co-authored-by: XsWack <xushiwei5@huawei.com>
2023-09-21 09:40:44 +08:00
Mengjiao Liu
a7466f44e0 Change the scheduler plugins PluginFactory function to use context parameter to pass logger
- Migrated pkg/scheduler/framework/plugins/nodevolumelimits to use contextual logging
- Fix golangci-lint validation failed
- Check for plugins creation err
2023-09-20 17:49:54 +08:00
Patrick Ohly
7cac1dcf67 dra scheduler: fall back to SSA for PodSchedulingContext updates
During scheduler_perf testing, roughly 10% of the PodSchedulingContext update
operations failed with a conflict error. Using SSA would avoid that, but
performance measurements showed that this causes a considerable
slowdown (primarily because of the slower encoding with JSON instead of
protobuf, but also because server-side processing is more expensive).

Therefore a normal update is tried first and SSA only gets used when there has
been a conflict. Using SSA in that case instead of giving up outright is better
because it avoids another scheduling attempt.
2023-09-15 15:05:38 +02:00
Stephen Kitt
3cb0b520d6
Scheduler CSI tests: switch maxVols to int32
This ends up stored in an int32 Count, use the target type throughout
to avoid narrowing conversions.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-15 09:52:50 +02:00
wackxu
28dbe8a34d scheduler/NodeUnschedulable: reduce pod scheduling latency
Signed-off-by: wackxu <xushiwei5@huawei.com>
2023-09-14 10:23:43 +08:00
Stephen Kitt
9990307146
kube-scheduler: drop deprecated pointer package
This replaces deprecated k8s.io/utils/pointer functions with their ptr
equivalent.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-13 09:42:19 +02:00
Kubernetes Prow Robot
db49b13ccd
Merge pull request #120252 from kerthcet/cleanup/framework-import
Move framework testing libraries to the right place
2023-09-12 17:44:11 -07:00
kerthcet
6fbb8ec7e4 Move scheduler testing utils to /scheduler/testing
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-09-12 13:42:38 +08:00
Patrick Ohly
6f9140e421 DRA scheduler: stop allocating before deallocation
This fixes a test flake:

    [sig-node] DRA [Feature:DynamicResourceAllocation] multiple nodes reallocation [It] works
    /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:552

      [FAILED] number of deallocations
      Expected
          <int64>: 2
      to equal
          <int64>: 1
      In [It] at: /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:651 @ 09/05/23 14:01:54.652

This can be reproduced locally with

    stress -p 10 go test ./test/e2e -args -ginkgo.focus=DynamicResourceAllocation.*reallocation.works  -ginkgo.no-color -v=4 -ginkgo.v

Log output showed that the sequence of events leading to this was:
- claim gets allocated because of selected node
- a different node has to be used, so PostFilter sets
  claim.status.deallocationRequested
- the driver deallocates
- before the scheduler can react and select a different node,
  the driver allocates *again* for the original node
- the scheduler asks for deallocation again
- the driver deallocates again (causing the test failure)
- eventually the pod runs

The fix is to disable allocations first by removing the selected node and then
starting to deallocate.
2023-09-11 10:56:17 +02:00
Kubernetes Prow Robot
a64a3e16ec
Merge pull request #120253 from pohly/dra-scheduler-podschedulingcontext-updates
dra scheduler: refactor PodSchedulingContext updates
2023-09-08 02:48:14 -07:00
Patrick Ohly
5c7dac2d77 dra scheduler: refactor PodSchedulingContext updates
Instead of modifying the PodSchedulingContext and then creating or updating it,
now the required changes (selected node, potential nodes) are tracked and the
actual input for an API call is created if (and only if) needed at the end.

This makes the code easier to read and change. In particular, replacing the
Update call with Patch or Apply is easy.
2023-09-08 08:06:06 +02:00
Kubernetes Prow Robot
2d5b6f16f5
Merge pull request #120213 from pohly/dra-scheduler-resourceclass-missing
dra: resourceclass missing
2023-09-06 23:47:09 -07:00
Patrick Ohly
c682d2b8c5 scheduler: add ResourceClass events
When filtering fails because a ResourceClass is missing, we can treat the pod
as "unschedulable" as long as we then also register a cluster event that wakes
up the pod. This is more efficient than periodically retrying.
2023-09-06 11:14:08 +02:00
Kubernetes Prow Robot
cd91351dff
Merge pull request #117720 from kerthcet/feat/remove-selector-spread
Remove deprecated selectorSpread
2023-08-29 00:25:22 -07:00
Kubernetes Prow Robot
029d518970
Merge pull request #117588 from kerthcet/cleanup/use-genericset
Avoid duplicated dots in pod status when preempting
2023-08-28 08:39:44 -07:00
kerthcet
855b445d28 Remove deprecated selectorSpread
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-28 22:11:33 +08:00
Kubernetes Prow Robot
faf1b5d655
Merge pull request #114685 from AxeZhan/dynamicresources
dynamic resource allocation: optimize class.SuitableNodes usage
2023-08-28 04:43:43 -07:00
kerthcet
3d583398fe Avoid to build the error msg for twice
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-28 17:13:39 +08:00
Patrick Ohly
2472291790 api: introduce separate VolumeResourceRequirements struct
PVC and containers shared the same ResourceRequirements struct to define their
API. When resource claims were added, that struct got extended, which
accidentally also changed the PVC API. To avoid such a mistake from happening
again, PVC now uses its own VolumeResourceRequirements struct.

The `Claims` field gets removed because risk of breaking someone is low:
theoretically, YAML files which have a claims field for volumes now
get rejected when validating against the OpenAPI. Such files
have never made sense and should be fixed.

Code that uses the struct definitions needs to be updated.
2023-08-21 15:31:28 +02:00
Kubernetes Prow Robot
312dc127a9
Merge pull request #118923 from AxeZhan/volume_zone_csi
[Scheduler]Translate beta label to ga in volume_zone
2023-08-17 20:20:28 -07:00
AxeZhan
af26ebd0fa translate beta label to ga in volume_zone 2023-08-18 00:31:09 +08:00
SataQiu
ef7d404702 using wait.PollUntilContextTimeout instead of deprecated wait.Poll for pkg/scheduler
using wait.PollUntilContextTimeout instead of deprecated wait.Poll for test/integration/scheduler

using wait.PollUntilContextTimeout instead of deprecated wait.Poll for test/e2e/scheduling

using wait.ConditionWithContextFunc for PodScheduled/PodIsGettingEvicted/PodScheduledIn/PodUnschedulable/PodSchedulingError
2023-08-17 17:25:09 +08:00
AxeZhan
47fec59a31 parse node selector in prefilter 2023-08-14 16:39:46 +08:00
wackxu
a9d26ac7c7 Optimize the code of NodeUnschedulable to reduce TolerationsTolerateTaint function calls
Signed-off-by: wackxu <xushiwei5@huawei.com>
2023-07-18 21:00:05 +08:00
carlory
0599b3caa0 change the QueueingHintFn to pass a logger 2023-07-13 00:56:41 +08:00
Patrick Ohly
6f1a29520f scheduler/dra: reduce pod scheduling latency
This is a combination of two related enhancements:
- By implementing a PreEnqueue check, the initial pod scheduling
  attempt for a pod with a claim template gets avoided when the claim
  does not exist yet.
- By implementing cluster event checks, only those pods get
  scheduled for which something changed, and they get scheduled
  immediately without delay.
2023-07-12 11:17:04 +02:00
Patrick Ohly
ef48efc736 scheduler dynamicresources: minor logging improvements
This makes some complex values a bit more readable.
2023-07-12 11:07:59 +02:00
Kubernetes Prow Robot
e0dafe57a3
Merge pull request #117351 from pohly/dra-generated-resource-claim-names
DRA: generated resource claim names
2023-07-11 10:33:11 -07:00
Patrick Ohly
444d23bd2f dra: generated name for ResourceClaim from template
Generating the name avoids all potential name collisions. It's not clear how
much of a problem that was because users can avoid them and the deterministic
names for generic ephemeral volumes have not led to reports from users. But
using generated names is not too hard either.

What makes it relatively easy is that the new pod.status.resourceClaimStatus
map stores the generated name for kubelet and node authorizer, i.e. the
information in the pod is sufficient to determine the name of the
ResourceClaim.

The resource claim controller becomes a bit more complex and now needs
permission to modify the pod status. The new failure scenario of "ResourceClaim
created, updating pod status fails" is handled with the help of a new special
"resource.kubernetes.io/pod-claim-name" annotation that together with the owner
reference identifies exactly for what a ResourceClaim was generated, so
updating the pod status can be retried for existing ResourceClaims.

The transition from deterministic names is handled with a special case for that
recovery code path: a ResourceClaim with no annotation and a name that follows
the Kubernetes <= 1.27 naming pattern is assumed to be generated for that pod
claim and gets added to the pod status.

There's no immediate need for it, but just in case that it may become relevant,
the name of the generated ResourceClaim may also be left unset to record that
no claim was needed. Components processing such a pod can skip whatever they
normally would do for the claim. To ensure that they do and also cover other
cases properly ("no known field is set", "must check ownership"),
resourceclaim.Name gets extended.
2023-07-11 14:23:48 +02:00
Kubernetes Prow Robot
c95b16b280
Merge pull request #118608 from utam0k/podtopologyspread-prescore-skip
Return Skip in PodTopologySpread#PreScore under specific conditions
2023-07-10 09:27:07 -07:00
Kubernetes Prow Robot
0ae9aaacfa
Merge pull request #118271 from tangwz/add_nodeports_prefilter_skip_status
feat(NodePorts): return Skip status in PreFilter
2023-07-09 20:49:04 -07:00
Gunju Kim
7286d122fb
Mark pods with restartable init containers as UnschedulableAndUnresolvable
This marks the pods with restartable init containers as
`UnschedulableAndUnresolvable` if the feature gate is disabled to avoid
the inconsistency in resource calculation between the scheduler and the
older kubelet.
2023-07-08 07:26:13 +09:00
tangwz
1bf2f6c9c0 feat(NodePorts): return Skip status in PreFilter 2023-07-06 08:42:08 +08:00
utam0k
ef26510164
Return Skip in PodTopologySpread#PreScore under specific conditions
Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-06-28 12:08:10 +00:00
Kubernetes Prow Robot
52457842d1
Merge pull request #117055 from cyclinder/csi_migration
remove CSI-migration gate
2023-06-28 04:28:31 -07:00
Kubernetes Prow Robot
d9714078f8
Merge pull request #118551 from sanposhiho/event-to-register
feature(scheduler): implement ClusterEventWithHint to filter out useless events
2023-06-26 06:41:45 -07:00
Kensei Nakada
6f8d38406a feature(scheduler): implement ClusterEventWithHint to filter out useless events 2023-06-22 13:36:19 +00:00
Kubernetes Prow Robot
bc8e312857
Merge pull request #117903 from sourcelliu/dynamic
feature(DynamicResources): return Skip in PreFilter
2023-06-20 17:48:20 -07:00
Kubernetes Prow Robot
4483bf66fe
Merge pull request #116635 from mengjiao-liu/contextual-logging-plugin-interpodaffinity
Migrated `pkg/scheduler/framework/plugins/interpodaffinity` to contextual logging
2023-06-09 08:14:13 -07:00
guoguangwu
1d9eed9f95 chore: slice replace loop 2023-06-05 22:40:53 +08:00
SataQiu
410b6023d6 scheduler: fix code style issues for pkg/scheduler 2023-06-05 17:29:49 +08:00
cyclinder
8e4228a8c1 remove CSI-migration gate 2023-06-04 18:40:17 +08:00
Mengjiao Liu
6d23da045f Migrated pkg/scheduler/framework/plugins/interpodaffinity to use contextual logging 2023-06-01 18:24:54 +08:00
Mengjiao Liu
074900e81b scheduler: update the scheduler interface and cache methods to use contextual logging 2023-05-29 13:26:32 +08:00
Kubernetes Prow Robot
f7cfb5f02f
Merge pull request #118257 from pohly/dra-scheduler-plugin-loopvar-fix
dra scheduler plugin test: fix loopvar bug and "reserve" expected data
2023-05-26 06:06:53 -07:00
Patrick Ohly
7a6b4a9215 dra scheduler plugin test: fix loopvar bug and "reserve" expected data
The `listAll` function returned a slice where all pointers referred to the same
instance. That instance had the value of the last list entry. As a result, unit
tests only compared that element.

During the reserve phase, the first claim gets reserved in two test
cases. Those two tests must expect that change. That hadn't been noticed before
because that first claim didn't get compared.
2023-05-25 15:10:05 +02:00
Mengjiao Liu
1c05cf1d51 kube-scheduler: NewFramework function to pass the context parameter
Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>
2023-05-23 10:17:34 +08:00
Kubernetes Prow Robot
c7c41d27b4
Merge pull request #117834 from NoicFank/cleanup-scheduler-node-must-not-nil-in-snapshot
cleanup useless null pointer check about nodeInfo.Node() from snapshot for in-tree plugins
2023-05-20 15:16:18 -07:00
dingzhu lurong
ed26fcf5b8 cleanup useless null pointer check about nodeInfo.Node() from snapshot for in-tree plugins 2023-05-20 22:53:43 +08:00
Kubernetes Prow Robot
da1b9df26c
Merge pull request #118032 from kerthcet/cleanup/interpodaffinity2
Chore: cleanup in interPodAffinity
2023-05-17 14:00:33 -07:00
Kubernetes Prow Robot
53772982be
Merge pull request #116829 from mengjiao-liu/contextual-logging-scheduler-plugin-volumezone
Migrated the volumezone scheduler plugin to use contextual logging
2023-05-16 09:53:35 -07:00
kerthcet
3ac7497361 Chore: cleanup in interpodaffinity
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-05-16 14:46:15 +08:00
mantuliu
6e2ea32fc8 feature(DynamicResources): return Skip in PreFilter 2023-05-15 00:06:08 +08:00
Kubernetes Prow Robot
58e13496d6
Merge pull request #116842 from mengjiao-liu/contextual-logging-scheduler-runtime
Migrated `pkg/scheduler/framework/runtime` to use contextual logging
2023-05-11 10:59:02 -07:00
Mengjiao Liu
fe728996ca scheduler test: call frameworkruntime.WithLogger function for contextual logging 2023-05-11 15:46:08 +08:00
utam0k
c0611b6bb3
Return Skip in InterPodAffinity#PreScore under specific conditions
This commit updates the InterPodAffinity PreScore to return a Skip status when the following conditions are met:
1. There are no nodes to score.
2. The incoming pod has no inter-pod affinities && the `IgnorePreferredTermsOfExistingPods` option is enabled.

Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-05-10 13:02:23 +00:00
mantuliu
e6900f5ead Optimize the performance of the Clone method of preFilterState 2023-05-05 19:53:12 +08:00
mantuliu
887654160f Improve the performance of schedulinggates 2023-05-04 18:09:49 +08:00
Kubernetes Prow Robot
47f1bd9f80
Merge pull request #117649 from SataQiu/scheduler-remove-v1beta2-20230427
scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration  component config
2023-05-03 09:54:41 -07:00
Kubernetes Prow Robot
0d67dd689b
Merge pull request #117683 from utam0k/skip-topologyspread-empty
Add check to skip PodTopologySpread PreFilter if no constraints are specified
2023-05-03 06:48:24 -07:00
SataQiu
1f7c07f355 scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration 2023-05-03 21:43:19 +08:00
utam0k
d82684e691
Add check to skip PodTopologySpread PreFilter if no constraints are specified
This commit adds a check in the PodTopologySpread PreFilter function to
return a Skip status if there are no topology spread constraints specified
This prevents unnecessary processing and filtering for pods that don't have any topology spread
constraints.
This change is a part of the work for issue #114399.

Signed-off-by: utam0k <k0ma@utam0k.jp>
2023-05-03 04:39:00 +00:00
Kubernetes Prow Robot
8353d4623b
Merge pull request #117427 from cbandy/pkg-testing-setenv
Replace os.Setenv with testing.T.Setenv in tests
2023-04-29 08:28:16 -07:00
Kubernetes Prow Robot
b44482a37c
Merge pull request #116797 from mengjiao-liu/contextual-looging-scheduler-plugin-podtopologyspread
Migrated `pkg/scheduler/framework/plugins/podtopologyspread` to contextual logging
2023-04-27 12:28:27 -07:00
Kubernetes Prow Robot
a38efaccc0
Merge pull request #116748 from mengjiao-liu/contextual-logging-scheduler-plugin-noderesource
Migrated `pkg/scheduler/framework/plugins/noderesources` to contextual logging
2023-04-27 12:28:15 -07:00
Kubernetes Prow Robot
5170c25609
Merge pull request #116835 from mengjiao-liu/contextual-logging-scheduler-plugin-preemption
Migrated `pkg/scheduler/framework/preemption & defaultpreemption` to use contextual logging
2023-04-27 11:10:16 -07:00
Kubernetes Prow Robot
87f3acf7f6
Merge pull request #115398 from tangwz/add_NodeVolumeLimits_PreFilter
feat(NodeVolumeLimits): return Skip in PreFilter
2023-04-27 01:44:14 -07:00
Mengjiao Liu
7f370d651d Migrated pkg/scheduler/framework/plugins/podtopologyspread to contextual logging 2023-04-27 15:55:09 +08:00
Mengjiao Liu
54e6f609ce Migrated pkg/scheduler/framework/plugins/noderesources to contextual logging 2023-04-27 14:46:13 +08:00
Mengjiao Liu
37a9260d5c Migrate pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go to use contextual logging 2023-04-27 11:28:19 +08:00
Mengjiao Liu
eeb1399383 Migrated pkg/scheduler/framework/preemption to use contextual logging 2023-04-27 11:28:14 +08:00
tangwz
8ed861889a feat(NodeVolumeLimits): return Skip in PreFilter 2023-04-26 20:17:04 +08:00
Maciej Borsz
5c584269a7 avoid volume copy in checkAttachableInlineVolume 2023-04-19 20:10:22 +00:00
Chris Bandy
d38ac7e7c6 Replace os.Setenv with testing.T.Setenv in tests
T.Setenv ensures that the environment is returned to its prior state
when the test ends. It also panics when called from a parallel test to
prevent racy test interdependencies.
2023-04-17 20:39:46 -05:00
Kubernetes Prow Robot
94a15929cf
Merge pull request #116408 from ChenLingPeng/fit
skip pod resource check when request is zero
2023-04-17 11:44:45 -07:00
Kubernetes Prow Robot
242702cb86
Merge pull request #116940 from sarab97/sarab/feat/sets
Use the generic Set in scheduler
2023-04-11 19:17:14 -07:00
Kubernetes Prow Robot
365ac69fc4
Merge pull request #116845 from major1201/fix_binder_typo
fix GetPodVolumeClaims in comments
2023-04-11 18:20:05 -07:00
Kubernetes Prow Robot
e77ca49022
Merge pull request #114898 from AxeZhan/volumerestrictions
feature(volume_restrictions): return Skip in PreFilter
2023-04-11 15:35:04 -07:00
sarab
8d18ae6fc2 Use the generic Set in scheduler 2023-04-09 11:34:17 +05:30
kidddddddddddddddddddddd
8d644fbc72 return skip in volumerestrictions 2023-03-23 23:14:24 +08:00