kubernetes/pkg/scheduler/framework/plugins
Patrick Ohly ecbafb8de5 DRA: fix scheduler/resource claim controller race
There was a race caused by having to update claim finalizer and status in two
different operations:
- Resource claim controller removes allocation, does not yet
  get to remove the finalizer.
- Scheduler prepares an allocation, without adding the finalizer
  because it's there.
- Controller removes finalizer.
- Scheduler adds allocation.

This is an invalid state. Automatic checking found this during the execution of
the "with translated parameters on single node.*supports sharing a claim
sequentially" E2E test, but only when run stand-alone. When running in
parallel (as in the CI), the bad outcome of the race did not occur.

The fix is to check that the finalizer is still set when adding the
allocation. The apiserver doesn't check that because it doesn't know which
finalizer goes with the allocation result. It could check for "some finalizer",
but that is not guaranteed to be correct (could be some unrelated one).

Checking the finalizer can only be done with a JSON patch. Despite the
complications, having the ability to add multiple pods concurrently to
ReservedFor seems worth it (avoids expensive rescheduling or a local retry
loop).

The resource claim controller doesn't need this, it can do a normal update
which implicitly checks ResourceVersion.
2024-06-27 15:03:06 +02:00
..
defaultbinder Change the scheduler plugins PluginFactory function to use context parameter to pass logger 2023-09-20 17:49:54 +08:00
defaultpreemption Test that the DisruptionTarget condition is added at preemption 2024-06-17 16:59:52 +00:00
dynamicresources DRA: fix scheduler/resource claim controller race 2024-06-27 15:03:06 +02:00
examples Migrated pkg/scheduler/framework/plugins/examples/ to use contextual logging 2023-10-09 11:43:17 +08:00
feature schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00
helper kube-scheduler: add taints filtering logic consistent with TaintToleration plugin for PodTopologySpread plugin 2022-09-10 09:04:30 +08:00
imagelocality Consider initContainer images in pod scheduling 2024-02-19 14:17:57 +08:00
interpodaffinity fix: node added with matched pod anti-affinity topologyKey 2024-04-12 11:08:44 +08:00
names Remove deprecated selectorSpread 2023-08-28 22:11:33 +08:00
nodeaffinity feature(NodeAffinity): return Skip in PreScore when nothing to do in Score 2023-12-18 12:00:10 +00:00
nodename Change the scheduler plugins PluginFactory function to use context parameter to pass logger 2023-09-20 17:49:54 +08:00
nodeports Optimize klog output 2024-03-26 18:53:29 +08:00
noderesources schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00
nodeunschedulable scheduler/NodeUnschedulable: reduce pod scheduling latency 2023-12-16 20:50:11 +08:00
nodevolumelimits QueueingHint for CSILimit when deleting pods (#121508) 2024-05-14 11:07:11 -07:00
podtopologyspread register Node/UpdateNodeTaint event to plugins which has Node/Add only, doesn't have Node/UpdateNodeTaint 2024-03-16 14:13:06 +00:00
queuesort Change the scheduler plugins PluginFactory function to use context parameter to pass logger 2023-09-20 17:49:54 +08:00
schedulinggates schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00
tainttoleration schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00
testing Change the scheduler plugins PluginFactory function to use context parameter to pass logger 2023-09-20 17:49:54 +08:00
volumebinding CephRBD volume plugin ( ) and its csi migration support were removed in this release 2024-05-09 22:55:34 +08:00
volumerestrictions Implement QueueingHintFn for pod deleted event 2024-06-17 22:42:04 +08:00
volumezone register Node/UpdateNodeTaint event to plugins which has Node/Add only, doesn't have Node/UpdateNodeTaint 2024-03-16 14:13:06 +00:00
README.md scheduler/framework/plugins: delete moved docs 2021-02-16 13:26:27 +00:00
registry.go schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00

Scheduler Framework Plugins

Moved here.