Commit Graph

605 Commits

Author SHA1 Message Date
Draven
79f05c65a0 fix: use ">" instead of ">=" in resource allocation 2020-09-30 11:06:17 +08:00
Arghya Sadhu
8154dc95b5 wrap errors in selectorspread and podtoplogyspread plugin 2020-09-28 21:28:54 +05:30
Arghya Sadhu
ad415b9f54 wrap errors in service affinity plugin 2020-09-27 11:28:31 +05:30
Arghya Sadhu
ff3c751afc wrap errors in taint-toleration plugin 2020-09-26 21:16:41 +05:30
SataQiu
8c51c9955c wrap errors from DefaultPreemption, ImageLocality and NodeAffinity plugins 2020-09-25 10:38:27 +08:00
Kubernetes Prow Robot
44cd4fcedc
Merge pull request #95001 from arghya88/deprecate-scheduler-metrics
deprecate scheduler metrics
2020-09-24 13:57:48 -07:00
Kubernetes Prow Robot
48cb3b2b4f
Merge pull request #94850 from SataQiu/structed-log-20200917
Using structured logging in scheduler framework runtime
2020-09-24 09:24:06 -07:00
SataQiu
47c58c3785 using structured logging in scheduler framework runtime 2020-09-24 21:54:31 +08:00
Arghya Sadhu
078b355da3 deprecate scheduler metrics BindingLatency and SchedulingAlgorithmPreemptionEvaluationDuration 2020-09-23 17:15:13 +05:30
Arghya Sadhu
c62f0dd165 removing deprecated scheduler metrics 2020-09-22 21:04:15 +05:30
Kubernetes Prow Robot
965137a992
Merge pull request #94692 from alculquicondor/wrap_errors_min
Wrap errors from VolumeBinding and DefaultBinder plugins
2020-09-15 18:27:34 -07:00
Wei Huang
185ba08fcd
Move podPassesBasicChecks() to VolumeBinding plugin 2020-09-11 13:54:02 -07:00
Aldo Culquicondor
7fb40fc03c Wrap errors on VolumeBinding plugin
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I23053528ac6857124fddd7f9fa26e122202ff4bd
Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-09-10 16:22:16 -04:00
Aldo Culquicondor
94985e28ac Wrap errors on DefaultBinder plugin
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I2e3c8aa2c1a2a5102e9110b6cff91d66a79a90f1
2020-09-10 16:22:16 -04:00
Aldo Culquicondor
aefcfcc627 Wrap errors when running Bind plugins
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I29f8d3ea219a5cf667cf718545e8dfff971ca6ec
2020-09-10 16:22:15 -04:00
Aldo Culquicondor
a482d7ef8e Wrap errors when running PreBind plugins
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I31bf35d7e96b1cebb285cf03ffad310d83224d9c
Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-09-10 16:22:12 -04:00
Aldo Culquicondor
34102140e2 Hold error in framework's Status
To allow obtaining and comparing the original error.

Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: Ibcef89f7b876a273ecc24548f8d204966e0e6059
2020-09-10 10:52:36 -04:00
Wei Huang
4f7ae54f3e
fixup: add podLister as a member field of DefaultPreemption 2020-09-08 18:27:23 -07:00
Wei Huang
52bf6ba8ba
Preemption plugin to fetch pod from informer cache 2020-09-08 18:27:23 -07:00
Kubernetes Prow Robot
44ecd80cf1
Merge pull request #91557 from chendave/cleanup
cleanup: remove useless methods
2020-09-08 06:27:44 -07:00
Kubernetes Prow Robot
d239cdfbc0
Merge pull request #94059 from ahg-g/ahg-anti-affinity
Track pods with required anti-affinity
2020-09-08 04:54:12 -07:00
Kubernetes Prow Robot
b837699f74
Merge pull request #94125 from soulxu/only_includes_all_nodes_for_preferred
Only process all nodes when incoming pod has no preferred affinity
2020-09-07 14:01:58 -07:00
Kubernetes Prow Robot
217d89a59f
Merge pull request #93843 from soulxu/fast_path_podaffinity
Fast return when no any matched anti-affinity terms
2020-09-04 03:31:55 -07:00
Yuan Chen
f6f9bf3e76 Update comments in pkg/scheduler/framework/v1alpha1/interface.go 2020-08-31 15:51:42 -07:00
Kubernetes Prow Robot
2382628f16
Merge pull request #93246 from lixiaobing1/lxb-remove-algorithm
remove some notes about scheduler/algorithm
2020-08-28 06:36:18 -07:00
Kubernetes Prow Robot
72350a3cf7
Merge pull request #93083 from Huang-Wei/dupe-import-alias
Remove duplicate path imports
2020-08-28 04:34:53 -07:00
Kubernetes Prow Robot
27c9bd4fd4
Merge pull request #92939 from yuanchen8911/patch-1
Fix an error in PreBindPlugin comment
2020-08-27 19:08:04 -07:00
Kubernetes Prow Robot
a4103dfeaf
Merge pull request #92819 from chendave/cleanup_machine
Change the node name from "machine" to "node"
2020-08-27 19:06:57 -07:00
Kubernetes Prow Robot
9062c43b76
Merge pull request #94211 from soulxu/cleanup_candidates
Initialize candidate directly instead of iterating the array of candidates
2020-08-27 16:08:06 -07:00
Kubernetes Prow Robot
a5160414e0
Merge pull request #93706 from SimpCosm/fix/scheduler-plugin-comment
Fix an error in NodeUnschedulable plugin comment
2020-08-27 16:06:48 -07:00
Kubernetes Prow Robot
19064f4738
Merge pull request #93669 from Mr-Linus/patch-1
Remove unnecessary conversion
2020-08-27 16:06:25 -07:00
He Jie Xu
0e8bd4c550 Initialize candidate directly instead of iterating the array of candidates
Using existing victimsMap to get the victims, then it is easy to build candidate
directly.
2020-08-27 23:29:36 +08:00
He Jie Xu
ccd8eb3b1b Only process all nodes when incoming pod has no preferred affinity
Currently, in interpodaffinty plugin, it only processes all nodes when the incoming
pod with affinity. Actually, it only cares about all nodes when the incoming pod
with preferred affinity. Then it will reduces the number of nodes need to be
processed.
2020-08-25 16:13:17 +08:00
Linus Lee 李俊江
e55856048d Remove unnecessary conversion
Update framework.go

Update framework.go

Update framework.go

remove unnecessary conversion

remove unnecessary conversion

remove unnecessary conversion

remove unnecessary conversion
2020-08-25 10:07:11 +08:00
Kubernetes Prow Robot
1c93be24ee
Merge pull request #93629 from cofyc/93009
fix flaky TestVolumeBinding unit test
2020-08-21 08:33:40 -07:00
Abdullah Gharaibeh
a8873e1a43 Track pods with required anti-affinity
This is a performance optimization that reduces the overhead of inter-pod affinity PreFilter calculaitons. Basically
eliminates that overhead when no pods in the cluster use required pod anti-affinity. This offered 20% improvement on 5k clusters for preferred anti-affinity benchmarks.
2020-08-21 10:09:21 -04:00
Aldo Culquicondor
dfe9e413d9 Keep track of remaining pods when a node is deleted.
The apiserver is expected to send pod deletion events that might arrive at a different time. However, sometimes a node could be recreated without its pods being deleted.

Partial revert of https://github.com/kubernetes/kubernetes/pull/86964

Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I51f683e5f05689b711c81ebff34e7118b5337571
2020-08-13 14:24:01 -04:00
lixiaobing1
7920de5b57 remove some notes about scheduler/algorithm 2020-08-13 10:01:54 +08:00
He Jie Xu
75ccb90407 Fast return when no any matched anti-affinity terms
When check the incoming pod's anti-affinity rules, there is change to
return early when there is no any matched anti-affinity terms in the
whole cluster.
2020-08-10 14:53:10 +08:00
houmin
868dd41a96 Fix an error in NodeUnschedulable plugin comment 2020-08-10 11:20:23 +08:00
Mike Dame
012245c5b9 Add LabelSelector validation in Pod Affinity/AntiAffinity Filter and Score plugins
The lack of this validation on incoming pods causes unpredictable cluster outcomes
when later calculating affinity results against existing pods (see #92714). This fix
quickly addresses the main source where these problems should be caught.

It is unfortunately difficult to add this validation directly to the API server due
to the fact that it may break migrations with existing pods that fail this check. This
is a compromise to address the current issue.
2020-08-07 12:17:40 -04:00
Shingo Omura
ef1fab7642
expose Run[Pre]ScorePlugins functions in PluginRunner interface 2020-08-04 22:50:13 +09:00
Yecheng Fu
96d0408a89 fix TestVolumeBinding unit test 2020-08-03 07:06:06 +08:00
Abdullah Gharaibeh
5e81a2de98 Optimize VolumeRestrictions scheduler plugin 2020-07-22 23:00:01 -04:00
Wei Huang
bc04d73330
remove duplicate path import 2020-07-14 16:34:09 -07:00
Wei Huang
4e8ccf0187
Refactor and expose common preemption functions 2020-07-11 23:17:21 -07:00
Kubernetes Prow Robot
d06ff65943
Merge pull request #92876 from Huang-Wei/pdbLister
Add pdbLister as a member field of struct DefaultPreemption
2020-07-11 20:57:42 -07:00
Kubernetes Prow Robot
016c2f64de
Merge pull request #92840 from adtac/listers
selectorspread: access listers in plugin instantiation
2020-07-11 20:56:23 -07:00
Kubernetes Prow Robot
36b4c2942b
Merge pull request #92815 from Huang-Wei/bypass-prefilter-svcaffinity
Bypass PreFilter in ServiceAfffinity if AffinityLabels arg is not present
2020-07-10 15:43:11 -07:00
Kubernetes Prow Robot
fbc9cf0894
Merge pull request #92797 from ahg-g/ahg-prefilter
Return a FitError when PreFilter fails with unschedulable status
2020-07-10 15:42:31 -07:00
Kubernetes Prow Robot
0cb7e320a5
Merge pull request #92784 from pohly/generic-ephemeral-inline-volumes
generic ephemeral inline volumes
2020-07-10 15:41:46 -07:00
Dave Chen
a1b2a7765d Change the node name from "machine" to "node"
Latest change on master rename the node name from "machine" to "node"
but haven't update all the affected code, which causes some of testcases
invalid.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-07-10 10:17:58 +08:00
Patrick Ohly
ff3e5e06a7 GenericEphemeralVolume: initial implementation
The implementation consists of
- identifying all places where VolumeSource.PersistentVolumeClaim has
  a special meaning and then ensuring that the same code path is taken
  for an ephemeral volume, with the ownership check
- adding a controller that produces the PVCs for each embedded
  VolumeSource.EphemeralVolume
- relaxing the PVC protection controller such that it removes
  the finalizer already before the pod is deleted (only
  if the GenericEphemeralVolume feature is enabled): this is
  needed to break a cycle where foreground deletion of the pod
  blocks on removing the PVC, which waits for deletion of the pod

The controller was derived from the endpointslices controller.
2020-07-09 23:29:24 +02:00
Yuan Chen
57de07064f
Fix a typo in PreBindPlugin comment
"before a pod is being scheduled"  ->" before a pod is bound"
2020-07-09 10:51:14 -07:00
Kubernetes Prow Robot
3a5e7ea986
Merge pull request #92752 from chendave/skip_preemption
Cut off the cost to run filter plugins when no victim pods are found
2020-07-09 09:10:10 -07:00
Kubernetes Prow Robot
70e09f2c24
Merge pull request #88842 from angao/fit-arg
add args for NodeResourcesFit plugin
2020-07-09 05:04:10 -07:00
Kubernetes Prow Robot
55d77ade67
Merge pull request #92489 from alculquicondor/sig-storage-ownership
Add SIG storage owner aliases
2020-07-09 00:05:20 -07:00
Kubernetes Prow Robot
94a08e159a
Merge pull request #92387 from pohly/csi-storage-capacity
CSI storage capacity check
2020-07-09 00:04:59 -07:00
Wei Huang
9d377eb655
Add pdbLister as a member field of struct DefaultPreemption 2020-07-07 12:25:53 -07:00
Adhityaa Chandrasekar
832a53acdb selectorspread: access listers in plugin instantiation 2020-07-07 14:45:28 +00:00
Aldo Culquicondor
27ec356d76 Add SIG storage owner aliases
And give ownership to pkg/scheduler/framework/plugins/volumebinding

Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I4bd89b1745a2be0e458601056ab905bdd6692195
2020-07-07 10:26:16 -04:00
Dave Chen
028af0970f Cut off the cost to run filter plugins when no victim pods are found
If no potential victims could be found, there is no need to evaluate the node
again, since its state didn't change.

It's safe to return and thus prevent scheduling from running the filter plugins
again.

NOTE:
A node that is filtered out by filter plugins could pass the filter plugins if
there is a change on that node, i.e. pods termination on that node.

Previously, this could be either caught by the normal `schedule` or `preempt` (pods
are terminated when the preemption logic tries to find the nodes and re-evaluate
the filter plugins.)

Actually, this shouldn't be taken care by the preemption, consider the routine
of `schedule` is always running when the interval is "zero", let `schedule`
take care of it will release `preempt` from something irrelevant with the `preemption`.

Due to above reason, couple of testcase as well as the logic of checking the existence
of victim pods are removed as it will never happen after the change.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-07-07 09:55:34 +08:00
Abdullah Gharaibeh
c98dee4945 Return a FitError when PreFilter fails with unschedulable status 2020-07-06 15:02:07 -04:00
Patrick Ohly
0efbbe8555 CSIStorageCapacity: check for sufficient storage in volume binder
This uses the information provided by a CSI driver deployment for
checking whether a node has access to enough storage to create the
currently unbound volumes, if the CSI driver opts into that checking
with CSIDriver.Spec.VolumeCapacity != false.

This resolves a TODO from commit 95b530366a.
2020-07-06 19:20:10 +02:00
Wei Huang
07583bf95b
Bypass PreFilter in ServiceAfffinity if AffinityLabels arg is not present 2020-07-05 23:37:04 -07:00
Kubernetes Prow Robot
86096addb1
Merge pull request #92689 from chendave/fix_testcase
Fix the nits found in the testcases of `PodTopologySpread`
2020-07-03 20:31:26 -07:00
Kubernetes Prow Robot
19883b50f8
Merge pull request #92604 from soulxu/fix_preemption_with_nominated_node
The Pod is eligible to preempt when previous nominanted node is UnschedulableAndUnresolvable
2020-07-03 05:03:01 -07:00
Dave Chen
3e65fe4378 Change the exception to avoid the cost of preemption
node's labels doesn't contain the required topologyKeys in `Constraints`
cannot be resolved by preempting the pods on that pods.

One use case that could easily reproduce the issue is,
- set `alwaysCheckAllPredicates` to true.
- one node contains all the required topologyKeys but is failed in predicates
  such as 'taint'.
- another node doesn't hold all the required topologyKeys, and thus return `Unschedulable`
  status code.
- scheduler will try to preempt the pods on the above node with lower priorities.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-07-03 10:17:31 +08:00
He Jie Xu
b3741f344e The Pod is eligible to preempt when previous nominanted node is UnschedulableAndUnresolvable
If the Pod's previous nominated node is UnschedulableAndUnresolvable from previous
filtering, it should be considered for preemption again.
2020-07-03 08:57:45 +08:00
Dave Chen
41fd19760e Fix the nits found in the testcases of PodTopologySpread
Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-07-02 12:37:46 +08:00
Wei Huang
7362fccdd7
Polish unit tests of defaultpreemptio plugin 2020-06-30 14:05:48 -07:00
Kubernetes Prow Robot
784b0738b5
Merge pull request #92578 from zhouya0/fix_preemt_comment
Fix scheduler preemt function comment
2020-06-29 18:35:27 -07:00
Kubernetes Prow Robot
281023790f
Merge pull request #92501 from rakeshreddybandi/rename-plugin
Rename DefaultPodTopologySpread plugin #91994
2020-06-29 18:34:58 -07:00
zhouya0
59f9a7d81e Fix preemt function comment 2020-06-28 18:29:55 +08:00
Kubernetes Prow Robot
4fc5c1eda2
Merge pull request #92391 from adtac/adtac/reserve-failure
scheduler: run Unreserve if Reserve fails
2020-06-27 16:04:14 -07:00
RAKESH REDDY BANDI
d44a20f9ca Rename DefaultPodTopologySpread plugin #91994 2020-06-27 13:46:31 -04:00
Kubernetes Prow Robot
ad29e168dc
Merge pull request #92108 from Huang-Wei/postfilter-impl-4
[postfilter-impl-4] Move Preempt() to defaultpreemption package.
2020-06-27 09:02:15 -07:00
Adhityaa Chandrasekar
1b223b861a scheduler: run Unreserve if Reserve fails
If a reserve plugin's Reserve method returns an error, there could be
previously allocated resources from successfully completed reserve
plugins that must be unallocated by the corresponding Unreserve
operation. Since Unreserve operations are idempotent, this patch runs
the Unreserve operation of ALL reserve plugins when a Reserve operation
fails.
2020-06-26 20:41:33 +00:00
Wei Huang
058e3d4258
Move Preempt() and its related functions to defaultpreemption package
Refactor genericScheduler and signature of preemption funcs
  - remove podNominator from genericScheduler
  - simplify signature of preemption functions

Make Preempt() private
2020-06-25 12:33:51 -07:00
Adhityaa Chandrasekar
ec83143342 scheduler: merge Reserve and Unreserve plugins
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
2020-06-24 21:10:35 +00:00
Kubernetes Prow Robot
8adcd7978e
Merge pull request #92268 from alculquicondor/ext-point-profile
Add profile label to framework_extension_point_duration_seconds
2020-06-24 13:31:37 -07:00
Kubernetes Prow Robot
c6d2b223fb
Merge pull request #92222 from cofyc/fix92186
Share pod volume binding cache via framework.CycleState
2020-06-24 13:31:21 -07:00
Yecheng Fu
f899976b41 fixup 2020-06-24 14:14:03 +08:00
Aldo Culquicondor
698eda3079 Add profile label to scheduler extension point metrics
Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-06-23 15:30:22 -04:00
Yecheng Fu
22d874993c build files 2020-06-23 22:18:33 +08:00
Yecheng Fu
4627b419b4 tests only 2020-06-23 22:18:33 +08:00
Yecheng Fu
ee4d7410be Share pod volume binding cache via framework.CycleState 2020-06-23 22:18:33 +08:00
Dave Chen
e1d61b621a Scheduler: remove the misleading comments in NodeResourcesBalancedAllocation
Signed-off-by: Dave Chen dave.chen@arm.com
2020-06-23 17:33:02 +08:00
Wei Huang
d99cc01646
Register and enable defaultpreemption plugin
- Enable defaultpreemption as a PostFilter plugin
- Remote legacy hard-coded preemption logic
2020-06-22 17:22:27 -07:00
Ali Farah
a22e115a0e Split scheduler framework implementation into new runtime package 2020-06-22 00:23:43 +10:00
Kubernetes Prow Robot
5ed7b1afb8
Merge pull request #92012 from Huang-Wei/postfilter-impl-2
[postfilter-impl-2] Introduce a defaultpreemption PostFilter plugin
2020-06-19 21:51:42 -07:00
Kubernetes Prow Robot
9c3f648300
Merge pull request #91705 from mrkm4ntr/revert-assumed-in-unreserve
Revert assumed PVs and PVCs in unreserve extension point
2020-06-19 21:50:54 -07:00
Kubernetes Prow Robot
5968bc4653
Merge pull request #92247 from chendave/skiptopology
Skip `PreScore` when the `TopologySpreadConstraints` is specified
2020-06-19 11:37:44 -07:00
Wei Huang
196056d7fe
Introduce a defaultpreemption PostFilter plugin
- Add a defaultpreemption PostFilter plugin
- Make g.Preempt() stateless
    - make g.Preempt() stateless
    - make g.getLowerPriorityNominatedPods() stateless
    - make g.processPreemptionWithExtenders() stateless
2020-06-19 09:13:55 -07:00
Shintaro Murakami
79ab958996 Revert assumed PVs and PVCs in unreserve extension point 2020-06-19 17:39:42 +09:00
Dave Chen
068c69d743 Skip PreScore when the TopologySpreadConstraints is specified
`DefaultPodTopologySpread` need't score when the `TopologySpreadConstraints`
is specified.

`PreScore` needn't do this as well, this cut off the cost of `PreScore` if
possible.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-06-18 18:01:56 +08:00
Dave Chen
9ebd872e71 Explicitly declare the interfaces for extension points
This make it easier to catch the issue during the compilation, also,
this also align with other plugins, i.e. plugin of "InterPodAffinity".

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-06-17 15:11:44 +08:00
Dave Chen
8f0c329758 cleanup: update invalid comments in plugin of InterPodAffinity
Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-06-16 14:11:59 +08:00
Yecheng Fu
814a6f2acd remove FakeVolumeBinderConfig and test new statues and states 2020-06-12 10:00:19 +08:00
Yecheng Fu
c4138361e4 Fail fast in PreFilter phase and return UnschedulableAndUnresolvable if immediate PVCs are not bound 2020-06-12 10:00:19 +08:00