The status can be used by (Pre)Filter plugins to indicate that
preemption wouldn't change the decision of the filter.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
- Rename 'topologyPairsPodSpreadMap' to 'podSpreadCache'
- New struct `criticalPaths criticalPaths`
- Add unified method `*criticalPaths.update()` for:
- regular update
- addPod in preemption case
- remotePod in preemption case
Filter() is called simultaneously, so the member of its (fake) implementation
cannot be written without lock.
The issue can be triggered by:
go test k8s.io/kubernetes/pkg/scheduler/core --race --count=50
Make the cache implement NodeLister and expose it to the priority
functions. This way, the priority functions make use of a single cache,
the scheduler's, instead of mixing it with the lister's caches.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
update bazel build
fix get plugin config method
initialize only needed plugins
fix unit test
fix import duplicate package
update bazel
add docstrings
add weight field to plugin
add plugin to v1alpha1
add plugins at appropriate extension points
remove todo statement
fix import package file path
set plugin json schema
add plugin unit test to option
initial plugin in test integration
initialize only needed plugins
update bazel
rename func
change plugins needed logic
remove v1 alias
change the comment
fix alias shorter
remove blank line
change docstrings
fix map bool to struct
add some docstrings
add unreserve plugin
fix docstrings
move variable inside the for loop
make if else statement cleaner
remove plugin config from reserve plugin unit test
add plugin config and reduce unnecessary options for unit test
update bazel
fix race condition
fix permit plugin integration
change plugins to be pointer
change weight to int32
fix package alias
initial queue sort plugin
rename unreserve plugin
redesign plugin struct
update docstrings
check queue sort plugin amount
fix error message
fix condition
change plugin struct
add disabled plugin for unit test
fix docstrings
handle nil plugin set
This moves the priority types from the algorithm package
to priorities package.
Idea is to move the type to the packages where it is
implemented. This will ease the future refactor process.
This moves the type `ScheduleAlgorithm` from `pkg/scheduler/algorithm`
to `pkg/scheduler/core`. The reason for this move is to fix our import
dependency graph and allow predicate & priority types to be moved into
their appropriate packages.
The new location makes sense because `core` is the only package that
exports an implementation of this type.
With the alpha scheduling queue we move pods from unschedulable to
active on certain events without a backoff. As a result we can cause
starvation issues if high priority pods are in the unschedulable queue.
Implement a backoff mechanism for pods being moved to active.
Closes#56721
- don't update nominatedMap cache when Pop() an element from activeQ
- instead, delete the nominated info from cache when it's "assumed"
- unit test behavior adjusted
- expose SchedulingQueue in factory.Config
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
Loop over priorityConfigs seperately. The node loop can only safely
modify result[i][index]. Before this change it sometimes modified
result[i] concurrently with other loops.
Fixes: 7164967662
==================== Test output for //pkg/scheduler/core:go_default_test:
==================
WARNING: DATA RACE
Read at 0x00c0005e8ed0 by goroutine 22:
k8s.io/kubernetes/pkg/scheduler/core.PrioritizeNodes.func2()
pkg/scheduler/core/generic_scheduler.go:667 +0x2ea
k8s.io/kubernetes/vendor/k8s.io/client-go/util/workqueue.ParallelizeUntil.func1()
staging/src/k8s.io/client-go/util/workqueue/parallelizer.go:65 +0x9e
Previous write at 0x00c0005e8ed0 by goroutine 21:
k8s.io/kubernetes/pkg/scheduler/core.PrioritizeNodes.func2()
pkg/scheduler/core/generic_scheduler.go:668 +0x450
k8s.io/kubernetes/vendor/k8s.io/client-go/util/workqueue.ParallelizeUntil.func1()
staging/src/k8s.io/client-go/util/workqueue/parallelizer.go:65 +0x9e
Goroutine 22 (running) created at:
k8s.io/kubernetes/vendor/k8s.io/client-go/util/workqueue.ParallelizeUntil()
staging/src/k8s.io/client-go/util/workqueue/parallelizer.go:57 +0x1a3
k8s.io/kubernetes/pkg/scheduler/core.PrioritizeNodes()
pkg/scheduler/core/generic_scheduler.go:682 +0x592
k8s.io/kubernetes/pkg/scheduler/core.(*genericScheduler).Schedule()
pkg/scheduler/core/generic_scheduler.go:186 +0x77d
k8s.io/kubernetes/pkg/scheduler/core.TestGenericScheduler.func1()
pkg/scheduler/core/generic_scheduler_test.go:464 +0x91f
testing.tRunner()
GOROOT/src/testing/testing.go:827 +0x162
Goroutine 21 (running) created at:
k8s.io/kubernetes/vendor/k8s.io/client-go/util/workqueue.ParallelizeUntil()
staging/src/k8s.io/client-go/util/workqueue/parallelizer.go:57 +0x1a3
k8s.io/kubernetes/pkg/scheduler/core.PrioritizeNodes()
pkg/scheduler/core/generic_scheduler.go:682 +0x592
k8s.io/kubernetes/pkg/scheduler/core.(*genericScheduler).Schedule()
pkg/scheduler/core/generic_scheduler.go:186 +0x77d
k8s.io/kubernetes/pkg/scheduler/core.TestGenericScheduler.func1()
pkg/scheduler/core/generic_scheduler_test.go:464 +0x91f
testing.tRunner()
GOROOT/src/testing/testing.go:827 +0x162
==================
--- FAIL: TestGenericScheduler (0.01s)
--- FAIL: TestGenericScheduler/test_6 (0.00s)
testing.go:771: race detected during execution of test
testing.go:771: race detected during execution of test
FAIL
- snapshot equivalence cache generation numbers before snapshotting the
scheduler cache
- skip update when generation does not match live generation
- keep the node and increment its generation to invalidate it instead of
deletion
- use predicates order ID as key to improve performance
Automatic merge from submit-queue (batch tested with PRs 67555, 68196). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Not split nodes when searching for nodes but doing it all at once
**What this PR does / why we need it**:
Not split nodes when searching for nodes but doing it all at once.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
@bsalamat
This is a follow up PR of #66733.
https://github.com/kubernetes/kubernetes/pull/66733#discussion_r205932531
**Release note**:
```release-note
Not split nodes when searching for nodes but doing it all at once.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Complement unit test case TestNodesWherePreemptionMightHelp for scheduler/core
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 66862, 67618). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use sync.map to scale equiv class cache better
**What this PR does / why we need it**:
Change the current lock in first level ecache into `sync.Map`, which is known for scaling better than `sync. Mutex ` on machines with >8 CPUs
ref: https://golang.org/pkg/sync/#Map
And the code is much cleaner in this way.
5k Nodes, 10k Pods benchmark with ecache enabled in 64 cores VM:
```bash
// before
BenchmarkScheduling/5000Nodes/0Pods-64 10000 17550089 ns/op
// after
BenchmarkScheduling/5000Nodes/0Pods-64 10000 16975098 ns/op
```
Comparing to current implementation, the improvement after this change is noticeable, and the test is stable in 8, 16, 64 cores VM.
**Special notes for your reviewer**:
**Release note**:
```release-note
Use sync.map to scale ecache better
```
This adds counters to equiv. cache reads & writes. Reads are labeled by
hit/miss, while writes are labeled to indicate whether the write was
discarded.
This will give us visibility into,
- hit rate of cache reads
- ratio of reads to writes
- rate of discarded writes
Automatic merge from submit-queue (batch tested with PRs 66491, 66587, 66856, 66657, 66923). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add space for output
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66291, 66471, 66499). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve unit test TestZeroRequest
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66468
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Extender preemption should respect IsInterested()
**What this PR does / why we need it**:
Extender preemption should respect IsInterested()
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66289
**Special notes for your reviewer**:
The bug is reported and the first commit is co-authored by: @chenchun
**Release note**:
```release-note
Extender preemption should respect IsInterested()
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve scheduler's performance by eliminating sorting of nodes by their score
**What this PR does / why we need it**:
Profiling scheduler, I noticed that scheduler spends a significant amount of time in sorting the nodes after we score them to find nodes with the highest score. Finding nodes with the highest score does not need sorting the array. This PR replaces the sort with a linear scan.
Eliminating the sort results in over 10% improvement in throughput of the scheduler.
Before (3 runs for 5000 nodes, scheduling 1000 pods in a cluster running 2000 pods):
BenchmarkScheduling/5000Nodes/2000Pods-12 1000 20682552 ns/op
BenchmarkScheduling/5000Nodes/2000Pods-12 1000 20464729 ns/op
BenchmarkScheduling/5000Nodes/2000Pods-12 1000 21188906 ns/op
After:
BenchmarkScheduling/5000Nodes/2000Pods-12 1000 18485866 ns/op
BenchmarkScheduling/5000Nodes/2000Pods-12 1000 18457749 ns/op
BenchmarkScheduling/5000Nodes/2000Pods-12 1000 18418200 ns/op
**Release note**:
```release-note
Improve scheduler's performance by eliminating sorting of nodes by their score.
```
Automatic merge from submit-queue (batch tested with PRs 65388, 64995). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add more conditions to the list of predicate failures that won't be resolved by preemption
**What this PR does / why we need it**:
Adds more conditions to the list of predicate failures that won't be resolved by preemption. This change can potentially improve performance of preemption by avoiding the nodes that won't be able to schedule the pending pod no matter how many other pods are removed from them.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Add more conditions to the list of predicate failures that won't be resolved by preemption.
```
/sig scheduling
This moves the equivalence cache implementation out of the 'core'
package and into k8s.io/kubernetes/pkg/scheduler/core/equivalence.
Separating the equiv. cache from the genericScheduler implementation
make their interaction points easier to follow, and prevents us from
accidentally accessing unexported fields.
Automatic merge from submit-queue (batch tested with PRs 64142, 64426, 62910, 63942, 64548). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
scheduler: further cleanup of equivalence cache
**What this PR does / why we need it**:
This improves comments and simplifies some names/logic in equivalence_cache.go, as well as changing the order of some items in the file.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/kind cleanup
Automatic merge from submit-queue (batch tested with PRs 64252, 64307, 64163, 64378, 64179). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove unused parameter (pod) in `pkg/scheduler/core/generic_scheduler`
**What this PR does / why we need it**:
Remove unused parameter (pod) in `pkg/scheduler/core/generic_scheduler`
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
This makes the lookup behave like a normal map lookup, so it is easier
for readers to follow the logic. It also inverts the "invalid" bool to
an "ok" bool because `!invalid` is a double negative.
Automatic merge from submit-queue (batch tested with PRs 63434, 64172, 63975, 64180, 63755). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Optimize the lock which in the RunPredicate
**What this PR does / why we need it**:
Enhance the performance of scheduler
- Change the lock in the RunPredicate from lock to rlock
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Could solve part of #63784
**Special notes for your reviewer**:
_Run benchmark test by scheduler_perf_:
`Before` BenchmarkScheduling/1000Nodes/0Pods-32 1000 11689758 ns/op
`After` BenchmarkScheduling/1000Nodes/0Pods-32 1000 5951510 ns/op
_Run integration (density) test by scheduler_perf_:
Schedule 3000 Pods On 3000 Nodes
`Before` rate 19 per second on average
`After` rate 58 per second on average
_Cpu profile test result_:
`Before` [click](https://cdn.rawgit.com/godliness/files/master/63784_before.svg)
`After` [click](https://cdn.rawgit.com/godliness/files/master/63784_after.svg)
**Release note**:
```release-note
`None`
```
/sig scheduling
/cc @misterikkit
/cc @bsalamat
/cc @ravisantoshgudimetla
/cc @resouer
Automatic merge from submit-queue (batch tested with PRs 64174, 64187, 64216, 63265, 64223). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Do not use DeepEqual to compare slices in test.
This wraps DeepEqual with a helper that considers nil slices and empty
slices to be equal.
Scheduler code might use a nil slice or empty slice to represent an
empty list, so tests should not be sensitive to the difference. Tests
could fail because DeepEqual considers nil to be different from an empty
slice.
**What this PR does / why we need it**:
Avoid breaking tests in cases where application behavior is not changed.
**Special notes for your reviewer**:
This brittle test keeps breaking in a number of my PRs. Hoping to get this fix merged independently.
**Release note**:
```release-note
NONE
```
/sig scheduling
/kind cleanup
This wraps DeepEqual with a helper that considers nil slices and empty
slices to be equal.
Scheduler code might use a nil slice or empty slice to represent an
empty list, so tests should not be sensitive to the difference. Tests
could fail because DeepEqual considers nil to be different from an empty
slice.
Automatic merge from submit-queue (batch tested with PRs 63283, 64032, 64159, 64126, 64098). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove unused code of (pkg/scheduler)
**What this PR does / why we need it**:
/kind cleanup
remove unused code
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63598, 63913, 63459, 63963, 60464). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Check nodeInfo before ecache predicate
**What this PR does / why we need it**:
There's chances during test when nodeInfo is nil which may cause ecache predicate fail with nil pointer.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63427
**Special notes for your reviewer**:
Not sure how to reproduce the original issue yet. i.e. why and when `nodeInfo` will become nil in tests is not clear to me, that's why I label it as WIP.
cc @bsalamat who may have more inputs.
**Release note**:
```release-note
NONE
```
This changes two methods in EquivalenceCache to be unexported, because
they should no longer be called by users of this type. (Even users in
the same package!)
The purpose of this map is to combine two predicate results before
writing to the equivalence cache. However, the branch that combines
results is unreachable.
1. Combining results happens in the second iteration of the outer loop.
2. There is only a second iteration when podsAdded is true.
3. We skip equiv. cache when podsAdded is true.
This method combines "lookup" and "update" into one operation. The
benefit is that this method call is very similar to running an ordinary
predicate, so callers can simplify their code.
Automatic merge from submit-queue (batch tested with PRs 62937, 63105, 63031, 63174). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "Revert "Revert revert of equivalence class hash calculation i…
…n scheduler""
This reverts commit 4386751b5d.
**What this PR does / why we need it**:
This re-introduces the change from https://github.com/kubernetes/kubernetes/pull/58555 which changes how the scheduler computes equivalence classes of pods. I believe we have fixed the flakiness observed previously (https://github.com/kubernetes/kubernetes/issues/61512, https://github.com/kubernetes/kubernetes/issues/62921). I have run the test in question a few dozen times without a failure.
```bash
make test-integration WHAT="./test/integration/scheduler" KUBE_TEST_ARGS="-run TestPreemptionStarvation" GOFLAGS="-v"
```
/ref https://github.com/kubernetes/kubernetes/issues/58222
**Special notes for your reviewer**:
I had to resolve several merge conflicts. I think I resolved them correctly, but keep an eye out for anything silly.
**Release note**:
```release-note
NONE
```
/sig scheduling
Because the scheduler takes a snapshot of cache data at the start of
each scheduling cycle, updates to the equivalence cache should be
skipped if there was a cache update during the cycle.
If the current NodeInfo becomes stale while we evaluate predicates, we
will not write any results into the equivalence cache. We will still use
the results for the current scheduling cycle, though.
Automatic merge from submit-queue (batch tested with PRs 60919, 60953, 61085, 61083, 60971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Sched cache resync
**What this PR does / why we need it**: Scheduler cache comparer
A debug tool that collects resources from api server and compares it
with the scheduler cache. It currently only compares the node list, but
it should be easy to extend. The compare is triggered by signal USER2,
by doing
kill -12 ${SCHED_PID}
The compare result goes to scheduler log.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Towards #60860
**Special notes for your reviewer**: @bsalamat
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 60898, 60912, 60753, 61002, 60796). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Change to fix scheduler extender error return message
**What this PR does / why we need it**:
As of now, scheduler always logs extender endpoint without verb like "filter", "prioritize" etc. With this change, we are including the verb as well while logging which helps in debugging
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 60898, 60912, 60753, 61002, 60796). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert revert of equivalence class hash calculation in scheduler
**What this PR does / why we need it**:
NOTE: This is a revert revert of https://github.com/kubernetes/kubernetes/pull/58555
But since the original PR has been changed, I have to copy the original changes and resend this new PR. See: https://github.com/kubernetes/kubernetes/pull/58555#issuecomment-364345972
And I kept @misterikkit 's change as the first commit (by co-author feature of github) in the history.
We decide to do revert revert because #58989 has been fixed, which should help to improve the time consumed by integration test.
**But** we should still pay attention to integration tests to see if there's frequent timeout happen.
**Special notes for your reviewer**:
**Release note**:
```release-note
Improve equivalence class hash calculation in scheduler
```
Automatic merge from submit-queue (batch tested with PRs 60759, 60531, 60923, 60851, 58717). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Implement preemption for extender with a verb and new interface
**What this PR does / why we need it**:
This is an alternative way of implementing #51656
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#51656
**Special notes for your reviewer**:
We will also want to compare with #56296 to see which one is the best solution. See: https://github.com/kubernetes/kubernetes/pull/56296#discussion_r163381235
cc @ravigadde @bsalamat
**Release note**:
```release-note
Implement preemption for extender with a verb and new interface
```
Automatic merge from submit-queue (batch tested with PRs 59740, 59728, 60080, 60086, 58714). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
more concise to merge the slice
**What this PR does / why we need it**:
more concise to merge the slice
**Special notes for your reviewer**:
Pods in scheduler cache contains both the scheduled pods and those not
scheduled yet in scheduling queue. This commit adds the second group of
pods into consideration while comparing the cache.
UID uniquely identifies pods across lifecycles, while namespace/name
could be 2 different pods across lifecycles. This could result in
tricky scheduler bugs.
Fixes#60966
Equivalence cache for CheckNodeConditionPred becomes invalid when
Node.Spec.Unschedulable changes. This can happen even if
Node.Status.Conditions does not change, so move the logic around.
This logic is covered by integration test
"test/integration/scheduler".TestUnschedulableNodes but equivalence
cache is currently skipped when test pods have no OwnerReference.
Add benchmark for equivalence hashing.
Change equivalence hash function.
This changes the equivalence class hashing function to use as inputs all
the Pod fields which are read by FitPredicates. Before we used a
combination of OwnerReference and PersistentVolumeClaim info, which was
a close approximation. The new method ensures that hashing remains
correct regardless of controller behavior.
The PVCSet field can be removed from equivalencePod because it is
implicitly included in the Volume list.
Tests are now broken.
Move equivalence class hash code.
This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.
In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.
Fix equivalence cache hash tests.
Co-authored-by: Jonathan Basseri <misterikkit@google.com>
Co-authored-by: Harry Zhang <resouer@gmail.com>
Add verb and preemption for scheduler extender
Update bazel
Use simple preemption in extender
Use node name instead of v1.Node
Fix support method
Fix preemption dup
Remove uneeded logics
Remove nodeInfo from param to extender
Update bazel for scheduler types
Mock extender cache with nodeInfo
Add nodeInfo as extender cache
Choose node name or node based on cache flag
Always return meta victims in result
Automatic merge from submit-queue (batch tested with PRs 60106, 59510, 60263, 60063, 59088). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reuse the `min*Nodes` slices in order to save GC time
**What this PR does / why we need it**:
Reuse the `min*Nodes` slices to save GC time when executing `pickOneNodeForPreemption`.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59748
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve scheduling queue's logic
**What this PR does / why we need it**:
Improves scheduling queue's code based on some recent comments on [the original PR](https://github.com/kubernetes/kubernetes/pull/55109).
This PR does not fix any bugs or make any change of behavior.
**Release note**:
```release-note
NONE
```
/sig scheduling
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Format some import statements in scheduler pkg
**What this PR does / why we need it**:
As the title says, apply `goimports` on some files under `pkg/scheduler` pkg.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve performance of scheduling queue by adding a hash map to track all pods with a nominatedNodeName
**What this PR does / why we need it**:
Our investigations show that there is a performance regression in the new scheduling queue which is not enabled by default and is enabled only if "priority and preemption" which is an alpha feature is enabled. This PR is an important performance improvement for those who want to use priority and preemption in larger clusters.
The PR adds a hash table to track nominated Pods so that finding such Pods will be faster.
Other than improving performance, we don't expect this PR to change behavior of scheduler.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
ref/ #56032
ref/ #57471
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/sig scheduling
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Avoid race condition when updating equivalence cache
**What this PR does / why we need it**:
Lock the ecache to update the ecache on each predicate running, to avoid race condition.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fix#58507
**Special notes for your reviewer**:
None
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 59010, 59212, 59281, 59014, 59297). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Replace nominateNodeName annotation with PodStatus.NominatedNodeName
**What this PR does / why we need it**:
Replaces nominateNodeName annotation with PodStatus.NominatedNodeName in scheudler's logic. We don't expect any logic/behavior changes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
ref #57471
/sig scheduling
cc: @k82cn @aveshagarwal @resouer
Automatic merge from submit-queue (batch tested with PRs 59394, 58769, 59423, 59363, 59245). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure euqiv hash calculation is per schedule
**What this PR does / why we need it**:
Currently, equiv hash is calculated per schedule, but also, per node. This is a potential cause of dragging integration test, see #58881
We should ensure this only happens once during scheduling of specific pod no matter how many nodes we have.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58989
**Special notes for your reviewer**:
**Release note**:
```release-note
Ensure euqiv hash calculation is per schedule
```
This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.
In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.