Commit Graph

414 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
4cca6a89a0 Merge pull request #66862 from resouer/sync-map
Automatic merge from submit-queue (batch tested with PRs 66862, 67618). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use sync.map to scale equiv class cache better

**What this PR does / why we need it**:

Change the current lock in first level ecache into  `sync.Map`, which is known for scaling better than `sync. Mutex ` on machines with >8 CPUs

ref: https://golang.org/pkg/sync/#Map
 
And the code is much cleaner in this way.

5k Nodes, 10k Pods benchmark with ecache enabled in 64 cores VM:

```bash
// before
BenchmarkScheduling/5000Nodes/0Pods-64             10000          17550089 ns/op

// after
BenchmarkScheduling/5000Nodes/0Pods-64             10000          16975098 ns/op
```
Comparing to current implementation, the improvement after this change is noticeable, and the test is stable in 8, 16, 64 cores VM.

**Special notes for your reviewer**:

**Release note**:

```release-note
Use sync.map to scale ecache better
```
2018-08-21 00:24:01 -07:00
Kubernetes Submit Queue
ef388fee53 Merge pull request #66948 from mohamed-mehany/anti-affinity-optimization
Automatic merge from submit-queue (batch tested with PRs 67041, 66948). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Anti affinity optimization

**What this PR does / why we need it**:
This pull request aims to optimize the performance of anti-affinity rules lookup of existing pods
This optimization maps the topology values to a list of pods running on nodes that match this value and store that map in the pod metadata. Accordingly, when validating anti-affinity rules of existing pods we will only check those running on nodes with similar topology values to the current candidate (node) for scheduling.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63937

**Special notes for your reviewer**:
/sig scalability
/sig scheduling
**Release note**:

```release-note
improve performance of anti-affinity predicate of default scheduler.
```
2018-08-17 19:14:08 -07:00
Ahmad Diaa
b4c7d190cd using set instead of lists for topologyPairsMaps attributes 2018-08-18 01:02:48 +02:00
Ahmad Diaa
0f4c3064fd created struct for topologyPairs maps 2018-08-18 01:02:48 +02:00
Ahmad Diaa
f6659e4543 further enhancements removing matchingTerms from metadata 2018-08-18 01:02:47 +02:00
Mohamed Mehany
3fb6912d08 add topologyValue map to reduce search space 2018-08-18 01:02:46 +02:00
Bobby (Babak) Salamat
2860743c86 Autogenerated files 2018-08-17 11:18:52 -07:00
Bobby (Babak) Salamat
abb70aee98 Add a scheduler config argument to set the percentage of nodes to score 2018-08-17 11:18:51 -07:00
Bobby (Babak) Salamat
a5045d107e Add NodeTree to the scheduler cache 2018-08-17 09:56:51 -07:00
Bobby (Babak) Salamat
c1896c97ea Add a node tree that allows iterating over nodes in regions and zones 2018-08-17 09:56:51 -07:00
Kubernetes Submit Queue
eeb3389f3b Merge pull request #63260 from misterikkit/ecache-metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

scheduler: add metrics to equivalence cache

This adds counters to equiv. cache reads & writes. Reads are labeled by
hit/miss, while writes are labeled to indicate whether the write was
discarded.

This will give us visibility into,
- hit rate of cache reads
- ratio of reads to writes
- rate of discarded writes



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/63259

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-08-17 01:10:51 -07:00
Kubernetes Submit Queue
825548df95 Merge pull request #67464 from misterikkit/deadcode
Automatic merge from submit-queue (batch tested with PRs 67461, 67464, 67416). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Delete dead code in pkg/scheduler

**What this PR does / why we need it**:
This is just some cleanup. I found some unused code while evaluating the scheduler code.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
/kind cleanup
/sig scheduling
2018-08-15 20:09:09 -07:00
Jonathan Basseri
fbf3d2b84c Delete dead code in pkg/scheduler.
This deletes some unused functions from the `Configurator` interface.
2018-08-15 17:14:38 -07:00
Jonathan Basseri
a77e3bd16b Delete dead code.
This removes a fake Cache implementation that is not used anywhere
(anymore).
2018-08-15 17:14:37 -07:00
Jonathan Basseri
b874d2789b Add metrics to equivalence cache.
This adds counters to equiv. cache reads & writes. Reads are labeled by
hit/miss, while writes are labeled to indicate whether the write was
discarded.

This will give us visibility into,
- hit rate of cache reads
- ratio of reads to writes
- rate of discarded writes
2018-08-15 15:51:13 -07:00
Wei Huang
976797c0b8 fix an issue in NodeInfo.Clone()
- usedPorts is a map-in-map struct, add fix to ensure it's deep copied
- updated unit test
2018-08-15 13:31:16 -07:00
Kubernetes Submit Queue
d7634dcf23 Merge pull request #66856 from charrywanganthony/scheduler_space
Automatic merge from submit-queue (batch tested with PRs 66491, 66587, 66856, 66657, 66923). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add space for output

**Release note**:
```release-note
NONE
```
2018-08-14 17:55:11 -07:00
Kubernetes Submit Queue
6274590518 Merge pull request #66656 from wackxu/fixappversion
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 use apps/v1 version for scheduler

/kind cleanup

**Release note**:

```release-note
NONE
```
2018-08-11 23:25:33 -07:00
Avesh Agarwal
be741feb1a Ouput volumes (total capacity and requests) too along with cpu and memory
when the feature BalanceAttachedNodeVolumes is used.
2018-08-07 15:40:33 -04:00
Avesh Agarwal
ea7f711ae2 Fix incorrect reporting of total request including current pod in the
resource allocation priority function.
2018-08-07 15:37:55 -04:00
Harry Zhang
17d0190706 Use sync.map to scale ecache better 2018-08-07 14:06:09 +08:00
Chao Wang
895b6d441d add space for output 2018-08-01 18:08:31 +08:00
Kubernetes Submit Queue
f4d8220df5 Merge pull request #65616 from cofyc/fix56163
Automatic merge from submit-queue (batch tested with PRs 65570, 65616). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Retry scheduling on StorageClass events

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #56163

**Special notes for your reviewer**:

I have taken over #60006.
It's hard to test in e2e, because we cannot know reschedule of pod is triggered by which event (periodically service/node events will move pods to active queue too). ~~I'll add integration tests for this functionality after [this PR](https://github.com/kubernetes/kubernetes/pull/65296) get merged.~~ (already added)

**Release note**:

```release-note
NONE
```
2018-07-31 19:18:00 -07:00
Kubernetes Submit Queue
0e9b1dd20f Merge pull request #66671 from hanxiaoshuai/cleanup07261
Automatic merge from submit-queue (batch tested with PRs 63955, 66685, 66671). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove unused code in pkg/scheduler/algorithm/scheduler_interface_test.go

**What this PR does / why we need it**:
remove unused code in pkg/scheduler/algorithm/scheduler_interface_test.go
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-26 21:05:11 -07:00
Kubernetes Submit Queue
fea4ad2783 Merge pull request #66670 from foxyriver/fix-log
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix error log

**What this PR does / why we need it**:

fix error log



**Release note**:

```release-note
NONE
```
2018-07-26 19:43:19 -07:00
Mayank Kumar
a5b6d805ea Use GetControllerOf from apimachinery and remove kubernetes copy 2018-07-26 12:20:35 -07:00
hangaoshuai
f3fb9e0f33 remove unused code in pkg/scheduler/algorithm/scheduler_interface_test.go 2018-07-26 21:01:50 +08:00
foxyriver
3b4f250c4a fix error log 2018-07-26 19:48:48 +08:00
wackxu
ab35fa0414 update bazel 2018-07-26 17:37:29 +08:00
xushiwei 00425595
fed8572745 use apps/v1 version for scheduler 2018-07-26 17:37:29 +08:00
Kubernetes Submit Queue
e4465b6e2f Merge pull request #66599 from cofyc/fixfeaturegate
Automatic merge from submit-queue (batch tested with PRs 66540, 66599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Invalidate CheckVolumeBinding predicate only when VolumeScheduling feature is enabled

**What this PR does / why we need it**:

Invalidate CheckVolumeBinding predicate only when VolumeScheduling feature is enabled.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-26 01:55:17 -07:00
Kubernetes Submit Queue
84a15d0291 Merge pull request #66540 from hanxiaoshuai/fixut0724
Automatic merge from submit-queue (batch tested with PRs 66540, 66599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

replace predicates string with corresponding const in TestDefaultPredicates

**What this PR does / why we need it**:
replace predicates string with corresponding const in TestDefaultPredicates. Unify with the const in func defaultPredicates().
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-26 01:55:14 -07:00
Bobby (Babak) Salamat
be55371ff2 minor cleanup of selector_spreading priority function 2018-07-25 13:43:37 -07:00
Yecheng Fu
d2fc875489 Invalidate CheckVolumeBinding predicate only when VolumeScheduling
feature is enabled.
2018-07-25 15:11:23 +08:00
Kubernetes Submit Queue
4dbcf32b3c Merge pull request #66471 from islinwb/improve_TestZeroRequest
Automatic merge from submit-queue (batch tested with PRs 66291, 66471, 66499). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve unit test TestZeroRequest

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66468

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-24 13:59:58 -07:00
Kubernetes Submit Queue
2119d349b0 Merge pull request #66291 from resouer/fix-extender
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Extender preemption should respect IsInterested()

**What this PR does / why we need it**:

Extender preemption should respect IsInterested()

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66289 

**Special notes for your reviewer**:

The bug is reported and the first commit is co-authored by: @chenchun

**Release note**:

```release-note
Extender preemption should respect IsInterested()
```
2018-07-24 13:48:38 -07:00
hangaoshuai
2c59a683a2 replace predicates string with corresponding const in TestDefaultPredicates 2018-07-24 14:27:36 +08:00
Weibin Lin
972e78748a add pod UID 2018-07-23 10:44:31 +08:00
Harry Zhang
d644162a29 Extender preemption should respect IsInterested()
Co-authored-by: Harry Zhang <resouer@gmail.com>
Co-authored-by: Chun Chen <ramichen@tencent.com>
2018-07-23 10:13:38 +08:00
Weibin Lin
5449d153bb Improve unit test TestZeroRequest 2018-07-23 09:15:19 +08:00
Kubernetes Submit Queue
4797c8df8f Merge pull request #63665 from xchapter7x/pkg-scheduler-core
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

use subtest for table units (pkg/scheduler/core)

**What this PR does / why we need it**: Update scheduler's unit table tests to use subtest

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:
breaks up PR: https://github.com/kubernetes/kubernetes/pull/63281
/ref #63267

**Release note**:

```release-note
This PR will leverage subtests on the existing table tests for the scheduler units.
Some refactoring of error/status messages and functions to align with new approach.

```
2018-07-21 01:52:30 -07:00
Kubernetes Submit Queue
827aa934ac Merge pull request #66397 from gnufied/fix-default-max-volume-ebs
Automatic merge from submit-queue (batch tested with PRs 66410, 66398, 66061, 66397, 65558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix volume limit for EBS on m5 and c5 instances

This is a fix for lower volume limits on m5 and c5 instance types while we wait for https://github.com/kubernetes/features/issues/554 to land GA.

This problem became urgent because many of our users are trying to migrate to those instance types in light of spectre/meltdown vulnerability but  lower volume limit on those instance types often causes cluster instability. Yes they can workaround by configuring the scheduler with lower limit but often this becomes somewhat difficult to do when cluster is mixed. 

The newer default limits were picked from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/volume_limits.html

Text about spectre/meltdown is available on - https://community.bitnami.com/t/spectre-variant-2/54961/5

/sig storage
/sig scheduling

```release-note
Fix volume limit for EBS on m5 and c5 instance types
```
2018-07-20 18:51:11 -07:00
John Calabrese
ad234e58be use subtest for table units
remove duplicate testname from error msg

remove subtest for test setup loop

do not break on test failure

  https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203571355

remove duplicate test.name in output

  https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203574001
  https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203574012
2018-07-20 16:02:50 -04:00
Yecheng Fu
8f0373792f Retry scheduling on various events. 2018-07-20 09:54:34 +08:00
Kubernetes Submit Queue
795b7da8b0 Merge pull request #65714 from resouer/fix-63784
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Re-design equivalence class cache to two level cache

**What this PR does / why we need it**:

The current ecache introduced a global lock across all the nodes, and this patch tried to assign ecache per node to eliminate that global lock. The improvement of scheduling performance and throughput are both significant.

**CPU Profile Result** 

Machine: 32-core 60GB GCE VM

1k nodes 10k pods bench test (we've highlighted the critical function):

1. Current default scheduler with ecache enabled:
![equivlance class cache bench test 001](https://user-images.githubusercontent.com/1701782/42196992-51b0a32a-7eb3-11e8-89ee-f13383091a00.jpeg)
2. Current default scheduler with ecache disabled:
![equivlance class cache bench test 002](https://user-images.githubusercontent.com/1701782/42196993-51eb0c68-7eb3-11e8-9326-1a7762072863.jpeg)
3. Current default scheduler with this patch and ecache enabled:
![equivlance class cache bench test 003](https://user-images.githubusercontent.com/1701782/42196994-52280ed8-7eb3-11e8-8100-690e2af2cf2f.jpeg)

**Throughput Test Result** 

1k nodes 3k pods `scheduler_perf` test: 

Current default scheduler, ecache is disabled:
```bash
Minimal observed throughput for 3k pod test: 200
PASS
ok      k8s.io/kubernetes/test/integration/scheduler_perf    30.091s
```
With this patch, ecache is enabled:
```bash
Minimal observed throughput for 3k pod test: 556
PASS
ok      k8s.io/kubernetes/test/integration/scheduler_perf    11.119s
```

**Design and implementation:**

The idea is: we re-designed ecache into a "two level cache". 

The first level cache holds the global lock across nodes and sync is needed only when node is added or deleted, which is of much lower frequency. 

The second level cache is assigned per node and its lock is restricted to per node level, thus there's no need to bother the global lock during whole predicate process cycle. For more detail, please check [the original discussion](https://github.com/kubernetes/kubernetes/issues/63784#issuecomment-399848349).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63784

**Special notes for your reviewer**:

~~Tagged as WIP to make sure this does not break existing code and tests, we can start review after CI is happy.~~

**Release note**:

```release-note
Re-design equivalence class cache to two level cache
```
2018-07-19 16:16:02 -07:00
Hemant Kumar
45b8107378 Fix volume limit for EBS on m5 and c5 instances 2018-07-19 16:27:52 -04:00
Kubernetes Submit Queue
357decc9db Merge pull request #63666 from xchapter7x/pkg-scheduler-factory
Automatic merge from submit-queue (batch tested with PRs 58487, 63666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

use subtest for table units (pkg/scheduler/factory)

**What this PR does / why we need it**: Update scheduler's unit table tests to use subtest

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:
breaks up PR: https://github.com/kubernetes/kubernetes/pull/63281
/ref #63267

**Release note**:

```release-note
This PR will leverage subtests on the existing table tests for the scheduler units.
Some refactoring of error/status messages and functions to align with new approach.

```
2018-07-19 02:09:06 -07:00
Harry Zhang
e5a7a4caf7 Fist level ecache for nodeMap
Use new cache map in scheduler

Add a integration test

Move init before schedudling

Add lock for first level cache
2018-07-18 15:11:59 +08:00
Harry Zhang
17977478e7 RWLock for cache 2018-07-18 15:11:59 +08:00
Nikhita Raghunath
c166743272 scheduler: fix panic while removing node from imageStates cache 2018-07-16 11:42:28 +05:30