Commit Graph

35 Commits

Author SHA1 Message Date
Guoliang Wang
761cf41427 Move pkg/scheduler/schedulercache -> pkg/scheduler/cache 2018-05-31 22:55:34 +08:00
Jonathan Basseri
18a8184dce Add warnings about cache invalidation.
Part of https://github.com/kubernetes/kubernetes/pull/63040 is the
assumption that scheduler cache updates must happen before equivalence
cache updates for any given informer event.

The reason for this is that the equivalence cache implementation checks
the main cache for staleness while holding the equiv. cache write lock.

case 1: If an informer invalidates an equiv. cache entry before the
staleness check, then we know that the main cache update completed.

case 2: If an informer blocks trying to grab the equiv. cache lock, then
invalidation will occur right after the potentially stale update is
written.

This patch adds a note to places where we invalidate the equivalence
cache so that hopefully nobody violates this invariant.
2018-05-22 15:15:37 -07:00
Kubernetes Submit Queue
0cf3788419 Merge pull request #63174 from misterikkit/equivHash
Automatic merge from submit-queue (batch tested with PRs 62937, 63105, 63031, 63174). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "Revert "Revert revert of equivalence class hash calculation i…

…n scheduler""

This reverts commit 4386751b5d.



**What this PR does / why we need it**:
This re-introduces the change from https://github.com/kubernetes/kubernetes/pull/58555 which changes how the scheduler computes equivalence classes of pods. I believe we have fixed the flakiness observed previously (https://github.com/kubernetes/kubernetes/issues/61512, https://github.com/kubernetes/kubernetes/issues/62921). I have run the test in question a few dozen times without a failure.

```bash
make test-integration WHAT="./test/integration/scheduler" KUBE_TEST_ARGS="-run TestPreemptionStarvation" GOFLAGS="-v"
```

/ref https://github.com/kubernetes/kubernetes/issues/58222

**Special notes for your reviewer**:
I had to resolve several merge conflicts. I think I resolved them correctly, but keep an eye out for anything silly.

**Release note**:

```release-note
NONE
```
/sig scheduling
2018-04-26 16:40:19 -07:00
Da K. Ma
2c10d15ae5 Do not schedule pod to the node under PID pressure.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-04-26 10:07:42 +08:00
Jonathan Basseri
eace2d08d0 Revert "Revert "Revert revert of equivalence class hash calculation in scheduler""
This reverts commit 4386751b5d.
2018-04-25 16:11:59 -07:00
Bobby (Babak) Salamat
a073dfdbd9 Fix scheduler Pod informers to receive events when pods are scheduled by other schedulers. 2018-04-23 11:07:53 -07:00
Kubernetes Submit Queue
1e39d68ecb Merge pull request #62243 from resouer/fix-62068
Automatic merge from submit-queue (batch tested with PRs 59592, 62308, 62523, 62635, 62243). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Separate pod priority from preemption

**What this PR does / why we need it**:
Users request to split priority and preemption feature gate so they can use priority separately.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62068 

**Special notes for your reviewer**:

~~I kept use `ENABLE_POD_PRIORITY` as ENV name for gce cluster scripts for backward compatibility reason. Please let me know if other approach is preffered.~~

~~This is a potential **break change** as existing clusters will be affected, we may need to include this in 1.11 maybe?~~

TODO: update this doc https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/

[Update] Usage: in config file for scheduler:
```yaml
apiVersion: componentconfig/v1alpha1
kind: KubeSchedulerConfiguration
...
disablePreemption: true
```

**Release note**:

```release-note
Split PodPriority and PodPreemption feature gate
```
2018-04-19 14:50:27 -07:00
Harry Zhang
4f0bd4121e Disable pod preemption by config 2018-04-12 21:11:51 -07:00
zhangxiaoyu-zidif
a7771ef58b spec.SchedulerName should be spec.schedulerName in kube-scheduler help 2018-04-07 18:06:17 +08:00
Kubernetes Submit Queue
71f150422c Merge pull request #62180 from msau42/binding-predicate
Automatic merge from submit-queue (batch tested with PRs 61918, 62180, 62198). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use provided node object in volume binding predicate

**What this PR does / why we need it**:
Autoscaler creates fake node objects, so we should use the provided node object instead of looking up the node from the informer.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #62178

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-04-06 11:56:07 -07:00
Michelle Au
8d1cd819ec Use provided node object in volume binding predicate 2018-04-05 14:35:55 -07:00
wackxu
4aa4255cf1 remove pvc node affinity update check since beta NodeAffinity is immutable 2018-04-03 09:41:40 +08:00
wackxu
6c0e588915 use filed NodeAffinity instead of annotation for scheduler 2018-03-28 16:08:15 +08:00
Bobby (Babak) Salamat
4386751b5d Revert "Revert revert of equivalence class hash calculation in scheduler" 2018-03-23 17:34:49 -07:00
Kubernetes Submit Queue
2d864f2359 Merge pull request #60953 from anfernee/sched-cache-resync
Automatic merge from submit-queue (batch tested with PRs 60919, 60953, 61085, 61083, 60971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Sched cache resync

**What this PR does / why we need it**:  Scheduler cache comparer
    
    A debug tool that collects resources from api server and compares it
    with the scheduler cache. It currently only compares the node list, but
    it should be easy to extend. The compare is triggered by signal USER2,
    by doing
    
      kill -12 ${SCHED_PID}
    
    The compare result goes to scheduler log.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Towards #60860

**Special notes for your reviewer**: @bsalamat 

**Release note**:
```release-note
None
```
2018-03-20 20:34:28 -07:00
Yongkun Anfernee Gui
cda749c237 Pod comparer should count pods in scheduling queue
Pods in scheduler cache contains both the scheduled pods and those not
scheduled yet in scheduling queue. This commit adds the second group of
pods into consideration while comparing the cache.
2018-03-14 10:29:42 -07:00
Yongkun Anfernee Gui
eba9528753 Add cache comparison for pods and pdbs 2018-03-09 15:10:26 -08:00
Yongkun Anfernee Gui
fda0d07eb6 Scheduler cache comparer
A debug tool that collects resources from api server and compares it
with the scheduler cache. It currently only compares the node list, but
it should be easy to extend. The compare is triggered by signal USER2,
by doing

  kill -12 ${SCHED_PID}

The compare result goes to scheduler log.

Towards #60860
2018-03-09 15:10:22 -08:00
Jonathan Basseri
f5ab6d5ad4 [PATCH] Fix equiv. cache invalidation of Node condition.
Equivalence cache for CheckNodeConditionPred becomes invalid when
Node.Spec.Unschedulable changes. This can happen even if
Node.Status.Conditions does not change, so move the logic around.

This logic is covered by integration test
"test/integration/scheduler".TestUnschedulableNodes but equivalence
cache is currently skipped when test pods have no OwnerReference.

Add benchmark for equivalence hashing.

Change equivalence hash function.

This changes the equivalence class hashing function to use as inputs all
the Pod fields which are read by FitPredicates. Before we used a
combination of OwnerReference and PersistentVolumeClaim info, which was
a close approximation. The new method ensures that hashing remains
correct regardless of controller behavior.

The PVCSet field can be removed from equivalencePod because it is
implicitly included in the Volume list.

Tests are now broken.

Move equivalence class hash code.

This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.

In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.

Fix equivalence cache hash tests.

Co-authored-by: Jonathan Basseri <misterikkit@google.com>
Co-authored-by: Harry Zhang <resouer@gmail.com>
2018-03-04 13:02:28 -08:00
Yang Guo
8d880506fe Support cluster-level extended resources in kubelet and kube-scheduler
Co-authored-by: Yang Guo <ygg@google.com>
Co-authored-by: Chun Chen <chenchun.feed@gmail.com>
2018-02-27 17:25:30 -08:00
Harry Zhang
136e5398ed Use consts as predicate name in handlers 2018-02-15 15:18:27 -08:00
Kubernetes Submit Queue
4949b63e0a Merge pull request #59335 from resouer/equiv-vol
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fixes volume predicate handler for equiv class

**What this PR does / why we need it**:

Per discussion in #58797 , we are missing some predicate handler in factory.go.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58797

Ref #58222

**Special notes for your reviewer**:

Kindly ping @msau42

**Release note**:

```release-note
Fixes volume predicate handler for equiv class
```
2018-02-14 20:57:08 -08:00
Bobby (Babak) Salamat
69d62a9288 Improve performance of scheduling queue by adding a hash map to track all pods in with a nominatedNodeName. 2018-02-09 14:07:29 -08:00
tossmilestone
3fdacfead5 Fix golint errors in pkg/scheduler based on golint check 2018-02-08 15:22:47 +08:00
Kubernetes Submit Queue
5a4b160cf0 Merge pull request #59281 from bsalamat/nominated_node
Automatic merge from submit-queue (batch tested with PRs 59010, 59212, 59281, 59014, 59297). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Replace nominateNodeName annotation with PodStatus.NominatedNodeName

**What this PR does / why we need it**:
Replaces nominateNodeName annotation with PodStatus.NominatedNodeName in scheudler's logic. We don't expect any logic/behavior changes.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

ref #57471

/sig scheduling
cc: @k82cn @aveshagarwal @resouer
2018-02-07 15:27:43 -08:00
Yang Guo
f69eaa3b18 kube-scheduler: Use default predicates/prioritizers if policy config does not specify them 2018-02-06 13:01:33 -08:00
Harry Zhang
bff62d2c86 Equiv class volume fixes
Generated bazel
2018-02-05 16:58:43 -08:00
Bobby (Babak) Salamat
bfd950e471 Replace nominateNodeName annotation with PodStatus.NominatedNodeName in scheudler logic 2018-02-02 13:06:33 -08:00
Bobby (Babak) Salamat
2274e93b64 Revert "Change equivalence class hashing function" 2018-01-26 18:13:15 -08:00
Jonathan Basseri
466a499fcb Move equivalence class hash code.
This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.

In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.
2018-01-24 17:15:42 -08:00
Jonathan Basseri
59f0a99909 Fix equiv. cache invalidation of Node condition.
Equivalence cache for CheckNodeConditionPred becomes invalid when
Node.Spec.Unschedulable changes. This can happen even if
Node.Status.Conditions does not change, so move the logic around.

This logic is covered by integration test
"test/integration/scheduler".TestUnschedulableNodes but equivalence
cache is currently skipped when test pods have no OwnerReference.
2018-01-24 17:07:52 -08:00
Bobby (Babak) Salamat
79601acb2c Add better event handling for deleted Pods 2018-01-23 12:03:35 -08:00
junxu
5deb5f4913 Rename func name according TODO 2018-01-15 00:08:59 -05:00
Wang Guoliang
b8526cd077 -Add scheduler optimization options, short circuit all predicates if one predicate fails 2018-01-13 18:18:55 +08:00
Jonathan Basseri
30b89d830b Move scheduler code out of plugin directory.
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.

Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
2018-01-05 15:05:01 -08:00