kubernetes

Author	SHA1	Message	Date
Kubernetes Submit Queue	4cca6a89a0	Merge pull request #66862 from resouer/sync-map Automatic merge from submit-queue (batch tested with PRs 66862, 67618). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Use sync.map to scale equiv class cache better What this PR does / why we need it: Change the current lock in first level ecache into `sync.Map`, which is known for scaling better than `sync. Mutex ` on machines with >8 CPUs ref: https://golang.org/pkg/sync/#Map And the code is much cleaner in this way. 5k Nodes, 10k Pods benchmark with ecache enabled in 64 cores VM: ```bash // before BenchmarkScheduling/5000Nodes/0Pods-64 10000 17550089 ns/op // after BenchmarkScheduling/5000Nodes/0Pods-64 10000 16975098 ns/op ``` Comparing to current implementation, the improvement after this change is noticeable, and the test is stable in 8, 16, 64 cores VM. Special notes for your reviewer: Release note: ```release-note Use sync.map to scale ecache better ```	2018-08-21 00:24:01 -07:00
Kubernetes Submit Queue	ef388fee53	Merge pull request #66948 from mohamed-mehany/anti-affinity-optimization Automatic merge from submit-queue (batch tested with PRs 67041, 66948). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Anti affinity optimization What this PR does / why we need it: This pull request aims to optimize the performance of anti-affinity rules lookup of existing pods This optimization maps the topology values to a list of pods running on nodes that match this value and store that map in the pod metadata. Accordingly, when validating anti-affinity rules of existing pods we will only check those running on nodes with similar topology values to the current candidate (node) for scheduling. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #63937 Special notes for your reviewer: /sig scalability /sig scheduling Release note: ```release-note improve performance of anti-affinity predicate of default scheduler. ```	2018-08-17 19:14:08 -07:00
Ahmad Diaa	b4c7d190cd	using set instead of lists for topologyPairsMaps attributes	2018-08-18 01:02:48 +02:00
Ahmad Diaa	0f4c3064fd	created struct for topologyPairs maps	2018-08-18 01:02:48 +02:00
Ahmad Diaa	f6659e4543	further enhancements removing matchingTerms from metadata	2018-08-18 01:02:47 +02:00
Mohamed Mehany	3fb6912d08	add topologyValue map to reduce search space	2018-08-18 01:02:46 +02:00
Bobby (Babak) Salamat	2860743c86	Autogenerated files	2018-08-17 11:18:52 -07:00
Bobby (Babak) Salamat	abb70aee98	Add a scheduler config argument to set the percentage of nodes to score	2018-08-17 11:18:51 -07:00
Bobby (Babak) Salamat	a5045d107e	Add NodeTree to the scheduler cache	2018-08-17 09:56:51 -07:00
Bobby (Babak) Salamat	c1896c97ea	Add a node tree that allows iterating over nodes in regions and zones	2018-08-17 09:56:51 -07:00
Kubernetes Submit Queue	eeb3389f3b	Merge pull request #63260 from misterikkit/ecache-metrics Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. scheduler: add metrics to equivalence cache This adds counters to equiv. cache reads & writes. Reads are labeled by hit/miss, while writes are labeled to indicate whether the write was discarded. This will give us visibility into, - hit rate of cache reads - ratio of reads to writes - rate of discarded writes What this PR does / why we need it: Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes https://github.com/kubernetes/kubernetes/issues/63259 Special notes for your reviewer: Release note: ```release-note NONE ```	2018-08-17 01:10:51 -07:00
Kubernetes Submit Queue	825548df95	Merge pull request #67464 from misterikkit/deadcode Automatic merge from submit-queue (batch tested with PRs 67461, 67464, 67416). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Delete dead code in pkg/scheduler What this PR does / why we need it: This is just some cleanup. I found some unused code while evaluating the scheduler code. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: Release note: ```release-note NONE ``` /kind cleanup /sig scheduling	2018-08-15 20:09:09 -07:00
Jonathan Basseri	fbf3d2b84c	Delete dead code in pkg/scheduler. This deletes some unused functions from the `Configurator` interface.	2018-08-15 17:14:38 -07:00
Jonathan Basseri	a77e3bd16b	Delete dead code. This removes a fake Cache implementation that is not used anywhere (anymore).	2018-08-15 17:14:37 -07:00
Jonathan Basseri	b874d2789b	Add metrics to equivalence cache. This adds counters to equiv. cache reads & writes. Reads are labeled by hit/miss, while writes are labeled to indicate whether the write was discarded. This will give us visibility into, - hit rate of cache reads - ratio of reads to writes - rate of discarded writes	2018-08-15 15:51:13 -07:00
Wei Huang	976797c0b8	fix an issue in NodeInfo.Clone() - usedPorts is a map-in-map struct, add fix to ensure it's deep copied - updated unit test	2018-08-15 13:31:16 -07:00
Kubernetes Submit Queue	d7634dcf23	Merge pull request #66856 from charrywanganthony/scheduler_space Automatic merge from submit-queue (batch tested with PRs 66491, 66587, 66856, 66657, 66923). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. add space for output Release note: ```release-note NONE ```	2018-08-14 17:55:11 -07:00
Kubernetes Submit Queue	6274590518	Merge pull request #66656 from wackxu/fixappversion Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. use apps/v1 version for scheduler /kind cleanup Release note: ```release-note NONE ```	2018-08-11 23:25:33 -07:00
Avesh Agarwal	be741feb1a	Ouput volumes (total capacity and requests) too along with cpu and memory when the feature BalanceAttachedNodeVolumes is used.	2018-08-07 15:40:33 -04:00
Avesh Agarwal	ea7f711ae2	Fix incorrect reporting of total request including current pod in the resource allocation priority function.	2018-08-07 15:37:55 -04:00
Harry Zhang	17d0190706	Use sync.map to scale ecache better	2018-08-07 14:06:09 +08:00
Chao Wang	895b6d441d	add space for output	2018-08-01 18:08:31 +08:00
Kubernetes Submit Queue	f4d8220df5	Merge pull request #65616 from cofyc/fix56163 Automatic merge from submit-queue (batch tested with PRs 65570, 65616). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Retry scheduling on StorageClass events What this PR does / why we need it: Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #56163 Special notes for your reviewer: I have taken over #60006. It's hard to test in e2e, because we cannot know reschedule of pod is triggered by which event (periodically service/node events will move pods to active queue too). ~~I'll add integration tests for this functionality after [this PR](https://github.com/kubernetes/kubernetes/pull/65296) get merged.~~ (already added) Release note: ```release-note NONE ```	2018-07-31 19:18:00 -07:00
Kubernetes Submit Queue	0e9b1dd20f	Merge pull request #66671 from hanxiaoshuai/cleanup07261 Automatic merge from submit-queue (batch tested with PRs 63955, 66685, 66671). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. remove unused code in pkg/scheduler/algorithm/scheduler_interface_test.go What this PR does / why we need it: remove unused code in pkg/scheduler/algorithm/scheduler_interface_test.go Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: Release note: ```release-note NONE ```	2018-07-26 21:05:11 -07:00
Kubernetes Submit Queue	fea4ad2783	Merge pull request #66670 from foxyriver/fix-log Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. fix error log What this PR does / why we need it: fix error log Release note: ```release-note NONE ```	2018-07-26 19:43:19 -07:00
Mayank Kumar	a5b6d805ea	Use GetControllerOf from apimachinery and remove kubernetes copy	2018-07-26 12:20:35 -07:00
hangaoshuai	f3fb9e0f33	remove unused code in pkg/scheduler/algorithm/scheduler_interface_test.go	2018-07-26 21:01:50 +08:00
foxyriver	3b4f250c4a	fix error log	2018-07-26 19:48:48 +08:00
wackxu	ab35fa0414	update bazel	2018-07-26 17:37:29 +08:00
xushiwei 00425595	fed8572745	use apps/v1 version for scheduler	2018-07-26 17:37:29 +08:00
Kubernetes Submit Queue	e4465b6e2f	Merge pull request #66599 from cofyc/fixfeaturegate Automatic merge from submit-queue (batch tested with PRs 66540, 66599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Invalidate CheckVolumeBinding predicate only when VolumeScheduling feature is enabled What this PR does / why we need it: Invalidate CheckVolumeBinding predicate only when VolumeScheduling feature is enabled. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: Release note: ```release-note NONE ```	2018-07-26 01:55:17 -07:00
Kubernetes Submit Queue	84a15d0291	Merge pull request #66540 from hanxiaoshuai/fixut0724 Automatic merge from submit-queue (batch tested with PRs 66540, 66599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. replace predicates string with corresponding const in TestDefaultPredicates What this PR does / why we need it: replace predicates string with corresponding const in TestDefaultPredicates. Unify with the const in func defaultPredicates(). Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes # Special notes for your reviewer: Release note: ```release-note NONE ```	2018-07-26 01:55:14 -07:00
Bobby (Babak) Salamat	be55371ff2	minor cleanup of selector_spreading priority function	2018-07-25 13:43:37 -07:00
Yecheng Fu	d2fc875489	Invalidate CheckVolumeBinding predicate only when VolumeScheduling feature is enabled.	2018-07-25 15:11:23 +08:00
Kubernetes Submit Queue	4dbcf32b3c	Merge pull request #66471 from islinwb/improve_TestZeroRequest Automatic merge from submit-queue (batch tested with PRs 66291, 66471, 66499). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Improve unit test TestZeroRequest What this PR does / why we need it: Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #66468 Special notes for your reviewer: Release note: ```release-note NONE ```	2018-07-24 13:59:58 -07:00
Kubernetes Submit Queue	2119d349b0	Merge pull request #66291 from resouer/fix-extender Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Extender preemption should respect IsInterested() What this PR does / why we need it: Extender preemption should respect IsInterested() Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #66289 Special notes for your reviewer: The bug is reported and the first commit is co-authored by: @chenchun Release note: ```release-note Extender preemption should respect IsInterested() ```	2018-07-24 13:48:38 -07:00
hangaoshuai	2c59a683a2	replace predicates string with corresponding const in TestDefaultPredicates	2018-07-24 14:27:36 +08:00
Weibin Lin	972e78748a	add pod UID	2018-07-23 10:44:31 +08:00
Harry Zhang	d644162a29	Extender preemption should respect IsInterested() Co-authored-by: Harry Zhang <resouer@gmail.com> Co-authored-by: Chun Chen <ramichen@tencent.com>	2018-07-23 10:13:38 +08:00
Weibin Lin	5449d153bb	Improve unit test TestZeroRequest	2018-07-23 09:15:19 +08:00
Kubernetes Submit Queue	4797c8df8f	Merge pull request #63665 from xchapter7x/pkg-scheduler-core Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. use subtest for table units (pkg/scheduler/core) What this PR does / why we need it: Update scheduler's unit table tests to use subtest Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Special notes for your reviewer: breaks up PR: https://github.com/kubernetes/kubernetes/pull/63281 /ref #63267 Release note: ```release-note This PR will leverage subtests on the existing table tests for the scheduler units. Some refactoring of error/status messages and functions to align with new approach. ```	2018-07-21 01:52:30 -07:00
Kubernetes Submit Queue	827aa934ac	Merge pull request #66397 from gnufied/fix-default-max-volume-ebs Automatic merge from submit-queue (batch tested with PRs 66410, 66398, 66061, 66397, 65558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Fix volume limit for EBS on m5 and c5 instances This is a fix for lower volume limits on m5 and c5 instance types while we wait for https://github.com/kubernetes/features/issues/554 to land GA. This problem became urgent because many of our users are trying to migrate to those instance types in light of spectre/meltdown vulnerability but lower volume limit on those instance types often causes cluster instability. Yes they can workaround by configuring the scheduler with lower limit but often this becomes somewhat difficult to do when cluster is mixed. The newer default limits were picked from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/volume_limits.html Text about spectre/meltdown is available on - https://community.bitnami.com/t/spectre-variant-2/54961/5 /sig storage /sig scheduling ```release-note Fix volume limit for EBS on m5 and c5 instance types ```	2018-07-20 18:51:11 -07:00
John Calabrese	ad234e58be	use subtest for table units remove duplicate testname from error msg remove subtest for test setup loop do not break on test failure https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203571355 remove duplicate test.name in output https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203574001 https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203574012	2018-07-20 16:02:50 -04:00
Yecheng Fu	8f0373792f	Retry scheduling on various events.	2018-07-20 09:54:34 +08:00
Kubernetes Submit Queue	795b7da8b0	Merge pull request #65714 from resouer/fix-63784 Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Re-design equivalence class cache to two level cache What this PR does / why we need it: The current ecache introduced a global lock across all the nodes, and this patch tried to assign ecache per node to eliminate that global lock. The improvement of scheduling performance and throughput are both significant. CPU Profile Result Machine: 32-core 60GB GCE VM 1k nodes 10k pods bench test (we've highlighted the critical function): 1. Current default scheduler with ecache enabled: ![equivlance class cache bench test 001](https://user-images.githubusercontent.com/1701782/42196992-51b0a32a-7eb3-11e8-89ee-f13383091a00.jpeg) 2. Current default scheduler with ecache disabled: ![equivlance class cache bench test 002](https://user-images.githubusercontent.com/1701782/42196993-51eb0c68-7eb3-11e8-9326-1a7762072863.jpeg) 3. Current default scheduler with this patch and ecache enabled: ![equivlance class cache bench test 003](https://user-images.githubusercontent.com/1701782/42196994-52280ed8-7eb3-11e8-8100-690e2af2cf2f.jpeg) Throughput Test Result 1k nodes 3k pods `scheduler_perf` test: Current default scheduler, ecache is disabled: ```bash Minimal observed throughput for 3k pod test: 200 PASS ok k8s.io/kubernetes/test/integration/scheduler_perf 30.091s ``` With this patch, ecache is enabled: ```bash Minimal observed throughput for 3k pod test: 556 PASS ok k8s.io/kubernetes/test/integration/scheduler_perf 11.119s ``` Design and implementation: The idea is: we re-designed ecache into a "two level cache". The first level cache holds the global lock across nodes and sync is needed only when node is added or deleted, which is of much lower frequency. The second level cache is assigned per node and its lock is restricted to per node level, thus there's no need to bother the global lock during whole predicate process cycle. For more detail, please check [the original discussion](https://github.com/kubernetes/kubernetes/issues/63784#issuecomment-399848349). Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #63784 Special notes for your reviewer: ~~Tagged as WIP to make sure this does not break existing code and tests, we can start review after CI is happy.~~ Release note: ```release-note Re-design equivalence class cache to two level cache ```	2018-07-19 16:16:02 -07:00
Hemant Kumar	45b8107378	Fix volume limit for EBS on m5 and c5 instances	2018-07-19 16:27:52 -04:00
Kubernetes Submit Queue	357decc9db	Merge pull request #63666 from xchapter7x/pkg-scheduler-factory Automatic merge from submit-queue (batch tested with PRs 58487, 63666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. use subtest for table units (pkg/scheduler/factory) What this PR does / why we need it: Update scheduler's unit table tests to use subtest Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Special notes for your reviewer: breaks up PR: https://github.com/kubernetes/kubernetes/pull/63281 /ref #63267 Release note: ```release-note This PR will leverage subtests on the existing table tests for the scheduler units. Some refactoring of error/status messages and functions to align with new approach. ```	2018-07-19 02:09:06 -07:00
Harry Zhang	e5a7a4caf7	Fist level ecache for nodeMap Use new cache map in scheduler Add a integration test Move init before schedudling Add lock for first level cache	2018-07-18 15:11:59 +08:00
Harry Zhang	17977478e7	RWLock for cache	2018-07-18 15:11:59 +08:00
Nikhita Raghunath	c166743272	scheduler: fix panic while removing node from imageStates cache	2018-07-16 11:42:28 +05:30

1 2 3 4 5 ...

414 Commits