kubernetes

Author	SHA1	Message	Date
Kubernetes Submit Queue	bc7ccfe93b	Merge pull request #50106 from julia-stripe/improve-scheduler-error-handling Automatic merge from submit-queue Retry scheduling pods after errors more consistently in scheduler What this PR does / why we need it: This fixes 2 places in the scheduler where pods can get stuck in Pending forever. In both these places, errors happen and `sched.config.Error` is not called afterwards. This is a problem because `sched.config.Error` is responsible for requeuing pods to retry scheduling when there are issues (see [here](`2540b333b2/plugin/pkg/scheduler/factory/factory.go (L958)`)), so if we don't call `sched.config.Error` then the pod will never get scheduled (unless the scheduler is restarted). One of these (where it returns when `ForgetPod` fails instead of continuing and reporting an error) is a regression from [this refactor](https://github.com/kubernetes/kubernetes/commit/ecb962e6585#diff-67f2b61521299ca8d8687b0933bbfb19L234), and with the [old behavior](`80f26fa8a8/plugin/pkg/scheduler/scheduler.go (L233-L237)`) the error was reported correctly. As far as I can tell changing the error handling in that refactor wasn't intentional. When AssumePod fails there's never been an error reported but I think adding this will help the scheduler recover when something goes wrong instead of letting pods possibly never get scheduled. This will help prevent issues like https://github.com/kubernetes/kubernetes/issues/49314 in the future. Release note: ```release-note Fix incorrect retry logic in scheduler ```	2017-08-07 01:35:17 -07:00
sakeven	e3537425e1	getHashEquivalencePod also returns if equivalence pod is found Signed-off-by: sakeven <jc5930@sina.cn>	2017-08-07 09:27:37 +08:00
Kubernetes Submit Queue	fa5877de18	Merge pull request #47408 from shiywang/follow-go-code-style Automatic merge from submit-queue (batch tested with PRs 47416, 47408, 49697, 49860, 50162) follow our go code style: error->err Fixes https://github.com/kubernetes/kubernetes/issues/50189 ```release-note NONE ```	2017-08-05 03:22:54 -07:00
Kubernetes Submit Queue	90a45b2df3	Merge pull request #49547 from k82cn/k8s_42001_0 Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547) Task 0: Added node taints labels and feature flags What this PR does / why we need it: Added node taint const for node condition. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): part of #42001 Release note: ```release-note None ```	2017-08-04 14:29:42 -07:00
Julia Evans	2d9c6dfae8	Handle errors more consistently in scheduler	2017-08-04 12:00:22 -07:00
Kubernetes Submit Queue	898b1b3330	Merge pull request #50028 from julia-stripe/fix-incorrect-scheduler-bind-call Automatic merge from submit-queue Fix incorrect call to 'bind' in scheduler I previously submitted https://github.com/kubernetes/kubernetes/pull/49661 -- I'm not sure if that PR is too big or what, but this is an attempt at a smaller PR that makes progress on the same issue and is easier to review. What this PR does / why we need it: In this refactor (https://github.com/kubernetes/kubernetes/commit/ecb962e6585#diff-67f2b61521299ca8d8687b0933bbfb19R223) the scheduler code was refactored into separate `bind` and `assume` functions. When that happened, `bind` was called with `pod` as an argument. The argument to `bind` should be the assumed pod, not the original pod. Evidence that `assumedPod` is the correct argument bind and not `pod`: `80f26fa8a8/plugin/pkg/scheduler/scheduler.go (L229-L234)`. (and it says `assumed` in the function signature for `bind`, even though it's not called with the assumed pod as an argument). This is an issue (and causes #49314, where pods that fail to bind to a node get stuck indefinitely) in the following scenario: 1. The pod fails to bind to the node 2. `bind` calls `ForgetPod` with the `pod` argument 3. since `ForgetPod` is expecting the assumed pod as an argument (because that's what's in the scheduler cache), it fails with an error like `scheduler cache ForgetPod failed: pod test-677550-rc-edit-namespace/nginx-jvn09 state was assumed on a different node` 4. The pod gets lost forever because of some incomplete error handling (which I haven't addressed here in the interest of making a simpler PR) In this PR I've fixed the call to `bind` and modified the tests to make sure that `ForgetPod` gets called with the correct argument (the assumed pod) when binding fails. Which issue this PR fixes: fixes #49314 Special notes for your reviewer: Release note: ```release-note ```	2017-08-04 10:33:10 -07:00
Julia Evans	d584bf4d50	Fix incorrect call to 'bind' in scheduler	2017-08-03 13:55:00 -07:00
Harry Zhang	f8309d7598	Update generated files	2017-08-03 23:03:52 +08:00
Harry Zhang	a0787358b5	Cover get equivalence cache in core Fix testing method	2017-08-03 23:03:52 +08:00
Klaus Ma	c8ecd92269	Moved node condition check into Predicats.	2017-08-03 15:39:11 +08:00
Avesh Agarwal	0dad8dd459	Do not allow empty topology key for pod affinities.	2017-08-02 09:41:29 -04:00
supereagle	a1c880ece3	update generated deepcopy code	2017-07-31 22:33:00 +08:00
Klaus Ma	ec4aa192cc	Added taints node by condition feature flag.	2017-07-31 19:30:34 +08:00
Kubernetes Submit Queue	9350afd772	Merge pull request #48976 from supereagle/cleanup-api-package Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430) Remove duplicated import and wrong alias name of api package What this PR does / why we need it: Which issue this PR fixes: fixes #48975 Special notes for your reviewer: /assign @caesarxuchao Release note: ```release-note NONE ```	2017-07-25 12:14:38 -07:00
vikaschoudhary16	df4f4d341b	Enhance scheduler cache unit tests to cover OIR in pod spec Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>	2017-07-25 06:35:23 -04:00
Klaus Ma	c85e4dc1de	Added node taints labels.	2017-07-25 15:21:51 +08:00
Kubernetes Submit Queue	e623fed778	Merge pull request #48636 from jingxu97/July/allocatable Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494) Fix issues for local storage allocatable feature This PR fixes the following issues: 1. Use ResourceStorageScratch instead of ResourceStorage API to represent local storage capacity 2. In eviction manager, use container manager instead of node provider (kubelet) to retrieve the node capacity and reserved resources. Node provider (kubelet) has a feature gate so that storagescratch information may not be exposed if feature gate is not set. On the other hand, container manager has all the capacity and allocatable resource information. This PR fixes issue #47809	2017-07-24 19:30:33 -07:00
supereagle	adc0eef43e	remove duplicated import and wrong alias name of api package	2017-07-25 10:04:25 +08:00
Kubernetes Submit Queue	2faf7ff2bc	Merge pull request #36238 from resouer/eclass-2-dev Automatic merge from submit-queue (batch tested with PRs 48043, 48200, 49139, 36238, 49130) Implement equivalence cache by caching and re-using predicate result The last part of #30844, I opened a new PR instead of overwrite the old one because we changed some basic assumption by allowing invalidating equivalence cache item by individual predicate. The idea of this PR is based on discussion in https://github.com/kubernetes/kubernetes/issues/32024 - [x] Pods belong to same controllerRef considered to be equivalent - [x] ` podFitsOnNode` will use cached predicate result if it's available - [x] Equivalence cache will be updated when if a fresh new predicate is done - [x] `factory.go` will invalid specific predicate cache(s) based on the object change - [x] Since `schedule` and `bind` are async, we need to optimistically invalid affected cache(s) before `bind` - [x] Fully unit test of affected files - [x] e2e test to verify cache update/invalid workflow - [x] performance test results - [x] Some nits fixes related but expected to result in `needs-rebase` so they are split to: #36060 #35968 #37512 cc @wojtek-t @davidopp	2017-07-19 01:57:32 -07:00
Kubernetes Submit Queue	2492477f0d	Merge pull request #49110 from xiangpengzhao/remove-annotation-affinity Automatic merge from submit-queue (batch tested with PRs 49055, 49128, 49132, 49134, 49110) Remove affinity annotations leftover What this PR does / why we need it: This is a further cleanup for affinity annotations, following #47869. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes # ref: #47869 Special notes for your reviewer: - I remove the commented test cases and just leave TODOs instead. I think converting these untestable test cases for now is not necessary. We can add new test cases in future. - I remove the e2e test case `validates that embedding the JSON PodAffinity and PodAntiAffinity setting as a string in the annotation value work` because we have a test case `validates that InterPod Affinity and AntiAffinity is respected if matching` to test the same thing. /cc @aveshagarwal @bsalamat @gyliu513 @k82cn @timothysc Release note: ```release-note NONE ```	2017-07-18 21:54:25 -07:00
Kubernetes Submit Queue	5bbdfc6661	Merge pull request #48544 from sttts/sttts-typed-deepcopy-1.8 Automatic merge from submit-queue (batch tested with PRs 46094, 48544, 48807, 49102, 44174) Static deepcopy – phase 1 This PR is the follow-up of https://github.com/kubernetes/kubernetes/pull/36412, replacing the dynamic reflection based deepcopy with static DeepCopy+DeepCopyInto methods on API types. This PR does not yet include the code dropping the cloner from the scheme and all the porting of the calls to scheme.Copy. This will be part of a follow-up "Phase 2" PR. A couple of the commits will go in first: - [x] audit: fix deepcopy registration https://github.com/kubernetes/kubernetes/pull/48599 - [x] apimachinery+apiserver: separate test types in their own packages #48601 - [x] client-go: remove TPR example #48604 - [x] apimachinery: remove unneeded GetObjectKind() impls #48608 - [x] sanity check against origin, that OpenShift's types are fine for static deepcopy https://github.com/deads2k/origin/pull/34 TODO after review here: - [x] merge https://github.com/kubernetes/gengo/pull/32 and update vendoring commit	2017-07-18 11:20:51 -07:00
Harry Zhang	f817b8a6f6	Update generated bazel	2017-07-18 23:58:32 +08:00
Harry Zhang	0e8517875e	Update factory.go informers to update equivalence cache Fix tombstone Add e2e to verify equivalence cache Addressing nits in factory,go and e2e Update build files	2017-07-18 23:55:01 +08:00
Kubernetes Submit Queue	686e93bbf1	Merge pull request #48333 from sakeven/master Automatic merge from submit-queue (batch tested with PRs 48333, 48806, 49046) use v1.ResourcePods instead of hard coding "pods" Signed-off-by: sakeven <jc5930@sina.cn> What this PR does / why we need it: use v1.ResourcePods instead of hard coding 'pods' Special notes for your reviewer: Release note: ``` NONE ```	2017-07-18 06:24:58 -07:00
xiangpengzhao	d9d3396566	Remove affinity annotations leftover	2017-07-18 19:42:52 +08:00
Dr. Stefan Schimanski	8dd0989b39	Update generated code	2017-07-18 09:28:49 +02:00
Dr. Stefan Schimanski	39d95b9b06	deepcopy: add interface deepcopy funcs - add DeepCopyObject() to runtime.Object interface - add DeepCopyObject() via deepcopy-gen - add DeepCopyObject() manually - add DeepCopySelector() to selector interfaces - add custom DeepCopy func for TableRow.Cells	2017-07-18 09:28:47 +02:00
Kubernetes Submit Queue	9995212ed3	Merge pull request #48869 from sakeven/rm_error Automatic merge from submit-queue [Scheduler] Remove error since err is always nil Signed-off-by: sakeven <jc5930@sina.cn> What this PR does / why we need it: No need to log error since err is always nil. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes # Special notes for your reviewer: Release note: ``` NONE ```	2017-07-17 23:08:39 -07:00
Jacob Simpson	29c1b81d4c	Scripted migration from clientset_generated to client-go.	2017-07-17 15:05:37 -07:00
Kubernetes Submit Queue	a9afb931d4	Merge pull request #48805 from sakeven/use_const Automatic merge from submit-queue (batch tested with PRs 48262, 48805) [Scheduler] Use const value maxPriority instead of immediate value 10 Signed-off-by: sakeven <jc5930@sina.cn> What this PR does / why we need it: Use const value maxPriority instead of immediate value 10. Special notes for your reviewer: Release note: ``` NONE ```	2017-07-17 04:40:53 -07:00
Kubernetes Submit Queue	4f6af5faa4	Merge pull request #48451 from sakeven/fix/ForgetPod_first_after_bind_failed Automatic merge from submit-queue forget pod first after binding failed Signed-off-by: sakeven <jc5930@sina.cn> What this PR does / why we need it: In the implementation of scheduler cache, `FinishBinding` marks Pod expired, and then pod would be cleaned in ttl seconds. While `ForgetPod` checks Pod whether assumed, if not, it reports an error. So if binding failed and ttl(now 30s) is too short, the error will occur when `ForgetPod`, thus we won't record `BindingRejected` event. Although it's rare, we shouldn't depend on the value of ttl. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes # Special notes for your reviewer: Release note: ``` NONE ```	2017-07-17 03:27:41 -07:00
sakeven	e9aee2b249	forget pod first after bind failed Signed-off-by: sakeven <jc5930@sina.cn>	2017-07-17 16:46:49 +08:00
Kubernetes Submit Queue	94bca5ffef	Merge pull request #47309 from xiang90/util Automatic merge from submit-queue (batch tested with PRs 47309, 47187) scheduler/util: remove bad print format Fix https://github.com/kubernetes/kubernetes/issues/18834	2017-07-16 20:00:54 -07:00
sakeven	6aeb77aa6a	Use const value maxPriority instead of immediate value 10 Signed-off-by: sakeven <jc5930@sina.cn>	2017-07-17 10:33:44 +08:00
Kubernetes Submit Queue	0c74c36b70	Merge pull request #46930 from k82cn/sched_integ_test Automatic merge from submit-queue (batch tested with PRs 47417, 47638, 46930) Added scheduler integration test owners. What this PR does / why we need it: Add OWNER file into scheduler integration test. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes # N/A Release note: ```release-note-none ```	2017-07-16 16:33:05 -07:00
Kubernetes Submit Queue	b039c6e185	Merge pull request #47106 from gyliu513/ecache-test Automatic merge from submit-queue Improved code coverage for equivalence cache. What this PR does / why we need it: Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes # Special notes for your reviewer: Release note: ```release-note none ```	2017-07-15 01:05:44 -07:00
Jing Xu	bb1920edcc	Fix issues for local storage allocatable feature This PR fixes the following issues: 1. Use ResourceStorageScratch instead of ResourceStorage API to represent local storage capacity 2. In eviction manager, use container manager instead of node provider (kubelet) to retrieve the node capacity and reserved resources. Node provider (kubelet) has a feature gate so that storagescratch information may not be exposed if feature gate is not set. On the other hand, container manager has all the capacity and allocatable resource information.	2017-07-13 12:06:19 -07:00
sakeven	d9c65bce5c	use v1.ResourcePods instead of hard coding 'pods' Signed-off-by: sakeven <jc5930@sina.cn>	2017-07-13 18:20:47 +08:00
sakeven	5435268e06	remove error since err is always nil Signed-off-by: sakeven <jc5930@sina.cn>	2017-07-13 17:45:14 +08:00
Kubernetes Submit Queue	eb196f8c9b	Merge pull request #48405 from k82cn/k8s_44188_1 Automatic merge from submit-queue (batch tested with PRs 48405, 48742, 48748, 48571, 48482) Removed scheduler dependencies to testapi. What this PR does / why we need it: When refactor scheduler to use client-go, k8s.io/api, it's also need to remove the dependeny to testapi. prefer to only include import/BUILD changes for #44188, so created separated PR for other enhancement removal. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): partially fixes #44188 Release note: ```release-note-none ```	2017-07-12 08:05:13 -07:00
Kubernetes Submit Queue	b8f1bb4105	Merge pull request #48614 from xing-yang/function_name Automatic merge from submit-queue (batch tested with PRs 46865, 48661, 48598, 48658, 48614) Fix function names in the comments This patch fixes function and type names in the comments in predicates.go. What this PR does / why we need it: It fixes function and type names in the comments in predicates.go. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes # This does not have an issue # because it is a trivial fix. Special notes for your reviewer: Release note: ```release-note ```	2017-07-12 03:02:22 -07:00
Kubernetes Submit Queue	33718a8fae	Merge pull request #48335 from sakeven/fix/close_resp_Body Automatic merge from submit-queue (batch tested with PRs 48402, 47203, 47460, 48335, 48322) HTTPExtender: should close resp.Body even when StatusCode not ok Signed-off-by: sakeven <jc5930@sina.cn> What this PR does / why we need it: close resp.Body even when StatusCode isn't ok Special notes for your reviewer: Release note: ``` NONE ```	2017-07-11 21:01:37 -07:00
Xing Yang	e94e50c999	Fix function and type names in the comments This patch fixes function and type names in the comments in predicates.go.	2017-07-10 04:59:58 -07:00
Cao Shufeng	0c577c47d5	Use glog.f when a format string is passed ref: https://godoc.org/github.com/golang/glog I use the following commands to search all the invalid usage: $ grep "glog.Warning(" -r \| grep % $ grep "glog.Info(" * -r \| grep % $ grep "glog.Error(" * -r \| grep % $ grep ").Info(" * -r \| grep % \| grep "glog.V("	2017-07-10 19:04:03 +08:00
Guangya Liu	cc719382ab	Commit-1: Improved code coverage for equivalence cache. Improved coverage for functions: 1) PredicateWithECache 2) UpdateCachedPredicateItem	2017-07-09 19:08:04 +08:00
Kubernetes Submit Queue	093dd52db2	Merge pull request #48337 from sakeven/fix/validation_test Automatic merge from submit-queue scheduler: fix validation test Signed-off-by: sakeven <jc5930@sina.cn> What this PR does / why we need it: Without setting `Weight`, `ValidatePolicy` will report ``` Priority for extender http://127.0.0.1:8081/extender should have a positive weight applied to it ``` Besides, it seems it's not a good way to test ValidatePolicy by```if ValidatePolicy(extenderPolicy) == nil```, because we can't determine specific reason which causes error. Special notes for your reviewer: Release note: ``` NONE ```	2017-07-07 22:38:28 -07:00
Shiyang Wang	9a96ff94af	follow our go code style: error->err	2017-07-07 09:34:38 +08:00
Kubernetes Submit Queue	e773c88b0a	Merge pull request #48399 from k82cn/ordered_pkgs Automatic merge from submit-queue (batch tested with PRs 48399, 48450, 48144) Group and order imported packages. Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): fixes #N/A Release note: ```release-note-none ```	2017-07-05 08:58:36 -07:00
sakeven	86c453a192	schduler: fix validation test Signed-off-by: sakeven <jc5930@sina.cn>	2017-07-05 14:36:53 +08:00
Kubernetes Submit Queue	2f1ea7efcf	Merge pull request #47515 from zhangxiaoyu-zidif/replace-scheduler-havesame Automatic merge from submit-queue (batch tested with PRs 47043, 48448, 47515, 48446) Refactor slice intersection What this PR does / why we need it: In worst case, the original method is O(N^2), while current method is 3 * O(N). I think it is better. Release note: ```release-note NONE ```	2017-07-04 09:12:26 -07:00

1 2 3 4 5 ...

988 Commits