Commit Graph

148 Commits

Author SHA1 Message Date
wojtekt
7b6bcdf780 Autogenerated code 2019-10-24 20:21:00 +02:00
draveness
1163a1d51e feat: update taint nodes by condition to GA 2019-10-19 09:17:41 +08:00
wojtekt
cf9203501e Swtich nodelifecyclecontroller to coordination/v1 2019-10-16 10:59:02 +02:00
Krzysztof Siedlecki
b1dfa83be6 using pod pointers in node lifecycle controller 2019-10-14 12:44:43 +02:00
Yassine TIJANI
c1487840bc move util/metrics to component-base
Signed-off-by: Yassine TIJANI <ytijani@vmware.com>
2019-10-08 14:42:31 +02:00
Krzysztof Siedlecki
a07a3a6878 adding pods to MarkPodsNotReady parameters 2019-10-02 14:22:58 +02:00
Krzysztof Siedlecki
8f48896709 adding pods to DeletePods parameters 2019-10-02 13:11:23 +02:00
Krzysztof Siedlecki
99eeab35a3 adding fakeGetPodsAssignedToNode 2019-09-30 11:03:36 +02:00
Krzysztof Siedlecki
2f3e8463ef eviction processing refactor 2019-09-19 17:15:32 +02:00
Krzysztof Siedlecki
029b72b553 adding lock to node data map 2019-09-12 10:23:24 +02:00
Kubernetes Prow Robot
d6bc4eb853
Merge pull request #81624 from logicalhan/cm-migration
migrate controller-manager metrics to stability framework
2019-08-29 05:30:09 -07:00
Han Kang
59db3ac27e migrate controller-manager metrics to stability framework 2019-08-28 12:26:57 -07:00
Clayton Coleman
2888e6e923
Node lifecycle controller should use a label for excluding nodes
The current mechanism for excluding "master" nodes based on node names
is fragile and should be fixed by using a label exclusion similar to
service load balancers. The legacy code path is preserved behind a
defaulted-on gate and will be removed in the future.
2019-08-28 10:29:08 -04:00
Kubernetes Prow Robot
927f45191e
Merge pull request #81527 from yastij/move-controller-util
move WaitForCacheSync to the sharedInformer package
2019-08-27 00:52:54 -07:00
Krzysztof Siedlecki
07b039862b moving podInformer to node controller scope 2019-08-23 15:59:59 +02:00
Yassine TIJANI
7e4c3096fe move WaitForCacheSync to the sharedInformer package
Signed-off-by: Yassine TIJANI <ytijani@vmware.com>
2019-08-22 16:13:41 +01:00
Krzysztof Siedlecki
6842e11f7e removing redundant code 2019-08-21 17:22:06 +02:00
Ted Yu
c811b2267f Simplify checking in getMinTolerationTime 2019-08-05 10:45:05 -07:00
Maciej Borsz
98e9c78c13 Migrate TaintManager to use watch for listing pods instead of expensive
listing pods call.
2019-06-26 13:35:59 +02:00
qingsenLi
5c23e393cf duplicated klog.V(5) when had a if klog.V(5) 2019-05-24 00:49:38 +08:00
Andrew Kim
c919139245 update import of generic featuregate code from k8s.io/apiserver/pkg/util/feature -> k8s.io/component-base/featuregate 2019-05-08 10:01:50 -04:00
Guangwen Feng
5714041f54 Fix comments about node health monitor
Signed-off-by: Guangwen Feng <fenggw-fnst@cn.fujitsu.com>
2019-03-27 12:18:23 +08:00
Kubernetes Prow Robot
4499275cb9
Merge pull request #72800 from stewart-yu/stewart-component-base
Move config local to every controller in KCM
2019-03-21 19:26:19 -07:00
WanLinghao
686b526de4 remove unused functions in pkg/controller/nodelifecycle/node_lifecycle_controller.go 2019-03-12 11:01:47 +08:00
Yu-Ju Hong
bd2301a628 nodelifecycle controller: reconcile node OS/arch labels 2019-03-06 17:26:23 -08:00
stewart-yu
ecbd5427e7 auto-generated file 2019-03-02 12:55:26 +08:00
stewart-yu
e01ff1641c move config local to every controllers in kube-controller-manager 2019-03-02 12:54:33 +08:00
Daniel (Shijun) Qian
d648ba856b Move pkg/api/v1/node to pkg/util/node (#73656)
* merge pkg/api/v1/node with pkg/util/node

* update test case for utilnode

* remove package pkg/api/v1/node

* move isNodeReady to internal func

* Split GetNodeCondition into e2e and controller pkg

* fix import errors
2019-02-26 11:05:32 -08:00
Kubernetes Prow Robot
dc634b3fd3
Merge pull request #70934 from mikeweiwei/my_fix3
Fix typos
2019-02-20 12:53:52 -08:00
Kubernetes Prow Robot
cbcc2b26fe
Merge pull request #72904 from ping035627/k8s-190115
Correct the count of evictionsNumber
2019-02-18 18:40:09 -08:00
Kubernetes Prow Robot
808f2cf0ef
Merge pull request #72525 from justinsb/owners_should_not_be_executable
Remove executable file permission from OWNERS files
2019-02-14 23:55:45 -08:00
PingWang
1ff8fac138 Correct the count of evictionsNumber
Signed-off-by: PingWang <wang.ping5@zte.com.cn>

check nodeutil.SwapNodeControllerTaint's return

Signed-off-by: PingWang <wang.ping5@zte.com.cn>
2019-02-11 10:15:10 +08:00
Kubernetes Prow Robot
b50c643be0
Merge pull request #73540 from rlenferink/patch-5
Updated OWNERS files to include link to docs
2019-02-08 09:05:56 -08:00
Davanum Srinivas
b975573385
move pkg/kubelet/apis/well_known_labels.go to staging/src/k8s.io/api/core/v1/
Co-Authored-By: Weibin Lin <linweibin1@huawei.com>

Change-Id: I163b2f2833e6b8767f72e2c815dcacd0f4e504ea
2019-02-05 13:39:07 -05:00
Roy Lenferink
b43c04452f Updated OWNERS files to include link to docs 2019-02-04 22:33:12 +01:00
Shiv Nagarajan
36ee154243 remove deprecated taints from 1.9 2019-01-16 21:20:57 -05:00
Justin SB
dd19b923b7
Remove executable file permission from OWNERS files 2019-01-11 16:42:59 -08:00
Kubernetes Prow Robot
c5616157a0
Merge pull request #71668 from mtaufen/node-lifecycle-metrics
export metrics from node lifecycle controller workqueues
2019-01-07 20:24:12 -08:00
Michael Taufen
0ab928c9d6 export metrics from node lifecycle controller workqueues 2019-01-07 15:27:35 -08:00
Kubernetes Prow Robot
0d63cf9caa
Merge pull request #67037 from Huang-Wei/cleanup-ood
cleanup logic related with OutOfDisk
2018-12-20 17:30:27 -08:00
Jordan Liggitt
0ff455e340 generated files 2018-12-19 11:19:12 -05:00
Weibin Lin
842bd1e1ec update deployment, daemonset, replicaset, statefulset to apps/v1 2018-12-19 10:46:45 -05:00
Wei Huang
8f87e71e0c
cleanup logic related with OutOfDisk
- cleanup OOD logic in scheduling and node controller
- update comments and testcases
2018-12-18 11:28:02 -08:00
andrewsykim
5329f09663 consolidate node deletion logic between node lifecycle and cloud node controller 2018-12-03 13:33:53 -05:00
mikeweiwei
31bbc17b2a Fix typos 2018-11-12 09:43:53 +08:00
Davanum Srinivas
954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
yameiwang
de36fc8f4b remove kubeClient != nil when using glog.Fatalf 2018-10-24 21:48:06 +08:00
yameiwang
f4713d43c3 nodeController should send events to api server when nodeController eviction happens 2018-10-24 00:42:38 +08:00
k8s-ci-robot
d425258532
Merge pull request #69788 from ravisantoshgudimetla/taint-based-eviction
Add test cases for taintbasedevictions
2018-10-18 06:34:31 -07:00
k8s-ci-robot
aad6437aa9
Merge pull request #64061 from wgliang/master.remove-unused-code-pkg-controller
remove unused code of (pkg/controller)
2018-10-17 19:54:05 -07:00
k8s-ci-robot
6f4b768c94
Merge pull request #65350 from liggitt/simplify-taint-manager-key
Simplify taint manager workqueue keys
2018-10-17 18:39:03 -07:00
Zhen Wang
7bb61c566d Put node lease lister behind feature gate 2018-10-17 09:41:30 -07:00
Jordan Liggitt
9503c64f27 Simplify taint manager workqueue keys 2018-10-17 10:47:14 -04:00
ravisantoshgudimetla
d281d566b3 Add test cases for taintbasedevictions 2018-10-16 18:56:45 -04:00
tanshanshan
b7c7966b9f Move pkg/scheduler/algorithm/well_known_labels.go out 2018-10-13 09:10:00 +08:00
Zhen Wang
e35d808aa2 NodeLifecycleController treats node lease renewal as a heartbeat signal 2018-10-11 16:07:15 -07:00
Guoliang Wang
b1ac6df4dc remove unused code of (pkg/controller) 2018-10-09 08:15:30 +08:00
Walter Fender
f3f46d5f5a Moving the cloudprovider interface to staging.
Individual implementations are not yet being moved.
Fixed all dependencies which call the interface.
Fixed golint exceptions to reflect the move.
Added project info as per @dims and
https://github.com/kubernetes/kubernetes-template-project.
Added dims to the security contacts.
Fixed minor issues.
Added missing template files.
Copied ControllerClientBuilder interface to cp.
This allows us to break the only dependency on K8s/K8s.
Added TODO to ControllerClientBuilder.
Fixed GoDeps.
Factored in feedback from JustinSB.
2018-10-04 14:41:20 -07:00
Zhen Wang
88e7e186f0 Rename node status to node health in NodeLifecycleController
Since we are going to treat both node status and node lease as node
heartbeat/health signals, this PR makes the renmae changes, so that the
follow-up PRs are easier to review.
2018-10-01 23:19:50 -07:00
k8s-ci-robot
d1e24acee7
Merge pull request #68644 from Pingan2017/nodecondition
NodePIDPressure condition should set to unknown when node lost connne…
2018-09-25 18:12:44 -07:00
Pingan2017
3b19c33be5 NodePIDPressure condition should set to unknown when node lost connnection with contorl 2018-09-14 08:50:05 +08:00
Da K. Ma
97ba8b477a Using node name to improve node controller performance.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-09-11 21:43:19 +08:00
Klaus Ma
85a19b109a Taint node in paralle.
Signed-off-by: Klaus Ma <klaus1982.cn@gmail.com>
2018-09-01 09:57:02 +08:00
Wei Huang
7c024273a4
fix an issue that scheduling doesn't respect NodeLost status of a node
- if Node is in UnknowStatus, apply unreachable taint with NoSchedule effect
- some internal data structure refactoring
- update unit test
2018-08-27 11:46:15 -07:00
Jordan Liggitt
52abbeffe6
Fix out of bounds error on non-64-bit machines 2018-06-28 16:29:52 -04:00
Da K. Ma
66d558dfd3 Removed unused vars.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-06-24 10:48:49 +08:00
Jeff Grafton
23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Jordan Liggitt
8448ee4e12
Remove item from taint manager workqueue on completion 2018-06-21 17:11:41 -04:00
Kubernetes Submit Queue
06ea14a5d6
Merge pull request #63471 from ceshihao/taint_behavior_consistent
Automatic merge from submit-queue (batch tested with PRs 65032, 63471, 64104, 64672, 64427). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make taint behavior consistent for NoSchedule

**What this PR does / why we need it**:
Make taint behavior consistent.
If `TaintNodesByCondition ` is enable, taint node with `NotReady:NoSchedule`.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63420

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```
2018-06-20 04:23:13 -07:00
Da K. Ma
254900832b Volunteer to maintain nodelifecycle 2018-06-08 10:17:08 +08:00
wojtekt
f7cf33e218 Parallelize taint manager 2018-05-30 14:46:48 +02:00
Maciej Szulik
a3dd7ca9ee
increase timeout in TestCancelAndReadd
the flakes referenced in #51704 were still seen downstream. the current timeout approach is known to be faulty, but fixing the tests has not been prioritized. this increases the timeout sufficiently to avoid flakes in the meantime
2018-05-17 10:26:17 -04:00
ceshihao
4eb72d7bcd simplify code and add unit test for NotReady taint 2018-05-14 06:55:42 +00:00
ceshihao
842ae0bc22 Make taint behavior consistent, taint node with NotReady:NoSchedule 2018-05-09 13:36:05 +00:00
Jesse Haka
de967b717d PR #59323, fix bug and remove one api call, add node util dependency to cloud controller 2018-04-22 20:32:26 +03:00
Mike Danese
f427531179 boring 2018-04-18 09:55:57 -07:00
Mikhail Mazurskiy
468655b76a
Use typed events client directly 2018-04-01 18:57:29 +10:00
Kubernetes Submit Queue
8c00efe653
Merge pull request #60831 from resouer/fix-race
Automatic merge from submit-queue (batch tested with PRs 60574, 60666, 60831, 60877, 60357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix data race in node lifecycle controller

**What this PR does / why we need it**:
Encountered this bug during fixing: https://github.com/kubernetes/kubernetes/pull/60753

There's a data race for `zoneNoExecuteTainter `.

```
--- PASS: TestTaintNodeByCondition (5.72s)
PASS
==================
WARNING: DATA RACE
Write at 0x00c421a8d2f0 by goroutine 1472:
  runtime.mapassign_faststr()
      /usr/local/go/src/runtime/hashmap_fast.go:598 +0x0
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).addPodEvictorForNewZone()
      /root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/controller/nodelifecycle/node_lifecycle_controller.go:1053 +0x37d
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).monitorNodeStatus()
      

Previous read at 0x00c421a8d2f0 by goroutine 1471:
  runtime.mapiterinit()
      /usr/local/go/src/runtime/hashmap.go:709 +0x0
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).doNoExecuteTaintingPass()
      /root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/controller/nodelifecycle/node_lifecycle_controller.go:459 +0xec
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).(k8s.io/kubernetes/pkg/controller/nodelifecycle.doNoExecuteTaintingPass)-fm()
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix data race in node lifecycle controller
```
2018-03-20 08:34:40 -07:00
Kubernetes Submit Queue
1be8e8bb59
Merge pull request #60562 from Pingan2017/cleanupnode
Automatic merge from submit-queue (batch tested with PRs 60457, 60331, 54970, 58731, 60562). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

clean up unused const in node_lifecycle_controller.go

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-19 23:42:22 -07:00
Da K. Ma
b23db30765 Added unscheduable taint.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-03-16 09:13:08 +08:00
Harry Zhang
da29bd2cbe Fix data race in node lifecycle controller 2018-03-06 00:18:11 -08:00
Pingan2017
822d21f88a clean up unused const in node_lifecycle_controller.go 2018-02-28 15:34:47 +08:00
Kubernetes Submit Queue
5b98dbcfe5
Merge pull request #60008 from k82cn/k8s_54313_2
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Taint node when it under PID pressure.

Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:
```release-note
If TaintNodesByCondition enabled, taint node when it under PID pressure 
```
2018-02-20 03:13:28 -08:00
Kubernetes Submit Queue
96ec318718
Merge pull request #59842 from ixdy/update-rules_go-02-2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies

**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.

**Release note**:

```release-note
NONE
```
2018-02-19 22:23:05 -08:00
Da K. Ma
6bda1bec6e Taint node when it under PID pressure.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-02-17 10:55:29 +08:00
Jeff Grafton
ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Aleksandra Malinowska
2d54ba3e0f
Revert "add node shutdown taint" 2018-02-16 12:24:27 +01:00
Kubernetes Submit Queue
27daaab224
Merge pull request #59323 from zetaab/nodetaint
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add node shutdown taint

**What this PR does / why we need it**: we need node stopped taint in order to detach volumes immediately without waiting timeout. More info in issue ticket #58635 

**Which issue(s) this PR fixes** 
Fixes #58635

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-02-15 09:52:10 -08:00
Di Xu
48388fec7e fix all the typos across the project 2018-02-11 11:04:14 +08:00
Jesse Haka
6665fa7144 taint also node controller
fix function

fix gofmt

fix function return value

fix tests

skip notimplemented error

remove factory unused

in openstack we should try to find instanceid from all states instead of ACTIVE, all other cloudproviders do this already

fix tests and lint

fix gofmt

fix nodelifecycletest

fix lint errors
2018-02-10 15:41:24 +02:00
Kubernetes Submit Queue
0656d030a7
Merge pull request #38320 from liggitt/golang-ratelimit
Automatic merge from submit-queue (batch tested with PRs 59158, 38320, 59059, 55516, 59357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Switch from juju/ratelimit to golang.org/x/time/rate

Replaces juju/ratelimit with golang.org/x/time/rate
xref https://github.com/kubernetes/steering/issues/21

Requires removing the Saturation() method on the rate limiter. In the process of attempting to contribute it to the `golang.org/x/time/rate` implementation, it became clear that what it was calculating was not very useful when combined with periodic polling. See discussion in https://go-review.googlesource.com/c/time/+/29958#message-4caffc11669cadd90e2da4c05122cfec50ea6a22

```release-note
NONE
```
2018-02-05 12:40:34 -08:00
Seth Jennings
e994ce1f7d nodelifecycle: set OutOfDisk unknown on node timeout 2018-02-02 14:15:36 -06:00
Jordan Liggitt
a9ed90f227
Remove Saturation() from rate limiter interface 2018-01-19 11:48:51 -05:00
Kubernetes Submit Queue
bd4d511a40
Merge pull request #57852 from misterikkit/moveScheduler
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move scheduler out of plugin directory

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
This is but one step toward resolving the referenced issue.
/ref #57579

**Special notes for your reviewer**:

**Release note**:

```release-note
Default scheduler code is moved out of the plugin directory.
plugin/pkg/scheduler -> pkg/scheduler
plugin/cmd/kube-scheduler -> cmd/kube-scheduler
```
/sig scheduling
2018-01-05 22:20:13 -08:00
Jonathan Basseri
85c5862552 Fix scheduler refs in BUILD files.
Update references to moved scheduler code.
2018-01-05 15:05:01 -08:00
Jonathan Basseri
30b89d830b Move scheduler code out of plugin directory.
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.

Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
2018-01-05 15:05:01 -08:00
Marek Grabowski
cd7e578489 Re-add nodecontroller OWNERS file 2018-01-05 16:08:30 +00:00
Walter Fender
9187b343e1 Split the NodeController into lifecycle and ipam pieces.
Prepatory work fpr removing cloud provider dependency from node
controller running in Kube Controller Manager. Splitting the node
controller into its two major pieces life-cycle and CIDR/IP
management. Both pieces currently need the the cloud system to do their work.
Removing lifecycles dependency on cloud will be fixed ina followup PR.

Moved node scheduler code to live with node lifecycle controller.
Got the IPAM/Lifecycle split completed. Still need to rename pieces.
Made changes to the utils and tests so they would be in the appropriate
package.
Moved the node based ipam code to nodeipam.
Made the relevant tests pass.
Moved common node controller util code to nodeutil.
Removed unneeded pod informer sync from node ipam controller.
Fixed linter issues.
Factored in  feedback from @gmarek.
Factored in feedback from @mtaufen.
Undoing unneeded change.
2018-01-04 12:48:08 -08:00