Commit Graph

44 Commits

Author SHA1 Message Date
Yu-Ju Hong
bd2301a628 nodelifecycle controller: reconcile node OS/arch labels 2019-03-06 17:26:23 -08:00
Daniel (Shijun) Qian
d648ba856b Move pkg/api/v1/node to pkg/util/node (#73656)
* merge pkg/api/v1/node with pkg/util/node

* update test case for utilnode

* remove package pkg/api/v1/node

* move isNodeReady to internal func

* Split GetNodeCondition into e2e and controller pkg

* fix import errors
2019-02-26 11:05:32 -08:00
Kubernetes Prow Robot
dc634b3fd3
Merge pull request #70934 from mikeweiwei/my_fix3
Fix typos
2019-02-20 12:53:52 -08:00
Kubernetes Prow Robot
cbcc2b26fe
Merge pull request #72904 from ping035627/k8s-190115
Correct the count of evictionsNumber
2019-02-18 18:40:09 -08:00
PingWang
1ff8fac138 Correct the count of evictionsNumber
Signed-off-by: PingWang <wang.ping5@zte.com.cn>

check nodeutil.SwapNodeControllerTaint's return

Signed-off-by: PingWang <wang.ping5@zte.com.cn>
2019-02-11 10:15:10 +08:00
Shiv Nagarajan
36ee154243 remove deprecated taints from 1.9 2019-01-16 21:20:57 -05:00
Kubernetes Prow Robot
c5616157a0
Merge pull request #71668 from mtaufen/node-lifecycle-metrics
export metrics from node lifecycle controller workqueues
2019-01-07 20:24:12 -08:00
Michael Taufen
0ab928c9d6 export metrics from node lifecycle controller workqueues 2019-01-07 15:27:35 -08:00
Kubernetes Prow Robot
0d63cf9caa
Merge pull request #67037 from Huang-Wei/cleanup-ood
cleanup logic related with OutOfDisk
2018-12-20 17:30:27 -08:00
Weibin Lin
842bd1e1ec update deployment, daemonset, replicaset, statefulset to apps/v1 2018-12-19 10:46:45 -05:00
Wei Huang
8f87e71e0c
cleanup logic related with OutOfDisk
- cleanup OOD logic in scheduling and node controller
- update comments and testcases
2018-12-18 11:28:02 -08:00
andrewsykim
5329f09663 consolidate node deletion logic between node lifecycle and cloud node controller 2018-12-03 13:33:53 -05:00
mikeweiwei
31bbc17b2a Fix typos 2018-11-12 09:43:53 +08:00
Davanum Srinivas
954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
yameiwang
de36fc8f4b remove kubeClient != nil when using glog.Fatalf 2018-10-24 21:48:06 +08:00
yameiwang
f4713d43c3 nodeController should send events to api server when nodeController eviction happens 2018-10-24 00:42:38 +08:00
k8s-ci-robot
6f4b768c94
Merge pull request #65350 from liggitt/simplify-taint-manager-key
Simplify taint manager workqueue keys
2018-10-17 18:39:03 -07:00
Zhen Wang
7bb61c566d Put node lease lister behind feature gate 2018-10-17 09:41:30 -07:00
Jordan Liggitt
9503c64f27 Simplify taint manager workqueue keys 2018-10-17 10:47:14 -04:00
tanshanshan
b7c7966b9f Move pkg/scheduler/algorithm/well_known_labels.go out 2018-10-13 09:10:00 +08:00
Zhen Wang
e35d808aa2 NodeLifecycleController treats node lease renewal as a heartbeat signal 2018-10-11 16:07:15 -07:00
Walter Fender
f3f46d5f5a Moving the cloudprovider interface to staging.
Individual implementations are not yet being moved.
Fixed all dependencies which call the interface.
Fixed golint exceptions to reflect the move.
Added project info as per @dims and
https://github.com/kubernetes/kubernetes-template-project.
Added dims to the security contacts.
Fixed minor issues.
Added missing template files.
Copied ControllerClientBuilder interface to cp.
This allows us to break the only dependency on K8s/K8s.
Added TODO to ControllerClientBuilder.
Fixed GoDeps.
Factored in feedback from JustinSB.
2018-10-04 14:41:20 -07:00
Zhen Wang
88e7e186f0 Rename node status to node health in NodeLifecycleController
Since we are going to treat both node status and node lease as node
heartbeat/health signals, this PR makes the renmae changes, so that the
follow-up PRs are easier to review.
2018-10-01 23:19:50 -07:00
k8s-ci-robot
d1e24acee7
Merge pull request #68644 from Pingan2017/nodecondition
NodePIDPressure condition should set to unknown when node lost connne…
2018-09-25 18:12:44 -07:00
Pingan2017
3b19c33be5 NodePIDPressure condition should set to unknown when node lost connnection with contorl 2018-09-14 08:50:05 +08:00
Da K. Ma
97ba8b477a Using node name to improve node controller performance.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-09-11 21:43:19 +08:00
Klaus Ma
85a19b109a Taint node in paralle.
Signed-off-by: Klaus Ma <klaus1982.cn@gmail.com>
2018-09-01 09:57:02 +08:00
Wei Huang
7c024273a4
fix an issue that scheduling doesn't respect NodeLost status of a node
- if Node is in UnknowStatus, apply unreachable taint with NoSchedule effect
- some internal data structure refactoring
- update unit test
2018-08-27 11:46:15 -07:00
Da K. Ma
66d558dfd3 Removed unused vars.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-06-24 10:48:49 +08:00
ceshihao
4eb72d7bcd simplify code and add unit test for NotReady taint 2018-05-14 06:55:42 +00:00
ceshihao
842ae0bc22 Make taint behavior consistent, taint node with NotReady:NoSchedule 2018-05-09 13:36:05 +00:00
Jesse Haka
de967b717d PR #59323, fix bug and remove one api call, add node util dependency to cloud controller 2018-04-22 20:32:26 +03:00
Kubernetes Submit Queue
8c00efe653
Merge pull request #60831 from resouer/fix-race
Automatic merge from submit-queue (batch tested with PRs 60574, 60666, 60831, 60877, 60357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix data race in node lifecycle controller

**What this PR does / why we need it**:
Encountered this bug during fixing: https://github.com/kubernetes/kubernetes/pull/60753

There's a data race for `zoneNoExecuteTainter `.

```
--- PASS: TestTaintNodeByCondition (5.72s)
PASS
==================
WARNING: DATA RACE
Write at 0x00c421a8d2f0 by goroutine 1472:
  runtime.mapassign_faststr()
      /usr/local/go/src/runtime/hashmap_fast.go:598 +0x0
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).addPodEvictorForNewZone()
      /root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/controller/nodelifecycle/node_lifecycle_controller.go:1053 +0x37d
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).monitorNodeStatus()
      

Previous read at 0x00c421a8d2f0 by goroutine 1471:
  runtime.mapiterinit()
      /usr/local/go/src/runtime/hashmap.go:709 +0x0
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).doNoExecuteTaintingPass()
      /root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/controller/nodelifecycle/node_lifecycle_controller.go:459 +0xec
  k8s.io/kubernetes/pkg/controller/nodelifecycle.(*Controller).(k8s.io/kubernetes/pkg/controller/nodelifecycle.doNoExecuteTaintingPass)-fm()
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix data race in node lifecycle controller
```
2018-03-20 08:34:40 -07:00
Kubernetes Submit Queue
1be8e8bb59
Merge pull request #60562 from Pingan2017/cleanupnode
Automatic merge from submit-queue (batch tested with PRs 60457, 60331, 54970, 58731, 60562). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

clean up unused const in node_lifecycle_controller.go

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-19 23:42:22 -07:00
Da K. Ma
b23db30765 Added unscheduable taint.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
2018-03-16 09:13:08 +08:00
Harry Zhang
da29bd2cbe Fix data race in node lifecycle controller 2018-03-06 00:18:11 -08:00
Pingan2017
822d21f88a clean up unused const in node_lifecycle_controller.go 2018-02-28 15:34:47 +08:00
Kubernetes Submit Queue
5b98dbcfe5
Merge pull request #60008 from k82cn/k8s_54313_2
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Taint node when it under PID pressure.

Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:
```release-note
If TaintNodesByCondition enabled, taint node when it under PID pressure 
```
2018-02-20 03:13:28 -08:00
Da K. Ma
6bda1bec6e Taint node when it under PID pressure.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-02-17 10:55:29 +08:00
Aleksandra Malinowska
2d54ba3e0f
Revert "add node shutdown taint" 2018-02-16 12:24:27 +01:00
Jesse Haka
6665fa7144 taint also node controller
fix function

fix gofmt

fix function return value

fix tests

skip notimplemented error

remove factory unused

in openstack we should try to find instanceid from all states instead of ACTIVE, all other cloudproviders do this already

fix tests and lint

fix gofmt

fix nodelifecycletest

fix lint errors
2018-02-10 15:41:24 +02:00
Seth Jennings
e994ce1f7d nodelifecycle: set OutOfDisk unknown on node timeout 2018-02-02 14:15:36 -06:00
Jonathan Basseri
30b89d830b Move scheduler code out of plugin directory.
This moves plugin/pkg/scheduler to pkg/scheduler and
plugin/cmd/kube-scheduler to cmd/kube-scheduler.

Bulk of the work was done with gomvpkg, except for kube-scheduler main
package.
2018-01-05 15:05:01 -08:00
Walter Fender
9187b343e1 Split the NodeController into lifecycle and ipam pieces.
Prepatory work fpr removing cloud provider dependency from node
controller running in Kube Controller Manager. Splitting the node
controller into its two major pieces life-cycle and CIDR/IP
management. Both pieces currently need the the cloud system to do their work.
Removing lifecycles dependency on cloud will be fixed ina followup PR.

Moved node scheduler code to live with node lifecycle controller.
Got the IPAM/Lifecycle split completed. Still need to rename pieces.
Made changes to the utils and tests so they would be in the appropriate
package.
Moved the node based ipam code to nodeipam.
Made the relevant tests pass.
Moved common node controller util code to nodeutil.
Removed unneeded pod informer sync from node ipam controller.
Fixed linter issues.
Factored in  feedback from @gmarek.
Factored in feedback from @mtaufen.
Undoing unneeded change.
2018-01-04 12:48:08 -08:00