Commit Graph

193 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
06c1f2ba2c Merge pull request #34707 from gmarek/master
Automatic merge from submit-queue

Bump log level in case of Node eviction
2016-10-13 05:37:10 -07:00
gmarek
41278b4c6b Bump log level in case of Node eviction 2016-10-13 13:26:16 +02:00
gmarek
8b7e9d303c Handle DeletedFinalStateUnknown in NodeController 2016-10-13 11:31:04 +02:00
Justin Santa Barbara
dc2a9d6f5c Fix typo: initilizing -> initializing 2016-10-07 13:29:22 -04:00
deads2k
0961784a9b switch node controller to shared informers 2016-09-29 09:16:41 -04:00
gmarek
cb0a13c1e5 Move orphaned Pod deletion logic to PodGC 2016-09-28 13:58:31 +02:00
Justin Santa Barbara
54195d590f Use strongly-typed types.NodeName for a node name
We had another bug where we confused the hostname with the NodeName.

To avoid this happening again, and to make the code more
self-documenting, we use types.NodeName (a typedef alias for string)
whenever we are referring to the Node.Name.

A tedious but mechanical commit therefore, to change all uses of the
node name to use types.NodeName

Also clean up some of the (many) places where the NodeName is referred
to as a hostname (not true on AWS), or an instanceID (not true on GCE),
etc.
2016-09-27 10:47:31 -04:00
Mike Danese
a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
gmarek
c40a36cab0 Change the eviction metric type and fix rate-limited-timed-queue 2016-09-08 12:20:51 +02:00
Kubernetes Submit Queue
9dfe8df7cd Merge pull request #32107 from xingzhou/km_bug
Automatic merge from submit-queue

Used goroutine to launch node controller's internalPodInformer.

Fixes #32103
2016-09-06 11:50:28 -07:00
Kubernetes Submit Queue
afef4b6938 Merge pull request #32070 from gmarek/nodecontroller
Automatic merge from submit-queue

Sleep between NodeStatus update retries

Just a thing I found when looking into other problems.

This is pretty much no-risk change fixing wrong behavior. Do you think it should go in 1.4? @pwittrock
2016-09-06 03:51:20 -07:00
Xing Zhou
46c302b4c2 Used goroutine to launch node controller's internalPodInformer.
Node controller's internalPodInformer will block main thread
if it is not started as a go routine. This patch fixed this
by runing internalPodInformer as a go routine.
2016-09-06 15:39:13 +08:00
Wojciech Tyczynski
b69d516763 NodeController listing nodes from apiserver cache 2016-09-05 16:52:57 +02:00
gmarek
bac603afd6 Sleep between NodeStatus update retries 2016-09-05 12:29:28 +02:00
gmarek
ea2d19f5d7 Remove unused argument to NodeController.Run 2016-08-30 14:24:56 +02:00
gmarek
5d8cb17efa Add cluster health metrics to NodeController 2016-08-18 15:11:10 +02:00
Kubernetes Submit Queue
f9190ed61a Merge pull request #30138 from gmarek/flags
Automatic merge from submit-queue

Expose flags for new NodeEviction logic in NodeController

Fix #28832
Last PR from the NodeController NodeEviction logic series. 

cc @davidopp @lavalamp @mml
2016-08-18 00:41:28 -07:00
bprashanth
15c9826061 Nodecontroller doesn't flip readiness on pods if kubeletVersion < 1.2.0 2016-08-17 15:33:35 -07:00
gmarek
4cf698ef04 Expose flags for new NodeEviction logic in NodeController 2016-08-17 10:43:24 +02:00
AdoHe
b2ab4c6d9b fix node controller event uid issue 2016-08-14 09:41:20 +08:00
Dominika Hodovska
816f6d32ca Collapse duplicate informer creation paths 2016-08-04 09:02:13 +02:00
gmarek
66224ce0bd Change eviction logic in NodeController and make it Zone-aware 2016-08-02 14:21:52 +02:00
k8s-merge-robot
59836d6dbd Merge pull request #24841 from sjenning/shared-informer
Automatic merge from submit-queue

update node controller to use shared pod informer

continuing work from #24470 and #23575
2016-08-02 03:45:01 -07:00
Matt T. Proud
76aab29ede pkg/controller/node/nodecontroller: simplify mutex
Similar to #29598, we can rely on the zero-value construction behavior
to embed `sync.Mutex` into parent structs.
2016-07-26 07:06:16 +02:00
Seth Jennings
db6026c82a node controller use shared pod informer 2016-07-20 15:26:19 -05:00
Seth Jennings
6d77f53af4 refactor maybeDeleteTerminatingPod 2016-07-20 15:26:19 -05:00
gmarek
56006fac43 Retry assigning CIDRs 2016-07-18 17:06:04 +02:00
Prashanth Balasubramanian
2f9516db30 List all nodes and occupy cidr map before starting allocations 2016-07-16 13:54:01 -07:00
gmarek
f6b1c316e9 Allow switching rate limiter inside RateLimitedQueue 2016-07-14 15:38:14 +02:00
gmarek
5677a9845e Split NodeController rate limiters between zones 2016-07-13 14:09:19 +02:00
gmarek
fd600ab65c Add hooks for cluster health detection 2016-07-12 15:10:58 +02:00
gmarek
7524da877e Reduce tightness of coupling in NodeController 2016-07-12 11:00:41 +02:00
gmarek
7f5f9d3a6f Move CIDR allocation logic away from nodecontroller.go 2016-07-12 09:40:43 +02:00
David McMahon
ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Oleg Shaldybin
a58b4cf59d Don't panic in NodeController if pod update fails
Previously it was trying to use a nil pod variable if error was returned
from the pod update call.
2016-06-28 11:54:13 -07:00
goltermann
218645b346 Fix several spelling errors in comments. 2016-06-17 10:41:18 -07:00
gmarek
7cac170214 AllocateOrOccupyCIDR returs quickly 2016-05-31 09:11:42 +02:00
k8s-merge-robot
577cdf937d Merge pull request #26415 from wojtek-t/network_not_ready
Automatic merge from submit-queue

Add a NodeCondition "NetworkUnavaiable" to prevent scheduling onto a node until the routes have been created 

This is new version of #26267 (based on top of that one).

The new workflow is:
- we have an "NetworkNotReady" condition
- Kubelet when it creates a node, it sets it to "true"
- RouteController will set it to "false" when the route is created
- Scheduler is scheduling only on nodes that doesn't have "NetworkNotReady ==true" condition

@gmarek @bgrant0607 @zmerlynn @cjcullen @derekwaynecarr @danwinship @dcbw @lavalamp @vishh
2016-05-29 03:06:59 -07:00
Alex Robinson
d577550dd0 Merge pull request #26054 from gmarek/flags
Make service-range flag in controller-manager optional
2016-05-27 14:26:15 -07:00
gmarek
7bdf480340 Node is NotReady until the Route is created 2016-05-27 19:29:51 +02:00
Zach Loafman
cb69960742 nodecontroller: Fix log message on successful update 2016-05-25 14:44:15 -07:00
gmarek
08385b2c5f Make service-range flag in controller-manager optional 2016-05-23 09:37:53 +02:00
gmarek
1d89d2f2d2 Add few log lines to NodeController 2016-05-23 08:49:11 +02:00
mqliang
17d5a302bb make podcidr mask size configurable 2016-05-20 20:44:40 +08:00
mqliang
cf7a3475f3 Don't allow node controller to allocate into service CIDR range 2016-05-20 20:44:40 +08:00
mqliang
69b8453fa0 cidr allocator 2016-05-20 20:44:40 +08:00
gmarek
6d27009db1 NodeController doesn't evict Pods if no Nodes are Ready 2016-05-17 23:03:21 +02:00
mqliang
c10f43a2e5 implement AddIndexers for SharedIndexInformer 2016-05-06 21:23:18 +08:00
mqliang
9011207f18 add namespace index to rc and pod 2016-05-06 17:12:36 +08:00
gmarek
3171aac57c Generated clients can return their RESTClients, RESTClient can return its RateLimiter 2016-04-27 22:15:10 +02:00
k8s-merge-robot
72e51dacfe Merge pull request #24034 from AdoHe/log_spam
Automatic merge from submit-queue

remove log spam from nodecontroller

@thockin @quinton-hoole ptal.
2016-04-21 12:11:05 -07:00
AdoHe
e52f71f78d remove log spam from nodecontroller 2016-04-10 22:51:29 -04:00
k8s-merge-robot
7d7ca5ab72 Merge pull request #23608 from caesarxuchao/mv-typed-clients
Automatic merge from submit-queue

Move typed clients into clientset folder

Move typed clients from `pkg/client/typed/` to `pkg/client/clientset_generated/${clientset_name}/typed`.

The first commit changes the client-gen, the last commit updates the doc, other commits are just moving things around.

@lavalamp @krousey
2016-04-02 19:31:40 -07:00
Chao Xu
49559a3332 Generate the typed clients under the clientset folder 2016-03-31 15:28:45 -07:00
André Cruz
34a273f1eb Fixed typo. 2016-03-29 16:10:11 +01:00
k8s-merge-robot
e44ad7a083 Merge pull request #22735 from resouer/throttle-dev
Auto commit by PR queue bot
2016-03-26 06:44:48 -07:00
goltermann
32d569d6c7 Fixing all the "composite literal uses unkeyed fields" Vet errors. 2016-03-25 15:25:09 -07:00
harry
8472cfa214 Refactor throttle into util pkg
Fix missing throttle.go
2016-03-25 08:32:23 +08:00
Mike Danese
e0431d8409 Revert "Revert "continuously delete pods on nodes that don't exist""
This reverts commit da0a72f2c2.
2016-03-08 11:00:35 -08:00
Marek Grabowski
da0a72f2c2 Revert "continuously delete pods on nodes that don't exist" 2016-03-08 09:38:53 +01:00
Mike Danese
c404e7c6d1 continuously delete pods on nodes that don't exist 2016-03-07 16:01:33 -08:00
k8s-merge-robot
dc46ae031d Merge pull request #22336 from cjcullen/evict
Auto commit by PR queue bot
2016-03-07 08:42:40 -08:00
CJ Cullen
e7fc608df7 Immediately evict pods and delete node when cloud says node is gone. 2016-03-07 06:07:51 -08:00
Chao Xu
828fe5df2f remove V(2) 2016-03-03 14:54:33 -08:00
k8s-merge-robot
ec77d0841d Merge pull request #22376 from caesarxuchao/add-log-nodecontroller
Auto commit by PR queue bot
2016-03-02 13:36:34 -08:00
Chao Xu
f8dd7fe1de log the succussful forceful deletion 2016-03-02 11:18:55 -08:00
CJ Cullen
506df24c1f Revert "Evict pods w/o rate-limit when cloud says node is gone." 2016-03-01 11:04:18 -08:00
k8s-merge-robot
fec00b535f Merge pull request #21187 from cjcullen/evict
Auto commit by PR queue bot
2016-03-01 10:45:22 -08:00
Kris
e664ef922f Move restclient to its own package 2016-02-29 12:05:13 -08:00
CJ Cullen
3a8c7a7074 Evict pods w/o rate-limit when cloud says node is gone. 2016-02-29 10:33:50 -08:00
Mike Danese
a50bc3da10 revert deletePods changes in #22100 2016-02-27 08:40:50 -08:00
Mike Danese
c1a7e280a3 fix pod eviction for gracefully terminationg pods 2016-02-26 17:34:15 -08:00
Fabio Yeon
7d0684e9c4 Merge pull request #21628 from smarterclayton/suppress_debug_logging
Reduce volume of logs generated at v(3)
2016-02-26 15:47:31 -08:00
gmarek
e99ad585ce Decrease verbosity in NodeController 2016-02-23 17:03:38 +01:00
Clayton Coleman
ae2f6a833a Reduce volume of logs generated at v(3)
Node controller is generating a huge amount of logging at v(3) that is
more appropriate for v(5). Split the log into two levels and ensure it
also ends up on one line (so grep works).

The pod manager generates a v(4) pod output on sync that always contains
a newline - since the size of the pod is so excessive in output, kick it
to v(5) for deep debugging (we're pretty happy with this loop).
2016-02-20 15:29:05 -05:00
Andy Goldstein
49fb450a36 Fix type for DaemonSet informer 2016-02-18 14:54:39 -05:00
k8s-merge-robot
5acdb92126 Merge pull request #21177 from laushinka/spelling-fixes
Auto commit by PR queue bot
2016-02-18 10:29:49 -08:00
gmarek
0d7f3bca0a Increase logging verbosity to help debugging #21458 2016-02-18 11:25:38 +01:00
laushinka
7ef585be22 Spelling fixes inspired by github.com/client9/misspell 2016-02-18 06:58:05 +07:00
Chao Xu
97aecd002a remove underscore in imported pkg names 2016-02-16 10:54:51 -08:00
CJ Cullen
52b16129dc Re-GET nodes during CIDR allocation (to avoid cascading bad resource version). 2016-02-12 13:08:54 -08:00
Jan Chaloupka
4389b3f0d6 Rewritte util.* -> wait.* wherever reasonable 2016-02-07 12:02:20 +01:00
Chao Xu
184440f8ef rename release_1_2 to internalclientset 2016-02-05 14:02:28 -08:00
Chao Xu
1b047f8e67 rename legacy to core 2016-02-04 14:26:56 -08:00
Chao Xu
f9f5736b01 grep sed 2016-02-03 13:06:07 -08:00
Chao Xu
fe7887f1ec replace the client with clientset in controllers 2016-02-02 20:28:45 -08:00
harry
1032067ff9 Replace runtime reference by pkg 2016-02-01 21:06:44 +08:00
Matt Liggett
0ba1b49b42 When a node becomes unreachable, do not evict DaemonSet-managed pods.
Part of the graduation requirement for DaemonSet spelled out in #15310.
2016-01-28 11:08:52 -08:00
David Oppenheimer
85f88b8645 Run gofmt. 2016-01-25 23:25:38 -08:00
David Oppenheimer
2a6da5871a Revert #19580 and #17754. Fixes #20048. 2016-01-25 23:19:22 -08:00
Alex Mohr
94b2490eba Merge pull request #19580 from WeixuZhuang/node
resolve the bug when cluster CIDR is not /8
2016-01-21 10:34:19 -08:00
Minhan Xia
01829432db update pod status once node becomes NotReady 2016-01-14 17:41:36 -08:00
Weixu Zhuang
2af68d25b8 resolve the bug when cluster CIDR is not /8
We will have the rigth formula to generate correct maxCIDRs now.
Previous code assume cluster CIDR is /8 which may not be true.
Now it generates maxCIDR based on the info of cluster IP.
2016-01-13 11:55:19 -08:00
Weixu Zhuang
3928bd6e76 Fix TODO in pkg/controller/nodecontroller.go line 472
The code now calculates and find out the CIDRs for every node in any sync period.
I will fix this TODO by maintaining a set for available CIDRs left. Firstly, I will
insert 256 CIDRs into the available set. Once someone get one CIDR, remove this CIDR
from the available set. If one node get deleted, we will reinsert the CIDR associates
with this node back to available CIDR. Once there are nothing left in available CIDR set,
generate another 256 CIDRs and insert them into the available set. As a result, we do not
need to generate CIDRs in every monitor process and we only need to assign CIDR to node
which does not have it.

This commit also fix the error that CIDR may overflow when we use the function
generateCIDRs. There will be no more ip overflowing, all assigan CIDR will be valid
2015-12-28 11:15:38 -08:00
Wojciech Tyczynski
960808bf08 Switch to versioned ListOptions in client. 2015-12-14 14:26:09 +01:00
Wojciech Tyczynski
b0fcb5adef Pass ListOptions to List in ListWatch. 2015-12-07 11:53:53 +01:00
Mike Danese
f784be33ac actually validate semver in node controller rather than prefix checking 2015-12-03 14:49:44 -08:00
Wojciech Tyczynski
6dcb689d4e Simplify List() signature in clients. 2015-12-03 09:54:07 +01:00
k8s-merge-robot
8a8639d7af Merge pull request #17863 from wojtek-t/only_list_options_in_watch
Auto commit by PR queue bot
2015-12-02 06:28:28 -08:00
Wojciech Tyczynski
8343c8ce6c Pass ListOptions to List() methods. 2015-12-01 15:00:36 +01:00