Commit Graph

31450 Commits

Author SHA1 Message Date
Weibin Lin
86e35b4463 add islinwb to pkg/util/ipset reviewers list 2018-06-16 11:40:52 +08:00
Kubernetes Submit Queue
a6e61e7452 Merge pull request #64838 from krzysied/scheduling_latency_metric_fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding summary metric for scheduling latency

**What this PR does / why we need it**:
Re-introduces histogram metrics for the backward compatibility.
Changes SchedulingLatency metric to satisfy prometheus best practice.
ref #64316

**Release note**:

```release-note
NONE
```
2018-06-15 08:50:07 -07:00
Brendan Burns
e3d2242388 Detect a missing key error and print a cleaner message. 2018-06-14 21:52:51 -07:00
Kubernetes Submit Queue
3abba25160 Merge pull request #65049 from xujieasd/iptables-typo
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

iptables proxier_test typo

**What this PR does / why we need it**:
The definition of `makeTestService` is
```
func makeTestService(namespace, name string, svcFunc func(*api.Service)) api.Service {
...
}
```
but in function `TestClusterIPReject`, use  
makeTestService(svcPortName.Namespace, svcPortName.`Namespace`, func(svc *api.Service)  
should be  
makeTestService(svcPortName.Namespace, svcPortName.`Name`, func(svc *api.Service)  

I think it's a typo

/area kube-proxy

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-14 18:23:21 -07:00
Kubernetes Submit Queue
a2de1398f8 Merge pull request #65034 from caesarxuchao/json-case-sensitive
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make kubernetes json serializer case sensitive

This PR imported the latest jsoniterator library so that case sensitivity during unmarhsaling is optional. The PR also set Kubernetes json serializer to be case sensitive.

Kubernetes json serializer had been case sensitive for 1.1-1.7 as we were using ugorji. This PR restores the behavior.

Fix #64612.

```release-notes
Kubernetes json deserializer is now case-sensitive as it was before 1.8.
If your config files contains fields with wrong case, the config files will be now invalid.
```
2018-06-14 15:41:26 -07:00
Kubernetes Submit Queue
0f87069384 Merge pull request #64630 from nicksardo/fix-op-rate
Automatic merge from submit-queue (batch tested with PRs 64272, 64630). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

GCE: Fix operation polling and error handling

Cloud functions using the generated API are bursting operation GET calls because we don't wait a minimum amount of time.

Fixes #64712
Fixes #64858

**Changes**
- `operationPollInterval` is now 1.5 seconds instead of 3 seconds.
-  `operationPollRateLimiter` is now configured with 5 QPS / 5 burst instead of 10 QPS / 10 burst.
- `gceRateLimiter` is now configured with a `MinimumRateLimiter` to wait the above `operationPollInterval` duration _before_ waiting on the token rate limiter.
- Operations are now rate limited on the very first GET call.
- Operations are polled until `DONE` or context times out (even if operations.get fails continuously).
- Compute operations are checked for errors when they're recognized as `DONE`. 
- All "wrapper" funcs now generate a context with an hour timeout.


`ingress-gce` will need to update its vendor and utilize the `MinimumRateLimiter` as well. Since ingress creates rate limiters based off flags, we'll need to check the resource type and operation while parsing the flags and wrap the appropriate one.

**Special notes for your reviewer**:
/assign bowei
/cc bowei

**Fix Example**
Creating an external load balancer

without fix:  https://pastebin.com/raw/NNkeNWS3  
with fix: https://pastebin.com/raw/x2iMLW5S (a difference of about 200 GET calls)

**Release note**:
```release-note
GCE: Fixes operation polling to adhere to the specified interval. Furthermore, operation errors are now returned instead of ignored.
```
2018-06-14 14:11:15 -07:00
Chao Xu
72a0dc1122 fix schema for kubeproxyconfig/v1alph1 2018-06-14 12:52:17 -07:00
Chao Xu
7b0ffb8410 make json serializer case sensitive 2018-06-14 12:29:27 -07:00
Nick Sardo
787f3a6386 Use context with timeout instead of context.Background 2018-06-14 11:20:38 -07:00
Nick Sardo
115ddc5a8e Wait a minimum amount of time for polling operations 2018-06-14 11:20:34 -07:00
Krzysztof Siedlecki
e32910a544 Readding summary metrics 2018-06-14 15:05:12 +02:00
Krzysztof Siedlecki
0547bbf744 Revert "Fixing scheduling latency metrics"
This reverts commit 0e833bfc83.
2018-06-14 14:50:12 +02:00
vikaschoudhary16
e8119dc134 Start plugin watcher after initialization of all kubelet components 2018-06-14 01:03:37 -04:00
xuzhonghu
578d7e266b rm dead auto gen code 2018-06-14 10:07:48 +08:00
WanLinghao
f16470c3f1 This patch adds limit to the TokenRequest expiration time. It constrains a TokenRequest's expiration time to avoid extreme value which could harm the cluster. 2018-06-14 09:31:50 +08:00
Andrew Lytvynov
2c0f043957 Re-use private key after failed CSR
If we create a new key on each CSR, if CSR fails the next attempt will
create a new one instead of reusing previous CSR.

If approver/signer don't handle CSRs as quickly as new nodes come up,
they can pile up and approver would keep handling old abandoned CSRs and
Nodes would keep timing out on startup.
2018-06-13 13:12:43 -07:00
Mike Dame
a8b7c94620 marshal bytes to return as string with kubectl config view -o jsonpath 2018-06-13 15:25:34 -04:00
Seth Jennings
f1551798e4 reduce logging for backoff situations 2018-06-13 13:25:20 -05:00
dbdd4us
5835d9bde4 fix update node condition 2018-06-13 19:58:41 +08:00
Dr. Stefan Schimanski
1208437f84 Update generated files 2018-06-13 12:35:13 +02:00
hangaoshuai
0a00829875 fix bug excludeCIDRs was not assign in func NewProxier 2018-06-13 12:34:37 +02:00
xujieasd
368cb99d0b fix iptables_test typo 2018-06-13 15:12:40 +08:00
Brendan Burns
804ee25b1e Make CredentialProvider config loading deterministic. 2018-06-12 21:39:46 -07:00
Kubernetes Submit Queue
b05a61e299 Merge pull request #65030 from deads2k/cli-74-experimental
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

mark kubectl wait as experimental

Per @smarterclayton comment in https://github.com/kubernetes/kubernetes/pull/64034

/assign @smarterclayton
2018-06-12 16:12:52 -07:00
Kubernetes Submit Queue
bb7e14429d Merge pull request #64922 from dcbw/dcbw-dockershim-network-approver
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dockershim/network: add dcbw to OWNERS as an approver

I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.

I've also been on the CNI maintainers team for a couple years.

```release-note
NONE
```
@freehan @thockin @kubernetes/sig-network-pr-reviews
2018-06-12 13:31:15 -07:00
Kubernetes Submit Queue
e7bdebd5f1 Merge pull request #65009 from mfojtik/ds-02-add-node-indexer
Automatic merge from submit-queue (batch tested with PRs 64974, 65009, 65018). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

daemon: add custom node indexer

**What this PR does / why we need it**:

<img width="863" alt="screen shot 2018-06-11 at 20 54 03" src="https://user-images.githubusercontent.com/44136/41279030-ad842020-6e2b-11e8-80d4-0a71ee415d30.png">

Based on this CPU profile, it looks like a lot of CPU cycles/cores are spend by retrieving a list of **all** pods in the cluster. On large clusters with multiple daemonset this might lead to locking the shared pod informer List() of every other controller that might need it or use it.

The indexer in the PR will index the pods based on nodeName assigned for these pods. That means the amount of pods returned from the ByIndex() function is fairly small and the call should be fast.

Additionally we can also use this index to check whether a node already run the pod without listing all pods in the cluster again.

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-06-12 12:58:13 -07:00
David Eads
4a180331a9 mark kubectl wait as experimental 2018-06-12 15:51:43 -04:00
Kubernetes Submit Queue
67ebbc675a Merge pull request #64862 from feiskyer/win-cni
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert #64189: Fix Windows CNI for the sandbox case

**What this PR does / why we need it**:

This reverts PR #64189, which breaks DNS for Windows containers.

Refer https://github.com/kubernetes/kubernetes/pull/64189#issuecomment-395248704

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64861

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

cc @madhanrm @PatrickLang @alinbalutoiu @dineshgovindasamy
2018-06-12 11:18:01 -07:00
Pavithra Ramesh
2d10c8a066 Handle empty clusterID in loadbalancer naming 2018-06-12 10:45:34 -07:00
Kubernetes Submit Queue
7e41ab4ed3 Merge pull request #64768 from krzysied/scale_retries
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adding scale error retries

**What this PR does / why we need it**:
ScaleWithRetries will retry all retryable errors, not only conflict error.
ref #63030

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-12 09:31:46 -07:00
ruicao
95c232ee07 Typo fix: toto -> to 2018-06-12 23:12:39 +08:00
Krzysztof Siedlecki
8a3c2dcc6d Adding scale error retries 2018-06-12 11:23:16 +02:00
Michal Fojtik
6517e250cd daemon: add custom node indexer 2018-06-12 11:10:10 +02:00
Kubernetes Submit Queue
52603a78ab Merge pull request #64969 from mfojtik/volume-01-fix-allocations
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

volume: decrease memory allocations for debugging messages

**What this PR does / why we need it**:

<img width="1769" alt="screen shot 2018-06-11 at 13 15 31" src="https://user-images.githubusercontent.com/44136/41230128-ebf7233c-6d7e-11e8-899d-6251a5fde236.png">

On large clusters, where the glog is not running on V(5) using the format as: `glog.V(5).Infof(fmt.Sprintf(....))` will cause the code inside `Infof()` to be ran and generate a tons of memory allocations even if the output of those messages are not returned to the console...

This patch should reduce those calls and also the string allocations done by message generation.

**Release note**:
```release-note
NONE
```
2018-06-11 22:05:42 -07:00
Dong Liu
df494e924a Move out azure_loadbalancer.md to cloud provider repository 2018-06-12 10:11:26 +08:00
Kubernetes Submit Queue
8e03228c1a Merge pull request #64643 from dashpole/memcg_poll
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use unix.EpollWait to determine when memcg events are available to be Read

**What this PR does / why we need it**:
This fixes a file descriptor leak introduced in https://github.com/kubernetes/kubernetes/pull/60531 when the `--experimental-kernel-memcg-notification` kubelet flag is enabled.  The root of the issue is that `unix.Read` blocks indefinitely when reading from an event file descriptor and there is nothing to read.  Since we refresh the memcg notifications, these reads accumulate until the memcg threshold is crossed, at which time all reads complete.  However, if the node never comes under memory pressure, the node can run out of file descriptors.

This PR changes the eviction manager to use `unix.EpollWait` to wait, with a 10 second timeout, for events to be available on the eventfd.  We only read from the eventfd when there is an event available to be read, preventing an accumulation of `unix.Read` threads, and allowing the event file descriptors to be reclaimed by the kernel.

This PR also breaks the creation, and updating of the memcg threshold into separate portions, and performs creation before starting the periodic synchronize calls.  It also moves the logic of configuring memory thresholds into memory_threshold_notifier into a separate file.

This also reverts https://github.com/kubernetes/kubernetes/pull/64582, as the underlying leak that caused us to disable it for testing is fixed here.

Fixes #62808

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority critical-urgent
2018-06-11 17:29:19 -07:00
Kubernetes Submit Queue
d61b681d24 Merge pull request #64903 from WanLinghao/token_projected_fix
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix a bug of wrong parameters which could cause token projection failure

**What this PR does / why we need it**:
a parameter wrong order bug
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-11 17:29:16 -07:00
Brendan Burns
84dd19eb23 Remove an old TODO. 2018-06-11 13:04:32 -07:00
David Ashpole
b7deb6d9e0 fix eviction event formatting 2018-06-11 11:38:00 -07:00
David Ashpole
93b6d026d9 fix memcg fd leak 2018-06-11 11:37:50 -07:00
Kubernetes Submit Queue
e6f64d0a79 Merge pull request #64916 from mfojtik/ds-01-improve-mem-usage
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

improve memory footprint of daemonset simulate

**What this PR does / why we need it**:

This is an alternative for https://github.com/kubernetes/kubernetes/pull/64915 (it might be not needed if that PR will merge)

During memory profiling of OpenShift, we noticed a significant amount of object allocations done by `IsControlledBy()`. @sttts found that the `GetObjectReferences()` method is doing a deep-copy of the object references. 

![screen shot 2018-06-08 at 14 22 59](https://user-images.githubusercontent.com/44136/41157922-7af953f2-6b27-11e8-9a16-bda8c3edfe07.png)

This PR simplify the `IsControlledBy()` to just iterate over the ownerRefs, without copying them. 

**Release note**:

```release-note
NONE
```
2018-06-11 08:56:20 -07:00
Michal Fojtik
97f546d249 volume: decrease memory allocations for debugging messages 2018-06-11 13:52:38 +02:00
andyzhangx
b9c07dc7a1 set EnableHTTPSTrafficOnly in storageAccount creation 2018-06-11 07:10:24 +00:00
liangwei
a270d14a00 complete ipvs proxier test 2018-06-09 16:00:54 +08:00
Kubernetes Submit Queue
5aa8d690a1 Merge pull request #64554 from hanxiaoshuai/fix05312
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix bug excludeCIDRs was not assign in func NewProxier

**What this PR does / why we need it**:
fix bug excludeCIDRs was not assign in func NewProxier
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-09 00:01:29 -07:00
Michal Fojtik
60ef68c87d improve memory footprint of daemonset simulate 2018-06-08 19:59:12 +02:00
John Calabrese
f415558c30 use subtest for table units
implement PR feedback

replace errorf + return with fatalf

  https://github.com/kubernetes/kubernetes/pull/63662#discussion_r192540370
  https://github.com/kubernetes/kubernetes/pull/63662#discussion_r192540457
2018-06-08 11:44:24 -04:00
Dan Williams
37792076b4 dockershim/network: add dcbw to OWNERS as an approver
I've been involved with the kubelet network code, including most
of this code, for a couple years and contributed a good number
of PRs for these directories. I've also been a SIG Network
co-lead for couple years.

I've also been on the CNI maintainers team for a couple years.
2018-06-08 10:06:19 -05:00
Huamin Chen
11c6126aa0 update rbd and cephfs volume owners
Signed-off-by: Huamin Chen <hchen@redhat.com>
2018-06-08 11:04:23 -04:00
John Calabrese
7735dd6843 use subtest for table units
apply consistent format to name strings

  - #63661 (comment)
  - #63661 (comment)

inline subtest logic

  https://github.com/kubernetes/kubernetes/pull/63661/files#r192540031

remove duplicate messaging in subtest errors
2018-06-08 10:52:28 -04:00