Commit Graph

30224 Commits

Author SHA1 Message Date
stewart-yu
ffbd7b22b3 remove the unnecessary comments in tryRegisterWithAPIServer for externalID removed in PR#61877 2018-07-25 11:23:56 +08:00
Kubernetes Submit Queue
9e0c4a6095 Merge pull request #66488 from linyouchong/pr-0723-csi-labelmanager
Automatic merge from submit-queue (batch tested with PRs 66464, 66488). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use glog instead of fmt

**What this PR does / why we need it**:
Use glog instead of fmt

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
NONE
```

/sig storage
2018-07-24 19:03:04 -07:00
Kubernetes Submit Queue
35c3764bbb Merge pull request #66464 from wongma7/round-overflow
Automatic merge from submit-queue (batch tested with PRs 66464, 66488). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid overflowing int64 in RoundUpSize and return error if overflow int

**What this PR does / why we need it**:
There are many places in plugins (some I may have missed) that we naively convert a resource.Quantity.Value() which is an int64, to an int, which may be only 32 bits long.

Background, optional to read :): Kubernetes canonicalizes resource.Quantities, and from what I have seen testing creating PVCs, decimalSI is the default. If a quantity is in `decimalSI` format and its value in bytes would overflow an int64, e.g. `10E`, nothing happens. If it is in binarySI and its value in bytes would overflow an int64, e.g. `10Ei`, it is set down to 2^63-1 and there's no overflow of the field value. But there may be overflow later in the code which is what this PR is addressing.

* Change `RoundUpSize` implementation to avoid overflowing `int64`
* Add `RoundUp*Int` functions for use when an `int` is expected instead of an `int64`, because `int` may be 32bits and naively doing `int($INT64_VALUE)` can lead to silent overflow. These functions return an error if overflow has occurred.
* Rename `*GB` variables to `*GiB` where appropriate for maximum clarity
* Use `RoundUpToGiB` instead of `RoundUpSize` where possible

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**: please review carefully as we don't have e2e tests for most plugins!

**Release note**:

```release-note
NONE
```
edit: remove 'we do not need to worry about...'. yes we do, i worded that badly :))
2018-07-24 19:03:01 -07:00
Kubernetes Submit Queue
1e3d23c5c3 Merge pull request #65907 from jbartosik/hpa-improv-refactor-run-test
Automatic merge from submit-queue (batch tested with PRs 64681, 65907). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make runTest easier to understand

Fewer nested conditions, more checking for incorrect looking test cases.

**What this PR does / why we need it**: Make HPA tests easier to understand.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-07-24 16:28:13 -07:00
Kubernetes Submit Queue
4dbcf32b3c Merge pull request #66471 from islinwb/improve_TestZeroRequest
Automatic merge from submit-queue (batch tested with PRs 66291, 66471, 66499). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve unit test TestZeroRequest

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66468

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-24 13:59:58 -07:00
Kubernetes Submit Queue
2119d349b0 Merge pull request #66291 from resouer/fix-extender
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Extender preemption should respect IsInterested()

**What this PR does / why we need it**:

Extender preemption should respect IsInterested()

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66289 

**Special notes for your reviewer**:

The bug is reported and the first commit is co-authored by: @chenchun

**Release note**:

```release-note
Extender preemption should respect IsInterested()
```
2018-07-24 13:48:38 -07:00
Joachim Bartosik
3d1b6b0f6e Make runTest easier to understand
Instead of deducing metric type from details of struct describing it
test cases explicitly specify the metric type they use.
2018-07-24 17:27:17 +02:00
linyouchong
f2e92776bc Use glog instead of fmt 2018-07-24 09:46:56 +08:00
Kubernetes Submit Queue
c2b2b01e01 Merge pull request #66352 from juanvallejo/jvallejo/switch-logs-cmd-externals
Automatic merge from submit-queue (batch tested with PRs 66352, 66504). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

update logs cmd to use external versions

**Release note**:
```release-note
NONE
```

Continues the pattern established across other kubectl commands, working with external objects throughout.

Depends on https://github.com/kubernetes/kubernetes/pull/66398

cc @deads2k @soltysh
2018-07-23 15:17:05 -07:00
Kubernetes Submit Queue
42d91ff9de Merge pull request #66506 from verb/remove-docker-pid-sharing
Automatic merge from submit-queue (batch tested with PRs 62423, 66180, 66492, 66506, 65242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove kubelet docker shared pid flag

**What this PR does / why we need it**:
The --docker-disable-shared-pid flag has been deprecated since 1.10 and
has been superceded by ShareProcessNamespace in the pod API, which is
scheduled for beta in 1.12.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #41938

**Special notes for your reviewer**:
/assign @yujuhong 

**Release note**:

```release-note
The --docker-disable-shared-pid kubelet flag has been removed. PID namespace sharing can instead be enable per-pod using the ShareProcessNamespace option.
```
2018-07-23 12:32:14 -07:00
Kubernetes Submit Queue
2beab8623c Merge pull request #66180 from kkmsft/user_assigned_msi
Automatic merge from submit-queue (batch tested with PRs 62423, 66180, 66492, 66506, 65242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add user assigned MSI support

**What this PR does / why we need it**:
Adds the support for generating tokens via user assigned MSI. 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes # 

**Special notes for your reviewer**:

**Release note**:

```release-note
Add support for using User Assigned MSI (https://docs.microsoft.com/en-us/azure/active-directory/managed-service-identity/overview) with Kubernetes cluster on Azure.
```
2018-07-23 12:32:06 -07:00
juanvallejo
94fbb48dfc switch logs to use external versions 2018-07-23 14:40:16 -04:00
Kubernetes Submit Queue
d244fa9441 Merge pull request #62423 from nckturner/eks-approvers-reviewers
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add myself, Micah to reviewers

**Release note**:
```release-note
NONE
```

Signed-off-by: Nick Turner <nic@amazon.com>
2018-07-23 11:21:37 -07:00
Matthew Wong
093e231289 Avoid overflowing int64 in RoundUpSize and return error if overflow int 2018-07-23 13:48:45 -04:00
Kubernetes Submit Queue
4e0a60a44c Merge pull request #66487 from islinwb/add_uid
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add pod UID

**What this PR does / why we need it**:
Add pod UID. The test passes but we'll get error info when run with `GOFLAGS=-v`:
```
E0723 09:18:18.393249   45452 node_info.go:477] Cannot get pod key, err: Cannot get cache key for pod with empty UID
E0723 09:18:18.393440   45452 node_info.go:490] Cannot get pod key, err: Cannot get cache key for pod with empty UID
```
 
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-23 02:31:04 -07:00
Kubernetes Submit Queue
49670bee18 Merge pull request #66429 from andyzhangx/acr-sp-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix acr could not be listed in sp issue

**What this PR does / why we need it**:
after granting sp access to azure ACR , pull image from ACR would fail, and after wait about 15-30min(or restart kubelet directly), pull image would succeed. Root cause is that `servicePrincipalToken` needs to be refreshed when doing `registryClient.List`, otherwise it will always return empty registry list. Pull image error would be like following:
```
Events:
  Type     Reason                 Age              From                               Message
  ----     ------                 ----             ----                               -------
  Warning  FailedScheduling       8m (x3 over 8m)  default-scheduler                  0/1 nodes are available: 1 Insufficient cpu.
  Normal   Scheduled              8m               default-scheduler                  Successfully assigned nginx-server-776564f79c-zhtjk to aks-nodepool1-20881069-0
  Normal   SuccessfulMountVolume  8m               kubelet, aks-nodepool1-20881069-0  MountVolume.SetUp succeeded for volume "default-token-4t7tk"
  Normal   SuccessfulMountVolume  8m               kubelet, aks-nodepool1-20881069-0  MountVolume.SetUp succeeded for volume "pvc-5c1f0521-739f-11e8-9b69-0a58ac1f09c2"
  Warning  Failed                 8m (x5 over 8m)  kubelet, aks-nodepool1-20881069-0  Error: ImagePullBackOff
  Normal   BackOff                8m (x5 over 8m)  kubelet, aks-nodepool1-20881069-0  Back-off pulling image "andyacr.azurecr.io/nginx-server:1.0.0"
  Warning  Failed                 8m (x2 over 8m)  kubelet, aks-nodepool1-20881069-0  Error: ErrImagePull
  Warning  Failed                 8m (x2 over 8m)  kubelet, aks-nodepool1-20881069-0  Failed to pull image "andyacr.azurecr.io/nginx-server:1.0.0": rpc error: code = Unknown desc = Error response from daemon: Get https://andyacr.azurecr.io/v2/nginx-server/manifests/1.0.0: unauthorized: authentication required
```

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65225

**Special notes for your reviewer**:
After discuss with dong, `registryClient.List` won't be necessary, instead we return `{"*.azurecr.io", "*.azurecr.cn", "*.azurecr.de", "*.azurecr.us"}` like aws, gce code logic, it will do the url matching.
I will cherry pick this PR to all supported version, every version has this issue.

**Release note**:

```
fix acr could not be listed in sp issue
```

/sig azure
/assign @feiskyer @khenidak @brendandburns @karataliu
2018-07-22 21:18:26 -07:00
Weibin Lin
972e78748a add pod UID 2018-07-23 10:44:31 +08:00
Harry Zhang
d644162a29 Extender preemption should respect IsInterested()
Co-authored-by: Harry Zhang <resouer@gmail.com>
Co-authored-by: Chun Chen <ramichen@tencent.com>
2018-07-23 10:13:38 +08:00
Weibin Lin
5449d153bb Improve unit test TestZeroRequest 2018-07-23 09:15:19 +08:00
Lee Verberne
7c558fb7bb Remove kubelet-level docker shared pid flag
The --docker-disable-shared-pid flag has been deprecated since 1.10 and
has been superceded by ShareProcessNamespace in the pod API, which is
scheduled for beta in 1.12.
2018-07-22 16:54:44 +02:00
Kubernetes Submit Queue
4797c8df8f Merge pull request #63665 from xchapter7x/pkg-scheduler-core
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

use subtest for table units (pkg/scheduler/core)

**What this PR does / why we need it**: Update scheduler's unit table tests to use subtest

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:
breaks up PR: https://github.com/kubernetes/kubernetes/pull/63281
/ref #63267

**Release note**:

```release-note
This PR will leverage subtests on the existing table tests for the scheduler units.
Some refactoring of error/status messages and functions to align with new approach.

```
2018-07-21 01:52:30 -07:00
Kubernetes Submit Queue
819604e2ed Merge pull request #65558 from apelisse/dry-run-feature-gate
Automatic merge from submit-queue (batch tested with PRs 66410, 66398, 66061, 66397, 65558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

dry-run: Create feature-gate flag

Creates a feature gate flag for dry-run. Currently, dry-run query parameter is completely blocking all requests, once the feature is implemented, the flag will allow the parameter to pass if enabled.

cc @jennybuckley @deads2k @liggitt @lavalamp 

**Release note**:

```release-note
NONE
```
2018-07-20 18:51:14 -07:00
Kubernetes Submit Queue
827aa934ac Merge pull request #66397 from gnufied/fix-default-max-volume-ebs
Automatic merge from submit-queue (batch tested with PRs 66410, 66398, 66061, 66397, 65558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix volume limit for EBS on m5 and c5 instances

This is a fix for lower volume limits on m5 and c5 instance types while we wait for https://github.com/kubernetes/features/issues/554 to land GA.

This problem became urgent because many of our users are trying to migrate to those instance types in light of spectre/meltdown vulnerability but  lower volume limit on those instance types often causes cluster instability. Yes they can workaround by configuring the scheduler with lower limit but often this becomes somewhat difficult to do when cluster is mixed. 

The newer default limits were picked from https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/volume_limits.html

Text about spectre/meltdown is available on - https://community.bitnami.com/t/spectre-variant-2/54961/5

/sig storage
/sig scheduling

```release-note
Fix volume limit for EBS on m5 and c5 instance types
```
2018-07-20 18:51:11 -07:00
Kubernetes Submit Queue
35ff6ea207 Merge pull request #66398 from deads2k/cli-04-make-logs-generic-again
Automatic merge from submit-queue (batch tested with PRs 66410, 66398, 66061, 66397, 65558). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix logs command to be generic for all resources again

--all-containers should not have been allowed as it was because it only worked for pods.  This approach does not make sense for a polymorphic command.  Rather than roll it back, I'll take the time to make it generic.  Because of this and other pods-only options, we now have inconsistencies with the command that should be addressed separately.

@CaoShuFeng 
/assign @juanvallejo @soltysh 
@kubernetes/sig-cli-maintainers 

```release-note
NONE
```
2018-07-20 18:51:05 -07:00
John Calabrese
ad234e58be use subtest for table units
remove duplicate testname from error msg

remove subtest for test setup loop

do not break on test failure

  https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203571355

remove duplicate test.name in output

  https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203574001
  https://github.com/kubernetes/kubernetes/pull/63665#discussion_r203574012
2018-07-20 16:02:50 -04:00
Kubernetes Submit Queue
53ee0c8652 Merge pull request #65660 from mtaufen/incremental-refactor-kubelet-node-status
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor kubelet node status setters, add test coverage

This internal refactor moves the node status setters to a new package, explicitly injects dependencies to facilitate unit testing, and adds individual unit tests for the setters.

I gave each setter a distinct commit to facilitate review.

Non-goals:
- I intentionally excluded the class of setters that return a "modified" boolean, as I want to think more carefully about how to cleanly handle the behavior, and this PR is already rather large.
- I would like to clean up the status update control loops as well, but that belongs in a separate PR.

```release-note
NONE
```
2018-07-20 12:12:24 -07:00
Kubernetes Submit Queue
6c500be080 Merge pull request #66218 from atlassian/handle-errors
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Handle errors

**What this PR does / why we need it**:
This is a followup PR for https://github.com/kubernetes/kubernetes/pull/64664 to handle errors returned from `.AddToScheme()` in places where they are not handled.

**Release note**:
```release-note
NONE
```
/kind cleanup
/sig api-machinery
/cc @sttts
2018-07-20 12:12:15 -07:00
Kubernetes Submit Queue
58aa10d213 Merge pull request #66406 from liggitt/pod-printing-panic
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix panic printing pods with nominatedNode names

Fixes #66379

```release-note
kubectl: fixes a panic displaying pods with nominatedNodeName set
```
2018-07-20 12:12:12 -07:00
David Eads
5ba07364ee fix logs command to be generic for all resources again 2018-07-20 15:10:44 -04:00
Antoine Pelisse
9e7b140450 dry-run: Create feature-gate flag 2018-07-20 11:40:06 -07:00
Krishnakumar R
2554c53bb3 Add user assigned MSI support for azure cloudprovider. 2018-07-20 08:39:16 -07:00
Kubernetes Submit Queue
8b4cdd0f85 Merge pull request #66378 from sngchlko/fix-type-in-csi-plugin
Automatic merge from submit-queue (batch tested with PRs 66098, 66389, 66400, 66413, 66378). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix a typo in csiPlugin comment

**What this PR does / why we need it**:
Fix a typo in csiPlugin comment.

**Release note**:
```release-note
NONE
```
2018-07-20 05:30:18 -07:00
Kubernetes Submit Queue
a4a2e6d61e Merge pull request #66400 from nicksardo/fix-err-code
Automatic merge from submit-queue (batch tested with PRs 66098, 66389, 66400, 66413, 66378). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

GCE: Return correct error type and HTTP Status code for operation errors

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #66399

**Special notes for your reviewer**:
/assign bowei, zihongz, rramkumar
/cc bowei

**Release note**:
```release-note
GCE: Fixes loadbalancer creation and deletion issues appearing in 1.10.5.
```
2018-07-20 05:30:12 -07:00
Kubernetes Submit Queue
e74a68e4c5 Merge pull request #66389 from bertinatto/metrics_pv_controller
Automatic merge from submit-queue (batch tested with PRs 66098, 66389, 66400, 66413, 66378). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add metrics in PV Controller

**What this PR does / why we need it**:

This PR adds a few metrics described in the [Metrics Spec](https://docs.google.com/document/d/1Fh0T60T_y888LsRwC51CQHO75b2IZ3A34ZQS71s_F0g/edit#heading=h.ys6pjpbasqdu) (PV Controller only):

Additional metrics for PV Controller:
* Total provision and deletion time
* Number of times PV provisioning and deletion failed

**Release note**:

```release-note
NONE
```
2018-07-20 05:30:09 -07:00
andyzhangx
a7e328c211 fix acr sp access issue 2018-07-20 08:39:31 +00:00
Kubernetes Submit Queue
b68c9440da Merge pull request #66242 from feiskyer/instance-az
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add initial availability zones support for Azure nodes

**What this PR does / why we need it**:

The first part of [Azure Availability Zone feature](https://github.com/kubernetes/features/issues/586).

This PR adds initial availability zone (AZ) support for Azure nodes. With this PR, Azure nodes with AZ will have label `failure-domain.beta.kubernetes.io/zone=<region>-<zoneID>`, e.g. `southeastasia-1`.

It also updates instance metadata api-version to 2017-12-01, which is required for AZ.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

VirtualMachineScaleSetVM doesn't have AZ info yet. It will be supported later after new Azure Go SDK releases.

**Release note**:

```release-note
Azure nodes with availability zone now will have label `failure-domain.beta.kubernetes.io/zone=<region>-<zoneID>`.
```

/kind feature
/sig azure

/assign @brendandburns @khenidak @andyzhangx
2018-07-20 00:18:47 -07:00
Jordan Liggitt
bd559e247c tolerate missing column headers in server-side print output 2018-07-19 20:55:01 -04:00
Jordan Liggitt
dc5f615152 Send correct headers for pod printing 2018-07-19 20:55:00 -04:00
Kubernetes Submit Queue
795b7da8b0 Merge pull request #65714 from resouer/fix-63784
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Re-design equivalence class cache to two level cache

**What this PR does / why we need it**:

The current ecache introduced a global lock across all the nodes, and this patch tried to assign ecache per node to eliminate that global lock. The improvement of scheduling performance and throughput are both significant.

**CPU Profile Result** 

Machine: 32-core 60GB GCE VM

1k nodes 10k pods bench test (we've highlighted the critical function):

1. Current default scheduler with ecache enabled:
![equivlance class cache bench test 001](https://user-images.githubusercontent.com/1701782/42196992-51b0a32a-7eb3-11e8-89ee-f13383091a00.jpeg)
2. Current default scheduler with ecache disabled:
![equivlance class cache bench test 002](https://user-images.githubusercontent.com/1701782/42196993-51eb0c68-7eb3-11e8-9326-1a7762072863.jpeg)
3. Current default scheduler with this patch and ecache enabled:
![equivlance class cache bench test 003](https://user-images.githubusercontent.com/1701782/42196994-52280ed8-7eb3-11e8-8100-690e2af2cf2f.jpeg)

**Throughput Test Result** 

1k nodes 3k pods `scheduler_perf` test: 

Current default scheduler, ecache is disabled:
```bash
Minimal observed throughput for 3k pod test: 200
PASS
ok      k8s.io/kubernetes/test/integration/scheduler_perf    30.091s
```
With this patch, ecache is enabled:
```bash
Minimal observed throughput for 3k pod test: 556
PASS
ok      k8s.io/kubernetes/test/integration/scheduler_perf    11.119s
```

**Design and implementation:**

The idea is: we re-designed ecache into a "two level cache". 

The first level cache holds the global lock across nodes and sync is needed only when node is added or deleted, which is of much lower frequency. 

The second level cache is assigned per node and its lock is restricted to per node level, thus there's no need to bother the global lock during whole predicate process cycle. For more detail, please check [the original discussion](https://github.com/kubernetes/kubernetes/issues/63784#issuecomment-399848349).

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63784

**Special notes for your reviewer**:

~~Tagged as WIP to make sure this does not break existing code and tests, we can start review after CI is happy.~~

**Release note**:

```release-note
Re-design equivalence class cache to two level cache
```
2018-07-19 16:16:02 -07:00
Hemant Kumar
45b8107378 Fix volume limit for EBS on m5 and c5 instances 2018-07-19 16:27:52 -04:00
Nick Sardo
808bc227ae Return correct error type and HTTP Status code for operation errors 2018-07-19 13:18:29 -07:00
Kubernetes Submit Queue
8770d12494 Merge pull request #65572 from yue9944882/fixes-admission-operation-mismatch-for-create-on-update
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fixes operation for "create on update"

**What this PR does / why we need it**:

Set operation to `admission.Create` for create-on-update requests.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #65553

**Special notes for your reviewer**:

**Release note**:

```release-note
Checks CREATE admission for create-on-update requests instead of UPDATE admission
```
2018-07-19 10:42:54 -07:00
Kubernetes Submit Queue
d2cc34fb07 Merge pull request #65771 from smarterclayton/untyped
Automatic merge from submit-queue (batch tested with PRs 65771, 65849). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a new conversion path to replace GenericConversionFunc

reflect.Call is very expensive. We currently use a switch block as part of AddGenericConversionFunc to avoid the bulk of top level a->b conversion for our primary types which is hand-written. Instead of having these be handwritten, we should generate them.

The pattern for generating them looks like:

```
scheme.AddConversionFunc(&v1.Type{}, &internal.Type{}, func(a, b interface{}, scope conversion.Scope) error {
  return Convert_v1_Type_to_internal_Type(a.(*v1.Type), b.(*internal.Type), scope)
})
```

which matches AddDefaultObjectFunc (which proved out the approach last year). The
conversion machinery should then do a simple map lookup based on the incoming types and invoke the function.  Like defaulting, it's up to the caller to match the types to arguments, which we do by generating this code.  This bypasses reflect.Call and in the future allows Golang mid-stack inlining to optimize this code.

As part of this change I strengthened registration of custom functions to be generated instead of hand registered, and also strengthened error checking of the generator when it sees a manual conversion to error out.  Since custom functions are automatically used by the generator, we don't really have a case for not registering the functions.

Once this is fully tested out, we can remove the reflection based path and the old registration methods, and all conversion will work from point to point methods (whether generated or custom).

Much of the need for the reflection path has been removed by changes to generation (to omit fields) and changes to Go (to make assigning equivalent structs easy).

```release-note
NONE
```
2018-07-19 09:29:00 -07:00
Fabio Bertinatto
a15cc29442 Add extra metrics for PV Controller
Specifically:

* Total provision time
* Total PV deletion time
* Number of times PV provisioning failed
* Number of times PV deletion failed
2018-07-19 15:36:37 +02:00
Fabio Bertinatto
97e63985dc Return error in provisionClaimOperation 2018-07-19 15:27:40 +02:00
Seungcheol Ko
43f805b7bd Fix a typo in csiPlugin comment 2018-07-19 21:01:09 +09:00
Kubernetes Submit Queue
357decc9db Merge pull request #63666 from xchapter7x/pkg-scheduler-factory
Automatic merge from submit-queue (batch tested with PRs 58487, 63666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

use subtest for table units (pkg/scheduler/factory)

**What this PR does / why we need it**: Update scheduler's unit table tests to use subtest

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:
breaks up PR: https://github.com/kubernetes/kubernetes/pull/63281
/ref #63267

**Release note**:

```release-note
This PR will leverage subtests on the existing table tests for the scheduler units.
Some refactoring of error/status messages and functions to align with new approach.

```
2018-07-19 02:09:06 -07:00
Kubernetes Submit Queue
5299b6c6b8 Merge pull request #66319 from tallclair/psp-path
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cleanup & fix PodSecurityPolicy field path usage

I noticed the field paths were incorrect for a bunch of PodSecurityPolicy validation errors. This PR fixes the errors, and makes it more explicit what the paths are pointing to in some cases.

**Release note**:
```release-note
NONE
```

/kind cleanup
/sig auth
2018-07-18 22:13:50 -07:00
Tim Allclair
5ace0f03d8 Cleanup & fix PodSecurityPolicy field path usage 2018-07-18 17:47:32 -07:00
Kubernetes Submit Queue
afcc156806 Merge pull request #66350 from aveshagarwal/master-rhbz-1601378
Automatic merge from submit-queue (batch tested with PRs 66175, 66324, 65828, 65901, 66350). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Start cloudResourceSyncsManager before getNodeAnyWay (initializeModules) to avoid kubelet getting stuck in retrieving node addresses from a cloudprovider.

**What this PR does / why we need it**:
This PR starts cloudResourceSyncsManager before getNodeAnyWay (initializeModules) otherwise kubelet gets stuck in setNodeAddress->kl.cloudResourceSyncManager.NodeAddresses() (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L470) forever retrieving node addresses from a cloud provider, and due to this cloudResourceSyncsManager will not be started at all.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
None
```

@ingvagabund @derekwaynecarr @sjenning @kubernetes/sig-node-bugs
2018-07-18 16:42:22 -07:00