Commit Graph

1547 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
9e56e58647 Merge pull request #47177 from rrati/aws-additional-logging
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)

Added logging to AWS api calls. #46969

Additionally logging of when AWS API calls start and end to help diagnose problems with kubelet on cloud provider nodes not reporting node status periodically.  There's some inconsistency in logging around this PR we should discuss.

IMO, the API logging should be at a higher level than most other types of logging as you would probably only want it in limited instances.  For most cases that is easy enough to do, but there are some calls which have some logging around them already, namely in the instance groups.  My preference would be to keep the existing logging as it and just add the new API logs around the API call.
2017-07-20 15:08:20 -07:00
Davanum Srinivas
6139f9ab89 Avoid looking up instance id until we need it
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)

The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
2017-07-20 14:40:10 -04:00
ymqytw
9b393a83d4 update godep 2017-07-20 11:03:49 -07:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Kubernetes Submit Queue
3660ff466f Merge pull request #49235 from dims/allow-cinder-scenarios-without-load-balancer
Automatic merge from submit-queue (batch tested with PRs 49276, 49235)

Don't fail fast if LoadBalancer section is missing

**What this PR does / why we need it**:

We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-07-20 08:18:09 -07:00
Davanum Srinivas
8fd21d67a8 Don't fail fast if LoadBalancer section is missing
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
2017-07-20 07:42:28 -04:00
Di Xu
2cddfd0db9 fix bug when azure cloud provider configuration file is not specified 2017-07-20 17:29:09 +08:00
Kubernetes Submit Queue
acc19cafa4 Merge pull request #49231 from dims/tolerate-flavor-info-keys
Automatic merge from submit-queue

Tolerate Flavor information for computing instance type

**What this PR does / why we need it**:
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api

So let's just inspect what is present and use that to figure out
the instance type.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-07-20 00:33:46 -07:00
Kubernetes Submit Queue
de71cc50d5 Merge pull request #49261 from heidecke/on-premises
Automatic merge from submit-queue

Fix on-premises term in error string and comments for aws provider

**What this PR does / why we need it**: fix for correct terminology of "on-premises" over "on-premise"

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a

**Special notes for your reviewer**: Updated error string while doing a scrub for the incorrect term in the docs (kubernetes/kubernetes.github.io#4413).

**Release note**:

```release-note
NONE
```
2017-07-19 23:03:26 -07:00
Kubernetes Submit Queue
ea18935670 Merge pull request #45540 from edevil/azure_extra_logging
Automatic merge from submit-queue (batch tested with PRs 49083, 45540, 46862)

Add extra logging to azure API get calls

**What this PR does / why we need it**:

This PR adds extra logging for external calls to the Azure API, specifically get calls.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

This will help troubleshoot problems arising from the usage of this cloudprovider. For example, it looks like #43516 is caused by a call to the cloudprovider taking too much time.
2017-07-19 21:18:25 -07:00
Luke Heidecke
c8b6924424 Fix on-premises term in error string and comments 2017-07-20 13:04:33 +09:00
Kubernetes Submit Queue
ecadada7ef Merge pull request #48967 from jackfrancis/azure-lb-backoff
Automatic merge from submit-queue (batch tested with PRs 49218, 48253, 48967, 48460, 49230)

additional backoff in azure cloudprovider

Fixes #48971

**What this PR does / why we need it**:

We want to be able to opt in to backoff retry logic for kubelet-originating request behavior: node IP address resolution and node load balancer pool membership enforcement.

**Special notes for your reviewer**:

The use-case for this is azure cloudprovider clusters with large node counts, especially during cluster installation, or other scenarios when lots of nodes come online at once and attempt to register all resources with the backend API. To allow clusters at scale more control over the API request rate in-cluster, backoff config has the ability to meaningful slow down this rate, when appropriate.

**Release note**:

```additional backoff in azure cloudprovider
```
2017-07-19 20:05:34 -07:00
Davanum Srinivas
c197e6238d Tolerate Flavor information for computing instance type
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api

So let's just inspect what is present and use that to figure out
the instance type.
2017-07-19 16:06:53 -04:00
Brendan Burns
38b1b74f82 Fix up imds, also refactor for better testing. 2017-07-19 12:53:08 -07:00
Kubernetes Submit Queue
a0e7114ab3 Merge pull request #48702 from FengyunPan/cloudprovider-rackspace
Automatic merge from submit-queue (batch tested with PRs 48702, 48965, 48740, 48974, 48232)

Rackspace for cloud-controller-manager

This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.

This is part of #47257 cc @wlan0

**Release note**:
```release-note
NONE
```
2017-07-18 20:06:14 -07:00
André Cruz
4071a36c12 Add extra logging to azure API calls 2017-07-18 14:40:28 +01:00
Jacob Simpson
29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
Nick Sardo
9b29f42fc5 Further removal of Gets from Creates 2017-07-15 19:41:21 -07:00
Robert Rati
92f030ca24 Added logging to AWS api calls. #46969 2017-07-14 21:37:05 -04:00
Jack Francis
f76ef29512 backing off az.getIPForMachine in az.NodeAddresses
also rate limiting the call to az.getVirtualMachine inside az.getIPForMachine
2017-07-14 17:13:40 -07:00
Kubernetes Submit Queue
8c8f562204 Merge pull request #48872 from FengyunPan/fix-order
Automatic merge from submit-queue (batch tested with PRs 48890, 46893, 48872, 48896)

Fix the order of deletion

1. EnsureLoadBalancer can't delete pool without deleting members,
   just let EnsureLoadBalancerDeleted do it.
2. Add some friendly error message

**Release note**:
```release-note
NONE
```
2017-07-14 16:49:53 -07:00
Jack Francis
2525ef9983 VirtualMachinesClient.Get backoff in lb pool logic
EnsureHostInPool() submits a GET to azure API for VM info. We’re seeing this on agent node kubelets and would like to enable configurable backoff engagement for 4xx responses to be able to slow down the rate of reconciliation, when appropriate.
2017-07-14 15:16:47 -07:00
Kubernetes Submit Queue
df47592d5a Merge pull request #48854 from colemickens/msi
Automatic merge from submit-queue (batch tested with PRs 47066, 48892, 48933, 48854, 48894)

azure: msi: add managed identity field, logic

**What this PR does / why we need it**: Enables managed service identity support for the Azure cloudprovider. "Managed Service Identity" allows us to ask the Azure Compute infra to provision an identity for the VM. Users can then retrieve the identity and assign it RBAC permissions to talk to Azure ARM APIs for the purpose of the cloudprovider needs.

Per the commit text:
```
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
```

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a 

**Special notes for your reviewer**: none

**Release note**:

```release-note
azure: support retrieving access tokens via managed identity extension
```

cc: @brendandburns @jdumars @anhowe
2017-07-14 12:50:55 -07:00
Kubernetes Submit Queue
8532cdfd69 Merge pull request #48886 from mikedanese/cleanup
Automatic merge from submit-queue

remove some people from OWNERS so they don't get reviews anymore

These are googlers who don't work on the project anymore but are still
getting reviews assigned to them:
- @bprashanth
- @rjnagal
- @vmarmol
2017-07-14 11:46:10 -07:00
Kubernetes Submit Queue
9e97b5249b Merge pull request #46360 from khenidak/azure-pd-final
Automatic merge from submit-queue

Azure PD (Managed/Blob)

This is exactly the same code as this [PR](https://github.com/kubernetes/kubernetes/pull/41950). It has a clean set of generated items. We created a separate PR to accelerate the accept/merge the PR

CC @colemickens 
CC @brendandburns 

**What this PR does / why we need it**:

1. Adds K8S support for Azure Managed Disks. 
2. Adds support for dedicated blob disks (1:1 to storage account) in addition to shared blob disks (n:1 to storage account). 
3. Automatically manages the underlying storage accounts. New storage accounts are created at 50% utilization. Max is 100 disks, 60 disks per storage account.    
2. Addresses the current issues with Blob Disks:
..* Significantly faster attach process. Disks are now usually available for pods on nodes under 30 sec if formatted, under a min if not formatted. 
..* Adds support to move disks between nodes.
..* Adds consistent attach/detach behavior, checks if the disk is leased/attached on a different node before attempting to attach to target nodes.
..* Fixes a random hang behavior on Azure VMs during mount/format (for both blob + managed disks).
..* Fixes a potential conflict by avoiding the use of disk names for mount paths. The new plugin uses hashed disk uri for mount path.  

The existing AzureDisk is used as is. Additional "kind" property was added  allowing the user to decide if the pd will be shared, dedicated or managed (Azure Managed Disks are used).

Due to the change in mounting paths, existing PDs need to be recreated as PV or PVCs on the new plugin.
2017-07-14 09:57:51 -07:00
Khaled Henidak & Andy Zhang
677e593d86 Add Azure managed disk support 2017-07-14 14:09:44 +08:00
Cole Mickens
8f55afd0cb azure: refactor azure.go to make auth reusable 2017-07-13 14:27:37 -07:00
Cole Mickens
4521c2312c azure: msi: add managed identity field, logic
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
2017-07-13 14:27:37 -07:00
Minhan Xia
a471140e13 fix gce cloud provider projects api 2017-07-13 14:00:02 -07:00
Mike Danese
c201553f27 remove some people from OWNERS so they don't get reviews anymore
These are googlers who don't work on the project anymore but are still
getting reviews assigned to them:
- bprashanth
- rjnagal
- vmarmol
2017-07-13 10:02:21 -07:00
gmarek
afe1a2c71b Revert "Merge pull request #48560 from nicksardo/gce-network-project"
This reverts commit d4881dd491, reversing
changes made to b5c4346130.
2017-07-13 18:34:24 +02:00
FengyunPan
a1be23679c Fix the order of deletion
1. EnsureLoadBalancer can't delete pool without deleting members,
   just let EnsureLoadBalancerDeleted do it.
2. Add some friendly error message
2017-07-13 21:10:23 +08:00
Kubernetes Submit Queue
74f1943774 Merge pull request #48849 from nicksardo/gce-panic-fix
Automatic merge from submit-queue (batch tested with PRs 48555, 48849)

GCE: Fix panic when service loadbalancer has static IP address

Fixes #48848 

```release-note
Fix service controller crash loop when Service with GCP LoadBalancer uses static IP (#48848, @nicksardo)
```
2017-07-12 23:59:03 -07:00
Nick Sardo
98368d974e Remove address getter from CreateAddress(Region and Global) 2017-07-12 20:06:18 -07:00
Kubernetes Submit Queue
3c080e83c7 Merge pull request #48642 from freehan/gce-api-endpint
Automatic merge from submit-queue

Support GCE alpha/beta api endpoint override

fixes: https://github.com/kubernetes/kubernetes/issues/48568
2017-07-12 18:23:37 -07:00
Kubernetes Submit Queue
30e865e456 Merge pull request #48829 from vmware/vsphere-ByProviderID
Automatic merge from submit-queue (batch tested with PRs 48781, 48817, 48830, 48829, 48053)

vSphere for cloud-controller-manager

**What this PR does / why we need it**:
This is to implement the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods for cloud-controller-manager for vSphere cloud provider.

Currently vSphere cloud provider only supports VMs in the same folder.
Thus `NodeAddressesByProviderID` is similar to `NodeAddresses` with a simple ProviderID to NodeName translation.

`InstanceTypeByProviderID`  returns nil as same as `InstanceType`.

**Which issue this PR fixes**
Part of Issue https://github.com/kubernetes/kubernetes/issues/47257

**Release note**:
```NONE
```
2017-07-12 15:11:14 -07:00
Minhan Xia
3e8b4a27c4 use overrided api endpoint in gce cloud provider 2017-07-12 15:10:13 -07:00
Kubernetes Submit Queue
d230956280 Merge pull request #48243 from brendandburns/imds
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)

Add initial support for the Azure instance metadata service.

Part of fixing #46632

@colemickens @rootfs @jdumars @kris-nova
2017-07-12 14:08:13 -07:00
Kubernetes Submit Queue
5ed8734649 Merge pull request #48801 from FengyunPan/fix-panic
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)

Fix panic of DeleteRoute()

Fix #48800
It should be 'addr_pairs', not 'routes'.

**Release note**:
```release-note
NONE
```
2017-07-12 14:08:07 -07:00
Minhan Xia
811597926a support GCE alpha beta API override 2017-07-12 13:46:52 -07:00
Kubernetes Submit Queue
aeb326e9bc Merge pull request #48704 from FengyunPan/remove-dead-code
Automatic merge from submit-queue

Remove dead code for OpenStack provider

**Release note**:
```release-note
NONE
```
2017-07-12 13:06:04 -07:00
Miao Luo
d327ac6c76 vSphere for cloud-controller-manager
Implement NodeAddressesByProviderID and InstanceTypeByProviderID for vsphere cloud provider.
2017-07-12 11:35:16 -07:00
Brendan Burns
29a0c6f56a Code updates for new SDK. 2017-07-12 06:09:31 -07:00
Kubernetes Submit Queue
3ade1a155d Merge pull request #47593 from fgimenez/cloudprovider-openstack-byid
Automatic merge from submit-queue (batch tested with PRs 47948, 48631, 48693, 48549, 47593)

OpenStack for cloud-controller-manager

**What this PR does / why we need it**:
This implements the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods used by the cloud-controller-manager to the OpenStack provider. The instance type returned is the flavor name, for consistency `InstanceType` has been implemented too returning the same value.

```release-note
NONE
```

This is part of #47257 cc @wlan0
2017-07-12 04:04:00 -07:00
FengyunPan
cd29146317 Fix panic of DeleteRoute()
Fix #48800
It should be 'addr_pairs', not 'routes'.
2017-07-12 17:28:58 +08:00
Kubernetes Submit Queue
a3430ad0c3 Merge pull request #47232 from gyliu513/remove-mesos-cp
Automatic merge from submit-queue

Removed mesos as cloud provider from Kubernetes.

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #47205

**Special notes for your reviewer**:

**Release note**:

```release-note
Move Mesos Cloud Provider out of Kubernetes Repo
```
2017-07-12 00:08:20 -07:00
FengyunPan
703b3761fe Remove dead code for OpenStack provider 2017-07-10 20:59:39 +08:00
FengyunPan
0154bd279d Rackspace for cloud-controller-manager
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.

This is part of #47257 cc @wlan0
2017-07-10 20:43:07 +08:00
Cao Shufeng
0c577c47d5 Use glog.*f when a format string is passed
ref:
https://godoc.org/github.com/golang/glog

I use the following commands to search all the invalid usage:
$ grep "glog.Warning(" * -r | grep %
$ grep "glog.Info(" * -r | grep %
$ grep "glog.Error(" * -r | grep %
$ grep ").Info(" * -r | grep % | grep "glog.V("
2017-07-10 19:04:03 +08:00
Guangya Liu
498b034492 Removed mesos as cloud provider from Kubernetes. 2017-07-09 21:54:57 -04:00
Kubernetes Submit Queue
d4881dd491 Merge pull request #48560 from nicksardo/gce-network-project
Automatic merge from submit-queue (batch tested with PRs 48497, 48604, 48599, 48560, 48546)

GCE: Use network project id for firewall/route mgmt and zone listing

- Introduces a new environment variable for plumbing the network project id which will be used for firewall and route management. fixes #48515
- onXPN is determined by metadata if config is not specified
- Split `if` conditions: fixes #48521
- Remove `getNetworkNameViaAPICall` which was used as a last resort for the `networkURL` (if empty) which was previously filled with the metadata network project & name.

**Release note**:
```release-note
NONE
```
2017-07-08 07:09:36 -07:00
Kubernetes Submit Queue
9fcb8b847e Merge pull request #48336 from FengyunPan/fix-delete-empty-monitors
Automatic merge from submit-queue

Fix deleting empty monitors

Fix #48094
When create-monitor of cloud-config is false, pool has not monitor
and can not delete empty monitor.

**Release note**:
```release-note
NONE
```
2017-07-08 06:02:45 -07:00
Kubernetes Submit Queue
954c356dc5 Merge pull request #48348 from FengyunPan/check-openstack-Opts
Automatic merge from submit-queue (batch tested with PRs 47234, 48410, 48514, 48529, 48348)

Check opts of cloud config file

Fix #48347
Check opts when register OpenStack CloudProvider rather than
returning error when use opts to create/use cloud resource.

**Release note**:
```release-note
NONE
```
2017-07-07 23:53:40 -07:00
Derek Carr
b6fabe5b9e Warn if aws has no cluster id provided 2017-07-07 11:57:20 -04:00
FengyunPan
d2ebb60438 Check opts of cloud config file
Fix #48347
Check opts when register OpenStack CloudProvider rather than
returning error when use opts to create/use cloud resource.
2017-07-07 17:05:21 +08:00
Nick Sardo
62d13f1379 Use API that utilizes networkProjectId 2017-07-06 18:13:02 -07:00
Nick Sardo
06e328627c Use network project id for firewall/route mgmt and zone listing 2017-07-06 16:58:27 -07:00
Brendan Burns
7644c6afc6 Add initial support for the Azure instance metadata service. 2017-07-06 06:56:39 -07:00
Davanum Srinivas
927a4a0a68 Volunteer to help with OpenStack provider reviews
I'd like to help with keeping the OpenStack cloud provider up-to-date
2017-07-06 08:43:43 -04:00
Kubernetes Submit Queue
9dd6a935fc Merge pull request #48501 from FengyunPan/enable-ServiceAffinity
Automatic merge from submit-queue

Enable Service Affinity for OpenStack cloudprovider

Fix issue: #48500
Kubernetes's OpenStack cloudprovider can't set persistence to "SOURCE_IP"

**Release note**:
```release-note
NONE
```
2017-07-05 20:45:26 -07:00
FengyunPan
6ee05783c2 Enable Service Affinity for OpenStack cloudprovider.
Fix issue: #48500
Kubernetes's OpenStack cloudprovider can't set LB's persistence
to "SOURCE_IP".
2017-07-06 09:25:31 +08:00
Kubernetes Submit Queue
d816555e44 Merge pull request #48121 from sakshamsharma/add-kms-dep
Automatic merge from submit-queue (batch tested with PRs 48292, 48121)

Add Google cloudkms dependency, add cloudkms service to GCE cloud provider

Required to introduce a Google KMS based envelope encryption, which shall allow encrypting secrets at rest using KEK-DEK scheme.

The above requires KMS API to create/delete KeyRings and CryptoKeys, and Encrypt/Decrypt data.

Should target release 1.8

@jcbsmpsn 

Update: It appears that Godep only allows dependencies which are in use. We may have to modify this PR to include some Google KMS code.

Progresses #48522
2017-07-05 17:41:40 -07:00
Cosmin Cojocar
afafb3f231 Use the azure certificate password when decoding the certificate 2017-07-04 08:56:40 +02:00
Kubernetes Submit Queue
8f9c57ca53 Merge pull request #47919 from rrati/aws-handle-logs-with-return-keys
Automatic merge from submit-queue

Use %q formatter for error messages from the AWS SDK. #47789

Error messages from the AWS SDK can have return keys in them, so use %q formatter for those messages.
2017-07-03 09:41:50 -07:00
Kubernetes Submit Queue
c0337c92cc Merge pull request #47881 from cadmuxe/endpoint
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)

Add ApiEndpoint support to GCE config.

**What this PR does / why we need it**:
Add the ability to change ApiEndpoint  for GCE.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
None
```
2017-06-30 18:42:40 -07:00
FengyunPan
643afd3ffc Fix deleting empty monitors
Fix #48094
When create-monitor of cloud-config is false, pool has not monitor
and can not delete empty monitor.
2017-06-30 23:46:36 +08:00
Koonwah Chen
0db5b37165 testing fixed
hack/verify-gofmt.sh and hack/verify-flags-underscore.py
2017-06-29 10:42:29 -07:00
Kubernetes Submit Queue
db46e4f8e6 Merge pull request #47286 from cosmincojocar/client_cert_azure_cloud_provider
Automatic merge from submit-queue (batch tested with PRs 47286, 47729)

Add client certificate authentication to Azure cloud provider

This adds support for client cert authentication in Azure cloud provider. The certificate can be provided in PKCS #12 format with password protection. Not that this authentication will be active only when no client secret is configured.

cc @brendandburns @colemickens
2017-06-28 23:14:29 -07:00
Saksham Sharma
57e8461662 Add Google cloudkms service to gce cloud provider 2017-06-28 16:56:01 -07:00
Kubernetes Submit Queue
a7f16b553b Merge pull request #48003 from MrHohn/gce-xlb-cleanup
Automatic merge from submit-queue (batch tested with PRs 48139, 48042, 47645, 48054, 48003)

Pipe clusterID into gce_loadbalancer_external.go

**What this PR does / why we need it**: Small cleanup for GCE ELB codes.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #48002

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-06-27 14:08:21 -07:00
Kubernetes Submit Queue
7dfa61a2d9 Merge pull request #47947 from zouyee/opa
Automatic merge from submit-queue (batch tested with PRs 47776, 46220, 46878, 47942, 47947)

fix comment mistake

fix comment mistake


**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-06-24 04:15:55 -07:00
Kubernetes Submit Queue
e22215d38e Merge pull request #47942 from zouyee/op
Automatic merge from submit-queue (batch tested with PRs 47776, 46220, 46878, 47942, 47947)

update openstack metadata-service url

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-06-24 04:15:54 -07:00
Zihong Zheng
baca8a1490 Pipe clusterID into gce_loadbalancer_external.go 2017-06-23 15:54:04 -07:00
Kubernetes Submit Queue
72cb080c87 Merge pull request #46181 from FengyunPan/ignore-LBnotfound
Automatic merge from submit-queue

Ignore ErrNotFound when delete LB resources

IsNotFound error is fine since that means the object is
deleted already, so let's check it before return error.
2017-06-23 09:35:11 -07:00
Robert Rati
d6a5175c05 Use %q formatter for error messages from the AWS SDK. #47789 2017-06-23 10:02:21 -04:00
Cosmin Cojocar
0235cb9e3c Fix dependencies order after rebase 2017-06-23 13:20:10 +02:00
Kubernetes Submit Queue
aaa5b2b642 Merge pull request #47575 from justinsb/fix_36902
Automatic merge from submit-queue (batch tested with PRs 47915, 47856, 44086, 47575, 47475)

AWS: Fix suspicious loop comparing permissions

Because we only ever call it with a single UserId/GroupId, this would
not have been a problem in practice, but this fixes the code.

Fix #36902 

```release-note
NONE
```
2017-06-23 04:06:25 -07:00
Cosmin Cojocar
fcdceb2e50 Add the pcks12 package to the build of Azure cloud provider 2017-06-23 12:19:56 +02:00
Cosmin Cojocar
2c8ec115db Fix tests after rebasing 2017-06-23 12:17:17 +02:00
Cosmin Cojocar
5462d06ce3 Add client cert authentication for Azure cloud provider 2017-06-23 12:17:17 +02:00
Federico Gimenez
37951c336b OpenStack for cloud-controller-manager 2017-06-23 08:53:19 +02:00
zouyee
5e56e5294a fix comment mistake 2017-06-23 14:06:46 +08:00
Kubernetes Submit Queue
be0b045072 Merge pull request #47401 from justinsb/fix_39374
Automatic merge from submit-queue (batch tested with PRs 47922, 47195, 47241, 47095, 47401)

AWS: Set CredentialsChainVerboseErrors

This avoids a rather confusing error message.

Fix #39374

```release-note
NONE
```
2017-06-22 21:33:34 -07:00
zouyee
39552417fe update openstack metadata-service url 2017-06-23 10:50:20 +08:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
Koonwah Chen
65b2f71ee7 Add ApiEndpoint support to GCE config. 2017-06-21 15:27:10 -07:00
Kubernetes Submit Queue
0f0e017ade Merge pull request #45473 from karataliu/AzureInternalLoadBalancerE2E
Automatic merge from submit-queue

Add E2E tests for Azure internal loadbalancer support, fix an issue for public IP resource deletion.

**What this PR does / why we need it**:

- Add E2E tests for Azure internal loadbalancer support: https://github.com/kubernetes/kubernetes/pull/43510
- Fix an issue that public IP resource not get deleted when switching from external loadbalancer to internal static loadbalancer.

**Special notes for your reviewer**:

1.  Add new Azure resource tag to Public IP resources to indicate kubernetes managed resources.
   Currently we determine whether the public IP resource should be deleted by looking at LoadBalancerIp property on spec. In the scenario 'Switching from external loadbalancer to internal loadbalancer with static IP', that value might have been updated for internal loadbalancer. So here we're to add an explicit tag for kubernetes managed resources.

2. Merge cleanupPublicIP logic into cleanupLoadBalancer

**Release note**:
NONE

CC @brendandburns @colemickens
2017-06-21 11:41:22 -07:00
Kubernetes Submit Queue
1499b6bddc Merge pull request #45268 from redbaron/aws-elb-attach-sgs
Automatic merge from submit-queue

New annotation to add existing Security Groups to ELBs created by AWS cloudprovider

**What this PR does / why we need it**:
When K8S cluster is deployed in existing VPC there might be a need to attach extra SecurityGroups to ELB created by AWS cloudprovider. Example of it can be cases, where such Security Groups are maintained by another team.

**Special notes for your reviewer**:
For tests to pass depends on https://github.com/kubernetes/kubernetes/pull/45168  and therefore includes it

**Release note**:
```release-note
New 'service.beta.kubernetes.io/aws-load-balancer-extra-security-groups' Service annotation to specify extra Security Groups to be added to ELB created by AWS cloudprovider
```
2017-06-20 18:06:29 -07:00
Kubernetes Submit Queue
5780cd06d1 Merge pull request #47572 from justinsb/fix_47524
Automatic merge from submit-queue

AWS: Remove blackhole routes in our managed range

Blackhole routes otherwise acccumulate unboundedly.  We also are careful
to ensure that we do so only within the managed range, which requires
enlisting the help of the routecontroller.

Fix #47524

```release-note
AWS: clean up blackhole routes when using kubenet
```
2017-06-20 17:00:30 -07:00
Kubernetes Submit Queue
7831a5426f Merge pull request #47605 from brendandburns/container
Automatic merge from submit-queue (batch tested with PRs 47562, 47605)

Change Container permissions to Private for provisioned Azure Volumes

@rootfs @philips #47611
2017-06-15 21:54:30 -07:00
Brendan Burns
f07ac3efc6 Change Container permissions to Private. 2017-06-16 01:40:10 +00:00
Justin Santa Barbara
737607ba6b AWS: Fix suspicious loop comparing permissions
Because we only ever call it with a single UserId/GroupId, this would
not have been a problem in practice, but this fixes the code.

Fix #36902
2017-06-15 09:20:41 -04:00
Justin Santa Barbara
3d2b71b78f AWS: Maintain a cache of all instances for ELB
We maintain a cache of all instances, and we invalidate the cache
whenever we see a new instance.  For ELBs that should be sufficient,
because our usage is limited to instance ids and security groups, which
should not change.

Fix #45050
2017-06-14 23:39:18 -04:00
Kubernetes Submit Queue
8e4ec18adf Merge pull request #47516 from gnufied/fix-filter-limit-aws
Automatic merge from submit-queue (batch tested with PRs 47510, 47516, 47482, 47521, 47537)

Batch AWS getInstancesByNodeNames calls with FilterNodeLimit

We are going to limit the getInstancesByNodeNames call with a batch
size of 150.

Fixes - #47271

```release-note
AWS: Batch DescribeInstance calls with nodeNames to 150 limit, to stay within AWS filter limits.
```
2017-06-14 20:32:45 -07:00
Justin Santa Barbara
11f8886f12 AWS: Remove blackhole routes in our managed range
Blackhole routes otherwise acccumulate unboundedly.  We also are careful
to ensure that we do so only within the managed range, which requires
enlisting the help of the routecontroller.

Fix #47524
2017-06-14 23:02:55 -04:00
Dong Liu
f8ae27db57 Add E2E tests for Azure internal loadbalancer support, fix an issue for public IP resource deletion. 2017-06-15 10:52:18 +08:00
Kubernetes Submit Queue
b361814e8e Merge pull request #47411 from justinsb/fix_47409
Automatic merge from submit-queue (batch tested with PRs 47470, 47260, 47411, 46852, 46135)

AWS: Remove getInstancesByRegex (dead code)

Fix #47409

```release-note
NONE
```
2017-06-14 12:52:21 -07:00
Kubernetes Submit Queue
6c38d009ce Merge pull request #47406 from justinsb/fix_47404
Automatic merge from submit-queue

AWS: Process disk attachments even with duplicate NodeNames

Fix #47404


```release-note
AWS: Process disk attachments even with duplicate NodeNames
```
2017-06-14 10:21:20 -07:00
Hemant Kumar
ffa622f9c7 Batch AWS getInstancesByNodeNames calls with FilterNodeLimit
We are going to limit the getInstancesByNodeNames call with a batch
size of 150
2017-06-14 10:46:46 -04:00
Kubernetes Submit Queue
f2ccb3594f Merge pull request #47459 from nicksardo/gce-ilb-fixes
Automatic merge from submit-queue (batch tested with PRs 47302, 47389, 47402, 47468, 47459)

[GCE] Fix ILB sharing and GC 

Fixes #47092 

- Users must opt-in for sharing backend services (alpha feature - may be removed in future release)
- Shared backend services use a hash for determining similarity via settings (so far, only sessionaffinity) (again, this may be removed)
- Move resource cleanup to after the ILB setup.

/assign @bowei 

**Release note**:
```release-note
NONE
```
2017-06-13 23:37:54 -07:00
Nick Sardo
efc2989dde Final fixes 2017-06-13 15:39:41 -07:00
Nick Sardo
3ea26e7436 Annotation for opting into backend sharing; Use hash suffix for sharing; Fix resource GC 2017-06-13 13:22:12 -07:00
Kubernetes Submit Queue
48bea51d04 Merge pull request #47399 from justinsb/fix_41904
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)

AWS: Richer log message when metadata fails

Not a resolution, but should at least help determine the issue.

Issue #41904

```release-note
NONE
```
2017-06-13 10:52:11 -07:00
Kubernetes Submit Queue
d216cfc41a Merge pull request #47391 from justinsb/fix_47067
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)

AWS: Perform ELB listener comparison in case-insensitive manner

Fix #47067

```release-note
AWS: Avoid spurious ELB listener recreation - ignore case when matching protocol
```
2017-06-13 10:52:08 -07:00
Justin Santa Barbara
b87c4398c7 AWS: Remove getInstancesByRegex (dead code)
Fix #47409
2017-06-13 12:37:45 -04:00
Justin Santa Barbara
bd526b0bc0 AWS: Process disk attachments even with duplicate NodeNames
Fix #47404
2017-06-13 03:09:43 -04:00
Justin Santa Barbara
9803840b5f AWS: Perform ELB listener comparison in case-insensitive manner
Fix #47067
2017-06-13 02:22:38 -04:00
Justin Santa Barbara
bad277e98b AWS: Set CredentialsChainVerboseErrors
This avoids a rather confusing error message.

Fix #39374
2017-06-13 01:56:10 -04:00
Justin Santa Barbara
9d8a721bb9 AWS: Richer log message when metadata fails
Not a resolution, but should at least help determine the issue.

Issue #41904
2017-06-13 01:46:09 -04:00
Justin Santa Barbara
30ecfbc7ee aws: remove redundant tests 2017-06-13 01:19:23 -04:00
Justin Santa Barbara
0a174089cd Use awsInstanceID to query instances
Also reuse existing mapping code, rather than reimplementing.

Issue #47394
2017-06-13 01:19:23 -04:00
Justin Santa Barbara
8aad321d69 Create strong typed awsInstanceID 2017-06-13 01:19:19 -04:00
Justin Santa Barbara
f10c9eed69 Follow our go code style: error -> err
Issue #47394
2017-06-13 01:07:07 -04:00
Kubernetes Submit Queue
ea3a896f2c Merge pull request #47215 from ublubu/aws-addresses
Automatic merge from submit-queue

AWS for cloud-controller-manager

fixes #47214

This implements the NodeAddressesByProviderID and InstanceTypeByProviderID methods used by the cloud-controller-manager for the AWS provider.

NodeAddressesByProvider uses DescribeInstances (for normal addresses) and DescribeAddresses (for Elastic IP addresses).

InstanceTypeByProviderID uses DescribeInstances.

```release-note
NONE
```
2017-06-11 17:33:51 -07:00
Kubernetes Submit Queue
67730881a6 Merge pull request #46940 from realfake/azure-cloud-controller-manager
Automatic merge from submit-queue

Azure for cloud-controller-manager

**What this PR does / why we need it**:
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID methods used by the cloud-controller-manager to the Azure provider.

**Release note**:

```release-note
NONE
```
Addresses #47257
2017-06-10 17:28:44 -07:00
Maxim Ivanov
2e5773b45d New Service annotation to specify ELB SGs
Service objects can be annotated with
`service.beta.kubernetes.io/aws-load-balancer-extra-security-groups`

to specify existing security groups to be added to ELB
created by AWS cloudprovider
2017-06-09 12:10:33 +01:00
ublubu
c261f98a60 bugfix for ProviderID parsing & corresponding unit test 2017-06-08 23:12:28 -04:00
ublubu
bc9d2e8832 use aws://[instance-id] as the ProviderID 2017-06-08 22:09:08 -04:00
ublubu
baa85c830a InstanceTypeByProviderID 2017-06-07 23:47:59 -04:00
Kynan Rilee
17783afc94 NodeAddressesByProviderID for AWS cloudprovider 2017-06-07 23:47:59 -04:00
Kubernetes Submit Queue
3adb9b428b Merge pull request #46660 from jackfrancis/azure-cloudprovider-backoff
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)

Azure cloudprovider retry using flowcontrol

An initial attempt at engaging exponential backoff for API error responses.

Addresses #47048

Uses k8s.io/client-go/util/flowcontrol; implementation inspired by GCE
cloudprovider backoff.



**What this PR does / why we need it**:

The existing azure cloudprovider implementation has no guard rails in place to adapt to unexpected underlying operational conditions (i.e., clogs in resource plumbing between k8s runtime and the cloud API). The purpose of these changes is to support exponential backoff wrapping around API calls; and to support targeted rate limiting. Both of these options are configurable via `--cloud-config`.

Implementation inspired by the GCE's use of `k8s.io/client-go/util/flowcontrol` and `k8s.io/apimachinery/pkg/util/wait`, this PR likewise uses `flowcontrol` for rate limiting; and `wait` to thinly wrap backoff retry attempts to the API.

**Special notes for your reviewer**:


Pay especial note to the declaration of retry-able conditions from an unsuccessful HTTP request:
- all `4xx` and `5xx` HTTP responses
- non-nil error responses

And the declaration of retry success conditions:
- `2xx` HTTP responses

Tests updated to include additions to `Config`.

Those may be incomplete, or in other ways non-representative.

**Release note**:

```release-note
Added exponential backoff to Azure cloudprovider
```
2017-06-07 13:30:58 -07:00
Jack Francis
acb65170f3 preferring float32 for rate limit QPS param 2017-06-06 22:21:14 -07:00
Jack Francis
2accbbd618 go vet errata 2017-06-06 22:12:49 -07:00
Jack Francis
6d73a09dcc rate limiting everywhere
not waiting to rate limit until we get an error response from the API, doing so on initial request for all API requests
2017-06-06 22:09:57 -07:00
Jack Francis
148e923f65 az.getVirtualMachine already rate-limited
we don’t need to rate limit the calls _to_ it
2017-06-06 14:55:07 -07:00
Jack Francis
ac931aa1e0 rate limiting on all azure sdk GET requests 2017-06-06 11:19:29 -07:00
Jack Francis
af5ce2fcc5 test coverage
We want to ensure that backoff and rate limit configuration is opt-in
2017-06-06 09:50:28 -07:00
Christoph Blecker
1bdc7a29ae
Update docs/ URLs to point to proper locations 2017-06-05 22:13:54 -07:00
Jack Francis
3f3aa279b9 configurable backoff
- leveraging Config struct (—cloud-config) to store backoff and rate limit on/off and performance configuration
- added add’l error logging
- enabled backoff for vm GET requests
2017-06-05 16:06:50 -07:00
realfake
7bc205fc59 Implement *ByProviderID methods 2017-06-05 22:56:09 +02:00
realfake
fc748662ef Add splitProviderID for azure 2017-06-05 22:56:09 +02:00
Nick Sardo
025f178b7e Use new kubelet apis pkg for labels 2017-06-04 10:26:33 -07:00
Nick Sardo
7248c61ea5 Update test utilities & build file 2017-06-04 10:25:05 -07:00
Nick Sardo
05aaef3edc Hook external & internal lb together 2017-06-04 10:25:05 -07:00
Nick Sardo
660452dee1 Add internal LB logic 2017-06-04 10:25:05 -07:00
Nick Sardo
1283d65538 Modify external LB logic 2017-06-04 10:25:05 -07:00
Nick Sardo
2cdaf1f32b Refactor compute API calls 2017-06-04 10:25:05 -07:00
Nick Sardo
b631061f05 Rename gce_staticip.go to gce_addresses.go 2017-06-04 10:25:05 -07:00
Nick Sardo
66773fea4b Rename gce_loadbalancer.go to gce_loadbalancer_external.go 2017-06-04 10:25:05 -07:00
Kubernetes Submit Queue
4220b7303e Merge pull request #45500 from nbutton23/nbutton-aws-elb-security-group
Automatic merge from submit-queue (batch tested with PRs 36721, 46483, 45500, 46724, 46036)

AWS: Allow configuration of a single security group for ELBs

**What this PR does / why we need it**:
AWS has a hard limit on the number of Security Groups (500).  Right now every time an ELB is created Kubernetes is creating a new Security Group.  This allows for specifying a Security Group to use for all ELBS

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
For some reason the Diff tool makes this look like it was way more changes than it really was. 
**Release note**:

```release-note
```
2017-06-03 08:08:40 -07:00
Kubernetes Submit Queue
348bf1e032 Merge pull request #46627 from deads2k/api-12-labels
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)

move labels to components which own the APIs

During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery.  They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet.  This pull moves the labels into their owning components and out of API machinery.

@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers 
@derekwaynecarr  since most of these are related to the kubelet
2017-06-02 23:37:38 -07:00
Kubernetes Submit Queue
0d4fda7746 Merge pull request #46462 from vmware/vsphere-storage-metrics
Automatic merge from submit-queue (batch tested with PRs 41563, 45251, 46265, 46462, 46721)

Add metric collections for vSphere cloud provider operations

**What this PR does / why we need it**:
This PR adds metric collections for vSphere Cloud Provider Operations.

**Which issue this PR fixes** 
fixes #

**Special notes for your reviewer**:
Verified Prometheus pod is able to scrape vSphere metrics from Kubernetes Controller’s URL.

`providers/vsphere/vsphere.go` file is intentionally kept not formatted with gofmt, to keep diff review-able.

After review is complete, I will apply the formatting. 

**Release note**:

```release-note
None
```

@BaluDontu @tusharnt

@gnufied Verified with executing various operations on the Kubernetes Cluster deployed using this change.

```
$ curl -s 10.160.18.128:10252/metrics | grep "cloudprovider_vsphere"
# HELP cloudprovider_vsphere_api_request_duration_seconds Latency of vsphere api call
# TYPE cloudprovider_vsphere_api_request_duration_seconds histogram
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.005"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.01"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.025"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.05"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.1"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.25"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="0.5"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="1"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="2.5"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="5"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="10"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="AttachVolume",le="+Inf"} 3
cloudprovider_vsphere_api_request_duration_seconds_sum{request="AttachVolume"} 3.9742241939999996
cloudprovider_vsphere_api_request_duration_seconds_count{request="AttachVolume"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.005"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.01"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.025"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.05"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.1"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.25"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="0.5"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="1"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="2.5"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="5"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="10"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="CreateVolume",le="+Inf"} 1
cloudprovider_vsphere_api_request_duration_seconds_sum{request="CreateVolume"} 0.920856776
cloudprovider_vsphere_api_request_duration_seconds_count{request="CreateVolume"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.005"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.01"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.025"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.05"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.1"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.25"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="0.5"} 2
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="1"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="2.5"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="5"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="10"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DeleteVolume",le="+Inf"} 3
cloudprovider_vsphere_api_request_duration_seconds_sum{request="DeleteVolume"} 1.3301585450000002
cloudprovider_vsphere_api_request_duration_seconds_count{request="DeleteVolume"} 3
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.005"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.01"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.025"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.05"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.1"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.25"} 0
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="0.5"} 1
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="1"} 4
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="2.5"} 6
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="5"} 6
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="10"} 6
cloudprovider_vsphere_api_request_duration_seconds_bucket{request="DetachVolume",le="+Inf"} 6
cloudprovider_vsphere_api_request_duration_seconds_sum{request="DetachVolume"} 5.350829375
cloudprovider_vsphere_api_request_duration_seconds_count{request="DetachVolume"} 6
# HELP cloudprovider_vsphere_api_request_errors vsphere Api errors
# TYPE cloudprovider_vsphere_api_request_errors counter
cloudprovider_vsphere_api_request_errors{request="DeleteVolume"} 4
# HELP cloudprovider_vsphere_operation_duration_seconds Latency of vsphere operation call
# TYPE cloudprovider_vsphere_operation_duration_seconds histogram
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.25"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="0.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="1"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="2.5"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="5"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="10"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="AttachVolumeOperation",le="+Inf"} 3
cloudprovider_vsphere_operation_duration_seconds_sum{operation="AttachVolumeOperation"} 4.732579923
cloudprovider_vsphere_operation_duration_seconds_count{operation="AttachVolumeOperation"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.25"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="0.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="2.5"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="5"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="10"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeOperation",le="+Inf"} 1
cloudprovider_vsphere_operation_duration_seconds_sum{operation="CreateVolumeOperation"} 1.2753096990000001
cloudprovider_vsphere_operation_duration_seconds_count{operation="CreateVolumeOperation"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.25"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="0.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="2.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="10"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithPolicyOperation",le="+Inf"} 1
cloudprovider_vsphere_operation_duration_seconds_sum{operation="CreateVolumeWithPolicyOperation"} 15.066558008
cloudprovider_vsphere_operation_duration_seconds_count{operation="CreateVolumeWithPolicyOperation"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.25"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="0.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="2.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="10"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="CreateVolumeWithRawVSANPolicyOperation",le="+Inf"} 2
cloudprovider_vsphere_operation_duration_seconds_sum{operation="CreateVolumeWithRawVSANPolicyOperation"} 21.805354686
cloudprovider_vsphere_operation_duration_seconds_count{operation="CreateVolumeWithRawVSANPolicyOperation"} 2
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.25"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="0.5"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="1"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="2.5"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="5"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="10"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DeleteVolumeOperation",le="+Inf"} 3
cloudprovider_vsphere_operation_duration_seconds_sum{operation="DeleteVolumeOperation"} 1.4869503179999999
cloudprovider_vsphere_operation_duration_seconds_count{operation="DeleteVolumeOperation"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.25"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="0.5"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="1"} 2
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="2.5"} 6
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="5"} 6
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="10"} 6
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DetachVolumeOperation",le="+Inf"} 6
cloudprovider_vsphere_operation_duration_seconds_sum{operation="DetachVolumeOperation"} 7.15601343
cloudprovider_vsphere_operation_duration_seconds_count{operation="DetachVolumeOperation"} 6
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.25"} 1
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="0.5"} 2
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="1"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="2.5"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="5"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="10"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DiskIsAttachedOperation",le="+Inf"} 3
cloudprovider_vsphere_operation_duration_seconds_sum{operation="DiskIsAttachedOperation"} 1.0603705730000001
cloudprovider_vsphere_operation_duration_seconds_count{operation="DiskIsAttachedOperation"} 3
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.005"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.01"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.025"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.05"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.1"} 0
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.25"} 4
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="0.5"} 12
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="1"} 12
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="2.5"} 12
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="5"} 12
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="10"} 12
cloudprovider_vsphere_operation_duration_seconds_bucket{operation="DisksAreAttachedOperation",le="+Inf"} 12
cloudprovider_vsphere_operation_duration_seconds_sum{operation="DisksAreAttachedOperation"} 3.282661207
cloudprovider_vsphere_operation_duration_seconds_count{operation="DisksAreAttachedOperation"} 12
# HELP cloudprovider_vsphere_operation_errors vsphere operation errors
# TYPE cloudprovider_vsphere_operation_errors counter
cloudprovider_vsphere_operation_errors{operation="DeleteVolumeOperation"} 4
```
2017-06-02 19:53:42 -07:00
Jack Francis
7e6c689e58 backoff logging, error handling, wait.ConditionFunc
- added info and error logs for appropriate backoff conditions/states
- rationalized log idioms across all resource requests that are backoff-enabled
- processRetryResponse as a wait.ConditionFunc needs to supress errors if it wants the caller to continue backing off
2017-06-02 15:35:20 -07:00
Jack Francis
c5dd95fc22 update-bazel.sh mods 2017-06-02 09:59:07 -07:00
Jack Francis
17f8dc53af two optimizations
- removed unnecessary return statements
- optimized HTTP response code evaluations as numeric comparisons
2017-06-01 13:58:11 -07:00
Kubernetes Submit Queue
5c048ac258 Merge pull request #45168 from redbaron/fix-aws-tagging
Automatic merge from submit-queue (batch tested with PRs 43505, 45168, 46439, 46677, 46623)

fix AWS tagging to add missing tags only

It seems that intention of original code was to build map of missing
tags and call AWS API to add just them, but due to typo full
set of tags was always (re)added

```release-note
NONE
```
2017-06-01 05:43:39 -07:00
Kubernetes Submit Queue
43ac38e29e Merge pull request #45049 from wongma7/volumeinuse
Automatic merge from submit-queue (batch tested with PRs 46686, 45049, 46323, 45708, 46487)

Log an EBS vol's instance when attaching fails because VolumeInUse

Messages now look something like this:
E0427 15:44:37.617134   16932 attacher.go:73] Error attaching volume "vol-00095ddceae1a96ed": Error attaching EBS volume "vol-00095ddceae1a96ed" to instance "i-245203b7": VolumeInUse: vol-00095ddceae1a96ed is already attached to an instance
        status code: 400, request id: f510c439-64fe-43ea-b3ef-f496a5cd0577. The volume is currently attached to instance "i-072d9328131bcd9cd"
weird that AWS doesn't bother to put that information in there for us (it does when you try to delete a vol that's in use)
```release-note
NONE
```
2017-06-01 03:42:05 -07:00
Jack Francis
c95af06154 errata
arg cruft in CreateOrUpdateSGWithRetry function declaration
2017-05-31 12:03:22 -07:00
Jack Francis
c6c6cc790e errata, wait.ExponentialBackoff, regex HTTP codes
- corrected Copyright copy/paste
- now actually implementing exponential backoff instead of regular interval retries
- using more general HTTP response code success/failure determination (e.g., 5xx for retry)
- net/http constants ftw
2017-05-31 11:53:02 -07:00
deads2k
954eb3ceb9 move labels to components which own the APIs 2017-05-31 10:32:06 -04:00
Kubernetes Submit Queue
0ff75d74d3 Merge pull request #46436 from rootfs/openstack-client
Automatic merge from submit-queue (batch tested with PRs 46394, 46650, 46436, 46673, 46212)

refactor and export openstack service clients

**What this PR does / why we need it**:
Refactor and export openstack service client.
Exporting OpenStack client so other projects can use the them to call functions that are not implemented in openstack cloud providers yet.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-05-31 00:14:07 -07:00
Huamin Chen
4d4bdf11de refactor and export openstack service clients
Signed-off-by: Huamin Chen <hchen@redhat.com>
2017-05-31 00:36:33 +00:00
divyenpatel
85dcf6d52c Adding vsphere Storage API Latency and Error Metrics support
fix bazel failure
2017-05-30 16:54:30 -07:00
Jack Francis
f200f9a1e8 Azure cloudprovider retry using flowcontrol
An initial attempt at engaging exponential backoff for API error responses.

Uses k8s.io/client-go/util/flowcontrol; implementation inspired by GCE
cloudprovider backoff.
2017-05-30 14:50:31 -07:00
Kubernetes Submit Queue
222d247489 Merge pull request #46463 from wongma7/getinstances
Automatic merge from submit-queue (batch tested with PRs 46489, 46281, 46463, 46114, 43946)

AWS: consider instances of all states in DisksAreAttached, not just "running"

Require callers of `getInstancesByNodeNames(Cached)` to specify the states they want to filter instances by, if any. DisksAreAttached, cannot only get "running" instances because of the following attach/detach bug we discovered:

1. Node A stops (or reboots) and stays down for x amount of time
2. Kube reschedules all pods to different nodes; the ones using ebs volumes cannot run because their volumes are still attached to node A
3. Verify volumes are attached check happens while node A is down
4. Since aws ebs bulk verify filters by running nodes, it assumes the volumes attached to node A are detached and removes them all from ASW
5. Node A comes back; its volumes are still attached to it but the attach detach controller has removed them all from asw and so will never detach them even though they are no longer desired on this node and in fact desired elsewhere
6. Pods cannot run because their volumes are still attached to node A

So the idea here is to remove the wrong assumption that callers of `getInstancesByNodeNames(Cached)` only want "running" nodes.

I hope this isn't too confusing, open to alternative ways of fixing the bug + making the code nice.

ping @gnufied @kubernetes/sig-storage-bugs

```release-note
Fix AWS EBS volumes not getting detached from node if routine to verify volumes are attached runs while the node is down
```
2017-05-30 11:59:04 -07:00
Kubernetes Submit Queue
aee0ced31f Merge pull request #43585 from foolusion/add-health-check-node-port-to-aws-loadbalancer
Automatic merge from submit-queue

AWS: support node port health check

**What this PR does / why we need it**:
if a custom health check is set from the beta annotation on a service it
should be used for the ELB health check. This patch adds support for
that.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
Let me know if any tests need to be added.
**Release note**:

```release-note
```
2017-05-29 15:29:51 -07:00
Nick Sardo
9063526dfb GCE: Refactor firewalls/backendservices api; other small changes 2017-05-27 10:25:03 -07:00
FengyunPan
f5f75f3879 Ignore ErrNotFound when delete LB resources
IsNotFound error is fine since that means the object is
deleted already, so let's check it before return error.
2017-05-27 18:07:38 +08:00
Kubernetes Submit Queue
daee6d4826 Merge pull request #45524 from MrHohn/l4-lb-healthcheck
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)

Make GCE load-balancers create health checks for nodes

From #14661. Proposal on kubernetes/community#552. Fixes #46313.

Bullet points:
- Create nodes health check and firewall (for health checking) for non-OnlyLocal service.
- Create local traffic health check and firewall (for health checking) for OnlyLocal service.
- Version skew: 
   - Don't create nodes health check if any nodes has version < 1.7.0.
   - Don't backfill nodes health check on existing LBs unless users explicitly trigger it.

**Release note**:

```release-note
GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default.
An existing LoadBalancer will have health check attached to it when:
- Change Service.Spec.Type from LoadBalancer to others and flip it back.
- Any effective change on Service.Spec.ExternalTrafficPolicy.
```
2017-05-26 19:47:57 -07:00
Kubernetes Submit Queue
58e98cfc25 Merge pull request #46545 from nicksardo/gce-reviewers
Automatic merge from submit-queue

Add reviewers for GCE cloud provider

**Release note**:
```release-note
NONE
```
2017-05-26 17:43:11 -07:00
Nick Sardo
5b00c38fd9 Add approvers for GCE cloud provider 2017-05-26 16:42:20 -07:00
Zihong Zheng
897da549bc Autogenerated files 2017-05-26 13:19:14 -07:00
Zihong Zheng
b4633b0600 Create nodes health checks for non-OnlyLocal services 2017-05-26 13:18:50 -07:00
Kubernetes Submit Queue
f8cfeef174 Merge pull request #44884 from verult/master
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)

Created unit tests for GCE cloud provider storage interface.

- Currently covers CreateDisk and DeleteDisk, GetAutoLabelsForPD
- Created ServiceManager interface in gce.go to facilitate mocking in tests.



**What this PR does / why we need it**:
Increasing test coverage for GCE Persistent Disk.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #44573 

**Release note**:

```release-note
NONE
```
2017-05-26 12:58:05 -07:00
Matthew Wong
319c608fdd Get instances of all states in DisksAreAttached, not just "running" 2017-05-25 17:08:30 -04:00
Matthew Wong
9afbb356de Log an EBS vol's instance when attaching fails because VolumeInUse 2017-05-25 15:07:12 -04:00
Kubernetes Submit Queue
29b3bb44ba Merge pull request #45932 from lpabon/elbtag_pr
Automatic merge from submit-queue (batch tested with PRs 45518, 46127, 46146, 45932, 45003)

aws: Support for ELB tagging by users

This PR provides support for tagging AWS ELBs using information in an
annotation and provided as a list of comma separated key-value pairs.

Closes https://github.com/kubernetes/community/pull/404
2017-05-25 11:46:06 -07:00
Kubernetes Submit Queue
9c1480bb61 Merge pull request #46366 from nicksardo/gce-subnetwork-url
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)

GCE - Retrieve subnetwork name/url from gce.conf 

**What this PR does / why we need it**:
Features like ILB require specifying the subnetwork if the network is type manual.

**Notes:**
The network URL can be [constructed](68e7e18698/pkg/cloudprovider/providers/gce/gce.go (L211-L217)) by fetching instance metadata; however, the subnetwork is not provided through this feature. Users must specify the subnetwork name/url through the gce.conf.

Although multiple subnets can exist in the same region for a network, the cloud provider will only use one subnet url for creating LBs. 


**Release note**:
```release-note
NONE
```
2017-05-25 03:14:05 -07:00
Cheng Xing
2141b0fb80 Created unit tests for GCE cloud provider storage interface.
- Currently covers CreateDisk and DeleteDisk, GetAutoLabelsForPD
- Created ServiceManager interface in gce.go to facilitate mocking in tests.
2017-05-24 15:50:22 -07:00
Kubernetes Submit Queue
89a76b8c8b Merge pull request #46128 from jagosan/master
Automatic merge from submit-queue

Added deprecation notice and guidance for cloud providers.

**What this PR does / why we need it**:
Adding context/background and general guidance for incoming cloud providers. 

**Which issue this PR fixes** 

**Special notes for your reviewer**:
Generalized message per discussion with @bgrant0607
2017-05-24 14:19:01 -07:00
Nick Sardo
435303c647 Add subnetworkURL to GCE provider 2017-05-24 09:35:51 -07:00
Kubernetes Submit Queue
6f7eac63c2 Merge pull request #46315 from wongma7/gcepdalready
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)

Fix provisioned GCE PD not being reused if already exists

@jsafrane PTAL 

This is another attempt at https://github.com/kubernetes/kubernetes/pull/38702 . We have observed that `gce.service.Disks.Insert(gce.projectID, zone, diskToCreate).Do()` instantly gets an error response of alreadyExists, so we must check for it.

I am not sure if we still need to check for the error after `waitForZoneOp`; I think that if there is an alreadyExists error, the `Do()` above will always respond with it instantly. But because I'm not sure, and to be safe, I will leave it.
2017-05-24 06:47:03 -07:00
Kubernetes Submit Queue
70dd10cc50 Merge pull request #41785 from jamiehannaford/cinder-performance
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)

Only retrieve relevant volumes

**What this PR does / why we need it**:

Improves performance for Cinder volume attach/detach calls. 

Currently when Cinder volumes are attached or detached, functions try to retrieve details about the volume from the Nova API. Because some only have the volume name not its UUID, they use the list function in gophercloud to iterate over all volumes to find a match. This incurs severe performance problems on OpenStack projects with lots of volumes (sometimes thousands) since it needs to send a new request when the current page does not contain a match. A better way of doing this is use the `?name=XXX` query parameter to refine the results.

**Which issue this PR fixes**:

https://github.com/kubernetes/kubernetes/issues/26404

**Special notes for your reviewer**:

There were 2 ways of addressing this problem:

1. Use the `name` query parameter
2. Instead of using the list function, switch to using volume UUIDs and use the GET function instead. You'd need to change the signature of a few functions though, such as [`DeleteVolume`](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/cinder/cinder.go#L49), so I'm not sure how backwards compatible that is.

Since #1 does effectively the same as #2, I went with it because it ensures BC.

One assumption that is made is that the `volumeName` being retrieved matches exactly the name of the volume in Cinder. I'm not sure how accurate that is, but I see no reason why cloud providers would want to append/prefix things arbitrarily. 

**Release note**:
```release-note
Improves performance of Cinder volume attach/detach operations
```
2017-05-24 06:46:59 -07:00
Jamie Hannaford
4bd71a3b77 Refactor to use Volume IDs and remove ambiguity 2017-05-24 12:59:16 +02:00
pospispa
9eb912e62f Admin Can Specify in Which AWS Availability Zone(s) a PV Shall Be Created
An admin wants to specify in which AWS availability zone(s) users may create persistent volumes using dynamic provisioning.

That's why the admin can now configure in StorageClass object a comma separated list of zones. Dynamically created PVs for PVCs that use the StorageClass are created in one of the configured zones.
2017-05-24 10:48:11 +02:00
Kubernetes Submit Queue
c1c7365e7c Merge pull request #46147 from nicksardo/gce-cluster-id
Automatic merge from submit-queue (batch tested with PRs 45891, 46147)

Watching ClusterId from within GCE cloud provider

**What this PR does / why we need it**:
Adds the ability for the GCE cloud provider to watch a config map for `clusterId` and `providerId`.

WIP - still needs more testing

cc @MrHohn @csbell @madhusudancs @thockin @bowei @nikhiljindal 

**Release note**:
```release-note
NONE
```
2017-05-24 00:42:58 -07:00
Matthew Wong
11cb36e9dc Fix provisioned GCE PD not being reused if already exists 2017-05-23 18:30:37 -04:00
Nick Sardo
729303f0de Watching ClusterId from within GCE cloud provider 2017-05-23 14:11:24 -07:00
Kubernetes Submit Queue
455e9fff09 Merge pull request #46176 from vmware/vSphereStoragePolicySupport
Automatic merge from submit-queue

vSphere storage policy support for dynamic volume provisioning

Till now, vSphere cloud provider provides support to configure persistent volume with VSAN storage capabilities - kubernetes#42974. Right now this only works with VSAN.

Also there might be other use cases:

- The user might need a way to configure a policy on other datastores like VMFS, NFS etc.
- Use Storage IO control, VMCrypt policies for a persistent disk.

We can achieve about 2 use cases by using existing storage policies which are already created on vCenter using the Storage Policy Based Management service. The user will specify the SPBM policy ID as part of dynamic provisioning 

- resultant persistent volume will have the policy configured with it. 
- The persistent volume will be created on the compatible datastore that satisfies the storage policy requirements. 
- If there are multiple compatible datastores, the datastore with the max free space would be chosen by default.
- If the user specifies the datastore along with the storage policy ID, the volume will created on this datastore if its compatible. In case if the user specified datastore is incompatible, it would error out the reasons for incompatibility to the user.
- Also, the user will be able to see the associations of persistent volume object with the policy on the vCenter once the volume is attached to the node.

For instance in the below example, the volume will created on a compatible datastore with max free space that satisfies the "Gold" storage policy requirements.

```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
       name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
      diskformat: zeroedthick
      storagepolicyName: Gold
```

For instance in the below example, the vSphere CP checks if "VSANDatastore" is compatible with "Gold" storage policy requirements. If yes, volume will be provisioned on "VSANDatastore" else it will error that "VSANDatastore" is not compatible with the exact reason for failure.

```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
       name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
      diskformat: zeroedthick
      storagepolicyName: Gold
      datastore: VSANDatastore
```

As a part of this change, 4 commits have been added to this PR.

1. Vendor changes for vmware/govmomi
2. Changes to the VsphereVirtualDiskVolumeSource in the Kubernetes API. Added 2 additional fields StoragePolicyName, StoragePolicyID
3. Swagger and Open spec API changes.
4. vSphere Cloud Provider changes to implement the storage policy support.

**Release note**:


```release-note
vSphere cloud provider: vSphere Storage policy Support for dynamic volume provisioning
```
2017-05-22 23:41:10 -07:00
Kubernetes Submit Queue
3bfae793f0 Merge pull request #46008 from NickrenREN/openstack-add-metric
Automatic merge from submit-queue

Recording openstack metrics

add openstack operation metrics


**Release note**:
```release-note
Add support for emitting metrics from openstack cloudprovider about storage operations.
```

/assign @gnufied
2017-05-22 21:54:02 -07:00
Balu Dontu
eb3cf509e5 SPBM policy ID support in vsphere cloud provider 2017-05-22 19:45:17 -07:00
Jago Macleod
185e85b632 Rename readme.md to README.md 2017-05-21 23:06:46 -07:00
NickrenREN
18852c58c1 Recording openstack metrics
add openstack operation metrics
Add support for emitting metrics from openstack cloudprovider about storage operations.
2017-05-22 10:47:08 +08:00
Kubernetes Submit Queue
39308b8980 Merge pull request #38959 from Gradiant/master
Automatic merge from submit-queue

Adapt loadbalancer deleting/updating when using cloudprovider openstack in openstack/liberty

**What this PR does / why we need it**:
Make an extra verification on the returned listeners and pools because gophercloud query doesn't filter the results by loadbalancerID / listenerID respectively when using **openstack/librerty**.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#33759 
**Special notes for your reviewer**:
#33759 it's supposed to have a pull request which fixes this problem but in the release  1.5 loadbalancers doesn't use that patched code.
**Release note**:

NONE
```release-note
```
2017-05-21 03:22:02 -07:00
Luis Pabón
67d269749b aws: Support for ELB tagging by users
This PR provides support for tagging AWS ELBs using information in an
annotation and provided as a list of comma separated key-value pairs.

Closes https://github.com/kubernetes/community/pull/404
2017-05-19 16:34:50 -04:00
Jago Macleod
0cab203fbd Added deprecation notice and guidance for cloud providers. 2017-05-19 12:17:09 -07:00
realfake
250b229912 Implement providerID node functions for gce
*Add splitProviderID helper function
*Add getInstanceFromProjectInZoneByName function
*Implement gce InstanceTypeByProviderID
*Implement gce NodeAddressesByProviderID
2017-05-19 08:41:54 +02:00
Kubernetes Submit Queue
ead8c98cdb Merge pull request #45987 from nicksardo/cloud-init-kubeclient
Automatic merge from submit-queue

Initialize cloud providers with a K8s clientBuilder

**What this PR does / why we need it**:
This PR provides each cloud provider the ability to generate kubernetes clients. Either the full access or service account client builder is passed from the controller manager. Cloud providers could need to retrieve information from the cluster that isn't provided through defined interfaces, and this seems more preferable to adding parameters.

Please leave your thoughts/comments.

**Release note**:
```release-note
NONE
```
2017-05-18 20:51:24 -07:00
Kubernetes Submit Queue
be71ec717b Merge pull request #45201 from vmware/network_id
Automatic merge from submit-queue

Same internal and external ip for vSphere Cloud Provider

Currently, vSphere Cloud Provider reports internal ip as container ip addresses. This PR modifies vSphere Cloud Provider to report same ip address as both internal and external that is provided by vmware infrastructure. 
cc @pdhamdhere @tusharnt @BaluDontu @divyenpatel @luomiao
2017-05-18 13:31:02 -07:00
Kubernetes Submit Queue
f231576f29 Merge pull request #45443 from abrarshivani/owners_cloud_providers
Automatic merge from submit-queue

Add approvers to vsphere cloudprovider

This PR adds approvers for vSphere Cloud provider.
cc @pdhamdhere @tusharnt @BaluDontu @divyenpatel @luomiao
2017-05-18 11:36:25 -07:00
Kubernetes Submit Queue
f760d5a592 Merge pull request #46001 from bowei/alpha-to-beta
Automatic merge from submit-queue

Use beta GCP API instead of alpha in CloudCIDR controller

The feature we are using has been promoted to beta.

```release-note
NONE
```
2017-05-18 11:36:19 -07:00
NickrenREN
9370808a35 Add myself to openstack review pool 2017-05-18 13:37:48 +08:00
Bowei Du
c77ffb2685 Use beta GCP API instead of alpha in CloudCIDR controller
The feature we are using has been promoted to beta.
2017-05-17 16:18:29 -07:00
Nick Sardo
87a5edd2cd Initialize cloud providers with a K8s clientBuilder 2017-05-17 14:38:25 -07:00
Kubernetes Submit Queue
bcf5837c94 Merge pull request #45912 from nicksardo/gce-src-ip-dedupe
Automatic merge from submit-queue (batch tested with PRs 45884, 45879, 45912, 45444, 45874)

[GCE] Removed duplicate CIDR

**What this PR does / why we need it**:
Removes a duplicate CIDR in the list of LB source CIDRs.
https://cloud.google.com/compute/docs/load-balancing/network/ and https://cloud.google.com/compute/docs/load-balancing/http/ both list `35.191.0.0/16`.  Only one is needed.

**Release note**:

```release-note
NONE
```
2017-05-16 22:18:54 -07:00
Kubernetes Submit Queue
f171683242 Merge pull request #44537 from FengyunPan/fix-volume-bug
Automatic merge from submit-queue (batch tested with PRs 45374, 44537, 45739, 44474, 45888)

Fix attach volume to instance repeatedly

1.When volume's status is 'attaching', controllermanager will attach
    it again and return err. So it is necessary to check volume's
    status before attach/detach volume.

   2. When volume's status is 'attaching', its attachments will be None,
    controllermanager can't get device path and make some failed event.
    But it is normal, so don't return err when attachments is None

Fix bug: #44536
2017-05-16 18:10:55 -07:00
Nick Sardo
908bcc3b24 Removed duplicate CIDR 2017-05-16 14:24:57 -07:00
Abrar Shivani
c7a22a588f Made internal-and-external-ip-same 2017-05-12 18:04:15 -07:00
Kubernetes Submit Queue
35eba22cc7 Merge pull request #41162 from MrHohn/esipp-ga
Automatic merge from submit-queue (batch tested with PRs 45623, 45241, 45460, 41162)

Promotes Source IP preservation for Virtual IPs from Beta to GA

Fixes #33625. Feature issue: kubernetes/features#27.

Bullet points:
- Declare 2 fields (ExternalTraffic and HealthCheckNodePort) that mirror the ESIPP annotations.
- ESIPP alpha annotations will be ignored.
- Existing ESIPP beta annotations will still be fully supported.
- Allow promoting beta annotations to first class fields or reversely.
- Disallow setting invalid ExternalTraffic and HealthCheckNodePort on services. Default ExternalTraffic field for nodePort or loadBalancer type service to "Global" if not set.

**Release note**:

```release-note
Promotes Source IP preservation for Virtual IPs to GA.

Two api fields are defined correspondingly:
- Service.Spec.ExternalTrafficPolicy <- 'service.beta.kubernetes.io/external-traffic' annotation.
- Service.Spec.HealthCheckNodePort <- 'service.beta.kubernetes.io/healthcheck-nodeport' annotation.
```
2017-05-12 15:00:46 -07:00