Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure public IP removed after service deleted
**What this PR does / why we need it**:
When creating many LoadBalancer services, some services may exceed Azure basic LB's FrontendIPConfiguations quota (default is 10). Public IPs are created for all services, but it is not removed after deleting the kubernetes services.
This PR fixes the problem.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59255
**Special notes for your reviewer**:
Should cherry-pick to v1.9.
**Release note**:
```release-note
Ensure Azure public IP removed after service deleted
```
Automatic merge from submit-queue (batch tested with PRs 59165, 59386). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Refactor and add some tests.
Improve test coverage of the Azure cloud provider.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Use beta instead of alpha GCE Compute API to add an alias range to an instance.
… instance.
**What this PR does / why we need it**:
We use beta instead of alpha GCE Compute API to add an alias range to an instance. Without this change, such an API is reserved for GCP projects which has whitelisted for this feature.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
N/A
**Special notes for your reviewer**:
**Release note**:
```release-note
"NONE".
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add PV size grow feature for azure file
**What this PR does / why we need it**:
According to kubernetes/features#284, add size grow feature for azure file
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56462
**Special notes for your reviewer**:
Since azure file is using SMB 3.0 protocal, there is no necessary to resize filesystem on agent side, the agent node will detect the changed size automatically.
**Release note**:
```
add size grow feature for azure file
```
/sig azure
@gnufied @rootfs @brendandburns
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add UT test to openstack and two para in configFromEnv
**What this PR does / why we need it**:
configFromEnv fun miss some para that the type define and add ut to TestToAuthOptions fun
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
/assign @dims
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix non-interface type ErrResourceNotFound on left
Related to #58145
The gophercloud.ErrResourceNotFound is not a interface, so should
use reflect to get its type then do a check.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Correct the URL of openstack and make test case more detail
**What this PR does / why we need it**:
correct the url of openstack doc and make the test case more detail,thanks!
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#54044
**Special notes for your reviewer**:
/assign @dims
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59097, 57076, 59295). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: sort firewall parameters
**What this PR does / why we need it**:
Make the firewall arguments deterministic.
Fixes#59294
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix a typo in pkg/cloudprovider/providers/azure/azure_loadbalancer.go
**What this PR does / why we need it**:
fix typo
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add UT test TestCheckOpenStackOptsfunc
**What this PR does / why we need it**:
checkOpenStackOpts func has three case that the test case not Covered,so add it,thanks
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure IP is set for Azure internal loadbalancer
**What this PR does / why we need it**:
Internal Load Balancer created and associated with availability set but no target network ip configurations on Azure. And kube-controller-manager would panic because of nil pointer dereference.
This PR ensures it is set correctly.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#59046
**Special notes for your reviewer**:
Should cherry-pick to v1.9
**Release note**:
```release-note
Ensure IP is set for Azure internal load balancer.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
cloudprovider/openstack: fix bug that tries to use octavia client to query flip
**What this PR does / why we need it**:
This fixes a bug that [potentially] tries to use an Octavia client to query a floating ip. Neutron should always handle those.
**Release note**:
```release-note
cloudprovider/openstack: fix bug the tries to use octavia client to query flip
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
use glog.info instead of glog.infof when no format
**What this PR does / why we need it**:
use glog.info instead of glog.infof when no format
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 59012, 58871). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support GetLabelsForVolume In OpenStack
**What this PR does / why we need it**:
Since PersistentVolumeLabelController will invoke ```GetLabelsForVolume``` interface
in Cloud-Controller-Manager, OpenStack Provider should support it.
https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/cloud.go#L213
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58870
**Special notes for your reviewer**:
**Release note**:
```release-note
Support GetLabelsForVolume in OpenStack Provider
```
Automatic merge from submit-queue (batch tested with PRs 58777, 58978, 58977, 58775). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix url parsing for staging/dev endpoint
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up unused functions and consts
**What this PR does / why we need it**:
Clean up unused functions and consts.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58661, 58764, 58368, 58739, 58773). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[GCE cloud provider] Ensure hosts are updated in EnsureLoadBalancer()
**What this PR does / why we need it**:
From https://github.com/kubernetes/kubernetes/issues/56527, the `EnsureLoadBalancer()` implementation in GCE external LB doesn't always update the hosts (nodes). This PR makes it to do so.
Previously, the only situation where `ensureExternalLoadBalancer()` will not update hosts is when hosts are updated but there is no other changes that trigger target pool update (for which we delete&recreate target pool and hence updates the hosts). So the main change here is detecting that condition and call `updateTargetPool()`.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56527
**Special notes for your reviewer**:
Turned out it could be a small change, so I gave it a try.
/assign @nicksardo @bowei
**Release note**:
```release-note
NONE
```
Reuse the
service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags
annotation for these tags.
We think that this is a simpler and less intrusive approach than adding
a new annotation, and most users will want the same set of tags applied.
Signed-off-by: Brett Kochendorfer <brett.kochendorfer@gmail.com>
Automatic merge from submit-queue (batch tested with PRs 58697, 58658, 58676, 58674). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Expose the generate stub for compute API
This allows clients such as Ingress to begin migration to the newly
generated stubs.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58697, 58658, 58676, 58674). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix possible panic when getting Azure primary IPConfig
**What this PR does / why we need it**:
kube-controller-manager panic when removing a lot of nodes from kubernetes cluster (see #58675).
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58675
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58590, 58667). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix logs message formating
**What this PR does / why we need it**:
Fix logs message formating.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57867, 58490, 58502, 58134). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add additional unit tests for Azure cloud provider.
@feiskyer @andyzhangx @khenidak
Automatic merge from submit-queue (batch tested with PRs 57867, 58490, 58502, 58134). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Openstack: register metadata.hostname as node name
**What this PR does / why we need it**:
Currently Openstack can boot up instances with the name like `xyz/abc`, which is not a valid kubelet node name. While `hostname` retrieved from `meta_data.json` has already been sanitized
by Openstack to valid DNS-1123 format string. It's safe to register this `metadata.hostname` as valid kubelet node name.
/kind bug
/sig openstack
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57765
**Special notes for your reviewer**:
/assign @dims @FengyunPan
**Release note**:
```release-note
Openstack: register metadata.hostname as node name
```
Automatic merge from submit-queue (batch tested with PRs 57867, 58490, 58502, 58134). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: neg to use generated code
GCE: neg to use generated code
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58412, 56132, 58506, 58542, 58394). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update Instances to use generated code
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58547, 57228, 58528, 58499, 58618). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove getOldSecurityGroupName() from OpenStack cloud provider
Related to #53764
The getOldSecurityGroupName() is used to get the old security
group name, we can remove it now.
**What this PR does / why we need it**:
#53764
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58300, 58530, 57942, 58543). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ability to specify OS_* variables for OpenStack configuration
**What this PR does / why we need it**:
When we convert the OpenStack cloud provider to run in an external
process, we should be able to use kubernetes Secrets capability to
inject the OS_* variables. This way we can specify the cloud
configuration as a configmap, specify secrets for the userid/password
information. The configmap can be mounted as a file. the secrets can
be made available as environment variables. the external controller
itself can run as a pod/daemonset.
For backward compat, we preload all the OS_* variables, if anything
is in the config file, then that overrides the environment variables.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Authentication information for OpenStack cloud provider can now be specified as environment variables
```
Automatic merge from submit-queue (batch tested with PRs 56948, 58365, 58501). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update Zones to use generated code
Update Zones to use generated code
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Change routes to use the generated code
GCE: Change routes to use the generated code
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57908, 58436). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Updates UrlMap, BackendService, Healthcheck, Certs, InstanceGroup to use the generated code
Updates UrlMap, BackendService, Healthcheck, Certs, InstanceGroup to use the generated code
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58104, 58492, 58491). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: forwarding rules to use generated code
GCE: forwarding rules to use generated code
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58104, 58492, 58491). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: addresses to use generated code
GCE: addresses to use generated code
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55918, 57258). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add multi-vc configuration for e2e tests
**What this PR does / why we need it**:
Currently, we accept configuration for only single VC in e2e tests. This PR adds support for multiple VC configuration for e2e tests.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/vmware/kubernetes/issues/412
**Special notes for your reviewer**:
Internally reviewed here: https://github.com/vmware/kubernetes/pull/418
**Release note**:
```release-note
NONE
```
// cc @divyenpatel @shaominchen
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add vSphere Cloud Provider simulator based tests
**What this PR does / why we need it**:
Initial set of vSphere Cloud Provider functional tests against the vCenter simulator, provides test coverage without having to run against a real vCenter instance.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
The vsphere simulator recently moved from vmware/vic to govmomi, I had discussed the idea of introducing it for testing with vSphere Cloud Provider maintainers. These tests provide 90%+ coverage for vclib/datacenter.go, but we can expand further of course.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58422, 58229, 58421, 58435, 58475). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add handling for method that use Pages() to retrieve results
- Add handling for method that use Pages() to retrieve results
- Make functions take in *Key rather than value type.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 58422, 58229, 58421, 58435, 58475). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update gce call to use wrapper in gce_loadbalancer_external
**What this PR does / why we need it**:
Ack https://github.com/kubernetes/kubernetes/pull/58368#discussion_r162139441, replacing some direct compute api calls to use wrapper.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @nicksardo
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
openstack: remove orphaned routes from terminated instances
**What this PR does / why we need it**:
At the moment the openstack cloudprovider only returns routes where the `NextHop` address points to an existing openstack instance. This is a problem when an instance is terminated before the corresponding node is removed from k8s. The existing route is not returned by the cloudprovider anymore and therefore never considered for deletion by the route controller. When the route's `DestinationCIDR` is reassigned to a new node the router ends up with two routes pointing to a different `NextHop` leading to broken networking.
This PR removes skipping routes pointing to unknown next hops when listing routes. This should cause [this conditional](93dc3763b0/pkg/controller/route/route_controller.go (L208)) in the route controller to succeed and have the route removed if the route controller [feels responsible](93dc3763b0/pkg/controller/route/route_controller.go (L206)).
```release-note
OpenStack cloudprovider: Ensure orphaned routes are removed.
```
When we convert the OpenStack cloud provider to run in an external
process, we should be able to use kubernetes Secrets capability to
inject the OS_* variables. This way we can specify the cloud
configuration as a configmap, specify secrets for the userid/password
information. The configmap can be mounted as a file. the secrets can
be made available as environment variables. the external controller
itself can run as a pod/daemonset.
For backward compat, we preload all the OS_* variables, if anything
is in the config file, then that overrides the environment variables.
Automatic merge from submit-queue (batch tested with PRs 53631, 56960). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove unused code in UT files in pkg/
**What this PR does / why we need it**:
Remove unused code in UT files in pkg/ .
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Convert ComputerName to lower case
**What this PR does / why we need it**:
When instances count is greater than 10, azure will generate instance hostname with upper cases. But kubelet always converts hostname to lower case.
So azure cloud provider should also covert OsProfile.ComputerName to lower case.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#58274
**Special notes for your reviewer**:
This PR also adds more unit tests for scale sets.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Log message at a better level
**What this PR does / why we need it**:
We don't really need to log this meessage at level 1.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix typeos in cloud-controller-manager
**What this PR does / why we need it**:
fix typeos
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
NONE
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Instrument the Azure API calls for Prometheus monitoring
**What this PR does / why we need it**:
Instruments the Azure API calls for Prometheus monitoring.
**Special notes for your reviewer**:
This is version 2 based on the wrapped clients.
**Release note**:
```release-note
Instrument the Azure cloud provider for Prometheus monitoring.
```
cc @feiskyer @andyzhangx @jdumars @khenidak
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
The lbaas.opts.SubnetId should be set by subnet id.
Fix#58145
The getSubnetIDForLB() should return subnet id rather than net id.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add `cloud` for the generated GCE interfaces, support structs
Note: this does not wire the generated code.
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Wrap azure client calls
**What this PR does / why we need it**:
This is a clean up for azure client calls. It adds wrappers over azure clients and moves verbose logs and rate limiter inside.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/cc @cosmincojocar @andyzhangx
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix cinder detach problems
**What this PR does / why we need it**: We have currently huge problems in cinder volume detach. This PR tries to fix these issues.
**Which issue(s) this PR fixes**:
Fixes#50004Fixes#57497
**Special notes for your reviewer**:
**Release note**:
```release-note
openstack cinder detach problem is fixed if nova is shutdowned
```
Automatic merge from submit-queue (batch tested with PRs 54230, 58100, 57861, 54752). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Use existing subnetwork of ILB forwarding rule
Fixes#57860
**Test Cases**:
Clusters using auto network with existence of a manual subnet in same region.
- [x] Upgrade 1.7 cluster with existing ILBs to latest. Confirm existing ILBs still are synced.
Version 1.7 does not attempt to fill in the subnetwork, so the forwarding rule was created with the correct subnetwork.
- [x] Upgrade 1.8 cluster with existing ILBs to latest. Confirm existing ILBs (using wrong subnet) still are synced.
- [x] Latest version creates ILBs using the correct subnet.
Clusters with manual subnets have always and will continue to use the subnet specified in gce.conf.
- [x] Upgrade 1.8 cluster with existing ILBs to latest. Confirm existing ILBs (using manual subnet) still are synced.
Clusters with legacy networks have always and will continue to use an empty subnet.
- [x] Upgrade 1.8 cluster with existing ILBs to latest. Confirm existing ILBs (using legacy network) still are synced.
**Release note**:
```release-note
GCE: Allows existing internal load balancers to continue using an outdated subnetwork
```
Automatic merge from submit-queue (batch tested with PRs 57511, 57978). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Renews cached NodeInfo with new vSphere connection
**What this PR does / why we need it**:
This PR modifies two public functions of nodemanager.go- GetNodeInfo and GetNodeDetails. For both these functions NodeInfo object is renewed with new GoVmomiClient and new vclib VirtualMachine and Datacenter.
**Which issue(s) this PR fixes** :
Fixes vmware#404
**Special notes for your reviewer**:
Code has been structured to minimize impact on existing 1.9 release code and any side-effects due to NodeInfo modification. This is a quick solution for vSphere connection renewal problem. A more enhanced solution is target for upcoming major release.
Testing:
- [x] Successfully tried out pod creation, deletion with dynamic volume.
- [x] Successfully ran e2e tests.
**Release note**:
```release-note
Fixes authentication problem faced during various vSphere operations.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix NLB icmp permission duplication
**What this PR does / why we need it**:
Fixes an issue with the ICMP rule for MTU during the creation of a NLB
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56703
Automatic merge from submit-queue (batch tested with PRs 57991, 57789). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix exists status for azure GetLoadBalancer
**What this PR does / why we need it**:
We see a lot of log indicating load balancer not found in azure:
```
E0109 07:00:31.126306 1 service_controller.go:776] Failed to process service kube-system/heapster. Retrying in 5m0s: error getting LB for service kube-system/heapster: Service(kube-system/heapster) - Loadbalancer not found
I0109 07:00:31.126384 1 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"kube-system", Name:"heapster", UID:"400266e7-f507-11e7-bbc2-000d3af86f66", APIVersion:"v1", ResourceVersion:"450", FieldPath:""}): type: 'Warning' reason: 'CreatingLoadBalancerFailed' Error creating load balancer (will retry): error getting LB for service kube-system/heapster: Service(kube-system/heapster) - Loadbalancer not found
I0109 07:00:31.158858 1 azure_backoff.go:177] LoadBalancerClient.List(name) - backoff: success
E0109 07:00:31.158930 1 service_controller.go:776] Failed to process service kube-system/kubernetes-dashboard. Retrying in 5m0s: error getting LB for service kube-system/kubernetes-dashboard: Service(kube-system/kubernetes-dashboard) - Loadbalancer not found
I0109 07:00:31.158988 1 event.go:218] Event(v1.ObjectReference{Kind:"Service", Namespace:"kube-system", Name:"kubernetes-dashboard", UID:"4052f12b-f507-11e7-bbc2-000d3af86f66", APIVersion:"v1", ResourceVersion:"498", FieldPath:""}): type: 'Warning' reason: 'CreatingLoadBalancerFailed' Error creating load balancer (will retry): error getting LB for service kube-system/kubernetes-dashboard: Service(kube-system/kubernetes-dashboard) - Loadbalancer not found
```
It's interesting that those service does not need loadbalancer, and caller is just checking whether one loadbalancer exists.
009701f181/pkg/controller/service/service_controller.go (L287)
And in we can see when err is not nil, it will not check exists value. Thus we should not return error when exists=false.
This was changed in:
edfb2ad552 (diff-c901394068476b4ccb003a6c6efad57cR63)
The PR removes the error when exists=false.
**Which issue(s) this PR fixes**
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Do not set BaseURI twice
**What this PR does / why we need it**:
Do not set BaseURI again. BaseURI has been set by NewAccountsClientWithBaseURI and NewDisksClientWithBaseURI method.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57951
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @karataliu
add getNodeNameByID and use volume.AttachedDevice as devicepath
use uppercase functionname
do not delete automatically nodes if node is shutdowned in openstack
do not delete node
fix gofmt
fix cinder detach if instance is not in active state
fix gofmt
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure Azure LB removable when VMSS is enabled
**What this PR does / why we need it**:
When VMSS enabled, Azure LB not removed after all LoadBalancer services deleted.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57826
**Special notes for your reviewer**:
This PR upgrades Azure GO SDK to latest release and adds a workaround to fix the problem.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add generic interface for azure clients
**What this PR does / why we need it**:
Continue of #43287. Moving remaining clients to generic interfaces.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Continue of #43287.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Minor commenting fixes for Azure Disk Controllers from CR
**What this PR does / why we need it**:
Minor commenting fixes for Azure Disk Controllers from code review.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Uniform Azure VM api calls
**What this PR does / why we need it**:
There is still a call to 'VirtualMachinesClient.Get' directly in azure_backoff, which does not go through the cache approach.
This PR uniforms all calls for getting azure vm to use 'getVirtualMachine'. Also refine unused 'exists' return value.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Related #57031
Follow-up #57432
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57696, 57821, 56317). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move DefaultMaxEBSVolumes constant into scheduler
**What this PR does / why we need it**:
A constant only used by the scheduler lives in the aws cloudprovider package. Moving the constant into the only package where it is used reduces import bloat. Testing with the dockerized build environment, the kube-scheduler binary went from 61748499 bytes to 47339144 bytes on amd64 with this change.
**Release note**:
```release-note
NONE
```
A constant only used by the scheduler lives in the aws cloudprovider
package. Moving the constant into the only package where it is used
reduces import bloat.
Automatic merge from submit-queue (batch tested with PRs 49856, 56257, 57027, 57695, 57432). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add cache for VirtualMachinesClient.Get in azure cloud provider
**What this PR does / why we need it**:
Add a timed cache for 'VirtualMachinesClient.Get'
Currently cloud provider will send several get calls to same URL in short period, which is not necessary.
**Which issue(s) this PR fixes**:
Fixes#57031
**Special notes for your reviewer**:
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 49856, 56257, 57027, 57695, 57432). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix vmss listing for Azure cloud provider
**What this PR does / why we need it**:
Fix a stupid bug of vmss listing: if there is only one instance, listScaleSetsWithRetry and listScaleSetVMsWithRetry will return empty list.
This PR also adds more verbose logs.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Related of #43287.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixing vSphere Cloud Provider to use "vsphere-cloud-provider" to create ClientBuilder
**What this PR does / why we need it**:
vSphere cloud Provider is not using lower case naming while creating clientBuilder.
With this fix, ClientBuilder is created using lowercase naming.
With mixed upper-lower case name, controller manager is crashing.
**Which issue(s) this PR fixes**
Fixes # https://github.com/kubernetes/kubernetes/issues/57279
**Special notes for your reviewer**:
None
**Release note**:
```release-note
This fixes controller manager crash in certain vSphere cloud provider environment.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove useInstanceMetadata param from Azure cloud provider
**What this PR does / why we need it**:
With out-of-tree Azure cloud provider (#50752), metadata won't work any more (kubelet won't call those metadata functions any more).
This PR removes the parameter useInstanceMetadata from Azure cloud provider.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57646.
**Special notes for your reviewer**:
**Release note**:
```release-note
Remove useInstanceMetadata parameter from Azure cloud provider.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reduce VirtualMachineScaleSetsClient#List calls for Azure cloud provider
**What this PR does / why we need it**:
For master nodes not managed by VMSS, current cloud provider would updateCaches each time when finding master nodes info. This could result in call limits of `VirtualMachineScaleSetsClient#List`.
This PR adds a caches to those nodes which reduces the cache updating significantly.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Continue of #43287.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57502, 57543). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support multiple scale sets in Azure cloud provider
**What this PR does / why we need it**:
This PR adds multiple scale sets support in Azure cloud provider.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Continue of #43287.
**Special notes for your reviewer**:
- Adds a local cache of basic scale sets information
- Update the cache when new nodes are not found or periodically
- Since azure doesn't support getting the scale set which contains the node, the cache is updated via listing all scale sets and their virtual machines
**Release note**:
```release-note
Support multiple scale sets in Azure cloud provider.
```
/assign @brendandburns @andyzhangx
Automatic merge from submit-queue (batch tested with PRs 57351, 55654). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Get automatically created subnetwork if none is specified for auto network
Fixes#57350
**Release note**:
```release-note
GCE: Fixes ILB creation on automatic networks with manually created subnetworks.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add Dong Liu as approver and add OWNERS in credentialprovider
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57540
**Special notes for your reviewer**:
**Release note**:
```
none
```
/sig azure
/assign @brendandburns
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow use resource ID to specify public IP address in azure_loadbalancer
**What this PR does / why we need it**: Currently the Azure load balancer assumes that a Public IP address is in the same resource group as the cluster. This is not necessarily true in all environments, in addition to accepting a Public IP, we should allow an annotation to the `Service` object that indicates what resource group the IP is present in.
**Which issue this PR fixes**: fixes#53274#52129
**Special notes for your reviewer**: *first time golang user, please forgive the amateurness*
Release note
```release-note
Allow use resource ID to specify public IP address in azure_loadbalancer
```
Automatic merge from submit-queue (batch tested with PRs 57282, 57484). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix a bug in validating node existence.
**What this PR does / why we need it**:
Fixes an bug where if an error was returned that was not an `autorest.DetailedError` we would return `"not found", nil` which would result in Nodes going `NotReady`
**Which issue(s) this PR fixes**
Fixes#57483
**Release note**:
```release-note
Fixes an bug where if an error was returned that was not an `autorest.DetailedError` we would return `"not found", nil` which caused nodes to go to `NotReady` state.
```
@feiskyer @khendiak
Automatic merge from submit-queue (batch tested with PRs 57282, 57484). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove dead code in cloudprovider
**What this PR does / why we need it**:
Remove dead code in `cloudprovider`
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55751, 57337, 56406, 56864, 57347). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
should reuse code rather than rewrite it
**What this PR does / why we need it**:
should reuse `dc.GetDatastoreByName()`, instead of rewrite it
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 56375, 56872, 57053, 57165, 57218). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Compare correct file names for volume detach operation
**What this PR does / why we need it**:
Current volume detach code compares volume path with disk path, as it is. This PR removes '.vmdk' extension from both paths and then compares them. This makes sure that correct comparison is done irrespective of a missing '.vmdk' extension from one of the paths.
**Which issue(s) this PR fixes**:
Fixes https://github.com/vmware/kubernetes/issues/392
**Special notes for your reviewer**:
Deployed cluster on vSphere and provisioned a static volume. Verified that a statically provisioned volume gets detached even when volume path didn't contain any .vmdk extension and disk path had .vmdk extension.
**Release note**:
```vSphere cloud provider: Fix detach operation for volumes, when .vmdk extension is not specified in volume path.```
Automatic merge from submit-queue (batch tested with PRs 57122, 57142, 57016, 56927, 56678). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
should not ignore return messages from wait function
**What this PR does / why we need it**:
It should not ignore return messages for `wait*` function. When it go wrong, need `return` at once.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57148, 57123, 57091, 57141, 57131). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support LoadBalancer for Azure Virtual Machine Scale Sets
**What this PR does / why we need it**:
Continue of #43287, this PR adds LoadBalancer support for Azure Virtual Machine Scale Sets. To achieve this, this PR also
- Add a general VMSet interfaces for VMSS and VMAS, so that we won't add much if-else blocks for different logics
- Add scale sets implementation and availability sets implementation to VMSet
- Add vmSet property to Azure cloud provider and call vmSet instead of direct azure clients
- Add LoadBalancer support based vmSet
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #43287.
**Special notes for your reviewer**:
**Release note**:
```release-note
Support LoadBalancer for Azure Virtual Machine Scale Sets
```
/assign @brendandburns
Automatic merge from submit-queue (batch tested with PRs 56997, 57008, 56984, 56975, 56955). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove unused ScrubDNS interface from cloudprovider
**What this PR does / why we need it**:
DNS scrubber from kubelet has been removed in #36785 and cloudprovider's `ScrubDNS()` interface is not used anywhere.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56953.
**Special notes for your reviewer**:
**Release note**:
```release-note
Remove ScrubDNS interface from cloudprovider.
```
Automatic merge from submit-queue (batch tested with PRs 56997, 57008, 56984, 56975, 56955). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Sort default cidrs for reproducible builds
**What this PR does / why we need it**:
In different distros or environments, we may end up with a different
order of the default string printed during help and man page generation,
So we should sort so the string we print is the same everytime.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#52269
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 56639, 56746, 56715, 56673, 56726). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix issue #390
**What this PR does / why we need it**:
When VM node is removed from vSphere Inventory, the corresponding Kubernetes node is unregistered and removed from registeredNodes cache in nodemanager. However, it is not removed from the other node info cache in nodemanager. The fix is to update the other cache accordingly.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/vmware/kubernetes/issues/390
**Special notes for your reviewer**:
Internally review PR here: https://github.com/vmware/kubernetes/pull/402
**Release note**:
```
NONE
```
Testing Done:
1. Removed the node VM from vSphere inventory.
2. Create storageclass and pvc to provision volume dynamically
Automatic merge from submit-queue (batch tested with PRs 56480, 56675, 56624, 56648, 56658). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix session out in vSphere Cloud Provider
**What this PR does / why we need it**:
When the disk is attached error is returned in case of VM migration but the disk is attached successfully. When pvc is created for provisioning volume dynamically the volume is not provisioned since the vc session was expired and not renewed. This PR fixes both the issues.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/vmware/kubernetes/issues/393https://github.com/vmware/kubernetes/issues/391
**Special notes for your reviewer**:
Internally review PR here: https://github.com/vmware/kubernetes/pull/396
**Release note**:
```release-note
NONE
```
Test Done:
Test for fix https://github.com/vmware/kubernetes/issues/391 (Prints error on attach disk)
- Create storageclass and pvc to provision volume dynamically.
- Migrated VM to different VC.
- Created Pod with volume provisioned at step 1.
- Executed command kubectl describe pod.
After Fix: Didn't find the error message.
Test for fix https://github.com/vmware/kubernetes/pull/396 (Session out)
Tests which reported this issue: go run hack/e2e.go --check-version-skew=false --v --test '--test_args=--ginkgo.focus=Selector-Label\sVolume\sBinding:vsphere'
After Fix: This tests didn't report any errors.
Automatic merge from submit-queue (batch tested with PRs 56480, 56675, 56624, 56648, 56658). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove unnecessary condition
**What this PR does / why we need it**:
Now that we have judgement `loadbalancer == nil` in `L1286`, the condition is uncessary.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 56337, 56546, 56550, 56633, 56635). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix incorrect error info when creating an azure file PVC failed
**What this PR does / why we need it**:
when creating an azure file PVC failed, it will always return following error which is totally incorrect:
```
Failed to provision volume with StorageClass "azurefile-premium": failed to find a matching storage account
```
The incorrect error info would mislead customer a lot, I would suggest return error directly if create first file share failed.
By this PR, the error info would be like following, which would provide user detailed and **correct** info:
```
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 13s persistentvolume-controller Failed to provision volume with StorageClass "azurefile-premium": failed to create share andy-k8s182-dynamic-pvc-cd66f4bd-d4c4-11e7-9f09-000d3a019e90 in account 00mqk6lqaouexy6agnt0: failed to create file share, err: Put https://00mqk6lqaouexy6agnt0.file.core.windows.net/andy-k8s182-dynamic-pvc-cd66f4bd-d4c4-11e7-9f09-000d3a019e90?restype=share: dial tcp: lookup 00mqk6lqaouexy6agnt0.file.core.windows.net on 168.63.129.16:53: no such host
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56548
**Special notes for your reviewer**:
**Release note**:
```
none
```
/sig azure
/assign @rootfs
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix managed identity issue: use ListByResourceGroup instead of List()
**What this PR does / why we need it**:
fix managed identity issue: use ListByResourceGroup instead of List(), use `StorageAccountClient.List()` func would get all storage accounts from current subscription which is not necessary, k8s cluster would only need storage accounts in the same resource group
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55837
**Special notes for your reviewer**:
**Release note**:
```
none
```
/sig azure
/assign @rootfs
@karataliu
Automatic merge from submit-queue (batch tested with PRs 55557, 55504, 56269, 55604, 56202). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Change wording in OpenStack Provider
**What this PR does / why we need it**:
Change working from "dealy" into "delay" in OpenStack Provider.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54410, 56184, 56199, 56191, 56231). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Delete unused code in gce/ensureInternalBackendService
existingIGLinks is not used after initialization, because instance groups are compared in backendSvcEqual()
**Release note**:
```release-note
NONE
```
/area cloudprovider
/area platform/gce
/sig gcp
Automatic merge from submit-queue (batch tested with PRs 56599, 56824, 56918, 56967, 56959). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Check both name and ports for azure health probes
**What this PR does / why we need it**:
Check both name and ports for azure health probes, so that probe ports could follow nodePorts changes.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56898
**Special notes for your reviewer**:
Should be cherry-picked in 1.7, 1.8, 1.9.
**Release note**:
```release-note
BUG FIX: Check both name and ports for azure health probes
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove time waiting after create storage account (save 25s)
**What this PR does / why we need it**:
I found azure cloud provider will always sleep 25 seconds after creating a new azure storage account:
https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure_blobDiskController.go#L531
Actually it's not necessary now, since it's already using sync way to create a storage account:
https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure_blobDiskController.go#L531
Above code will wait until the storage account is created in azure.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56674
**Special notes for your reviewer**:
Below are logs without this PR:
```
I1201 06:41:22.486663 1 azure_blobDiskController.go:522] azureDisk - Creating storage account pvc3329812692002 type Standard_LRS
I1201 06:41:22.486810 1 azure_blobDiskController.go:531] azureDisk - Creating storage account pvc3329812692002 type Standard_LRS begin to wait
I1201 06:41:40.440005 1 azure_blobDiskController.go:533] azureDisk - Creating storage account pvc3329812692002 type Standard_LRS end wait
I1201 06:41:40.440030 1 azure_blobDiskController.go:551] azureDisk - storage account pvc3329812692002 was just created, allowing time before polling status
I1201 06:42:05.440176 1 azure_blobDiskController.go:553] azureDisk - storage account pvc3329812692002 was just created, allowing time before polling status, end wait
```
Below are logs with this PR, it could save 25s now:
```
I1201 07:36:07.755540 1 azure_blobDiskController.go:523] azureDisk - Creating storage account pvc33298126923895004820 type Standard_LRS
I1201 07:36:07.755652 1 azure_blobDiskController.go:532] azureDisk - Creating storage account pvc33298126923895004820 type Standard_LRS begin to wait
I1201 07:36:25.722540 1 azure_blobDiskController.go:534] azureDisk - Creating storage account pvc33298126923895004820 type Standard_LRS end wait
I1201 07:36:25.722557 1 azure_blobDiskController.go:552] azureDisk - storage account pvc33298126923895004820 was just created, allowing time before polling status
I1201 07:36:25.722562 1 azure_blobDiskController.go:554] azureDisk - storage account pvc33298126923895004820 was just created, allowing time before polling status, end wait
I1201 07:36:26.011157 1 azure_blobDiskController.go:436] azureDisk - storage account:pvc33298126923895004820 had no default container(3329812692) and it was created
I1201 07:36:26.011201 1 azure_blobDiskController.go:182] azureDisk - creating page blob andy-mgwin1710-dynamic-pvc-88c50c37-d668-11e7-94dc-000d3a041274.vhd in container 3329812692 account pvc33298126923895004820
```
**Release note**:
```
none
```
/sig azure
/assign @khenidak
In different distros or environments, we may end up with a different
order of the default string printed during help and man page generation,
So we should sort so the string we print is the same everytime.
Most attach/detach operations on AWS finish within 1-4seconds.
By using a shorter time interval and higher exponetial
factor we can shorten time taken for attach and detach to complete.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prevent deadlock on azure zone fetch in presence of failure
**What this PR does / why we need it**:
This fixes a bug in the Zone get function for the Azure cloud provider.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Fix deadlock in azure cloud provider zone fetching
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix static IP issue for Azure internal LB
**What this PR does / why we need it**:
Fix regression for Azure internal LB with static IP support
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56686
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 52013, 56719). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support autoprobing floating-network-id for openstack cloud provider
Currently if user doesn't specify floatingnetwork-id and loadbalancer.openstack.org/floating-network-id annotation, openstack cloud provider can't create a external LoadBalancer service.
Actually we can get floatingnetwork-id automatically.
If we get multiple floatingnetwork-ids, then ask user to specify one, or we use the floatingnetwork-id to create floatingip for external LoadBalancer service.
This is a part of #50726
**Special notes for your reviewer**:
/assign @dims
**Release note**:
```release-note
Support autoprobing floating-network-id for openstack cloud provider
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
AWS: Support for mounting nvme volumes
Supports mounting nvme volumes
Fixes#56155
```release-note
AWS: Detect EBS volumes mounted via NVME and mount them
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix CreateVolume func: use search mode instead
**What this PR does / why we need it**:
This is a little fall back for CreateVolume func: use search mode for Dedicated kind as @rootfs suggested.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52396
**Special notes for your reviewer**:
I reference the implmentation of v1.6 in the same CreateVolume func
https://github.com/kubernetes/kubernetes/blob/release-1.6/pkg/cloudprovider/providers/azure/azure_storage.go#L213-L247
**Release note**:
```
fix azure storage account exhausting issue by using azure disk mount
```
/sig azure
@rootfs @feiskyer @karataliu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add andyzhangx as azure reviewer
**What this PR does / why we need it**:
add andyzhangx as azure reviewer
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```
none
```
/sig azure
/assign @jdumars @brendandburns
Automatic merge from submit-queue (batch tested with PRs 56520, 53764). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add service.UID into security group name
Related to: #53714
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 56446, 56437). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix unit tests that need openstack
**What this PR does / why we need it**:
Currently the unit tests that depend that they be on running inside an openstack vm fail as no one seem to have run them for a while.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #56437
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 56094, 52910, 55953, 56405, 56415). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support VolumeV3 for OpenStack cloud Provider
Currently OpenStack supports Cinder v3 API, let Kubernetes support
it too.
Fix#52877
**Release note**:
```release-note
OpenStack cloud provider supports Cinder v3 API.
```
Automatic merge from submit-queue (batch tested with PRs 55545, 55548, 55815, 56136, 56185). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure GCE AlphaFeatureGate initialized
If no config file is specified for the controller-manager,
the GCE CloudConfig.AlphaFeatureGate property is not initialized.
This can cause a panic when checking for alpha features in the GCE
provider.
```release-note
NONE
```
Closes#55544
Automatic merge from submit-queue (batch tested with PRs 56211, 56024). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
allow ELB Healthcheck configuration via Service annotations
**What this PR does / why we need it**:
The default settings which are set on the ELB HC work well but there are cases when it would be better to tweak its parameters -- for example, faster detection of unhealthy backends. This PR makes it possible to override any of the healthcheck's parameters via annotations on the Service, with the exception of the Target setting which continues to be inferred from the Service's spec.
**Release note**:
```release-note
It is now possible to override the healthcheck parameters for AWS ELBs via annotations on the corresponding service. The new annotations are `healthy-threshold`, `unhealthy-threshold`, `timeout`, `interval` (all prefixed with `service.beta.kubernetes.io/aws-load-balancer-healthcheck-`)
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Implement volume resize for cinder
**What this PR does / why we need it**:
resize for cinder
xref: [resize proposal](https://github.com/kubernetes/community/pull/657)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref https://github.com/kubernetes/community/pull/657
Follow up: #49727
**Special notes for your reviewer**:
**Release note**:
```release-note
Implement volume resize for cinder
```
wip, assign to myself first
/assign @NickrenREN
Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Azure Load Balancer reconciliation should consider all Kubernetes-controlled properties of a LB NSG
**What this PR does / why we need it**:
This PR refers to issue #55733
With this PR, Kubernetes will update Azure nsg rules based on not just name, but also based on other properties such as destination port range and destination ip address.
We need it because right now Kubernetes will detect the difference and update only if there is difference in Name of nsg rule. It's been working fine for changing destination port range and source IP address because these two are part of the Name. (which external users should not assume) Basically right now, Kubernetes won't detect the difference if I go ahead and change any part of nsg rule using port UI.
This PR will let Kubernetes detect the difference and always try to reconcile nsg rules with service definition.
**Which issue(s) this PR fixes** :
Fixes#55733
**Special notes for your reviewer**: None
**Release note**:
```release-note
Kubernetes update Azure nsg rules based on not just difference in Name, but also in Protocol, SourcePortRange, DestinationPortRange, SourceAddressPrefix, DestinationAddressPrefix, Access, and Direction.
```
Automatic merge from submit-queue (batch tested with PRs 56128, 56004, 56083, 55833, 56042). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add initial Virtual Machine Scale Sets (VMSS) support in Azure
**What this PR does / why we need it**:
This is the first step of adding Virtual Machine Scale Sets (VMSS) support in Azure, it
- Adds vmType params to support both vmss and standard in Azure
- Adds initial InstanceID/InstanceType/IP/Routes support for vmss instances
- Master nodes may not belong to any scale sets, so it falls back to VirtualMachinesClient for such instances
Have validated that nodes could be registered and pods could be scheduled and run correctly.
Still more work to do to fully support Azure VMSS. And next steps are tracking at #43287.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #43287.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54316, 53400, 55933, 55786, 55794). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add Amazon NLB support
**What this PR does / why we need it**:
This adds support for AWS's NLB for `LoadBalancer` services.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#52173
**Special notes for your reviewer**:
This is NOT yet ready for merge, but I'd love any feedback before it is.
This requires at least `v1.10.40` of the [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go), which is not yet included in Kubernetes. Per @justinsb, I'm waiting on possibly #48314 to update to `v1.10.40` or some other PR.
I tried to make the change as easy to review as possible, so some LoadBalancer logic is duplicated in the `if isNLB(annotations)` blocks. I can refactor that and sprinkle more `isNLB()` switches around, but it might be harder to view the diff.
**Other Notes:**
* NLB's subnets cannot be modified after creation (maybe look for public subnets in all AZ's?). Currently, I'm just using `c.findELBSubnets()`
* Health check uses TCP with all the NLB default values. I was thinking HTTP health checks via annotation could be added later. Should that go into this PR?
* ~~`externalTrafficPolicy`/`healthCheckNodePort` are ignored. Should those be implemented for this PR?~~
* `externalTrafficPolicy` and subsequent `healthCheckNodePort` are handled properly. This may come with uneven load balancing, as NLB doesn't support weighted backends.
* With classic ELB, you have a security group the ELB is inside of to associate Instance (k8s node) SG rules with a LoadBalancer (k8s Service), but NLB's don't have a security group. Instead, I use the `Description` field on [`ec2.IpRange`](https://docs.aws.amazon.com/sdk-for-go/api/service/ec2/#IpRange) with the following annotations. Is this ok? I couldn't think of another way to associate SG rule to the NLB
* Node SG gets an rule added for VPC cidr on NodePort for Health Check with annotation in description `kubernetes.io/rule/nlb/health=<loadBalancerName>`
* Node SG gets an rule added for `loadBalancerSourceRanges` to NodePort for client traffic with annotation in description `kubernetes.io/rule/nlb/client=<loadBalancerName>`
* **Note: if `loadBalancerSourceRanges` is unspecified, this opens instance security groups to traffic from `0.0.0.0/0` on the service's nodePorts**
* Respects internal annotation
* Creates a TargetGroup per frontend port: simplifies updates when you have same backend port for multiple front end ports.
* Does not (yet) verify that we're under the NLB limits in terms of # of listeners
* `UpdateLoadBalancer()` basically just calls `EnsureLoadBalancer` for NLB's. Is this ok?
**Areas for future improvement or optimization**:
* A new annotation indicating a new security group should be created for NLB traffic and instances would be placed in this new SG. (Could bump up against the default limit of 5 SG's per instance)
* Only create a client health check security group rule when the VPC cidr is not a subset of `spec.loadBalancerSourceRanges`
* Consolidate TargetGroups if a service has 2+ frontend ports and the same nodePort.
* A new annotation for specifying TargetGroup Health Check options.
**Release note**:
```release-notes
Add Amazon NLB support - Fixes#52173
```
ping @justinsb @bchav
Automatic merge from submit-queue (batch tested with PRs 56021, 55843, 55088, 56117, 55859). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix panic when AlphaFeatureGate isn't configured for gcp.
**What this PR does / why we need it**:
When AlphaFeatureGate isn't configured, the pointer will be nil. This PR fixes it.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56009
**Special notes for your reviewer**:
cc @jsiebens
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Check if SleepDelay of AWS request is nil before sign.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55309
**Special notes for your reviewer**:
/cc @justinsb
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
using Regexp Match
**What this PR does / why we need it**:
using regexp match achieve find efficiently
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes issue where PVCs using `standard` StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters
Fixes#50115
Changed GetAllZones to only get zones with nodes that are currently running (renamed to GetAllCurrentZones). Added E2E test to confirm this behavior.
Automatic merge from submit-queue (batch tested with PRs 55112, 56029, 55740, 56095, 55845). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Updating vsphere cloud provider to support k8s cluster spread across multiple vCenters
**What this PR does / why we need it**:
vSphere cloud provider in Kubernetes 1.8 was designed to work only if all the nodes of the cluster are in one single datacenter folder. This is a hard restriction that makes the cluster not span across different folders/datacenter/vCenters. Users have use-cases to span the cluster across datacenters/vCenters.
**Which issue(s) this PR fixes**
Fixes # https://github.com/vmware/kubernetes/issues/255
**Special notes for your reviewer**:
This is a change purely in vsphere cloud provider and no changes in kubernetes core are needed.
**Release note**:
```release-note
With this change
- User should be able to create k8s cluster which spans across multiple ESXi clusters, datacenters or even vCenters.
- vSphere cloud provider (VCP) uses OS hostname and not vSphere Inventory VM Name.
That means, now VCP can handle cases where user changes VM inventory name.
- VCP can handle cases where VM migrates to other ESXi cluster or datacenter or vCenter.
The only requirement is the shared storage. VCP needs shared storage on all Node VMs.
```
Internally tested and reviewed the code.
@tthole, @shaominchen, @abrarshivani
Consider the migration from the old security group name to the new
security group name, we need delete the old security group.
At V1.10, we can assume everyone is using the new security group
names and remove this code.
running (renamed to GetAllCurrentZones). Added E2E test to confirm this
behavior.
Added node informer to cloud-provider controller to keep track of zones
with k8s nodes in them.
Automatic merge from submit-queue (batch tested with PRs 55217, 54260). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Unit tests for Azure service session affinity
**What this PR does / why we need it**: We added session affinity support in the Azure load balancer in commit 8b50b83067. This PR adds unit tests for this behaviour.
**Which issue this PR fixes**: None
**Special notes for your reviewer**: None
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55233, 55927, 55903, 54867, 55940). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix azure disk storage account init issue
**What this PR does / why we need it**:
There are two issues for the original azure disk storage account initialaztion code:
1) wrong controller-master detection, see issue #54570, #55776
2) should not initialize two storage account even if it's not necessary, see issue #50883
This PR would fix the above two issues:
For 1: remove the controller-master process binding
For 2: remove the storage account initialization process, just create on demand
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#54570Fixes#55776Fixes#50883
**Special notes for your reviewer**:
@rootfs @karataliu
**Release note**:
```
fix azure disk storage account init issue
```
/sig azure
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure new tags are created on existing ELBs
**What this PR does / why we need it**:
When editing an existing service of type LoadBalancer in an AWS environment and adding the `service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags` annotation, you would expect the new tags to be set on the load balancer, however this doesn't happen currently. The annotation only takes effect if specified when the service is created.
This PR adds an AddTags method to the ELB interface and uses this to ensure tags set in the annotation are present on the ELB. If the tag key is already present, the value will be updated.
This PR does not remove tags that have been removed from the annotation, it only add/updates tags.
**Which issue(s) this PR fixes**:
Fixes#54642
**Special notes for your reviewer**:
The change requires that the IAM policy of the master instance(s) has the `elasticloadbalancing:AddTags` permission.
**Release note**:
```release-note
Ensure additional resource tags are set/updated AWS load balancers
```
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Apply taint when a volume is stuck in attaching state
When a volume is stuck in attaching state for too long on a node, it is best to make node unschedulable so as any other pod may not be scheduled on it.
Fixes https://github.com/kubernetes/kubernetes/issues/55502
```release-note
AWS: Apply taint to a node if volumes being attached to it are stuck in attaching state
```
Automatic merge from submit-queue (batch tested with PRs 55642, 55897, 55835, 55496, 55313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
OpenStack: fetch volume path from metadata service
**What this PR does / why we need it**:
Updates the OpenStack cloud provider to use the Nova metadata service as a fallback when retrieving mounted PV disk paths. Note that the Nova instance device metadata will contain the disk address and bus, which allows finding its path.
This is needed as the *standard* mechanism of retrieving disk paths is not available when running k8s under OpenStack Hyper-V hosts.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55312
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
- vsphere.conf (cloud-config) is now needed only on master node
- VCP uses OS hostname and not vSphere inventory name
- VCP is now resilient to VM inventory name change and VM migration
Automatic merge from submit-queue (batch tested with PRs 54134, 54507). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added service annotation for AWS ELB SSL policy
**What this PR does / why we need it**:
This work adds a new supported service annotation for AWS clusters, `service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy`, which lets users specify which [predefined AWS SSL policy](http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-security-policy-table.html) they would like to use.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#43744
**Special notes for your reviewer**:
While this PR doesn't allow users to define their own cipher policy in an annotation, a user could (out of band) create their own policy on an ELB with the naming convention `k8s-SSLNegotiationPolicy-<my-policy-name>` and specify it with the above annotation.
This is my second k8s PR, and I don't have experience with an e2e test, would that be required for this change? I did run this in a kubeadm cluster and it worked like a charm. I was able to choose different predefined policies, and revert to the default policy when I removed the annotation.
**Release note**:
```release-note
Added service annotation for AWS ELB SSL policy
```
Automatic merge from submit-queue (batch tested with PRs 55392, 55491, 51914, 55831, 55836). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix dangling attach errors
Detach volumes from shutdown nodes and ensure that
dangling volumes are handled correctly in AWS
Fixes https://github.com/kubernetes/kubernetes/issues/52573
```release-note
Implement correction mechanism for dangling volumes attached for deleted pods
```
If no config file is specified for the controller-manager,
the GCE CloudConfig.AlphaFeatureGate property is not initialized.
This can cause a panic when checking for alpha features in the GCE
provider.
Closes#55544
Automatic merge from submit-queue (batch tested with PRs 55594, 47849, 54692, 55478, 54133). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added service annotation to set Azure DNS label for public IP
**What this PR does / why we need it**: Added a feature to set the DNS label for public IPs in the Azure cloud.
For example:
```
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/label-name: myservice
...
```
Will resolve myservice.westus.cloudapp.azure.com to the service's IP.
**Which issue this PR fixes**: fixes#44775
**Special notes for your reviewer**: Note that this is defining a new annotation, so feel free to point out if there is a preferred convention or anything else that needs to be done.
**Release note**:
```release-note
New service annotation "service.beta.kubernetes.io/azure-dns-label-name" to set Azure DNS label name for public IP
```
The OpenStack cloud provider retrieves mounted Cinder volume paths
by looking in /dev/disk/by-id, expecting the disk serial IDs (e.g.
SCSI ID) to include the volume ID.
The issue is that not all hypervisors are able to expose this.
For example, Hyper-V will just preserve the original Cinder volume
lun SCSI ID (without setting the volume id). For this reason,
disk path lookups will fail.
In order to be able to leverage Hyper-V based OpenStack providers,
as a fallback, we're querying the metadata service, searching for
disk device metadata. Note that starting with Nova Queens, the Hyper-V
driver always provides disk address information through the instance
metadata.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Restrict Azure NSG rules to allow external access only to load balancer IP
**What this PR does / why we need it**: On Azure, we create NSG (Network Security Group) rules on the vnet to allow external clients to access services exposed as type LoadBalancer. At the moment, these rules have a destination of `Any`, which means that they will permit requests on the opened port to any IP within the vnet. This PR restricts the security rules so that they admit external access only to the load balancer IP.
**Which issue this PR fixes**: None in upstream - reported as https://github.com/Azure/acs-engine/issues/1619
**Special notes for your reviewer**: None
**Release note**:
```release-note
Azure NSG rules for services exposed via external load balancer
now limit the destination IP address to the relevant front end load
balancer IP.
```
Automatic merge from submit-queue (batch tested with PRs 55093, 54966, 55047, 54971, 54786). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Upgrade Azure SDK to v11.1.1
**What this PR does / why we need it**: This fixes various Azure SDK bugs per the Azure SDK for Go changelogs:
* Fixed bug in which blob types were unmarshaled incorrectly
* Fixed various package names
* Miscellaneous unspecified storage bug fixes
This is also a prerequisite for a bug fix for running out of firewall rules when exposing large numbers of services from an Azure cluster.
**Which issue(s) this PR fixes**: None
**Special notes for your reviewer**:
1. I inadvertently committed a compatibility fix along with the dependency upgrade (which the guidelines say should have been two separate commits). The offending file is `pkg/cloudprovider/providers/azure.go`.
2. We require an urgent bug fix for the firewall rules limit so it would be great if we could get this agreed quickly. I have struggled with the dependency upgrade process a bit so if it looks wrong, please let me know as soon as you can! Thanks!
**Release note**:
```release-note
Upgraded Azure SDK to v11.1.1.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove Google Cloud KMS's in-tree integration
Removes the following introduced by #48574 and others:
* `kms.go` which contained the cloudkms-specific code for Google Cloud KMS service.
* Registering the Google Cloud KMS in the KMS plugin registry.
* Google's `cloudkms` API package from `vendor` folder.
The following changes are upcoming:
* Removal of KMSPluginRegistry. This would not be needed anymore, since KMS providers will be out-of-tree from now on (so no need of registering them, an address of the process would be enough).
* A service which allows encrypt/decrypt functionality (satisfies `envelope.Service` interface) if initialized with an IP/Port of an out-of-tree process serving KMS requests. Will tentatively use gRPC requests to talk to this external service.
Reference: https://github.com/kubernetes/kubernetes/pull/54439#issuecomment-340062801 and https://github.com/kubernetes/kubernetes/issues/51965#issuecomment-339333937.
```release-note
Google KMS integration was removed from in-tree in favor of a out-of-process extension point that will be used for all KMS providers.
```
We should check for available volume before performing
attach or delete of EBS volume. This will make sure that
we do not blow up API quota of mutable operations in AWS and stay a
good citizen.
Automatic merge from submit-queue (batch tested with PRs 51409, 54616). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Implement InstanceExistsByProviderID() for cloud providers
Fix#51406
If cloud providers(like aws, gce etc...) implement ExternalID()
and support getting instance by ProviderID , they also implement
InstanceExistsByProviderID().
/assign wlan0
/assign @luxas
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52717, 54568, 54452, 53997, 54237). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[OpenStack]Remove the LbaasV1 of OpenStack cloud provider
The Neutron LbaasV1 has been declared obsolete, LbaasV2 is a
better choice.
So let's remove the codes of LbaasV1, only support LbaasV2.
xref: #52609
Reference OpenStack doc:
https://docs.openstack.org/mitaka/networking-guide/config-lbaas.html
**Special notes for your reviewer**:
/assign @dims
/assign @anguslees
**Release note**:
```release-note
Remove the LbaasV1 of OpenStack cloud provider, currently only support LbaasV2.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixing usage of clustered datastore name to be absolute datastore name
**What this PR does / why we need it**:
This PR adds the step to convert the datastore name in form of a folder/cluster to absolute datastore name.
**Which issue this PR fixes**: fixes vmware#312
**Special notes for your reviewer**:
Testing. All the clustered datastore tests pass now
```
root@k8s-dev-vm-01:~/shahzeb/k8s/kubernetes# export CLUSTER_DATASTORE=dscl1/sharedVmfs-0
root@k8s-dev-vm-01:~/shahzeb/k8s/kubernetes# export VSPHERE_SPBM_POLICY_DS_CLUSTER=gold_cluster
root@k8s-dev-vm-01:~/shahzeb/k8s/kubernetes# go run hack/e2e.go --check-version-skew=false --v --test --test_args='--ginkgo.focus=Volume\sProvisioning\sOn\sClustered\sDatastore'
flag provided but not defined: -check-version-skew
Usage of /tmp/go-build641208048/command-line-arguments/_obj/exe/e2e:
-get
go get -u kubetest if old or not installed (default true)
-old duration
Consider kubetest old if it exceeds this (default 24h0m0s)
2017/10/18 17:29:39 e2e.go:55: NOTICE: go run hack/e2e.go is now a shim for test-infra/kubetest
2017/10/18 17:29:39 e2e.go:56: Usage: go run hack/e2e.go [--get=true] [--old=24h0m0s] -- [KUBETEST_ARGS]
2017/10/18 17:29:39 e2e.go:57: The separator is required to use --get or --old flags
2017/10/18 17:29:39 e2e.go:58: The -- flag separator also suppresses this message
2017/10/18 17:29:39 e2e.go:77: Calling kubetest --check-version-skew=false --v --test --test_args=--ginkgo.focus=Volume\sProvisioning\sOn\sClustered\sDatastore...
2017/10/18 17:29:39 util.go:154: Running: ./cluster/kubectl.sh --match-server-version=false version
2017/10/18 17:29:39 util.go:156: Step './cluster/kubectl.sh --match-server-version=false version' finished in 286.302987ms
2017/10/18 17:29:39 util.go:154: Running: ./hack/e2e-internal/e2e-status.sh
Skeleton Provider: prepare-e2e not implemented
Client Version: version.Info{Major:"1", Minor:"6+", GitVersion:"v1.6.0-alpha.0.17675+2bfcf4a7d1d6d4-dirty", GitCommit:"2bfcf4a7d1d6d4f257fe7ee96dcb47484b8e3a5f", GitTreeState:"dirty", BuildDate:"2017-10-18T23:37:58Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6+", GitVersion:"v1.6.0-alpha.0.17663+ba93892625a3e8", GitCommit:"ba93892625a3e8a246f3bc660d56a2635e7e7f9f", GitTreeState:"clean", BuildDate:"2017-10-18T20:57:49Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}
2017/10/18 17:29:40 util.go:156: Step './hack/e2e-internal/e2e-status.sh' finished in 296.318449ms
2017/10/18 17:29:40 util.go:154: Running: ./hack/ginkgo-e2e.sh --ginkgo.focus=Volume\sProvisioning\sOn\sClustered\sDatastore
Conformance test: not doing test setup.
Oct 18 17:29:41.634: INFO: Overriding default scale value of zero to 1
Oct 18 17:29:41.634: INFO: Overriding default milliseconds value of zero to 5000
I1018 17:29:41.829967 32645 e2e.go:383] Starting e2e run "98fbc41f-b464-11e7-8e40-0050569c27f6" on Ginkgo node 1
Running Suite: Kubernetes e2e suite
===================================
Random Seed: 1508372981 - Will randomize all specs
Will run 3 of 714 specs
Oct 18 17:29:41.889: INFO: >>> kubeConfig: /tmp/kube171.json
Oct 18 17:29:41.894: INFO: Waiting up to 4h0m0s for all (but 0) nodes to be schedulable
Oct 18 17:29:41.928: INFO: Waiting up to 10m0s for all pods (need at least 0) in namespace 'kube-system' to be running and ready
Oct 18 17:29:42.057: INFO: 13 / 13 pods in namespace 'kube-system' are running and ready (0 seconds elapsed)
Oct 18 17:29:42.057: INFO: expected 4 pod replicas in namespace 'kube-system', 4 are Running and Ready.
Oct 18 17:29:42.063: INFO: Waiting for pods to enter Success, but no pods in "kube-system" match label map[name:e2e-image-puller]
Oct 18 17:29:42.063: INFO: Dumping network health container logs from all nodes...
Oct 18 17:29:42.076: INFO: Client version: v1.6.0-alpha.0.17675+2bfcf4a7d1d6d4-dirty
Oct 18 17:29:42.079: INFO: Server version: v1.6.0-alpha.0.17663+ba93892625a3e8
SSSS
------------------------------
[sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
verify static provisioning on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:70
[BeforeEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:133
STEP: Creating a kubernetes client
Oct 18 17:29:42.079: INFO: >>> kubeConfig: /tmp/kube171.json
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:50
[It] verify static provisioning on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:70
STEP: creating a test vsphere volume
W1018 17:29:43.087373 32645 datacenter.go:156] QueryVirtualDiskUuid failed for diskPath: "[sharedVmfs-0] kubevols/kube-dummyDisk.vmdk". err: ServerFaultCode: File [sharedVmfs-0] kubevols/kube-dummyDisk.vmdk was not found
STEP: Creating pod
STEP: Waiting for pod to be ready
STEP: Verifying volume is attached
STEP: Deleting pod
Oct 18 17:30:07.268: INFO: Deleting pod "vsphere-e2e-vtzbg" in namespace "e2e-tests-volume-provision-bchh4"
Oct 18 17:30:07.301: INFO: Wait up to 5m0s for pod "vsphere-e2e-vtzbg" to be fully deleted
STEP: Waiting for volumes to be detached from the node
Oct 18 17:30:49.429: INFO: Volume "[dscl1/sharedVmfs-0] kubevols/e2e-vmdk-e2e-tests-volume-provision-bchh4.vmdk" appears to have successfully detached from "kubernetes-node4".
STEP: Deleting the vsphere volume
[AfterEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:134
Oct 18 17:30:49.774: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "e2e-tests-volume-provision-bchh4" for this suite.
Oct 18 17:30:58.203: INFO: namespace: e2e-tests-volume-provision-bchh4, resource: bindings, ignored listing per whitelist
Oct 18 17:30:58.225: INFO: namespace e2e-tests-volume-provision-bchh4 deletion completed in 8.444766463s
• [SLOW TEST:76.146 seconds]
[sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/framework.go:22
verify static provisioning on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:70
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
verify dynamic provision with spbm policy on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:131
[BeforeEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:133
STEP: Creating a kubernetes client
Oct 18 17:30:58.231: INFO: >>> kubeConfig: /tmp/kube171.json
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:50
[It] verify dynamic provision with spbm policy on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:131
STEP: Creating Storage Class With storage policy params
STEP: Creating PVC using the Storage Class
STEP: Waiting for claim to be in bound phase
Oct 18 17:30:58.402: INFO: Waiting up to 5m0s for PersistentVolumeClaim pvc-6dwrm to have phase Bound
Oct 18 17:30:58.410: INFO: PersistentVolumeClaim pvc-6dwrm found but phase is Pending instead of Bound.
Oct 18 17:31:00.416: INFO: PersistentVolumeClaim pvc-6dwrm found but phase is Pending instead of Bound.
Oct 18 17:31:02.430: INFO: PersistentVolumeClaim pvc-6dwrm found and phase=Bound (4.027818895s)
STEP: Creating pod to attach PV to the node
STEP: Verify the volume is accessible and available in the pod
Oct 18 17:31:26.666: INFO: Running '/root/shahzeb/k8s/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --server=https://10.160.141.5 --kubeconfig=/tmp/kube171.json exec pvc-tester-hgcj6 --namespace=e2e-tests-volume-provision-sfbcf -- /bin/touch /mnt/volume1/emptyFile.txt'
Oct 18 17:31:27.171: INFO: stderr: ""
Oct 18 17:31:27.171: INFO: stdout: ""
STEP: Deleting pod
Oct 18 17:31:27.171: INFO: Deleting pod "pvc-tester-hgcj6" in namespace "e2e-tests-volume-provision-sfbcf"
Oct 18 17:31:27.202: INFO: Wait up to 5m0s for pod "pvc-tester-hgcj6" to be fully deleted
STEP: Waiting for volumes to be detached from the node
Oct 18 17:32:11.324: INFO: Volume "[sharedVmfs-0] kubevols/kubernetes-dynamic-pvc-c6d8e174-b464-11e7-8b86-005056b7d4e9.vmdk" appears to have successfully detached from "kubernetes-node2".
Oct 18 17:32:11.324: INFO: Deleting PersistentVolumeClaim "pvc-6dwrm"
[AfterEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:134
Oct 18 17:32:11.393: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "e2e-tests-volume-provision-sfbcf" for this suite.
Oct 18 17:32:19.867: INFO: namespace: e2e-tests-volume-provision-sfbcf, resource: bindings, ignored listing per whitelist
Oct 18 17:32:19.876: INFO: namespace e2e-tests-volume-provision-sfbcf deletion completed in 8.460287539s
• [SLOW TEST:81.645 seconds]
[sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/framework.go:22
verify dynamic provision with spbm policy on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:131
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
verify dynamic provision with default parameter on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:121
[BeforeEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:133
STEP: Creating a kubernetes client
Oct 18 17:32:19.878: INFO: >>> kubeConfig: /tmp/kube171.json
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:50
[It] verify dynamic provision with default parameter on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:121
STEP: Creating Storage Class With storage policy params
STEP: Creating PVC using the Storage Class
STEP: Waiting for claim to be in bound phase
Oct 18 17:32:20.024: INFO: Waiting up to 5m0s for PersistentVolumeClaim pvc-zrcmm to have phase Bound
Oct 18 17:32:20.034: INFO: PersistentVolumeClaim pvc-zrcmm found but phase is Pending instead of Bound.
Oct 18 17:32:22.040: INFO: PersistentVolumeClaim pvc-zrcmm found and phase=Bound (2.016093024s)
STEP: Creating pod to attach PV to the node
STEP: Verify the volume is accessible and available in the pod
Oct 18 17:32:30.266: INFO: Running '/root/shahzeb/k8s/kubernetes/_output/dockerized/bin/linux/amd64/kubectl --server=https://10.160.141.5 --kubeconfig=/tmp/kube171.json exec pvc-tester-2hns4 --namespace=e2e-tests-volume-provision-llvbq -- /bin/touch /mnt/volume1/emptyFile.txt'
Oct 18 17:32:30.774: INFO: stderr: ""
Oct 18 17:32:30.774: INFO: stdout: ""
STEP: Deleting pod
Oct 18 17:32:30.774: INFO: Deleting pod "pvc-tester-2hns4" in namespace "e2e-tests-volume-provision-llvbq"
Oct 18 17:32:30.806: INFO: Wait up to 5m0s for pod "pvc-tester-2hns4" to be fully deleted
STEP: Waiting for volumes to be detached from the node
Oct 18 17:33:20.972: INFO: Volume "[dscl1/sharedVmfs-0] kubevols/kubernetes-dynamic-pvc-f77fba36-b464-11e7-8b86-005056b7d4e9.vmdk" appears to have successfully detached from "kubernetes-node4".
Oct 18 17:33:20.972: INFO: Deleting PersistentVolumeClaim "pvc-zrcmm"
[AfterEach] [sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:134
Oct 18 17:33:21.100: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready
STEP: Destroying namespace "e2e-tests-volume-provision-llvbq" for this suite.
Oct 18 17:33:29.490: INFO: namespace: e2e-tests-volume-provision-llvbq, resource: bindings, ignored listing per whitelist
Oct 18 17:33:29.554: INFO: namespace e2e-tests-volume-provision-llvbq deletion completed in 8.444585266s
• [SLOW TEST:69.676 seconds]
[sig-storage] Volume Provisioning On Clustered Datastore [Feature:vsphere]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/framework.go:22
verify dynamic provision with default parameter on clustered datastore
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/storage/vsphere_volume_cluster_ds.go:121
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSOct 18 17:33:29.559: INFO: Running AfterSuite actions on all node
Oct 18 17:33:29.559: INFO: Running AfterSuite actions on node 1
Ran 3 of 714 Specs in 227.670 seconds
SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 711 Skipped PASS
Ginkgo ran 1 suite in 3m48.395491704s
Test Suite Passed
```
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 54366, 54500). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Service controller / gce loadbalancer logging cleanup
**What this PR does / why we need it**:
From https://github.com/kubernetes/kubernetes/issues/52495, while looking into the potential scalability issue service controller may have, I found it really hard to debug as the log sometimes doesn't reference which LB it is for, or sometimes it just doesn't log at all. It get even harder that in correctness CIs we reduce verbose level to v1:
67bc30718f/jobs/env/ci-kubernetes-e2e-gce-scale-correctness.env (L16).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE
**Special notes for your reviewer**:
/assign @nicksardo @bowei
cc @shyamjvs
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Make OpenStack LBaaS v2 Provider configurable
Add option 'lb-provider' to the Loadbalancer section of the OpenStack
cloudprovider configuration to allow using a different LBaaS v2
provider than the default.
**What this PR does / why we need it**:
This PR allows to use a different OpenStack LBaaS v2 provider than the default of the OpenStack cloud.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Added option lb-provider to OpenStack cloud provider config
```
Add option 'lb-provider' to the Loadbalancer section of the OpenStack
cloudprovider configuration to allow using a different LBaaS v2
provider than the default.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update the Google API clients used by the GCE provider to identify Kubernetes as the origin of GCP API calls.
**What this PR does / why we need it**:
This modifies the Google API clients used for the GCE provider to identify GCP API calls as originating from Kubernetes.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39391
**Special notes for your reviewer**:
**Release note**:
```release-note
`NONE`
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Let the caller log error message
Talk to @anguslees offline and in resize pr, we should let the caller log the error message and in those functions, just return err.
Also error string should not be capitalized or end with punctuation
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @anguslees
Fix#51406
If cloud providers(like aws, gce etc...) implement ExternalID()
and support getting instance by ProviderID , they also implement
InstanceExistsByProviderID().
Automatic merge from submit-queue (batch tested with PRs 53694, 53919). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix controller manager crash issue on a manually created k8s cluster
**What this PR does / why we need it**:
fix controller manager crash issue on a manually created k8s cluster, it's due to availability set nil issue in azure loadbalancer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
In the testing of a manually created k8s cluster, I found controller manager on master would crash in current scenario:
1. Use acs-engine to set up k8s 1.7.7 cluster (it's with an availability set)
2. Manually add a node to the k8s cluster (without an availibity set in this VM)
3. Set up a service and schedule the pod onto this newly added node
4. controller manager would crash on master because although this k8s cluster has an availability set, the newly added node's `machine.AvailabilitySet` is nil which would cause controller manager crash
**Special notes for your reviewer**:
@brendanburns @karataliu @JiangtianLi
**Release note**:
```
fix controller manager crash issue on a manually created k8s cluster
```
/sig azure
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support autoprobing node-security-group for openstack cloud provider
1. Support autoprobing node-security-group
2. Support multiple Security Groups for cluster's nodes
3. Fix recreating Security Group for cluster's nodes
This is a part of #50726
**Special notes for your reviewer**:
/assign @anguslees
/assign @dims
**Release note**:
```release-note
Support autoprobing node-security-group for openstack cloud provider, Support multiple Security Groups for cluster's nodes.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix format specifiers in Azure cloud provider
**What this PR does / why we need it**: Fixes invalid/mismatched format specifiers in Azure cloud provider logging statements (`glog...Infof(...)`) that would cause information to be lost in logging output, as flagged by `go vet`.
**Which issue this PR fixes**: None
**Special notes for your reviewer**: None
**Release note**:
```release-note
NONE
```
Currently the service's name is not unique, and the Securty Group
name is not unique too. openstack cloud provider will delete the
Securty Group of other loadbalancer service when do a deletion.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix minor typo
**What this PR does / why we need it**:
Typo error
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 53444, 52067, 53571, 53182). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Retry when checking Azure storage account readiness
**What this PR does / why we need it**: When the Azure cloud provider ensures that a default storage container exists, if the storage account exists but is still provisioning, it exits without retrying. This is a bug as the code is wrapped in a backoff policy but never signals the policy to retry. This PR fixes this behaviour by returning values which allow the backoff policy to operate.
**Which issue this PR fixes**: fixes#53052
**Special notes for your reviewer**: Not sure how to test this - I have done a deployment using acs-engine and it seems to work but I am not sure of the best way to exercise the failure path.
**Release note**:
```release-note
NONE
```
1. Support autoprobing node-security-group
2. Support multiple Security Groups for cluster's nodes
3. Fix recreating Security Group for cluster's nodes
This is a part of #50726
Automatic merge from submit-queue (batch tested with PRs 52662, 53547, 53588, 53573, 53599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Return err when delete volume failed
Return err when delete volume failed
**Release note**:
```release-note
NONE
```
/kind bug
/sig openstack
Automatic merge from submit-queue (batch tested with PRs 53567, 53197, 52944, 49593). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[OpenStack]Add codes to check the count of nodes(members)
After merging this PR(#53146), if there is no available nodes for
the loadbalancer service, UpdateLoadBalancer() will run panic.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add possibility to ignore volume label in dynamic provisioning
**What this PR does / why we need it**: this is needed if openstack cinder zone name does not match to compute zone names. For instance if there is only one cinder zone and many compute zones.
**Which issue this PR fixes**: fixes#53488
**Special notes for your reviewer**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52768, 51898, 53510, 53097, 53058). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ability to run the openstack tests against DevStack
**What this PR does / why we need it**:
Some of the environment variables have changed as devstack defaults
have changed. So look for the older env variables first and try the
newer ones later.
At a minimum you need the following for v3 authentication which is
the default with latest devstack. If you miss the Tenant information
then the token issued will be a unscoped token (and will not have any
service catalog information).
OS_AUTH_URL=http://192.168.0.42/identity
OS_REGION_NAME=RegionOne
OS_USERNAME=demo
OS_PASSWORD=supersecret
OS_TENANT_NAME=demo
OS_USER_DOMAIN_ID=default
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add cheftako to CP reviewers and wlan0 to approvers.
**What this PR does / why we need it**: wlan0 is helping to lead the separate cloud providers effort and so should be an approver. I am helping to do the gce effort and should probably be a reviewer.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: owners
**Special notes for your reviewer**:
**Release note**:
```release-note NONE
```
Automatic merge from submit-queue (batch tested with PRs 53418, 53366, 53115, 53402, 53130). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix the version detection of OpenStack Cinder
**What this PR does / why we need it**:
When running Kubernetes against an installation of DevStack which
deploys the Cinder service at a path rather than a port (ex:
http://foo.bar/volume rather than http://foo.bar:xxx), the version
detection fails. It is better to use the OpenStack service catalog.
OTOH, when initialize cinder client, kubernetes will check the
endpoint from the OpenStack service catalog, so we can do this
version detection by it.
There are two case should be fixed in other PR:
1. revisit the version detection after supporting Cinder V3 API.
2. add codes to support MicroVersion after gophercloud supports MicroVersion.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50461
**Special notes for your reviewer**:
/assign @dims
/assign @xsgordon
**Release note**:
```release-note
Using OpenStack service catalog to do version detection
```
Some of the environment variables have changed as devstack defaults
have changed. So look for the older env variables first and try the
newer ones later.
At a minimum you need the following for v3 authentication which is
the default with latest devstack. If you miss the Tenant information
then the token issued will be a unscoped token (and will not have any
service catalog information).
OS_AUTH_URL=http://192.168.0.42/identity
OS_REGION_NAME=RegionOne
OS_USERNAME=demo
OS_PASSWORD=supersecret
OS_TENANT_NAME=demo
OS_USER_DOMAIN_ID=default
Automatic merge from submit-queue (batch tested with PRs 51750, 53195, 53384, 53410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Handle missing subnet for legacy networks and auto networks with unique subnet names
Fixes#53409
/assign @bowei
Tested on three GKE clusters with automatic, manual, and legacy networks.
**Release note**:
```release-note
GCE: Fixes ILB sync on legacy networks and auto networks with unique subnet names
```
Automatic merge from submit-queue (batch tested with PRs 51750, 53195, 53384, 53410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add http request timeout for OpenStack cloud provider
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53191
**Special notes for your reviewer**:
/assign @NickrenREN @dims @FengyunPan
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
volunteer to help with external cloud providers
**What this PR does / why we need it**:
Looks like we have a single approver in Mike. Throwing my hat in
to help with approvals etc.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51754, 53261, 53450). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Ignore notFound when deleting firewall
**What this PR does / why we need it**:
Ignores a not found error when deleting a firewall on line 220.
**Which issue this PR fixes**:
Fixes#53411
**Special notes for your reviewer**:
/assign @MrHohn
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix broken cloud provider info urls
kubernetes/community repo's commit 3034683c5997474d9f59ef722c8ee9c1f1e58f07
started a re-org of the design-proposals directory to have hierarchical
structure and subdirectories. This in turn broke the urls in the
kubernetes/kubernetes/pkg/cloud-provider/README.md file. This patch adds
the appropriate subdirectories into the urls in the readme.
Signed-off-by: Tim Pepper <tpepper@vmware.com>
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move AWS Fake implementations out of test
The AWS fake implementations are in a test file and can't be imported into any other tests. This makes integration testing difficult. This PR moves the fake implementations such that they can be used by other entities.
@kubernetes/sig-aws-misc @justinsb
Automatic merge from submit-queue (batch tested with PRs 53234, 53252, 53267, 53276, 53107). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add get alpha backend service into cloud provider
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53101, 53158, 52165). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[OpenStack] Service LoadBalancer defaults to external
**What this PR does / why we need it**:
Let "service.beta.kubernetes.io/openstack-internal-load-balancer" default to false.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes#53078
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51311, 52575, 53169). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Unable to detach the vSphere volume from Powered off node
With the existing implementation when a vSphere node is powered off, the node is not deleted by the node controller and is in "NotReady" state. Following the approach similar to GCE as mentioned here - https://github.com/kubernetes/kubernetes/issues/46442.
I observe the following issues:
- The pods on the powered off node are not **instantaneously** created on the other available node. Only after 5 minutes timeout, the pods will be created on other available nodes with the volume attached to it. This means an application downtime of around 5 minutes which is not good at all.
- The volume on the powered off node are not detached at all when the pod with the volume is already moved to other available node. Hence any attempt to restart the powered off node will fail as the same volume is attached to other node which is present on this powered off node. (Please note that the volumes are not automatically detached from powered off in vSphere as opposed to GCE, AWS where volume is automatically detached from when node is powered off).
So inorder to resolve this problem, we have decided to back with the approach where the powered off node will be removed by the Node controller. So the above 2 problems will be resolved as follows:
- Since the node is deleted, the pod on the powered off node becomes instantaneously available on other available nodes with the volume attached to the new nodes. Hence there is no application downtime at all.
- After a period of 6 minutes (timeout period), the volumes are automatically detached from the powered off node. Hence any restarts after 6 minutes on the powered off node would work and not cause any problems as volumes are already detached.
For now, we would want to go ahead with deleting the node from node controller when a node is powered off in vCenter until we have a better approach. I think the best possible solution would be to introduce power handler in volume controller to see if the node is powered off before we can take any appropriate for attach/detach operations.
```release-note
None
```
@jingxu97 @saad-ali @divyenpatel @luomiao @rohitjogvmw
Automatic merge from submit-queue (batch tested with PRs 50280, 52529, 53093, 53108, 53168). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Mark volume as detached when node does not exist for photon
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
**Special notes for your reviewer**:
/assign @jingxu97
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53157, 52628). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added openstack instance metadata search order
**What this PR does / why we need it**: This PR adds a search order for the instance metadata retrieval on openstack. More information and discussion can be found on #52378
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52378
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
kubernetes/community repo's commit 3034683c5997474d9f59ef722c8ee9c1f1e58f07
started a re-org of the design-proposals directory to have hierarchical
structure and subdirectories. This in turn broke the urls in the
kubernetes/kubernetes/pkg/cloud-provider/README.md file. This patch adds
the appropriate subdirectories into the urls in the readme.
While the kubernetes/kubernetes/pkg/cloud-provider/cloud-provider
directory represents an area that's deprecated now, this patch isn't
introducing anything new, but rather fixes the broken links to
information on the deprecation and info on the evolving forward
path for the cloud providers.
Signed-off-by: Tim Pepper <tpepper@vmware.com>
When running Kubernetes against an installation of DevStack which
deploys the Cinder service at a path rather than a port (ex:
http://foo.bar/volume rather than http://foo.bar:xxx), the version
detection fails. It is better to use the OpenStack service catalog.
OTOH, when initialize cinder client, kubernetes will check the
endpoint from the OpenStack service catalog, so we can do this
version detection by it.
Automatic merge from submit-queue (batch tested with PRs 52751, 52898, 52633, 52611, 52609). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Only register floatingIP for external loadbalancer service
If the user has provided the floating-ip options, then it's safe
to assume they want (only) the floating-ip to be the ingress IP;
if they have not provided floating-ip options, then the LB IP is
the only relevant value.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fix#52566
**Release note**:
```release-note
Only register floatingIP into Loadbalancer ingress field for external loadbalancer service
```
Automatic merge from submit-queue (batch tested with PRs 52751, 52898, 52633, 52611, 52609). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix missing floatingip when calling GetLoadBalancer()
If user specify floating-network-id, a floatingip and a vip will
be assigned to LoadBalancer service, So its status contains a
floatingip and a vip, but GetLoadBalancer() only return vip.
**Release note**:
```release-note
GetLoadBalancer() only return floatingip when user specify floating-network-id, or return LB vip.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix GCE LB resource cleanup for service e2e tests.
**What this PR does / why we need it**: Fix GCE LB resource cleanup logic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52347
**Special notes for your reviewer**:
/assign @shyamjvs @nicksardo
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52880, 52855, 52761, 52885, 52929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Remove cloud provider rackspace
**What this PR does / why we need it**:
For now, we have to implement functions in both `rackspace` and `openstack` packages if we want to add function for cinder, for example [resize for cinder](https://github.com/kubernetes/kubernetes/pull/51498). Since openstack has implemented all the functions rackspace has, and rackspace is considered deprecated for a long time, [rackspace deprecated](https://github.com/rackspace/gophercloud/issues/592) ,
after talking with @mikedanese and @jamiehannaford offline , i sent this PR to remove `rackspace` in favor of `openstack`
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52854
**Special notes for your reviewer**:
**Release note**:
```release-note
The Rackspace cloud provider has been removed after a long deprecation period. It was deprecated because it duplicates a lot of the OpenStack logic and can no longer be maintained. Please use the OpenStack cloud provider instead.
```
Automatic merge from submit-queue (batch tested with PRs 50068, 52406, 52394, 48551, 52131). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Implement bulk polling of volumes for vSphere
This PR implements bulk polling of volumes - BulkVerifyVolumes() API for vSphere.
With the existing implementation, vSphere makes multiple calls to VC to check if the volume is attached to a node. If there are "N" volumes attached on "M" nodes, vSphere makes "N" VCenter calls to check if the volumes are attached to VC for all "N" volumes. Also, by default Kubernetes queries if the volumes are attached to nodes every 1 minute. This will substantially increase the number of calls made by vSphere cloud provider to vCenter.
Inorder to prevent this, vSphere cloud provider implements the BulkVerifyVolumes() API in which only a single call is made to vCenter to check if all the volumes are attached to the respective nodes. Irrespective of the number of volumes attached to nodes, the number of vCenter calls will always be 1 on a query to BulkVerifyVolumes() API by kubernetes.
@rohitjogvmw @divyenpatel @luomiao
```release-note
BulkVerifyVolumes() implementation for vSphere
```
Automatic merge from submit-queue (batch tested with PRs 52355, 52537, 52551, 52403, 50673). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Azure - Use cloud environment to instantiate storage client
**What this PR does / why we need it**:
Since 1.7 and managed disk for azure, blob storage on Azure cloud other than the default public one is broken, because kubernetes expect blob ressources URI to end with `.blob.core.windows.net ` (ignoring storageEndpointSuffix).
This include the chinese Cloud, for which storageEndpointSuffix is `blob.core.chinacloudapi.cn` for example.
See : https://github.com/Azure/azure-storage-go/blob/master/client.go#L194
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50294, 50422, 51757, 52379, 52014). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Azure cloud provider: expose services on non-default subnets
**What this PR does / why we need it**: The Azure cloud provider allows users to specify that a service should be exposed on an internal load balancer instead of the default external load balancer. However, in a VNet environment, such services are currently always exposed on the master subnet. Where there are multiple subnets in the VNet, it's desirable to be able to expose an internal service on any subnet. This PR allows this via a new annotation, `service.beta.kubernetes.io/azure-load-balancer-internal-subnet`.
**Which issue this PR fixes**: fixes https://github.com/Azure/acs-engine/issues/1296 (no corresponding issue has been raised in the k8s core repo)
**Special notes for your reviewer**: None
**Release note**:
```release-note
A new service annotation has been added for services of type LoadBalancer on Azure,
to specify the subnet on which the service's front end IP should be provisioned. The
annotation is service.beta.kubernetes.io/azure-load-balancer-internal-subnet and its
value is the subnet name (not the subnet ARM ID). If omitted, the default is the
master subnet. It is ignored if the service is not on Azure, if the type is not
LoadBalancer, or if the load balancer is not internal.
```
Automatic merge from submit-queue (batch tested with PRs 52240, 48145, 52220, 51698, 51777). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Avoid printing node list for LoadBalancer in log file
**What this PR does / why we need it**: Production log files get saturated with EnsureLoadBalancer messages, this is problematic for sysadmins.
This patch avoids printing the node list on the AWS logs so the log file is more readable.
Automatic merge from submit-queue (batch tested with PRs 43016, 50503, 51281, 51518, 51582). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Refactor: Moving disk-related cloud provider operations to gce_disks.go
**What this PR does / why we need it**: The main GCE cloud provider code (pkg/cloudprovider/providers/gce/gce.go) should not contain disk-related operations. Moved them to gce_disks.go
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51280
**Release note**:
```release-note
NONE
```
/release-note-none
/sig storage
/assign @msau42 @bowei
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Implement the `cloudprovider.Instances` interface for CloudStack
This PR adds code to support the `cloudprovider.Instances` interface, for the CloudStack provider
Closes#47303
If user specify floating-network-id, a floatingip be assigned to
LoadBalancer service, So its status contains a floatingip, but
GetLoadBalancer() only return vip.
If the user has provided the floating-ip options, then it's safe
to assume they want (only) the floating-ip to be the ingress IP;
if they have not provided floating-ip options, then the LB IP is
the only relevant value.
Fix#52566
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)
Remove links to GCE/AWS cloud providers from PersistentVolumeCo…
…ntroller
**What this PR does / why we need it**:
We should be able to build a cloud-controller-manager without having to
pull in code specific to GCE and AWS clouds. Note that this is a tactical
fix for now, we should have allow PVLabeler to be passed into the
PersistentVolumeController, maybe come up with better interfaces etc. Since
it is too late to do all that for 1.8, we just move cloud specific code
to where they belong and we check for PVLabeler method and use it where
needed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#51629
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix splitProviderID for Azure
**What this PR does / why we need it**:
#46940 add 'splitProviderID' for Azure to get node name from provider, but it captures the resource id instead of node name.
Functions such as NodeAddresses are accepting node names:
84d9778f22/pkg/cloudprovider/providers/azure/azure_instances.go (L32)
With current implementation, it takes in a resource ID, and will result in following error
```
E0830 04:15:09.877143 10427 azure_instances.go:63] error: az.NodeAddresses, az.getIPForMachine(/subscriptions/{id}/resourceGroups/{id}/providers/Microsoft.Compute/virtualMachines/k8s-master-0), err=instance not found
```
This fix makes is return node names instead.
**Which issue this PR fixes**
**Special notes for your reviewer**:
**Release note**:
`NONE`
@brendandburns @realfake @wlan0
Automatic merge from submit-queue (batch tested with PRs 52047, 52063, 51528)
implementation of GetZoneByProviderID and GetZoneByNodeName for azure
This is part of the #50926 effort
cc @luxas
**Release note**:
```release-note
None
```
We should be able to build a cloud-controller-manager without having to
pull in code specific to GCE and AWS clouds. Note that this is a tactical
fix for now, we should have allow PVLabeler to be passed into the
PersistentVolumeController, maybe come up with better interfaces etc. Since
it is too late to do all that for 1.8, we just move cloud specific code
to where they belong and we check for PVLabeler method and use it where
needed.
Fixes#51629
Automatic merge from submit-queue (batch tested with PRs 51984, 51351, 51873, 51795, 51634)
Bug Fix - Adding an allowed address pair wipes port security groups
**What this PR does / why we need it**:
Fix for cloud routes enabled instances will have their security groups
removed when the allowed address pair is added to the instance's port.
Upstream bug report is in:
https://github.com/gophercloud/gophercloud/issues/509
Upstream bug fix is in:
https://github.com/gophercloud/gophercloud/pull/510
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#51755
**Special notes for your reviewer**:
Just an fix in vendored code. minimal changes needed in OpenStack cloud provider
**Release note**:
```release-note
NONE
```
Modifies the VolumeZonePredicate to handle a PV that belongs to more
then one zone or region. This is indicated by the zone or region label
value containing a comma separated list.
Automatic merge from submit-queue (batch tested with PRs 50602, 51561, 51703, 51748, 49142)
Implement GetZoneByProviderID & GetZoneByNodeName
Adding an implementation of GetZoneByProviderID & GetZoneByNodeName for
GCE.
This is related to ticket 50926.
This was tested as part of the ongoing separate GCE cloud provider work.
**What this PR does / why we need it**: It implements GCE methods needed by the cloud provider work.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50926
**Special notes for your reviewer**: Tested with pull/50811
**Release note**:
<!-- Steps to write your release note:
```release-note NONE
```
Automatic merge from submit-queue (batch tested with PRs 51301, 50497, 50112, 48184, 50993)
Replace the deprecated function with the suggest function in aws module
**What this PR does / why we need it**:
There are some deprecated function and I replace the deprecated function with the suggest function in aws module.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51301, 50497, 50112, 48184, 50993)
AWS: handle multiple IPs when using more than 1 network interface per ec2 instance
**What this PR does / why we need it**:
Adds support for kubelets running with the AWS cloud provider on ec2 instances with multiple network interfaces. If the active interface is not eth0, the AWS cloud provider currently reports the wrong node IP.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44686
**Special notes for your reviewer**:
There is also some work necessary for handling multiple DNS names and such but I didn't fix them in this PR.
**Release note**:
```release-note
Fixed bug in AWS provider to handle multiple IPs when using more than 1 network interface per ec2 instance.
```
Automatic merge from submit-queue
GCE: Add Alpha feature "Network Tiers" for external L4 load balancers
**Special notes for your reviewer**:
The PR has been manually tested in a GCE e2e cluster for the following conditions:
1. When `network-tier` is not enabled in gce.conf, network tier annotations are completely ignored by the controller.
2. When `network-tier` is enabled in gce.conf:
* Service w/ Standard tier: create a standard-tier LB.
* Update Service to use a different tier: tear down the existing forwarding rule and release the IP before creating a new LB.
* Service w/ an invalid tier value: `ensureExternalLoadBalancer()` returns an error, and controller emits an event.
* Service w/ a user-owned static IP: check if the tier matches, if not, returns an error and emits an event.
I uploaded an e2e test #51483. You're welcome to review that one too.
**Release note**:
```release-note
GCE: Service object now supports "Network Tiers" as an Alpha feature via annotations.
```
Automatic merge from submit-queue
Fix InstanceTypeByProviderID for Azure
**What this PR does / why we need it**:
Fix change in #46940, should return InstanceType in function InstanceTypeByProviderID
Otherwise:
```
I0830 05:01:08.497989 15347 node_controller.go:328] Adding node label from cloud provider: beta.kubernetes.io/instance-type=/subscriptions/{id}/resourceGroups/{id}/providers/Microsoft.Compute/virtualMachines/k8s-agentpool1
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
@brendandburns @realfake
Adding an implementation of GetZoneByProviderID & GetZoneByNodeName for
GCE.
This is related to ticket 50926.
This was tested as part of the ongoing separate GCE cloud provider work.
Added unit test.
Fix for wojtek-t (borrowed from FengyunPan)
Automatic merge from submit-queue (batch tested with PRs 51632, 51055, 51676, 51560, 50007)
GCE: Reserve address for ILBs during sync
**What this PR does / why we need it**:
This PR adds the ability for the service controller to hold the ILB's IP during sync which may delete/recreate the forwarding rule.
Fixes: #47531
**Release note**:
```release-note
GCE: Internal load balancer IPs are now reserved during service sync to prevent losing the address to another service.
```
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
fix typo about volumes
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
implementation of GetZoneByProviderID and GetZoneByNodeName for AWS
This a part of the #50926 effort
cc @luxas
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
AWS: check validity of KSM key before creating a new encrypted disk.
AWS CreateVolume call does not check if referenced encryption key actually exists and returns a valid new AWS EBS volume even though an invalid key was specified. Later on it removes the EBS silently when its encryption fails.
To work around this buggy behavior we manually check that the key exists before calling CreateVolume.
Fixes#48438
/sig aws
Please review carefully. Can we safely assume that Kubernetes controller-manager can read encryption keys?
```release-note
aws: Kubernetes now checks existence of provided KSM (Key Management Service) key before creating an encrypted AWS EBS.
```
Automatic merge from submit-queue
e2e: Add tests for network tiers in GCE
This test depends on #51301, which adds the new feature. Only the `e2e: Add tests for network tiers in GCE` commit is new.
#51301 should pass this new test.
Automatic merge from submit-queue
Add Google cloud KMS service for envelope encryption transformer
This adds the required pieces which will allow addition of KMS based encryption providers (envelope transformer).
For now, we will be implementing it using Google Cloud KMS, but the code should make it easy to add support for any other such provider which can expose Decrypt and Encrypt calls.
Writing tests for Google Cloud KMS Service may cause a significant overhead to the testing framework. It has been tested locally and on GKE though.
Upcoming after this PR:
* Complete implementation of the envelope transformer, which uses LRU cache to maintain decrypted DEKs in memory.
* Track key version to assist in data re-encryption after a KEK rotation.
Development branch containing the changes described above: https://github.com/sakshamsharma/kubernetes/pull/4
Envelope transformer used by this PR was merged in #49350
Concerns #48522
Planned configuration:
```
kind: EncryptionConfig
apiVersion: v1
resources:
- resources:
- secrets
providers:
- kms:
cachesize: 100
configfile: gcp-cloudkms.conf
name: gcp-cloudkms
- identity: {}
```
gcp-cloudkms.conf:
```
[GoogleCloudKMS]
kms-location: global
kms-keyring: google-container-engine
kms-cryptokey: example-key
```
Automatic merge from submit-queue (batch tested with PRs 51298, 51510, 51511)
GCE: Add a fake forwarding rule service
Also add more methods to the address service. These
will be used for testing soon.
Automatic merge from submit-queue (batch tested with PRs 50919, 51410, 50099, 51300, 50296)
GCE: Read networkProjectID param
Fixes#48515
/assign bowei
The first commit is the original PR cherrypicked. The master's kubelet isn't provided a cloud config path, so the project is retrieved via instance metadata. In the GKE case, this project cannot be retrieved by the master and caused an error.
**Release note**:
```release-note
NONE
```
AWS CreateVolume call does not check if referenced encryption key actually
exists and returns a valid new AWS EBS volume even though an invalid key
was specified. Later on it removes the EBS silently when its encryption fails.
To work around this buggy behavior we manually check that the key exists
before calling CreateVolume.
Automatic merge from submit-queue
Implement GetZoneByProviderID and GetZoneByNodeName for openstack
This is part of #50926
cc @wlan0
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51174, 51363, 51087, 51382, 51388)
Add InstanceExistsByProviderID to cloud provider interface for CCM
**What this PR does / why we need it**:
Currently, [`MonitorNode()`](02b520f0a4/pkg/controller/cloud/nodecontroller.go (L240)) in the node controller checks with the CCM if a node still exists by calling `ExternalID(nodeName)`. `ExternalID` is supposed to return the provider id of a node which is not supported on every cloud. This means that any clouds who cannot infer the provider id by the node name from a remote location will never remove nodes that no longer exist.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50985
**Special notes for your reviewer**:
We'll want to create a subsequent issue to track the implementation of these two new methods in the cloud providers.
**Release note**:
```release-note
Adds `InstanceExists` and `InstanceExistsByProviderID` to cloud provider interface for the cloud controller manager
```
/cc @wlan0 @thockin @andrewsykim @luxas @jhorwit2
/area cloudprovider
/sig cluster-lifecycle
Automatic merge from submit-queue (batch tested with PRs 51235, 50819, 51274, 50972, 50504)
Support for specifying external LoadBalancerIP on openstack
1. Support ServiceAnnotationLoadBalancerFloatingNetworkId for LB v1
2. Support for specifying external LoadBalancerIP on openstack
Add ServiceAnnotationLoadBalancerInternal annotation to distinguish
between internal LoadBalancerIP and external LoadBalancerIP.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fix#50851
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51114, 51233, 51024, 51053, 51197)
Add AddAliasToInstance() to gce cloud provider
- Adds AddAliasToInstance() to the GCE cloud provider.
- Adds field "secondary-range-name" to the gce.conf configuration file.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51244, 50559, 49770, 51194, 50901)
Fix the matching rule of instance ProviderID
Url.Parse() can't parse ProviderID which contains ':///'.
This PR use regexp to match ProviderID.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fix#49769
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)
Change the FakeCloudAddressService to store Alpha objects internally
The change assumes the compute Alpha object is the superset of the v1
object. By storing the Alpha objects internally in the fake, we can
convert them to Beta and v1 to test different functions.
The change assumes the compute Alpha object is the superset of the v1
object. By storing the Alpha objects internally in the fake, we can
convert them to Beta and v1 to test different functions.
- Adds AddAliasToInstance() to the GCE cloud provider.
- Adds field "secondary-range-name" to the gce.conf configuration file.
```release-note
NONE
```
Automatic merge from submit-queue
cloudprovider/openstack bug fix: don't try to append pool id if pool doesn't exist
**What this PR does / why we need it**:
This fixes a bug in the OpenStack cloud provider that could cause a panic.
Consider what will happen in the current `LbaasV2.EnsureLoadBalancerDeleted` code if `nil, ErrNotFound` is returned by `getPoolByListenerID`.
Automatic merge from submit-queue (batch tested with PRs 38947, 50239, 51115, 51094, 51116)
Mark the volumes as detached when node does not exist
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and do not return err.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#50200
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)
teach gce cloud to handle alpha/beta operations v2
Alternative to #50704
This one feels cleaner. BUT, type assertion problems cannot be exposed at compile time.
Please let me know what you think. This will set the precedence for consuming GCE alpha/beta API.
cc: @thockin @yujuhong @saad-ali @MrHohn
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640)
gce external LB: add a function to verify the requested IP address
Factor out the logic for verifying the user-requested IP for better
readability and testing. Also rename a few variables for clarity.
If user specify floating-network-id by annotation rather than cloud
provider file, openstack cloud provider don't delete floatingip when
deleting LoadBalancer service.
Automatic merge from submit-queue (batch tested with PRs 50255, 50885)
AWS: Arbitrarily choose first (lexicographically) subnet in AZ
When there is more than one subnet for an AZ on AWS choose arbitrarily
chose the first one lexicographically for consistency.
**What this PR does / why we need it**:
If two subnets were to be used appear in the same aws az which one is chosen is currently not consistent. This could lead to difficulty in diagnosing issues.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45983
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 50303, 50856)
Make route-controller list only relevant routes instead of all of them
Ref https://github.com/kubernetes/kubernetes/issues/50854 (somewhat related issue)
IIUC from the code, route-controller memory is mainly being used in storing routes and nodes (also CIDRs, but that's not much).
This should help reduce that memory usage (particularly when running in a project with large no. of routes), by moving filtering to server-side.
For e.g in kubernetes-scale project we have ~5000 routes (each about 600B) => 3 MB of routes
This doesn't help with reducing time to list the routes as filtering is also linear.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Mark volume as detached when node does not exist for vsphere
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
**Special notes for your reviewer**:
/assign @jingxu97
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46317, 48922, 50651, 50230, 47599)
Log name if Azure file share cannot be created
**What this PR does / why we need it**: If the Azure storage provider fails to create a file share, it logs and error message 'failed to create share in account _foo_: _error-msg_'. A user on the Slack azure-sig channel reported an error of "The specified resource name length is not within the permissible limits". This PR adds logging of the name so that this error can be diagnosed in future.
**Which issue this PR fixes**: This was raised on Slack and has not been created as a GitHub issue.
**Special notes for your reviewer**: None
**Release note**:
```release-note
Changed the error log format when creating an Azure file share to include the name of the share.
```
Automatic merge from submit-queue (batch tested with PRs 50061, 48580, 50779, 50722)
Fix for Policy based volume provisioning failure due to long VM Name in vSphere cloud provider
Dummy VM is used for SPBM policy based provisioning feature of vSphere cloud provider.
Dummy VM name is generated based on kubernetes cluster name and pv name. It can easily go beyond
vSphere's limitation of 80 characters for vmName.
To solve the long VM name failure hash is used instead of vSphere-k8s-clusterName-PvName
**Which issue this PR fixes**
https://github.com/vmware/kubernetes/issues/176
**Release note:**
```release-note
None
```
@BaluDontu @divyenpatel @luomiao @tusharnt
Currently if user doesn't specify subnet-id or specify a unsafe
subnet-id, openstack cloud provider can't create a correct LoadBalancer
service.
Actually we can get it automatically. This patch do a improvement.
This is a part of #50726
vSphere has limitation of 80 characters for vmName.
with vsphere-k8s prefix and "vmdisk.volumeOptions.Name" vmName can become easily bigger than 80 chars.
Used hash funciton just of the "vmdisk.volumeOptions.Name" part as cleanup dummyVm logic depends on prefix "vsphere-k8s"
If node doesn't exist, OpenStack Nova will assume the volumes
are not attached to it. So mark the volumes as detached and
return false without error.
Fix: #50200
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
Automatic merge from submit-queue (batch tested with PRs 47724, 49984, 49785, 49803, 49618)
Fix conflict about getPortByIp
**What this PR does / why we need it**:
Currently getPortByIp() get port of instance only based on IP.
If there are two instances in diffent network and the CIDR of
their subnet are same, getPortByIp() will be conflict.
My PR gets port based on IP and Name of instance.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fix#43909
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Added jdumars to OWNERS file for Azure cloud provider
**What this PR does / why we need it**:
This PR adds GitHub user jdumars as an approver to pkg/cloudprovider/providers/azure
Jaice Singer DuMars (me) is the program manager at Microsoft tasked with shepherding all upstream contributions from Microsoft into Kubernetes. With the volume of work, and the impending breakout of cloud provider code, this helps distribute the review and approval load more evenly.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
N/A
**Special notes for your reviewer**:
This was discussed with Brendan Burns prior to submitting the pre-approval.
**Release note**:
none
Automatic merge from submit-queue
GCE: Fix lowercase value and alpha-missing annotation for ILB
**What this PR does / why we need it**:
Fixes#50426
Also explicitly sets an annotation as 'alpha'.
/assign @freehan @bowei
**Release note**:
```release-note
NONE
```
Currently getPortByIp() get port of instance only based on IP.
If there are two instances in diffent network and the CIDR of
their subnet are same, getPortByIp() will be conflict.
My PR gets port based on IP and Name of instance.
Automatic merge from submit-queue
Ignore the available volume when calling DetachDisk
Fix#50207
If user detachs the volume by nova in openstack env, volume becomes
available. If nova instance is been deleted, nova will detach it
automatically and become available. So the "available" is fine since that means the
volume is detached from instance already.
**Release note**:
```release-note
NONE
```
If node does not exist, node's volumes will be detached
automatically and become available. So mark them detached and
return false without error.
Fix#50266
Automatic merge from submit-queue (batch tested with PRs 49524, 46760, 50206, 50166, 49603)
[OpenStack] Add more detail error message
I get same simple error messages "Unable to initialize cinder client
for region: RegionOne" from controller-manager, but I can not find the
reason. We should add more detail message "err" into glog.Errorf.
Currently NewBlockStorageV2() return err when failed to get cinder endpoint, but there is no code to output the message of err.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Azure: Allow VNet to be in a separate Resource Group
**What this PR does / why we need it**:
This PR allows Kubernetes in an Azure context to use a VNet which is not in the same Resource Group as Kubernetes.
We need this because currently Azure Cloud Provider driver assumes that it should have a VNet for himself but if there is one thing that should be shared amongst Azure resources it's a VNet cause, well, things might want to talk to each other in a private network, don't you think ?
I guess this should we backported down to 1.6 branch.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes#49577
**Release note**:
```release-note
NONE
```
@kubernetes/sig-azure
@kubernetes/sig-azure-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50418, 49830, 49206, 49061, 49912)
add LocalZone into gce.conf and refactor gce cloud provider configura…
The main goal of this PR is to make gce cloud provider able to run locally.
1. added a LocalZone parameter into gce.conf.
2. refactor `newGCECloud` to avoid contacting metadata server if configuration is already available.
```release-note
None
```
Automatic merge from submit-queue
VSphere cloud provider code refactoring
The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.
- VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
- Volume management on datastore (Create/Delete)
- Volume management on Virtual Machines (Attach/Detach)
- Storage Policy Management
- vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
- Modifications to e2e tests to accomodate the latest design changes.
@divyenpatel @rohitjogvmw @luomiao
```release-note
vSphere cloud provider: vSphere cloud provider code refactoring
```
Automatic merge from submit-queue (batch tested with PRs 50087, 39587, 50042, 50241, 49914)
AttachDisk should not call detach inside of Cinder volume provider
If use detachs the volume by nova in openstack env, volume becomes
available. If nova instance is been deleted, nova will detach it
automatically. So the "available" is fine since that means the
volume is detached from instance already.
I get same simple error messages "Unable to initialize cinder client
for region: RegionOne" from controller-manager, but I can not find the
reason. We should add more detail message "err" into glog.Errorf.
Automatic merge from submit-queue (batch tested with PRs 49805, 50052)
We never want to modify the globally defined SG for ELBs
**What this PR does / why we need it**:
Fixes a bug where creating or updating an ELB will modify a globally defined security group
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50105
**Special notes for your reviewer**:
**Release note**:
```release-note
fixes a bug around using the Global config ElbSecurityGroup where Kuberentes would modify the passed in Security Group.
```
Automatic merge from submit-queue (batch tested with PRs 47416, 47408, 49697, 49860, 50162)
add possibility to use multiple floatingip pools in openstack loadbalancer
**What this PR does / why we need it**: Currently only one floating pool is supported in kubernetes openstack cloud provider. It is quite big issue for us, because we want run only single kubernetes cluster, but we want that external and internal services can be used. It means that we need possibility to create services with internal and external pools.
**Which issue this PR fixes**: fixes#49147
**Special notes for your reviewer**: service labels is not maybe correct place to define this floatingpool id. However, I did not find any better place easily. I do not want start modifying service api structure.
**Release note**:
```release-note
Add possibility to use multiple floatingip pools in openstack loadbalancer
```
Example how it works:
```
cat /etc/kubernetes/cloud-config
[Global]
auth-url=https://xxxx
username=xxxx
password=xxxx
region=yyy
tenant-id=b23efb65b1d44b5abd561511f40c565d
domain-name=foobar
[LoadBalancer]
lb-version=v2
subnet-id=aed26269-cd01-4d4e-b0d8-9ec726c4c2ba
lb-method=ROUND_ROBIN
floating-network-id=56e523e7-76cb-477f-80e4-2dc8cf32e3b4
create-monitor=yes
monitor-delay=10s
monitor-timeout=2000s
monitor-max-retries=3
```
```
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
template:
metadata:
labels:
run: web
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
run: web-ext
name: web-ext
namespace: default
spec:
selector:
run: web
ports:
- port: 80
name: https
protocol: TCP
targetPort: 80
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
labels:
run: web-int
floatingPool: a2a84887-4915-42bf-aaff-2b76688a4ec7
name: web-int
namespace: default
spec:
selector:
run: web
ports:
- port: 80
name: https
protocol: TCP
targetPort: 80
type: LoadBalancer
```
```
% kubectl create -f example.yaml
deployment "nginx-deployment" created
service "web-ext" created
service "web-int" created
% kubectl get svc -o wide
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes 10.254.0.1 <none> 443/TCP 2m <none>
web-ext 10.254.23.153 192.168.1.57,193.xx.xxx.xxx 80:30151/TCP 52s run=web
web-int 10.254.128.141 192.168.1.58,10.222.130.80 80:32431/TCP 52s run=web
```
cc @anguslees @k8s-sig-openstack-feature-requests @dims
Automatic merge from submit-queue (batch tested with PRs 49916, 50050)
GCE: Fix bug by correctly cast port to string
Code is incorrectly casting a port to a string, causing the diff-expression to always return true.
**What this PR does / why we need it**:
Fixes#50049
**Special notes for your reviewer**:
/assign @MrHohn
**Release note**:
```release-note
NONE
```
if not needed here
load network ids from gophercloud api
fix to getnetworkbyname
update godeps, add networks library
fix gofmt and boilerplate
gofmt
use annotations
fix
remove enableflag
add comment to annotationvalue
Automatic merge from submit-queue (batch tested with PRs 49870, 49416, 49872, 49892, 49908)
fix alpha/beta endpoint when api endpoint is specified
fix a bug in alpha/beta compute API endpoint bootstraping when api-endpiont is specified.
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
GCE: Adding unit test for ensureStaticIP
**What this PR does / why we need it**:
Entry into unit testing GCE loadbalancer code by testing `ensureStaticIP` which had a bug in 1.7.0.
@bowei @freehan @MrHohn @dnardo @thockin, any thoughts and comments on how we could unit test LB code moving forward? I think there are many areas we can split functions into smaller ones for easier testing - firewallNeedsUpdate being an example of that. However, it seems to me that we still need to mock our GCP calls for some functions that heavily revolve around API calls. A dream goal would be to have a unit test that can call EnsureLoadBalancer. Now that we have shared resources between different services and ingresses (firewalls, instance groups, [future features]), being able to setup different scenarios without depending on E2E tests would be awesome. However, I'm not sure how reachable that goal would be.
Most importantly, let's not make things worse. If you have advice on anti-patterns to avoid, please speak up.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
GCE: Update vendor of gcfg and filter config parsing errors
**What this PR does / why we need it**:
To utilize new function `FatalOnly` which filters "programmer errors"
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#49660
**Special notes for your reviewer**:
/assign @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
GCE: Remove resource Get function calls from Create functions
**What this PR does / why we need it**:
Consistency. This PR removes the GetXXX from the CreateXXX functions of the GCE cloudprovider. Consumers (specifically the ingress controller) will need to call the Get resource funcs separately when updating their vendored versions.
**Release note**:
```release-note
NONE
```
/assign @bowei
Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
Better message if we dont find appropriate BlockStorage API
**What this PR does / why we need it**:
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Define a new config VnetResourceGroup in order to be able to use a VNet
which is not in the same resource group as kubernetes.
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
Automatic merge from submit-queue
Warn if aws has no cluster id provided
**What this PR does / why we need it**:
we info log a message when no cluster id is provided that should be a warning given its impact.
fixes https://github.com/kubernetes/kubernetes/issues/49568
**Release note**:
```release-note
NONE
```
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
Automatic merge from submit-queue (batch tested with PRs 49420, 49296, 49299, 49371, 46514)
Avoid looking up instance id until we need it
**What this PR does / why we need it**:
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
So let's try to find the instance-id only when we need it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Bump up gce minNodesHealthCheckVersion due to known issues
**What this PR does / why we need it**: There are some known issues in previous 1.7 versions causing kube-proxy not correctly responding healthz traffic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: From #49263.
**Special notes for your reviewer**:
/assign @nicksardo @freehan
cc @bowei @thockin
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service will have health checks for nodes by default if all nodes have version >= v1.7.2.
```
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
Added logging to AWS api calls. #46969
Additionally logging of when AWS API calls start and end to help diagnose problems with kubelet on cloud provider nodes not reporting node status periodically. There's some inconsistency in logging around this PR we should discuss.
IMO, the API logging should be at a higher level than most other types of logging as you would probably only want it in limited instances. For most cases that is easy enough to do, but there are some calls which have some logging around them already, namely in the instance groups. My preference would be to keep the existing logging as it and just add the new API logs around the API call.
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
Automatic merge from submit-queue (batch tested with PRs 49276, 49235)
Don't fail fast if LoadBalancer section is missing
**What this PR does / why we need it**:
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
Automatic merge from submit-queue
Tolerate Flavor information for computing instance type
**What this PR does / why we need it**:
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api
So let's just inspect what is present and use that to figure out
the instance type.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix on-premises term in error string and comments for aws provider
**What this PR does / why we need it**: fix for correct terminology of "on-premises" over "on-premise"
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a
**Special notes for your reviewer**: Updated error string while doing a scrub for the incorrect term in the docs (kubernetes/kubernetes.github.io#4413).
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49083, 45540, 46862)
Add extra logging to azure API get calls
**What this PR does / why we need it**:
This PR adds extra logging for external calls to the Azure API, specifically get calls.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
This will help troubleshoot problems arising from the usage of this cloudprovider. For example, it looks like #43516 is caused by a call to the cloudprovider taking too much time.
Automatic merge from submit-queue (batch tested with PRs 49218, 48253, 48967, 48460, 49230)
additional backoff in azure cloudprovider
Fixes#48971
**What this PR does / why we need it**:
We want to be able to opt in to backoff retry logic for kubelet-originating request behavior: node IP address resolution and node load balancer pool membership enforcement.
**Special notes for your reviewer**:
The use-case for this is azure cloudprovider clusters with large node counts, especially during cluster installation, or other scenarios when lots of nodes come online at once and attempt to register all resources with the backend API. To allow clusters at scale more control over the API request rate in-cluster, backoff config has the ability to meaningful slow down this rate, when appropriate.
**Release note**:
```additional backoff in azure cloudprovider
```
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api
So let's just inspect what is present and use that to figure out
the instance type.
Automatic merge from submit-queue (batch tested with PRs 48702, 48965, 48740, 48974, 48232)
Rackspace for cloud-controller-manager
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.
This is part of #47257 cc @wlan0
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48890, 46893, 48872, 48896)
Fix the order of deletion
1. EnsureLoadBalancer can't delete pool without deleting members,
just let EnsureLoadBalancerDeleted do it.
2. Add some friendly error message
**Release note**:
```release-note
NONE
```
EnsureHostInPool() submits a GET to azure API for VM info. We’re seeing this on agent node kubelets and would like to enable configurable backoff engagement for 4xx responses to be able to slow down the rate of reconciliation, when appropriate.
Automatic merge from submit-queue (batch tested with PRs 47066, 48892, 48933, 48854, 48894)
azure: msi: add managed identity field, logic
**What this PR does / why we need it**: Enables managed service identity support for the Azure cloudprovider. "Managed Service Identity" allows us to ask the Azure Compute infra to provision an identity for the VM. Users can then retrieve the identity and assign it RBAC permissions to talk to Azure ARM APIs for the purpose of the cloudprovider needs.
Per the commit text:
```
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a
**Special notes for your reviewer**: none
**Release note**:
```release-note
azure: support retrieving access tokens via managed identity extension
```
cc: @brendandburns @jdumars @anhowe
Automatic merge from submit-queue
remove some people from OWNERS so they don't get reviews anymore
These are googlers who don't work on the project anymore but are still
getting reviews assigned to them:
- @bprashanth
- @rjnagal
- @vmarmol
Automatic merge from submit-queue
Azure PD (Managed/Blob)
This is exactly the same code as this [PR](https://github.com/kubernetes/kubernetes/pull/41950). It has a clean set of generated items. We created a separate PR to accelerate the accept/merge the PR
CC @colemickens
CC @brendandburns
**What this PR does / why we need it**:
1. Adds K8S support for Azure Managed Disks.
2. Adds support for dedicated blob disks (1:1 to storage account) in addition to shared blob disks (n:1 to storage account).
3. Automatically manages the underlying storage accounts. New storage accounts are created at 50% utilization. Max is 100 disks, 60 disks per storage account.
2. Addresses the current issues with Blob Disks:
..* Significantly faster attach process. Disks are now usually available for pods on nodes under 30 sec if formatted, under a min if not formatted.
..* Adds support to move disks between nodes.
..* Adds consistent attach/detach behavior, checks if the disk is leased/attached on a different node before attempting to attach to target nodes.
..* Fixes a random hang behavior on Azure VMs during mount/format (for both blob + managed disks).
..* Fixes a potential conflict by avoiding the use of disk names for mount paths. The new plugin uses hashed disk uri for mount path.
The existing AzureDisk is used as is. Additional "kind" property was added allowing the user to decide if the pd will be shared, dedicated or managed (Azure Managed Disks are used).
Due to the change in mounting paths, existing PDs need to be recreated as PV or PVCs on the new plugin.
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
Automatic merge from submit-queue (batch tested with PRs 48555, 48849)
GCE: Fix panic when service loadbalancer has static IP address
Fixes#48848
```release-note
Fix service controller crash loop when Service with GCP LoadBalancer uses static IP (#48848, @nicksardo)
```
Automatic merge from submit-queue (batch tested with PRs 48781, 48817, 48830, 48829, 48053)
vSphere for cloud-controller-manager
**What this PR does / why we need it**:
This is to implement the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods for cloud-controller-manager for vSphere cloud provider.
Currently vSphere cloud provider only supports VMs in the same folder.
Thus `NodeAddressesByProviderID` is similar to `NodeAddresses` with a simple ProviderID to NodeName translation.
`InstanceTypeByProviderID` returns nil as same as `InstanceType`.
**Which issue this PR fixes**
Part of Issue https://github.com/kubernetes/kubernetes/issues/47257
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Add initial support for the Azure instance metadata service.
Part of fixing #46632
@colemickens @rootfs @jdumars @kris-nova
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Fix panic of DeleteRoute()
Fix#48800
It should be 'addr_pairs', not 'routes'.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47948, 48631, 48693, 48549, 47593)
OpenStack for cloud-controller-manager
**What this PR does / why we need it**:
This implements the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods used by the cloud-controller-manager to the OpenStack provider. The instance type returned is the flavor name, for consistency `InstanceType` has been implemented too returning the same value.
```release-note
NONE
```
This is part of #47257 cc @wlan0
Automatic merge from submit-queue
Removed mesos as cloud provider from Kubernetes.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47205
**Special notes for your reviewer**:
**Release note**:
```release-note
Move Mesos Cloud Provider out of Kubernetes Repo
```
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.
This is part of #47257 cc @wlan0
Automatic merge from submit-queue (batch tested with PRs 48497, 48604, 48599, 48560, 48546)
GCE: Use network project id for firewall/route mgmt and zone listing
- Introduces a new environment variable for plumbing the network project id which will be used for firewall and route management. fixes#48515
- onXPN is determined by metadata if config is not specified
- Split `if` conditions: fixes#48521
- Remove `getNetworkNameViaAPICall` which was used as a last resort for the `networkURL` (if empty) which was previously filled with the metadata network project & name.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix deleting empty monitors
Fix#48094
When create-monitor of cloud-config is false, pool has not monitor
and can not delete empty monitor.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47234, 48410, 48514, 48529, 48348)
Check opts of cloud config file
Fix#48347
Check opts when register OpenStack CloudProvider rather than
returning error when use opts to create/use cloud resource.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48292, 48121)
Add Google cloudkms dependency, add cloudkms service to GCE cloud provider
Required to introduce a Google KMS based envelope encryption, which shall allow encrypting secrets at rest using KEK-DEK scheme.
The above requires KMS API to create/delete KeyRings and CryptoKeys, and Encrypt/Decrypt data.
Should target release 1.8
@jcbsmpsn
Update: It appears that Godep only allows dependencies which are in use. We may have to modify this PR to include some Google KMS code.
Progresses #48522
Automatic merge from submit-queue
Use %q formatter for error messages from the AWS SDK. #47789
Error messages from the AWS SDK can have return keys in them, so use %q formatter for those messages.
Automatic merge from submit-queue (batch tested with PRs 47918, 47964, 48151, 47881, 48299)
Add ApiEndpoint support to GCE config.
**What this PR does / why we need it**:
Add the ability to change ApiEndpoint for GCE.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 47286, 47729)
Add client certificate authentication to Azure cloud provider
This adds support for client cert authentication in Azure cloud provider. The certificate can be provided in PKCS #12 format with password protection. Not that this authentication will be active only when no client secret is configured.
cc @brendandburns @colemickens
Automatic merge from submit-queue (batch tested with PRs 48139, 48042, 47645, 48054, 48003)
Pipe clusterID into gce_loadbalancer_external.go
**What this PR does / why we need it**: Small cleanup for GCE ELB codes.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48002
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47776, 46220, 46878, 47942, 47947)
fix comment mistake
fix comment mistake
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 47776, 46220, 46878, 47942, 47947)
update openstack metadata-service url
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue
Ignore ErrNotFound when delete LB resources
IsNotFound error is fine since that means the object is
deleted already, so let's check it before return error.
Automatic merge from submit-queue (batch tested with PRs 47915, 47856, 44086, 47575, 47475)
AWS: Fix suspicious loop comparing permissions
Because we only ever call it with a single UserId/GroupId, this would
not have been a problem in practice, but this fixes the code.
Fix#36902
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47922, 47195, 47241, 47095, 47401)
AWS: Set CredentialsChainVerboseErrors
This avoids a rather confusing error message.
Fix#39374
```release-note
NONE
```
Automatic merge from submit-queue
Add E2E tests for Azure internal loadbalancer support, fix an issue for public IP resource deletion.
**What this PR does / why we need it**:
- Add E2E tests for Azure internal loadbalancer support: https://github.com/kubernetes/kubernetes/pull/43510
- Fix an issue that public IP resource not get deleted when switching from external loadbalancer to internal static loadbalancer.
**Special notes for your reviewer**:
1. Add new Azure resource tag to Public IP resources to indicate kubernetes managed resources.
Currently we determine whether the public IP resource should be deleted by looking at LoadBalancerIp property on spec. In the scenario 'Switching from external loadbalancer to internal loadbalancer with static IP', that value might have been updated for internal loadbalancer. So here we're to add an explicit tag for kubernetes managed resources.
2. Merge cleanupPublicIP logic into cleanupLoadBalancer
**Release note**:
NONE
CC @brendandburns @colemickens
Automatic merge from submit-queue
New annotation to add existing Security Groups to ELBs created by AWS cloudprovider
**What this PR does / why we need it**:
When K8S cluster is deployed in existing VPC there might be a need to attach extra SecurityGroups to ELB created by AWS cloudprovider. Example of it can be cases, where such Security Groups are maintained by another team.
**Special notes for your reviewer**:
For tests to pass depends on https://github.com/kubernetes/kubernetes/pull/45168 and therefore includes it
**Release note**:
```release-note
New 'service.beta.kubernetes.io/aws-load-balancer-extra-security-groups' Service annotation to specify extra Security Groups to be added to ELB created by AWS cloudprovider
```
Automatic merge from submit-queue
AWS: Remove blackhole routes in our managed range
Blackhole routes otherwise acccumulate unboundedly. We also are careful
to ensure that we do so only within the managed range, which requires
enlisting the help of the routecontroller.
Fix#47524
```release-note
AWS: clean up blackhole routes when using kubenet
```
We maintain a cache of all instances, and we invalidate the cache
whenever we see a new instance. For ELBs that should be sufficient,
because our usage is limited to instance ids and security groups, which
should not change.
Fix#45050
Automatic merge from submit-queue (batch tested with PRs 47510, 47516, 47482, 47521, 47537)
Batch AWS getInstancesByNodeNames calls with FilterNodeLimit
We are going to limit the getInstancesByNodeNames call with a batch
size of 150.
Fixes - #47271
```release-note
AWS: Batch DescribeInstance calls with nodeNames to 150 limit, to stay within AWS filter limits.
```
Blackhole routes otherwise acccumulate unboundedly. We also are careful
to ensure that we do so only within the managed range, which requires
enlisting the help of the routecontroller.
Fix#47524
Automatic merge from submit-queue
AWS: Process disk attachments even with duplicate NodeNames
Fix#47404
```release-note
AWS: Process disk attachments even with duplicate NodeNames
```
Automatic merge from submit-queue (batch tested with PRs 47302, 47389, 47402, 47468, 47459)
[GCE] Fix ILB sharing and GC
Fixes#47092
- Users must opt-in for sharing backend services (alpha feature - may be removed in future release)
- Shared backend services use a hash for determining similarity via settings (so far, only sessionaffinity) (again, this may be removed)
- Move resource cleanup to after the ILB setup.
/assign @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46929, 47391, 47399, 47428, 47274)
AWS: Richer log message when metadata fails
Not a resolution, but should at least help determine the issue.
Issue #41904
```release-note
NONE
```
Automatic merge from submit-queue
AWS for cloud-controller-manager
fixes#47214
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID methods used by the cloud-controller-manager for the AWS provider.
NodeAddressesByProvider uses DescribeInstances (for normal addresses) and DescribeAddresses (for Elastic IP addresses).
InstanceTypeByProviderID uses DescribeInstances.
```release-note
NONE
```
Automatic merge from submit-queue
Azure for cloud-controller-manager
**What this PR does / why we need it**:
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID methods used by the cloud-controller-manager to the Azure provider.
**Release note**:
```release-note
NONE
```
Addresses #47257
Service objects can be annotated with
`service.beta.kubernetes.io/aws-load-balancer-extra-security-groups`
to specify existing security groups to be added to ELB
created by AWS cloudprovider
Automatic merge from submit-queue (batch tested with PRs 43005, 46660, 46385, 46991, 47103)
Azure cloudprovider retry using flowcontrol
An initial attempt at engaging exponential backoff for API error responses.
Addresses #47048
Uses k8s.io/client-go/util/flowcontrol; implementation inspired by GCE
cloudprovider backoff.
**What this PR does / why we need it**:
The existing azure cloudprovider implementation has no guard rails in place to adapt to unexpected underlying operational conditions (i.e., clogs in resource plumbing between k8s runtime and the cloud API). The purpose of these changes is to support exponential backoff wrapping around API calls; and to support targeted rate limiting. Both of these options are configurable via `--cloud-config`.
Implementation inspired by the GCE's use of `k8s.io/client-go/util/flowcontrol` and `k8s.io/apimachinery/pkg/util/wait`, this PR likewise uses `flowcontrol` for rate limiting; and `wait` to thinly wrap backoff retry attempts to the API.
**Special notes for your reviewer**:
Pay especial note to the declaration of retry-able conditions from an unsuccessful HTTP request:
- all `4xx` and `5xx` HTTP responses
- non-nil error responses
And the declaration of retry success conditions:
- `2xx` HTTP responses
Tests updated to include additions to `Config`.
Those may be incomplete, or in other ways non-representative.
**Release note**:
```release-note
Added exponential backoff to Azure cloudprovider
```
- leveraging Config struct (—cloud-config) to store backoff and rate limit on/off and performance configuration
- added add’l error logging
- enabled backoff for vm GET requests
Automatic merge from submit-queue (batch tested with PRs 36721, 46483, 45500, 46724, 46036)
AWS: Allow configuration of a single security group for ELBs
**What this PR does / why we need it**:
AWS has a hard limit on the number of Security Groups (500). Right now every time an ELB is created Kubernetes is creating a new Security Group. This allows for specifying a Security Group to use for all ELBS
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
For some reason the Diff tool makes this look like it was way more changes than it really was.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
- added info and error logs for appropriate backoff conditions/states
- rationalized log idioms across all resource requests that are backoff-enabled
- processRetryResponse as a wait.ConditionFunc needs to supress errors if it wants the caller to continue backing off
Automatic merge from submit-queue (batch tested with PRs 43505, 45168, 46439, 46677, 46623)
fix AWS tagging to add missing tags only
It seems that intention of original code was to build map of missing
tags and call AWS API to add just them, but due to typo full
set of tags was always (re)added
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46686, 45049, 46323, 45708, 46487)
Log an EBS vol's instance when attaching fails because VolumeInUse
Messages now look something like this:
E0427 15:44:37.617134 16932 attacher.go:73] Error attaching volume "vol-00095ddceae1a96ed": Error attaching EBS volume "vol-00095ddceae1a96ed" to instance "i-245203b7": VolumeInUse: vol-00095ddceae1a96ed is already attached to an instance
status code: 400, request id: f510c439-64fe-43ea-b3ef-f496a5cd0577. The volume is currently attached to instance "i-072d9328131bcd9cd"
weird that AWS doesn't bother to put that information in there for us (it does when you try to delete a vol that's in use)
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46394, 46650, 46436, 46673, 46212)
refactor and export openstack service clients
**What this PR does / why we need it**:
Refactor and export openstack service client.
Exporting OpenStack client so other projects can use the them to call functions that are not implemented in openstack cloud providers yet.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
An initial attempt at engaging exponential backoff for API error responses.
Uses k8s.io/client-go/util/flowcontrol; implementation inspired by GCE
cloudprovider backoff.
Automatic merge from submit-queue (batch tested with PRs 46489, 46281, 46463, 46114, 43946)
AWS: consider instances of all states in DisksAreAttached, not just "running"
Require callers of `getInstancesByNodeNames(Cached)` to specify the states they want to filter instances by, if any. DisksAreAttached, cannot only get "running" instances because of the following attach/detach bug we discovered:
1. Node A stops (or reboots) and stays down for x amount of time
2. Kube reschedules all pods to different nodes; the ones using ebs volumes cannot run because their volumes are still attached to node A
3. Verify volumes are attached check happens while node A is down
4. Since aws ebs bulk verify filters by running nodes, it assumes the volumes attached to node A are detached and removes them all from ASW
5. Node A comes back; its volumes are still attached to it but the attach detach controller has removed them all from asw and so will never detach them even though they are no longer desired on this node and in fact desired elsewhere
6. Pods cannot run because their volumes are still attached to node A
So the idea here is to remove the wrong assumption that callers of `getInstancesByNodeNames(Cached)` only want "running" nodes.
I hope this isn't too confusing, open to alternative ways of fixing the bug + making the code nice.
ping @gnufied @kubernetes/sig-storage-bugs
```release-note
Fix AWS EBS volumes not getting detached from node if routine to verify volumes are attached runs while the node is down
```
Automatic merge from submit-queue
AWS: support node port health check
**What this PR does / why we need it**:
if a custom health check is set from the beta annotation on a service it
should be used for the ELB health check. This patch adds support for
that.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Let me know if any tests need to be added.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Make GCE load-balancers create health checks for nodes
From #14661. Proposal on kubernetes/community#552. Fixes#46313.
Bullet points:
- Create nodes health check and firewall (for health checking) for non-OnlyLocal service.
- Create local traffic health check and firewall (for health checking) for OnlyLocal service.
- Version skew:
- Don't create nodes health check if any nodes has version < 1.7.0.
- Don't backfill nodes health check on existing LBs unless users explicitly trigger it.
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default.
An existing LoadBalancer will have health check attached to it when:
- Change Service.Spec.Type from LoadBalancer to others and flip it back.
- Any effective change on Service.Spec.ExternalTrafficPolicy.
```
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)
Created unit tests for GCE cloud provider storage interface.
- Currently covers CreateDisk and DeleteDisk, GetAutoLabelsForPD
- Created ServiceManager interface in gce.go to facilitate mocking in tests.
**What this PR does / why we need it**:
Increasing test coverage for GCE Persistent Disk.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44573
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45518, 46127, 46146, 45932, 45003)
aws: Support for ELB tagging by users
This PR provides support for tagging AWS ELBs using information in an
annotation and provided as a list of comma separated key-value pairs.
Closes https://github.com/kubernetes/community/pull/404
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)
GCE - Retrieve subnetwork name/url from gce.conf
**What this PR does / why we need it**:
Features like ILB require specifying the subnetwork if the network is type manual.
**Notes:**
The network URL can be [constructed](68e7e18698/pkg/cloudprovider/providers/gce/gce.go (L211-L217)) by fetching instance metadata; however, the subnetwork is not provided through this feature. Users must specify the subnetwork name/url through the gce.conf.
Although multiple subnets can exist in the same region for a network, the cloud provider will only use one subnet url for creating LBs.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Added deprecation notice and guidance for cloud providers.
**What this PR does / why we need it**:
Adding context/background and general guidance for incoming cloud providers.
**Which issue this PR fixes**
**Special notes for your reviewer**:
Generalized message per discussion with @bgrant0607
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)
Fix provisioned GCE PD not being reused if already exists
@jsafrane PTAL
This is another attempt at https://github.com/kubernetes/kubernetes/pull/38702 . We have observed that `gce.service.Disks.Insert(gce.projectID, zone, diskToCreate).Do()` instantly gets an error response of alreadyExists, so we must check for it.
I am not sure if we still need to check for the error after `waitForZoneOp`; I think that if there is an alreadyExists error, the `Do()` above will always respond with it instantly. But because I'm not sure, and to be safe, I will leave it.
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)
Only retrieve relevant volumes
**What this PR does / why we need it**:
Improves performance for Cinder volume attach/detach calls.
Currently when Cinder volumes are attached or detached, functions try to retrieve details about the volume from the Nova API. Because some only have the volume name not its UUID, they use the list function in gophercloud to iterate over all volumes to find a match. This incurs severe performance problems on OpenStack projects with lots of volumes (sometimes thousands) since it needs to send a new request when the current page does not contain a match. A better way of doing this is use the `?name=XXX` query parameter to refine the results.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubernetes/issues/26404
**Special notes for your reviewer**:
There were 2 ways of addressing this problem:
1. Use the `name` query parameter
2. Instead of using the list function, switch to using volume UUIDs and use the GET function instead. You'd need to change the signature of a few functions though, such as [`DeleteVolume`](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/cinder/cinder.go#L49), so I'm not sure how backwards compatible that is.
Since #1 does effectively the same as #2, I went with it because it ensures BC.
One assumption that is made is that the `volumeName` being retrieved matches exactly the name of the volume in Cinder. I'm not sure how accurate that is, but I see no reason why cloud providers would want to append/prefix things arbitrarily.
**Release note**:
```release-note
Improves performance of Cinder volume attach/detach operations
```
An admin wants to specify in which AWS availability zone(s) users may create persistent volumes using dynamic provisioning.
That's why the admin can now configure in StorageClass object a comma separated list of zones. Dynamically created PVs for PVCs that use the StorageClass are created in one of the configured zones.
Automatic merge from submit-queue (batch tested with PRs 45891, 46147)
Watching ClusterId from within GCE cloud provider
**What this PR does / why we need it**:
Adds the ability for the GCE cloud provider to watch a config map for `clusterId` and `providerId`.
WIP - still needs more testing
cc @MrHohn @csbell @madhusudancs @thockin @bowei @nikhiljindal
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
vSphere storage policy support for dynamic volume provisioning
Till now, vSphere cloud provider provides support to configure persistent volume with VSAN storage capabilities - kubernetes#42974. Right now this only works with VSAN.
Also there might be other use cases:
- The user might need a way to configure a policy on other datastores like VMFS, NFS etc.
- Use Storage IO control, VMCrypt policies for a persistent disk.
We can achieve about 2 use cases by using existing storage policies which are already created on vCenter using the Storage Policy Based Management service. The user will specify the SPBM policy ID as part of dynamic provisioning
- resultant persistent volume will have the policy configured with it.
- The persistent volume will be created on the compatible datastore that satisfies the storage policy requirements.
- If there are multiple compatible datastores, the datastore with the max free space would be chosen by default.
- If the user specifies the datastore along with the storage policy ID, the volume will created on this datastore if its compatible. In case if the user specified datastore is incompatible, it would error out the reasons for incompatibility to the user.
- Also, the user will be able to see the associations of persistent volume object with the policy on the vCenter once the volume is attached to the node.
For instance in the below example, the volume will created on a compatible datastore with max free space that satisfies the "Gold" storage policy requirements.
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: zeroedthick
storagepolicyName: Gold
```
For instance in the below example, the vSphere CP checks if "VSANDatastore" is compatible with "Gold" storage policy requirements. If yes, volume will be provisioned on "VSANDatastore" else it will error that "VSANDatastore" is not compatible with the exact reason for failure.
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: zeroedthick
storagepolicyName: Gold
datastore: VSANDatastore
```
As a part of this change, 4 commits have been added to this PR.
1. Vendor changes for vmware/govmomi
2. Changes to the VsphereVirtualDiskVolumeSource in the Kubernetes API. Added 2 additional fields StoragePolicyName, StoragePolicyID
3. Swagger and Open spec API changes.
4. vSphere Cloud Provider changes to implement the storage policy support.
**Release note**:
```release-note
vSphere cloud provider: vSphere Storage policy Support for dynamic volume provisioning
```
Automatic merge from submit-queue
Adapt loadbalancer deleting/updating when using cloudprovider openstack in openstack/liberty
**What this PR does / why we need it**:
Make an extra verification on the returned listeners and pools because gophercloud query doesn't filter the results by loadbalancerID / listenerID respectively when using **openstack/librerty**.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#33759
**Special notes for your reviewer**:
#33759 it's supposed to have a pull request which fixes this problem but in the release 1.5 loadbalancers doesn't use that patched code.
**Release note**:
NONE
```release-note
```
This PR provides support for tagging AWS ELBs using information in an
annotation and provided as a list of comma separated key-value pairs.
Closes https://github.com/kubernetes/community/pull/404
Automatic merge from submit-queue
Initialize cloud providers with a K8s clientBuilder
**What this PR does / why we need it**:
This PR provides each cloud provider the ability to generate kubernetes clients. Either the full access or service account client builder is passed from the controller manager. Cloud providers could need to retrieve information from the cluster that isn't provided through defined interfaces, and this seems more preferable to adding parameters.
Please leave your thoughts/comments.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Same internal and external ip for vSphere Cloud Provider
Currently, vSphere Cloud Provider reports internal ip as container ip addresses. This PR modifies vSphere Cloud Provider to report same ip address as both internal and external that is provided by vmware infrastructure.
cc @pdhamdhere @tusharnt @BaluDontu @divyenpatel @luomiao
Automatic merge from submit-queue
Add approvers to vsphere cloudprovider
This PR adds approvers for vSphere Cloud provider.
cc @pdhamdhere @tusharnt @BaluDontu @divyenpatel @luomiao
Automatic merge from submit-queue
Use beta GCP API instead of alpha in CloudCIDR controller
The feature we are using has been promoted to beta.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45374, 44537, 45739, 44474, 45888)
Fix attach volume to instance repeatedly
1.When volume's status is 'attaching', controllermanager will attach
it again and return err. So it is necessary to check volume's
status before attach/detach volume.
2. When volume's status is 'attaching', its attachments will be None,
controllermanager can't get device path and make some failed event.
But it is normal, so don't return err when attachments is None
Fix bug: #44536
Automatic merge from submit-queue (batch tested with PRs 45623, 45241, 45460, 41162)
Promotes Source IP preservation for Virtual IPs from Beta to GA
Fixes#33625. Feature issue: kubernetes/features#27.
Bullet points:
- Declare 2 fields (ExternalTraffic and HealthCheckNodePort) that mirror the ESIPP annotations.
- ESIPP alpha annotations will be ignored.
- Existing ESIPP beta annotations will still be fully supported.
- Allow promoting beta annotations to first class fields or reversely.
- Disallow setting invalid ExternalTraffic and HealthCheckNodePort on services. Default ExternalTraffic field for nodePort or loadBalancer type service to "Global" if not set.
**Release note**:
```release-note
Promotes Source IP preservation for Virtual IPs to GA.
Two api fields are defined correspondingly:
- Service.Spec.ExternalTrafficPolicy <- 'service.beta.kubernetes.io/external-traffic' annotation.
- Service.Spec.HealthCheckNodePort <- 'service.beta.kubernetes.io/healthcheck-nodeport' annotation.
```
When volume's status is 'attaching', its attachments will be None,
controllermanager can't get device path and make some failed event.
But it is normal, let's fix it.
Automatic merge from submit-queue (batch tested with PRs 45569, 45602, 45604, 45478, 45550)
Fixing VolumesAreAttached and DisksAreAttached functions in vSphere
**What this PR does / why we need it**:
In the vSphere HA, when node fail over happens, node VM momentarily goes in to “not connected” state. During this time, if kubernetes calls VolumesAreAttached function, we are returning incorrect map, with status for volume set to false - detached state.
Volumes attached to previous nodes, requires to be detached before they can attach to the new node. Kubernetes attempt to check volume attachment. When node VM is not accessible or for any reason we cannot determine disk is attached, we were returning a Map of volumepath and its attachment status set to false. This was misinterpreted as disks are already detached from the node and Kubernetes was marking volumes as detached after orphaned pod is cleaned up. This causes volumes to remain attached to previous node, and pod creation always remains in the “containercreating” state. Since both the node are powered on, volumes can not be attached to new node.
**Logs before fix**
```
{"log":"E0508 21:31:20.902501 1 vsphere.go:1053] disk uuid not found for [vsanDatastore] kubevols/kubernetes-dynamic-pvc-8b75170e-342d-11e7-bab5-0050568aeb0a.vmdk. err: No disk UUID fou
nd\n","stream":"stderr","time":"2017-05-08T21:31:20.902792337Z"}
{"log":"E0508 21:31:20.902552 1 vsphere.go:1041] Failed to check whether disk is attached. err: No disk UUID found\n","stream":"stderr","time":"2017-05-08T21:31:20.902842673Z"}
{"log":"I0508 21:31:20.902575 1 attacher.go:114] VolumesAreAttached: check volume \"[vsanDatastore] kubevols/kubernetes-dynamic-pvc-8b75170e-342d-11e7-bab5-0050568aeb0a.vmdk\" (specName
: \"pvc-8b75170e-342d-11e7-bab5-0050568aeb0a\") is no longer attached\n","stream":"stderr","time":"2017-05-08T21:31:20.902849717Z"}
{"log":"I0508 21:31:20.902596 1 operation_generator.go:166] VerifyVolumesAreAttached determined volume \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-8b7
5170e-342d-11e7-bab5-0050568aeb0a.vmdk\" (spec.Name: \"pvc-8b75170e-342d-11e7-bab5-0050568aeb0a\") is no longer attached to node \"node3\", therefore it was marked as detached.\n","stream":"s
tderr","time":"2017-05-08T21:31:20.902863097Z"}
```
In this change, we are making sure correct volume attachment map is returned, and in case of any error occurred while checking disk’s status, we return nil map.
**Logs after fix**
```
{"log":"E0509 20:25:37.982152 1 vsphere.go:1067] Failed to check whether disk is attached. err: No disk UUID found\n","stream":"stderr","time":"2017-05-09T20:25:37.982516134Z"}
{"log":"E0509 20:25:37.982190 1 attacher.go:104] Error checking if volumes ([[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1.vmdk [vsanDatastore] kubevols/kubernetes-dynamic-pvc-c268f141-34f2-11e7-9303-0050568a3ac1.vmdk [vsanDatastore] kubevols/kubernetes-dynamic-pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1.vmdk]) are attached to current node (\"node3\"). err=No disk UUID found\n","stream":"stderr","time":"2017-05-09T20:25:37.982521101Z"}
{"log":"E0509 20:25:37.982220 1 operation_generator.go:158] VolumesAreAttached failed for checking on node \"node3\" with: No disk UUID found\n","stream":"stderr","time":"2017-05-09T20:25:37.982526285Z"}
{"log":"I0509 20:25:39.157279 1 attacher.go:115] VolumesAreAttached: volume \"[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c268f141-34f2-11e7-9303-0050568a3ac1.vmdk\" (specName: \"pvc-c268f141-34f2-11e7-9303-0050568a3ac1\") is attached\n","stream":"stderr","time":"2017-05-09T20:25:39.157724393Z"}
{"log":"I0509 20:25:39.157329 1 attacher.go:115] VolumesAreAttached: volume \"[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1.vmdk\" (specName: \"pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1\") is attached\n","stream":"stderr","time":"2017-05-09T20:25:39.157787946Z"}
{"log":"I0509 20:25:39.157367 1 attacher.go:115] VolumesAreAttached: volume \"[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1.vmdk\" (specName: \"pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1\") is attached\n","stream":"stderr","time":"2017-05-09T20:25:39.157794586Z"}
```
```
{"log":"I0509 20:25:41.267425 1 reconciler.go:173] Started DetachVolume for volume \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1.vmdk\" from node \"node3\"\n","stream":"stderr","time":"2017-05-09T20:25:41.267883567Z"}
{"log":"I0509 20:25:41.271836 1 operation_generator.go:694] Verified volume is safe to detach for volume \"pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1\" (UniqueName: \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1.vmdk\") on node \"node3\" \n","stream":"stderr","time":"2017-05-09T20:25:41.272703255Z"}
{"log":"I0509 20:25:47.928021 1 operation_generator.go:341] DetachVolume.Detach succeeded for volume \"pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1\" (UniqueName: \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c26fcae8-34f2-11e7-9303-0050568a3ac1.vmdk\") on node \"node3\" \n","stream":"stderr","time":"2017-05-09T20:25:47.928348553Z"}
{"log":"I0509 20:26:12.535962 1 operation_generator.go:694] Verified volume is safe to detach for volume \"pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1\" (UniqueName: \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1.vmdk\") on node \"node3\" \n","stream":"stderr","time":"2017-05-09T20:26:12.536055214Z"}
{"log":"I0509 20:26:14.188580 1 operation_generator.go:341] DetachVolume.Detach succeeded for volume \"pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1\" (UniqueName: \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c25d08d3-34f2-11e7-9303-0050568a3ac1.vmdk\") on node \"node3\" \n","stream":"stderr","time":"2017-05-09T20:26:14.188792677Z"}
{"log":"I0509 20:26:40.355656 1 reconciler.go:173] Started DetachVolume for volume \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c268f141-34f2-11e7-9303-0050568a3ac1.vmdk\" from node \"node3\"\n","stream":"stderr","time":"2017-05-09T20:26:40.355922165Z"}
{"log":"I0509 20:26:40.357988 1 operation_generator.go:694] Verified volume is safe to detach for volume \"pvc-c268f141-34f2-11e7-9303-0050568a3ac1\" (UniqueName: \"kubernetes.io/vsphere-volume/[vsanDatastore] kubevols/kubernetes-dynamic-pvc-c268f141-34f2-11e7-9303-0050568a3ac1.vmdk\") on node \"node3\" \n","stream":"stderr","time":"2017-05-09T20:26:40.358177953Z"}
```
**Which issue this PR fixes**
fixes#45464, https://github.com/vmware/kubernetes/issues/116
**Special notes for your reviewer**:
Verified this change on locally built hyperkube image - v1.7.0-alpha.3.147+3c0526cb64bdf5-dirty
**performed many fail over with large volumes (30GB) attached to the pod.**
$ kubectl describe pod
Name: wordpress-mysql-2789807967-3xcvc
Node: node3/172.1.87.0
Status: Running
Powered Off node3's host. pod failed over to node2. Verified all 3 disks detached from node3 and attached to node2.
$ kubectl describe pod
Name: wordpress-mysql-2789807967-qx0b0
Node: node2/172.1.9.0
Status: Running
Powered Off node2's host. pod failed over to node3. Verified all 3 disks detached from node2 and attached to node3.
$ kubectl describe pod
Name: wordpress-mysql-2789807967-7849s
Node: node3/172.1.87.0
Status: Running
Powered Off node3's host. pod failed over to node1. Verified all 3 disks detached from node3 and attached to node1.
$ kubectl describe pod
Name: wordpress-mysql-2789807967-26lp1
Node: node1/172.1.98.0
Status: Running
Powered off node1's host. pod failed over to node3. Verified all 3 disks detached from node1 and attached to node3.
$ kubectl describe pods
Name: wordpress-mysql-2789807967-4pdtl
Node: node3/172.1.87.0
Status: Running
Powered off node3's host. pod failed over to node1. Verified all 3 disks detached from node3 and attached to node1.
$ kubectl describe pod
Name: wordpress-mysql-2789807967-t375f
Node: node1/172.1.98.0
Status: Running
Powered off node1's host. pod failed over to node3. Verified all 3 disks detached from node1 and attached to node3.
$ kubectl describe pods
Name: wordpress-mysql-2789807967-pn6ps
Node: node3/172.1.87.0
Status: Running
powered off node3's host. pod failed over to node1. Verified all 3 disks detached from node3 and attached to node1
$ kubectl describe pods
Name: wordpress-mysql-2789807967-0wqc1
Node: node1/172.1.98.0
Status: Running
powered off node1's host. pod failed over to node3. Verified all 3 disks detached from node1 and attached to node3.
$ kubectl describe pods
Name: wordpress-mysql-2789807967-821nc
Node: node3/172.1.87.0
Status: Running
**Release note**:
```release-note
NONE
```
CC: @BaluDontu @abrarshivani @luomiao @tusharnt @pdhamdhere
Automatic merge from submit-queue (batch tested with PRs 43067, 45586, 45590, 38636, 45599)
AWS: Remove check that forces loadBalancerSourceRanges to be 0.0.0.0/0.
fixes#38633
Remove check that forces loadBalancerSourceRanges to be 0.0.0.0/0. Also, remove check that forces service.beta.kubernetes.io/aws-load-balancer-internal annotation to be 0.0.0.0/0. Ideally, it should be a boolean, but for backward compatibility, leaving it to be a non-empty value
Automatic merge from submit-queue (batch tested with PRs 45382, 45384, 44781, 45333, 45543)
azure: improve user agent string
**What this PR does / why we need it**: the UA string doesn't actually contain "kubernetes" in it
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: none
**Release note**:
```release-note
NONE
```
cc: @brendandburns
Automatic merge from submit-queue
azure: load balancer: support UDP, fix multiple loadBalancerSourceRanges support, respect sessionAffinity
**What this PR does / why we need it**:
1. Adds support for UDP ports
2. Fixes support for multiple `loadBalancerSourceRanges`
3. Adds support the Service spec's `sessionAffinity`
4. Removes dead code from the Instances file
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43683
**Special notes for your reviewer**: n/a
**Release note**:
```release-note
azure: add support for UDP ports
azure: fix support for multiple `loadBalancerSourceRanges`
azure: support the Service spec's `sessionAffinity`
```
Automatic merge from submit-queue
Filter out IPV6 addresses from NodeAddresses() returned by vSphere
The vSphere CP returns both IPV6 and IPV4 addresses for a Node as part of NodeAddresses() implementation. However, Kubelet fails due to duplicate api.NodeAddress value when the node has an IPV6 address associated with it. This issue is tracked in #42690. The following are observed:
- when we enabled the logs and checked the addresses sent by vSphere CP to Kubelet, we don't see any duplicate addresses at all.
- Also, kubelet_node_status doesn’t receive any duplicate address from cloud provider.
However, when we filter out the IPV6 addresses and only return IPV4 addresses to the Kubelet, it works perfectly fine.
Even though the Kubelet receives the non-duplicate node-addresses, it still errors out with duplicate node addresses. It might be an issue when kubelet propagates these addresses to API server (or) API server is enable to handle IPV6 addresses.
@divyenpatel @abrarshivani @pdhamdhere @tusharnt
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Statefulsets for cinder: allow multi-AZ deployments, spread pods across zones
**What this PR does / why we need it**: Currently if we do not specify availability zone in cinder storageclass, the cinder is provisioned to zone called nova. However, like mentioned in issue, we have situation that we want spread statefulset across 3 different zones. Currently this is not possible with statefulsets and cinder storageclass. In this new solution, if we leave it empty the algorithm will choose the zone for the cinder drive similar style like in aws and gce storageclass solutions.
**Which issue this PR fixes** fixes#44735
**Special notes for your reviewer**:
example:
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: all
provisioner: kubernetes.io/cinder
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: galera
labels:
app: mysql
spec:
ports:
- port: 3306
name: mysql
clusterIP: None
selector:
app: mysql
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: "galera"
replicas: 3
template:
metadata:
labels:
app: mysql
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
containers:
- name: mysql
image: adfinissygroup/k8s-mariadb-galera-centos:v002
imagePullPolicy: Always
ports:
- containerPort: 3306
name: mysql
- containerPort: 4444
name: sst
- containerPort: 4567
name: replication
- containerPort: 4568
name: ist
volumeMounts:
- name: storage
mountPath: /data
readinessProbe:
exec:
command:
- /usr/share/container-scripts/mysql/readiness-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
volumeClaimTemplates:
- metadata:
name: storage
annotations:
volume.beta.kubernetes.io/storage-class: all
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 12Gi
```
If this example is deployed it will automatically create one replica per AZ. This helps us a lot making HA databases.
Current storageclass for cinder is not perfect in case of statefulsets. Lets assume that cinder storageclass is defined to be in zone called nova, but because labels are not added to pv - pods can be started in any zone. The problem is that at least in our openstack it is not possible to use cinder drive located in zone x from zone y. However, should we have possibility to choose between cross-zone cinder mounts or not? Imo it is not good way of doing things that they mount volume from another zone where the pod is located(means more network traffic between zones)? What you think? Current new solution does not allow that anymore (should we have possibility to allow it? it means removing the labels from pv).
There might be some things that needs to be fixed still in this release and I need help for that. Some parts of the code is not perfect.
Issues what i am thinking about (I need some help for these):
1) Can everybody see in openstack what AZ their servers are? Can there be like access policy that do not show that? If AZ is not found from server specs, I have no idea how the code behaves.
2) In GetAllZones() function, is it really needed to make new serviceclient using openstack.NewComputeV2 or could I somehow use existing one
3) This fetches all servers from some openstack tenant(project). However, in some cases kubernetes is maybe deployed only to specific zone. If kube servers are located for instance in zone 1, and then there are another servers in same tenant in zone 2. There might be usecase that cinder drive is provisioned to zone-2 but it cannot start pod, because kubernetes does not have any nodes in zone-2. Could we have better way to fetch kubernetes nodes zones? Currently that information is not added to kubernetes node labels automatically in openstack (which should I think). I have added those labels manually to nodes. If that zone information is not added to nodes, the new solution does not start stateful pods at all, because it cannot target pods.
cc @rootfs @anguslees @jsafrane
```release-note
Default behaviour in cinder storageclass is changed. If availability is not specified, the zone is chosen by algorithm. It makes possible to spread stateful pods across many zones.
```
Fixes support for multiple instances of loadBalancerSourceRanges.
Previously, the names of the rules for each address range conflicted
causing only one to be applied. Now each gets a unique name.
Automatic merge from submit-queue (batch tested with PRs 45018, 45330)
Add exponential backoff to openstack loadbalancer functions
Using exponential backoff to lower openstack load and reduce API call throttling
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45508, 44258, 44126, 45441, 45320)
cloud initialize node in external cloud controller
@thockin This PR adds support in the `cloud-controller-manager` to initialize nodes (instead of kubelet, which did it previously)
This also adds support in the kubelet to skip node cloud initialization when `--cloud-provider=external`
Specifically,
Kubelet
1. The kubelet has a new flag called `--provider-id` which uniquely identifies a node in an external DB
2. The kubelet sets a node taint - called "ExternalCloudProvider=true:NoSchedule" if cloudprovider == "external"
Cloud-Controller-Manager
1. The cloud-controller-manager listens on "AddNode" events, and then processes nodes that starts with that above taint. It performs the cloud node initialization steps that were previously being done by the kubelet.
2. On addition of node, it figures out the zone, region, instance-type, removes the above taint and updates the node.
3. Then periodically queries the cloudprovider for node addresses (which was previously done by the kubelet) and updates the node if there are new addresses
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41903, 45311, 45474, 45472, 45501)
Fetch VM UUID from - /sys/class/dmi/id/product_serial
**What this PR does / why we need it**:
Current code fetch VM uuid using uuid reported at `'/sys/devices/virtual/dmi/id/product_uuid'.` This doesn't work with all the distros like Ubuntu 16.04 and Fedora.
updating code to fetch VM uuid from `/sys/class/dmi/id/product_serial`
**Which issue this PR fixes**
fixes #
**Special notes for your reviewer**:
Verified UUID is matching with VM UUID on ubuntu 16.04, Cent OS 7.3 , and Photon OS
@BaluDontu @tusharnt
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43006, 45305, 45390, 45412, 45392)
[GCE] Collect latency metric on get/list calls
**What this PR does / why we need it**:
Collects latency & count measurements on GET and LIST operations to GCE cloud.
**Release note**:
```release-note
NONE
```
It seems that intention of original code was to build map of missing
tags and call AWS API to add just them, but due to typo full
set of tags was always (re)added
Automatic merge from submit-queue
Start recording cloud provider metrics for AWS
**What this PR does / why we need it**:
This PR implements support for emitting metrics from AWS about storage operations.
**Which issue this PR fixes**
Fixes https://github.com/kubernetes/features/issues/182
**Release note**:
```
Add support for emitting metrics from AWS cloudprovider about storage operations.
```
Automatic merge from submit-queue (batch tested with PRs 44124, 44510)
Add metrics to all major gce operations (latency, errors)
```release-note
Add metrics to all major gce operations {latency, errors}
The new metrics are:
cloudprovider_gce_api_request_duration_seconds{request, region, zone}
cloudprovider_gce_api_request_errors{request, region, zone}
`request` is the specific function that is used.
`region` is the target region (Will be "<n/a>" if not applicable)
`zone` is the target zone (Will be "<n/a>" if not applicable)
Note: this fixes some issues with the previous implementation of
metrics for disks:
- Time duration tracked was of the initial API call, not the entire
operation.
- Metrics label tuple would have resulted in many independent
histograms stored, one for each disk. (Did not aggregate well).
```
The new metrics is:
cloudprovider_gce_api_request_duration_seconds{request, region, zone}
cloudprovider_gce_api_request_errors{request, region, zone}
`request` is the specific function that is used.
`region` is the target region (Will be "<n/a>" if not applicable)
`zone` is the target zone (Will be "<n/a>" if not applicable)
Note: this fixes some issues with the previous implementation of
metrics for disks:
- Time duration tracked was of the initial API call, not the entire
operation.
- Metrics label tuple would have resulted in many independent
histograms stored, one for each disk. (Did not aggregate well).
Automatic merge from submit-queue
Use provided VipPortID for OpenStack LB
**What this PR does / why we need it**:
When creating an OpenStack LoadBalancer, Kubernetes will search through the tenant trying to match the LB's VIP with a port. This is problematic because multiple ports may have the same fixed IP, therefore leading to routing inconsistencies. We should use the port ID provided by the LB's response body instead.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubernetes/issues/43909
**Special notes for your reviewer**:
Since this involves non-deterministic testing, it'd be best if we can run this in a staging environment for a few days before merging (say until early next week).
**Release note**:
```release-note
Fixes issue during LB creation where ports where incorrectly assigned to a floating IP
```
Automatic merge from submit-queue
cinder: Add support for the KVM virtio-scsi driver
**What this PR does / why we need it**:
The VirtIO SCSI driver for KVM changes the way disks appear in /dev/disk/by-id.
This adds support for the new format.
Without this, volume attaching on an openstack cluster using this kvm driver doesn't work
**Special notes for your reviewer**:
Does this need e2e tests? I couldn't find anywhere to add another openstack configuration used in the e2e tests.
Wiki page about this: https://wiki.openstack.org/wiki/Virtio-scsi-for-bdm
**Release note**:
```release-note
cinder: Add support for the KVM virtio-scsi driver
```
Automatic merge from submit-queue (batch tested with PRs 44594, 44651)
remove strings.compare(), use string native operation
I notice we use strings.Compare() in some code, we can remove it and use native operation.
Automatic merge from submit-queue (batch tested with PRs 44555, 44238)
openstack: remove field flavor_to_resource
I believe there is no usage about `flavor_to_resource`, and I think there is no need to build that information, too.
cc @anguslees
**Release note:**
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 44687, 44689, 44661)
Fix panic when using `kubeadm init` with vsphere cloud-provider
**What this PR does / why we need it**:
Check if the reference is nil when finding machine reference by UUID.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44603
**Special notes for your reviewer**:
This is just a quick fix for the panic.
**Release note**:
```release-note
NONE
```