Automatic merge from submit-queue (batch tested with PRs 56128, 56004, 56083, 55833, 56042). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add initial Virtual Machine Scale Sets (VMSS) support in Azure
**What this PR does / why we need it**:
This is the first step of adding Virtual Machine Scale Sets (VMSS) support in Azure, it
- Adds vmType params to support both vmss and standard in Azure
- Adds initial InstanceID/InstanceType/IP/Routes support for vmss instances
- Master nodes may not belong to any scale sets, so it falls back to VirtualMachinesClient for such instances
Have validated that nodes could be registered and pods could be scheduled and run correctly.
Still more work to do to fully support Azure VMSS. And next steps are tracking at #43287.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Part of #43287.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54316, 53400, 55933, 55786, 55794). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add Amazon NLB support
**What this PR does / why we need it**:
This adds support for AWS's NLB for `LoadBalancer` services.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#52173
**Special notes for your reviewer**:
This is NOT yet ready for merge, but I'd love any feedback before it is.
This requires at least `v1.10.40` of the [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go), which is not yet included in Kubernetes. Per @justinsb, I'm waiting on possibly #48314 to update to `v1.10.40` or some other PR.
I tried to make the change as easy to review as possible, so some LoadBalancer logic is duplicated in the `if isNLB(annotations)` blocks. I can refactor that and sprinkle more `isNLB()` switches around, but it might be harder to view the diff.
**Other Notes:**
* NLB's subnets cannot be modified after creation (maybe look for public subnets in all AZ's?). Currently, I'm just using `c.findELBSubnets()`
* Health check uses TCP with all the NLB default values. I was thinking HTTP health checks via annotation could be added later. Should that go into this PR?
* ~~`externalTrafficPolicy`/`healthCheckNodePort` are ignored. Should those be implemented for this PR?~~
* `externalTrafficPolicy` and subsequent `healthCheckNodePort` are handled properly. This may come with uneven load balancing, as NLB doesn't support weighted backends.
* With classic ELB, you have a security group the ELB is inside of to associate Instance (k8s node) SG rules with a LoadBalancer (k8s Service), but NLB's don't have a security group. Instead, I use the `Description` field on [`ec2.IpRange`](https://docs.aws.amazon.com/sdk-for-go/api/service/ec2/#IpRange) with the following annotations. Is this ok? I couldn't think of another way to associate SG rule to the NLB
* Node SG gets an rule added for VPC cidr on NodePort for Health Check with annotation in description `kubernetes.io/rule/nlb/health=<loadBalancerName>`
* Node SG gets an rule added for `loadBalancerSourceRanges` to NodePort for client traffic with annotation in description `kubernetes.io/rule/nlb/client=<loadBalancerName>`
* **Note: if `loadBalancerSourceRanges` is unspecified, this opens instance security groups to traffic from `0.0.0.0/0` on the service's nodePorts**
* Respects internal annotation
* Creates a TargetGroup per frontend port: simplifies updates when you have same backend port for multiple front end ports.
* Does not (yet) verify that we're under the NLB limits in terms of # of listeners
* `UpdateLoadBalancer()` basically just calls `EnsureLoadBalancer` for NLB's. Is this ok?
**Areas for future improvement or optimization**:
* A new annotation indicating a new security group should be created for NLB traffic and instances would be placed in this new SG. (Could bump up against the default limit of 5 SG's per instance)
* Only create a client health check security group rule when the VPC cidr is not a subset of `spec.loadBalancerSourceRanges`
* Consolidate TargetGroups if a service has 2+ frontend ports and the same nodePort.
* A new annotation for specifying TargetGroup Health Check options.
**Release note**:
```release-notes
Add Amazon NLB support - Fixes#52173
```
ping @justinsb @bchav
Automatic merge from submit-queue (batch tested with PRs 56021, 55843, 55088, 56117, 55859). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix panic when AlphaFeatureGate isn't configured for gcp.
**What this PR does / why we need it**:
When AlphaFeatureGate isn't configured, the pointer will be nil. This PR fixes it.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#56009
**Special notes for your reviewer**:
cc @jsiebens
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Check if SleepDelay of AWS request is nil before sign.
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55309
**Special notes for your reviewer**:
/cc @justinsb
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
using Regexp Match
**What this PR does / why we need it**:
using regexp match achieve find efficiently
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes issue where PVCs using `standard` StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters
Fixes#50115
Changed GetAllZones to only get zones with nodes that are currently running (renamed to GetAllCurrentZones). Added E2E test to confirm this behavior.
Automatic merge from submit-queue (batch tested with PRs 55112, 56029, 55740, 56095, 55845). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Updating vsphere cloud provider to support k8s cluster spread across multiple vCenters
**What this PR does / why we need it**:
vSphere cloud provider in Kubernetes 1.8 was designed to work only if all the nodes of the cluster are in one single datacenter folder. This is a hard restriction that makes the cluster not span across different folders/datacenter/vCenters. Users have use-cases to span the cluster across datacenters/vCenters.
**Which issue(s) this PR fixes**
Fixes # https://github.com/vmware/kubernetes/issues/255
**Special notes for your reviewer**:
This is a change purely in vsphere cloud provider and no changes in kubernetes core are needed.
**Release note**:
```release-note
With this change
- User should be able to create k8s cluster which spans across multiple ESXi clusters, datacenters or even vCenters.
- vSphere cloud provider (VCP) uses OS hostname and not vSphere Inventory VM Name.
That means, now VCP can handle cases where user changes VM inventory name.
- VCP can handle cases where VM migrates to other ESXi cluster or datacenter or vCenter.
The only requirement is the shared storage. VCP needs shared storage on all Node VMs.
```
Internally tested and reviewed the code.
@tthole, @shaominchen, @abrarshivani
Consider the migration from the old security group name to the new
security group name, we need delete the old security group.
At V1.10, we can assume everyone is using the new security group
names and remove this code.
running (renamed to GetAllCurrentZones). Added E2E test to confirm this
behavior.
Added node informer to cloud-provider controller to keep track of zones
with k8s nodes in them.
Automatic merge from submit-queue (batch tested with PRs 55217, 54260). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Unit tests for Azure service session affinity
**What this PR does / why we need it**: We added session affinity support in the Azure load balancer in commit 8b50b83067. This PR adds unit tests for this behaviour.
**Which issue this PR fixes**: None
**Special notes for your reviewer**: None
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55233, 55927, 55903, 54867, 55940). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix azure disk storage account init issue
**What this PR does / why we need it**:
There are two issues for the original azure disk storage account initialaztion code:
1) wrong controller-master detection, see issue #54570, #55776
2) should not initialize two storage account even if it's not necessary, see issue #50883
This PR would fix the above two issues:
For 1: remove the controller-master process binding
For 2: remove the storage account initialization process, just create on demand
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#54570Fixes#55776Fixes#50883
**Special notes for your reviewer**:
@rootfs @karataliu
**Release note**:
```
fix azure disk storage account init issue
```
/sig azure
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Ensure new tags are created on existing ELBs
**What this PR does / why we need it**:
When editing an existing service of type LoadBalancer in an AWS environment and adding the `service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags` annotation, you would expect the new tags to be set on the load balancer, however this doesn't happen currently. The annotation only takes effect if specified when the service is created.
This PR adds an AddTags method to the ELB interface and uses this to ensure tags set in the annotation are present on the ELB. If the tag key is already present, the value will be updated.
This PR does not remove tags that have been removed from the annotation, it only add/updates tags.
**Which issue(s) this PR fixes**:
Fixes#54642
**Special notes for your reviewer**:
The change requires that the IAM policy of the master instance(s) has the `elasticloadbalancing:AddTags` permission.
**Release note**:
```release-note
Ensure additional resource tags are set/updated AWS load balancers
```
Automatic merge from submit-queue (batch tested with PRs 50457, 55558, 53483, 55731, 52842). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Apply taint when a volume is stuck in attaching state
When a volume is stuck in attaching state for too long on a node, it is best to make node unschedulable so as any other pod may not be scheduled on it.
Fixes https://github.com/kubernetes/kubernetes/issues/55502
```release-note
AWS: Apply taint to a node if volumes being attached to it are stuck in attaching state
```
Automatic merge from submit-queue (batch tested with PRs 55642, 55897, 55835, 55496, 55313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
OpenStack: fetch volume path from metadata service
**What this PR does / why we need it**:
Updates the OpenStack cloud provider to use the Nova metadata service as a fallback when retrieving mounted PV disk paths. Note that the Nova instance device metadata will contain the disk address and bus, which allows finding its path.
This is needed as the *standard* mechanism of retrieving disk paths is not available when running k8s under OpenStack Hyper-V hosts.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55312
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
- vsphere.conf (cloud-config) is now needed only on master node
- VCP uses OS hostname and not vSphere inventory name
- VCP is now resilient to VM inventory name change and VM migration