Automatic merge from submit-queue
Azure: Allow VNet to be in a separate Resource Group
**What this PR does / why we need it**:
This PR allows Kubernetes in an Azure context to use a VNet which is not in the same Resource Group as Kubernetes.
We need this because currently Azure Cloud Provider driver assumes that it should have a VNet for himself but if there is one thing that should be shared amongst Azure resources it's a VNet cause, well, things might want to talk to each other in a private network, don't you think ?
I guess this should we backported down to 1.6 branch.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes#49577
**Release note**:
```release-note
NONE
```
@kubernetes/sig-azure
@kubernetes/sig-azure-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50418, 49830, 49206, 49061, 49912)
add LocalZone into gce.conf and refactor gce cloud provider configura…
The main goal of this PR is to make gce cloud provider able to run locally.
1. added a LocalZone parameter into gce.conf.
2. refactor `newGCECloud` to avoid contacting metadata server if configuration is already available.
```release-note
None
```
Automatic merge from submit-queue
VSphere cloud provider code refactoring
The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.
- VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
- Volume management on datastore (Create/Delete)
- Volume management on Virtual Machines (Attach/Detach)
- Storage Policy Management
- vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
- Modifications to e2e tests to accomodate the latest design changes.
@divyenpatel @rohitjogvmw @luomiao
```release-note
vSphere cloud provider: vSphere cloud provider code refactoring
```
Automatic merge from submit-queue
Ignore the available volume when calling DetachDisk
Fix#50207
If user detachs the volume by nova in openstack env, volume becomes
available. If nova instance is been deleted, nova will detach it
automatically and become available. So the "available" is fine since that means the
volume is detached from instance already.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49524, 46760, 50206, 50166, 49603)
[OpenStack] Add more detail error message
I get same simple error messages "Unable to initialize cinder client
for region: RegionOne" from controller-manager, but I can not find the
reason. We should add more detail message "err" into glog.Errorf.
Currently NewBlockStorageV2() return err when failed to get cinder endpoint, but there is no code to output the message of err.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50087, 39587, 50042, 50241, 49914)
AttachDisk should not call detach inside of Cinder volume provider
If use detachs the volume by nova in openstack env, volume becomes
available. If nova instance is been deleted, nova will detach it
automatically. So the "available" is fine since that means the
volume is detached from instance already.
I get same simple error messages "Unable to initialize cinder client
for region: RegionOne" from controller-manager, but I can not find the
reason. We should add more detail message "err" into glog.Errorf.
Automatic merge from submit-queue (batch tested with PRs 49805, 50052)
We never want to modify the globally defined SG for ELBs
**What this PR does / why we need it**:
Fixes a bug where creating or updating an ELB will modify a globally defined security group
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50105
**Special notes for your reviewer**:
**Release note**:
```release-note
fixes a bug around using the Global config ElbSecurityGroup where Kuberentes would modify the passed in Security Group.
```
Automatic merge from submit-queue (batch tested with PRs 47416, 47408, 49697, 49860, 50162)
add possibility to use multiple floatingip pools in openstack loadbalancer
**What this PR does / why we need it**: Currently only one floating pool is supported in kubernetes openstack cloud provider. It is quite big issue for us, because we want run only single kubernetes cluster, but we want that external and internal services can be used. It means that we need possibility to create services with internal and external pools.
**Which issue this PR fixes**: fixes#49147
**Special notes for your reviewer**: service labels is not maybe correct place to define this floatingpool id. However, I did not find any better place easily. I do not want start modifying service api structure.
**Release note**:
```release-note
Add possibility to use multiple floatingip pools in openstack loadbalancer
```
Example how it works:
```
cat /etc/kubernetes/cloud-config
[Global]
auth-url=https://xxxx
username=xxxx
password=xxxx
region=yyy
tenant-id=b23efb65b1d44b5abd561511f40c565d
domain-name=foobar
[LoadBalancer]
lb-version=v2
subnet-id=aed26269-cd01-4d4e-b0d8-9ec726c4c2ba
lb-method=ROUND_ROBIN
floating-network-id=56e523e7-76cb-477f-80e4-2dc8cf32e3b4
create-monitor=yes
monitor-delay=10s
monitor-timeout=2000s
monitor-max-retries=3
```
```
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
template:
metadata:
labels:
run: web
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
run: web-ext
name: web-ext
namespace: default
spec:
selector:
run: web
ports:
- port: 80
name: https
protocol: TCP
targetPort: 80
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
labels:
run: web-int
floatingPool: a2a84887-4915-42bf-aaff-2b76688a4ec7
name: web-int
namespace: default
spec:
selector:
run: web
ports:
- port: 80
name: https
protocol: TCP
targetPort: 80
type: LoadBalancer
```
```
% kubectl create -f example.yaml
deployment "nginx-deployment" created
service "web-ext" created
service "web-int" created
% kubectl get svc -o wide
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes 10.254.0.1 <none> 443/TCP 2m <none>
web-ext 10.254.23.153 192.168.1.57,193.xx.xxx.xxx 80:30151/TCP 52s run=web
web-int 10.254.128.141 192.168.1.58,10.222.130.80 80:32431/TCP 52s run=web
```
cc @anguslees @k8s-sig-openstack-feature-requests @dims
Automatic merge from submit-queue (batch tested with PRs 49916, 50050)
GCE: Fix bug by correctly cast port to string
Code is incorrectly casting a port to a string, causing the diff-expression to always return true.
**What this PR does / why we need it**:
Fixes#50049
**Special notes for your reviewer**:
/assign @MrHohn
**Release note**:
```release-note
NONE
```
if not needed here
load network ids from gophercloud api
fix to getnetworkbyname
update godeps, add networks library
fix gofmt and boilerplate
gofmt
use annotations
fix
remove enableflag
add comment to annotationvalue
Automatic merge from submit-queue (batch tested with PRs 49870, 49416, 49872, 49892, 49908)
fix alpha/beta endpoint when api endpoint is specified
fix a bug in alpha/beta compute API endpoint bootstraping when api-endpiont is specified.
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
GCE: Adding unit test for ensureStaticIP
**What this PR does / why we need it**:
Entry into unit testing GCE loadbalancer code by testing `ensureStaticIP` which had a bug in 1.7.0.
@bowei @freehan @MrHohn @dnardo @thockin, any thoughts and comments on how we could unit test LB code moving forward? I think there are many areas we can split functions into smaller ones for easier testing - firewallNeedsUpdate being an example of that. However, it seems to me that we still need to mock our GCP calls for some functions that heavily revolve around API calls. A dream goal would be to have a unit test that can call EnsureLoadBalancer. Now that we have shared resources between different services and ingresses (firewalls, instance groups, [future features]), being able to setup different scenarios without depending on E2E tests would be awesome. However, I'm not sure how reachable that goal would be.
Most importantly, let's not make things worse. If you have advice on anti-patterns to avoid, please speak up.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
GCE: Update vendor of gcfg and filter config parsing errors
**What this PR does / why we need it**:
To utilize new function `FatalOnly` which filters "programmer errors"
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#49660
**Special notes for your reviewer**:
/assign @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
GCE: Remove resource Get function calls from Create functions
**What this PR does / why we need it**:
Consistency. This PR removes the GetXXX from the CreateXXX functions of the GCE cloudprovider. Consumers (specifically the ingress controller) will need to call the Get resource funcs separately when updating their vendored versions.
**Release note**:
```release-note
NONE
```
/assign @bowei
Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
Better message if we dont find appropriate BlockStorage API
**What this PR does / why we need it**:
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Define a new config VnetResourceGroup in order to be able to use a VNet
which is not in the same resource group as kubernetes.
Signed-off-by: Sylvain Rabot <s.rabot@lectra.com>
Automatic merge from submit-queue
Warn if aws has no cluster id provided
**What this PR does / why we need it**:
we info log a message when no cluster id is provided that should be a warning given its impact.
fixes https://github.com/kubernetes/kubernetes/issues/49568
**Release note**:
```release-note
NONE
```
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
Automatic merge from submit-queue (batch tested with PRs 49420, 49296, 49299, 49371, 46514)
Avoid looking up instance id until we need it
**What this PR does / why we need it**:
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
So let's try to find the instance-id only when we need it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Bump up gce minNodesHealthCheckVersion due to known issues
**What this PR does / why we need it**: There are some known issues in previous 1.7 versions causing kube-proxy not correctly responding healthz traffic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: From #49263.
**Special notes for your reviewer**:
/assign @nicksardo @freehan
cc @bowei @thockin
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service will have health checks for nodes by default if all nodes have version >= v1.7.2.
```
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
Added logging to AWS api calls. #46969
Additionally logging of when AWS API calls start and end to help diagnose problems with kubelet on cloud provider nodes not reporting node status periodically. There's some inconsistency in logging around this PR we should discuss.
IMO, the API logging should be at a higher level than most other types of logging as you would probably only want it in limited instances. For most cases that is easy enough to do, but there are some calls which have some logging around them already, namely in the instance groups. My preference would be to keep the existing logging as it and just add the new API logs around the API call.
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
Automatic merge from submit-queue (batch tested with PRs 49276, 49235)
Don't fail fast if LoadBalancer section is missing
**What this PR does / why we need it**:
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer