Automatic merge from submit-queue (batch tested with PRs 46252, 45524, 46236, 46277, 46522)
Make GCE load-balancers create health checks for nodes
From #14661. Proposal on kubernetes/community#552. Fixes#46313.
Bullet points:
- Create nodes health check and firewall (for health checking) for non-OnlyLocal service.
- Create local traffic health check and firewall (for health checking) for OnlyLocal service.
- Version skew:
- Don't create nodes health check if any nodes has version < 1.7.0.
- Don't backfill nodes health check on existing LBs unless users explicitly trigger it.
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service now have health checks for nodes by default.
An existing LoadBalancer will have health check attached to it when:
- Change Service.Spec.Type from LoadBalancer to others and flip it back.
- Any effective change on Service.Spec.ExternalTrafficPolicy.
```
Automatic merge from submit-queue (batch tested with PRs 46383, 45645, 45923, 44884, 46294)
Created unit tests for GCE cloud provider storage interface.
- Currently covers CreateDisk and DeleteDisk, GetAutoLabelsForPD
- Created ServiceManager interface in gce.go to facilitate mocking in tests.
**What this PR does / why we need it**:
Increasing test coverage for GCE Persistent Disk.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#44573
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45573, 46354, 46376, 46162, 46366)
GCE - Retrieve subnetwork name/url from gce.conf
**What this PR does / why we need it**:
Features like ILB require specifying the subnetwork if the network is type manual.
**Notes:**
The network URL can be [constructed](68e7e18698/pkg/cloudprovider/providers/gce/gce.go (L211-L217)) by fetching instance metadata; however, the subnetwork is not provided through this feature. Users must specify the subnetwork name/url through the gce.conf.
Although multiple subnets can exist in the same region for a network, the cloud provider will only use one subnet url for creating LBs.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 38505, 41785, 46315)
Fix provisioned GCE PD not being reused if already exists
@jsafrane PTAL
This is another attempt at https://github.com/kubernetes/kubernetes/pull/38702 . We have observed that `gce.service.Disks.Insert(gce.projectID, zone, diskToCreate).Do()` instantly gets an error response of alreadyExists, so we must check for it.
I am not sure if we still need to check for the error after `waitForZoneOp`; I think that if there is an alreadyExists error, the `Do()` above will always respond with it instantly. But because I'm not sure, and to be safe, I will leave it.
Automatic merge from submit-queue
Initialize cloud providers with a K8s clientBuilder
**What this PR does / why we need it**:
This PR provides each cloud provider the ability to generate kubernetes clients. Either the full access or service account client builder is passed from the controller manager. Cloud providers could need to retrieve information from the cluster that isn't provided through defined interfaces, and this seems more preferable to adding parameters.
Please leave your thoughts/comments.
**Release note**:
```release-note
NONE
```
The new metrics is:
cloudprovider_gce_api_request_duration_seconds{request, region, zone}
cloudprovider_gce_api_request_errors{request, region, zone}
`request` is the specific function that is used.
`region` is the target region (Will be "<n/a>" if not applicable)
`zone` is the target zone (Will be "<n/a>" if not applicable)
Note: this fixes some issues with the previous implementation of
metrics for disks:
- Time duration tracked was of the initial API call, not the entire
operation.
- Metrics label tuple would have resulted in many independent
histograms stored, one for each disk. (Did not aggregate well).
Automatic merge from submit-queue
Adding load balancer src cidrs to GCE cloudprovider
**What this PR does / why we need it**:
As of January 31st, 2018, GCP will be sending health checks and l7 traffic from two CIDRs and legacy health checks from three CIDS. This PR moves them into the cloudprovider package and provides a flag for override.
Another PR will need to be address firewall rule creation for external L4 network loadbalancing #40778
**Which issue this PR fixes**
Step one of #40778
Step one of https://github.com/kubernetes/ingress/issues/197
**Release note**:
```release-note
Add flags to GCE cloud provider to override known L4/L7 proxy & health check source cidrs
```
If CIDRAllocatorType is set to `CloudCIDRAllocator`, then allocation
of CIDR allocation instead is done by the external cloud provider and
the node controller is only responsible for reflecting the allocation
into the node spec.
- Splits off the rangeAllocator from the cidr_allocator.go file.
- Adds cloudCIDRAllocator, which is used when the cloud provider allocates
the CIDR ranges externally. (GCE support only)
- Updates RBAC permission for node controller to include PATCH
Automatic merge from submit-queue
Implement API usage metrics for gce storage
**What this PR does / why we need it**:
This PR implements support for emitting metrics from GCE about storage operations.
**Which issue this PR fixes**
Fixes https://github.com/kubernetes/features/issues/182
**Release note**:
```
Add support for emitting metrics from GCE cloudprovider about storage operations.
```
Automatic merge from submit-queue (batch tested with PRs 42617, 43247, 43509, 43644, 43820)
[GCE] Support legacy-https and generic health checks
**What this PR does / why we need it**:
- Adds CRUD functions to manage `compute.HttpsHealthChecks`
The legacy HTTPS healthchecks will be used by the GLBC (GCE Load balancer Controller)
- Adds CRUD functions to manage `compute.HealthChecks`
These are required for the internal load balancer
- Removes the logic that disregards NotFound errors on DeleteHttpHealthChecks as this is useful information for callers. Here are the three known invocations within kubernetes:
[gce/gce_loadbalancer.go#L457](bc6e77d42f/pkg/cloudprovider/providers/gce/gce_loadbalancer.go (L457)): Only prints warning that HC wasn't deleted -> acceptable
[gce/gce_loadbalancer.go#L465](bc6e77d42f/pkg/cloudprovider/providers/gce/gce_loadbalancer.go (L465)): Err is ignored if not nil -> acceptable
[e2e/framework/ingress_utils.go#L530](bc6e77d42f/test/e2e/framework/ingress_utils.go (L530)): Already checks if is NotFound error -> acceptable
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Step one of https://github.com/kubernetes/ingress/issues/494
Step one of #33483
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
The cloudprovider is being refactored out of kubernetes core. This is being
done by moving all the cloud-specific calls from kube-apiserver, kubelet and
kube-controller-manager into a separately maintained binary(by vendors) called
cloud-controller-manager. The Kubelet relies on the cloudprovider to detect information
about the node that it is running on. Some of the cloudproviders worked by
querying local information to obtain this information. In the new world of things,
local information cannot be relied on, since cloud-controller-manager will not
run on every node. Only one active instance of it will be run in the cluster.
Today, all calls to the cloudprovider are based on the nodename. Nodenames are
unqiue within the kubernetes cluster, but generally not unique within the cloud.
This model of addressing nodes by nodename will not work in the future because
local services cannot be queried to uniquely identify a node in the cloud. Therefore,
I propose that we perform all cloudprovider calls based on ProviderID. This ID is
a unique identifier for identifying a node on an external database (such as
the instanceID in aws cloud).