Automatic merge from submit-queue
Implement LRU for AWS device allocator
On failure to attach do not use device from pool
In AWS environment when attach fails on the node
lets not use device from the pool. This makes sure
that a bigger pool of devices is available.
Automatic merge from submit-queue (batch tested with PRs 43925, 42512)
AWS: add KubernetesClusterID as additional option when VPC is set
This is a small enhancement after the PRs https://github.com/kubernetes/kubernetes/pull/41695 and https://github.com/kubernetes/kubernetes/pull/39996
## Release Notes
```release-note
AWS cloud provider: allow to set KubernetesClusterID or KubernetesClusterTag in combination with VPC.
```
The cloudprovider is being refactored out of kubernetes core. This is being
done by moving all the cloud-specific calls from kube-apiserver, kubelet and
kube-controller-manager into a separately maintained binary(by vendors) called
cloud-controller-manager. The Kubelet relies on the cloudprovider to detect information
about the node that it is running on. Some of the cloudproviders worked by
querying local information to obtain this information. In the new world of things,
local information cannot be relied on, since cloud-controller-manager will not
run on every node. Only one active instance of it will be run in the cluster.
Today, all calls to the cloudprovider are based on the nodename. Nodenames are
unqiue within the kubernetes cluster, but generally not unique within the cloud.
This model of addressing nodes by nodename will not work in the future because
local services cannot be queried to uniquely identify a node in the cloud. Therefore,
I propose that we perform all cloudprovider calls based on ProviderID. This ID is
a unique identifier for identifying a node on an external database (such as
the instanceID in aws cloud).
Automatic merge from submit-queue
Add approvers to the aws OWNERS file
Without this it was picking up reviewers from a much higher directory.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41306, 42187, 41666, 42275, 42266)
Implement bulk polling of volumes
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
cc @justinsb
This implements Bulk volume polling using ideas presented by
justin in https://github.com/kubernetes/kubernetes/pull/39564
But it changes the implementation to use an interface
and doesn't affect other implementations.
Set the vpcID when dummy is created (+1 squashed commit)
Squashed commits:
[0b1ac6e83e] Use the VPC flag and KubernetesClusterTag as identifier (+1 squashed commit)
Squashed commits:
[962bc56e38] Remove again availabilityZone and fix naming (+1 squashed commit)
Squashed commits:
[e3d1b41807] Use the VCID flag as identifier (+1 squashed commit)
Squashed commits:
[5b99fe6243] Add flag for external master
Automatic merge from submit-queue (batch tested with PRs 41921, 41695, 42139, 42090, 41949)
AWS: Support shared tag `kubernetes.io/cluster/<clusterid>`
We recognize an additional cluster tag:
kubernetes.io/cluster/<clusterid>
This now allows us to share resources, in particular subnets.
In addition, the value is used to track ownership/lifecycle. When we
create objects, we record the value as "owned".
We also refactor out tags into its own file & class, as we are touching
most of these functions anyway.
```release-note
AWS: Support shared tag `kubernetes.io/cluster/<clusterid>`
```
Automatic merge from submit-queue (batch tested with PRs 38676, 41765, 42103, 41833, 41702)
AWS: Skip instances that are taggged as a master
We recognize a few AWS tags, and skip over masters when finding zones
for dynamic volumes. This will fix#34583.
This is not perfect, in that really the scheduler is the only component
that can correctly choose the zone, but should address the common
problem.
```release-note
AWS: Do not consider master instance zones for dynamic volume creation
```
We recognize an additional cluster tag:
kubernetes.io/cluster/<clusterid>
This now allows us to share resources, in particular subnets.
In addition, the value is used to track ownership/lifecycle. When we
create objects, we record the value as "owned".
We also refactor out tags into its own file & class, as we are touching
most of these functions anyway.
Automatic merge from submit-queue (batch tested with PRs 41756, 36344, 34259, 40843, 41526)
add InternalDNS/ExternalDNS node address types
This PR adds internal/external DNS names to the types of NodeAddresses that can be reported by the kubelet.
will spawn follow up issues for cloud provider owners to include these when possible
```release-note
Nodes can now report two additional address types in their status: InternalDNS and ExternalDNS. The apiserver can use `--kubelet-preferred-address-types` to give priority to the type of address it uses to reach nodes.
```
We recognize a few AWS tags, and skip over masters when finding zones
for dynamic volumes. This will fix#34583.
This is not perfect, in that really the scheduler is the only component
that can correctly choose the zone, but should address the common
problem.
Automatic merge from submit-queue
AWS: trust region if found from AWS metadata
```release-note
AWS: trust region if found from AWS metadata
```
Means we can run in newly announced regions without a code change.
We don't register the ECR provider in new regions, so we will still need
a code change for now.
Fix#35014
Automatic merge from submit-queue
Curating Owners: pkg/cloudprovider
cc @runseb @justinsb @kerneltime @mikedanese @svanharmelen @anguslees @brendandburns @abrarshivani @imkin @luomiao @colemickens @ngtuna @dagnello @abithap
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue (batch tested with PRs 39625, 39842)
AWS: Remove duplicate calls to DescribeInstance during volume operations
This change removes all duplicate calls to describeInstance
from aws volume code path.
**What this PR does / why we need it**:
This PR removes the duplicate calls present in disk check code paths in AWS. I can confirm that `getAWSInstance` actually returns all instance information already and hence there is no need of making separate `describeInstance` call.
Related to - https://github.com/kubernetes/kubernetes/issues/39526
cc @justinsb @jsafrane
Means we can run in newly announced regions without a code change.
We don't register the ECR provider in new regions, so we will still need
a code change for now.
This also means we do trust config / instance metadata, and don't reject
incorrectly configured zones.
Fix#35014
Automatic merge from submit-queue
AWS: Add exponential backoff to waitForAttachmentStatus() and createTags()
We should use exponential backoff while waiting for a volume to get attached/detached to/from a node. This will lower AWS load and reduce API call throttling.
This partly fixes#33088
@justinsb, can you please take a look?
On AWS, we should not reuse device names as long as possible, see
https://aws.amazon.com/premiumsupport/knowledge-center/ebs-stuck-attaching/
"If you specify a device name that is not in use by EC2, but is being used by
the block device driver within the EC2 instance, the attachment of the EBS
volume does not succeed and the EBS volume is stuck in the attaching state."
This patch adds a device name allocator that tries to find a name that's next
to the last used device name instead of using the first available one.
This way we will loop through all device names ("xvdba" .. "xvdzz") before
a device name is reused.
We should use exponential backoff while waiting for a volume to get attached/
detached to/from a node. This will lower AWS load and reduce its API call
throttling.
This method has been unused by k8s for some time, and yet is the last
piece of the cloud provider API that encourages provider names to be
human-friendly strings (this method applies a regex to instance names).
Actually removing this deprecated method is part of a long effort to
migrate from instance names to instance IDs in at least the OpenStack
provider plugin.