Automatic merge from submit-queue
rename variables to make sure that they conform to golang variable name conventions
rename variables to make sure that they conform to golang variable name conventions
**What this PR does / why we need it**:
there are lots of package level unexported variables in package `cmd` not conforming golang variable name conventions, such as `version_example`, in this PR i rename all of them to make sure that they conform to golang variable name conventions
Several of these loops overlap, and when they are the reason a failure
is happening it is difficult to sort them out. Slighly misalign these
loops to make their impact obvious.
Automatic merge from submit-queue
Start recording cloud provider metrics for AWS
**What this PR does / why we need it**:
This PR implements support for emitting metrics from AWS about storage operations.
**Which issue this PR fixes**
Fixes https://github.com/kubernetes/features/issues/182
**Release note**:
```
Add support for emitting metrics from AWS cloudprovider about storage operations.
```
Automatic merge from submit-queue
Log node name when error attaching volume
Helps with debugging to know immediately which node the volume failed to atach to. Went through all plugins, added this to 3. @gnufied
```release-note
NONE
```
Automatic merge from submit-queue
don't HandleError on container start failure
Failing to start containers is a common error case if there is something wrong with the container image or environment like missing mounts/configs/permissions/etc. Not only is it common; it is reoccurring as backoff happens and new attempts to start the container are made. `HandleError` it too verbose for this very common situation.
Replace `HandleError` with `glog.V(3).Infof`
xref https://github.com/openshift/origin/issues/13889
@smarterclayton @derekwaynecarr @eparis
Automatic merge from submit-queue
Prepare for move zz_generated_deepcopy.go to k8s.io/api
This is in preparation to move deep copies to with the types to the types repo (see https://github.com/kubernetes/gengo/pull/47#issuecomment-296855818). The init() function is referring the `SchemeBuilder` defined in the register.go in the same packge, so we need to revert the dependency.
This PR depends on https://github.com/kubernetes/gengo/pull/49, otherwise verification will fail.
Automatic merge from submit-queue (batch tested with PRs 45052, 44983, 41254)
Non-controversial part of #44523
For easier review of #44523, i extracted the non-controversial part out to this PR.
Automatic merge from submit-queue (batch tested with PRs 44124, 44510)
Add metrics to all major gce operations (latency, errors)
```release-note
Add metrics to all major gce operations {latency, errors}
The new metrics are:
cloudprovider_gce_api_request_duration_seconds{request, region, zone}
cloudprovider_gce_api_request_errors{request, region, zone}
`request` is the specific function that is used.
`region` is the target region (Will be "<n/a>" if not applicable)
`zone` is the target zone (Will be "<n/a>" if not applicable)
Note: this fixes some issues with the previous implementation of
metrics for disks:
- Time duration tracked was of the initial API call, not the entire
operation.
- Metrics label tuple would have resulted in many independent
histograms stored, one for each disk. (Did not aggregate well).
```
Automatic merge from submit-queue (batch tested with PRs 44124, 44510)
Optimize the time taken to create Persistent volumes with VSAN storage capabilities at scale and handle VPXD crashes
Currently creating persistent volumes with VSAN storage capabilities at scale is taking very large amount of time. We have tested at the scale of 500-600 PVC's and its more time for all the PVC requests to go from Pending state to Bound state.
- In our current design we use a single systemVM - "kubernetes-helper-vm" as a means to create a persistent volume with the VSAN policy configured.
- Since all the operations are on a single system VM, all requests on scale get queued and executed serially on this system VM. Because of this creating a high number of PVC's is taking very large time.
- Since its a single system VM, all parallel PVC requests most of the time tend to take the same SCSI adapter on the system VM and also same unit number on the SCSI adapter. Therefore the error rate is high.
Inorder to overcome these issues and to optimize the time taken to create persistent volumes with VSAN storage capabilities at scale we have slightly modified the design which is described below:
- In this model, we create a VM on the fly for every persistent volume that is being created. Since all the reconfigure operations to create a disk with the VSAN policy configured are on their individual VM's, all of these PVC's request execute in parallel independent one other.
- With this new design, there will no error rate at all.
Also, we have overcome the problem of vpxd crashes and any other intermediate problems by checking type of the errors.
Fixes https://github.com/vmware/kubernetes/issues/122, https://github.com/vmware/kubernetes/issues/124
@kerneltime @tusharnt @divyenpatel @pdhamdhere
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 42740, 44980, 45039, 41627, 45044)
Improved code coverage for /pkg/kubelet/types
**What this PR does / why we need it**:
The test coverage for /pkg/kubelet/types was increased from 50% to 87.5%
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
The new metrics is:
cloudprovider_gce_api_request_duration_seconds{request, region, zone}
cloudprovider_gce_api_request_errors{request, region, zone}
`request` is the specific function that is used.
`region` is the target region (Will be "<n/a>" if not applicable)
`zone` is the target zone (Will be "<n/a>" if not applicable)
Note: this fixes some issues with the previous implementation of
metrics for disks:
- Time duration tracked was of the initial API call, not the entire
operation.
- Metrics label tuple would have resulted in many independent
histograms stored, one for each disk. (Did not aggregate well).
Automatic merge from submit-queue (batch tested with PRs 41106, 44346, 44929, 44979, 45027)
Add PATCH to supported list of proxy subresource verbs
Follow up to #41421 for the proxy subresources
```release-note
The proxy subresource APIs for nodes, services, and pods now support the HTTP PATCH method.
```
Automatic merge from submit-queue
Don't check in zz_generated.openapi.go.
`zz_generated.openapi.go` is the file that causes the most merge conflicts of all. In #33440, @thockin updated the makefile to support generating these files on demand, but that didn't play well with bazel/gazel.
In this PR, I add a new build macro that will generate this file with a `go_genrule`. I added support for keeping the BUILD file up to date in mikedanese/gazel#34.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Update token controller test to test async retry
Fixes#44819https://github.com/kubernetes/kubernetes/pull/44625 changed the token controller to queue a retry if the live service account's resourceVersion did not match our cache.
This updates the unit test that was testing that condition to test async queue behavior (which this condition now drives)
Automatic merge from submit-queue (batch tested with PRs 44970, 43618)
CRI: Fix StopContainer timeout
Fixes https://github.com/kubernetes/kubernetes/issues/44956.
I verified this PR with the example provided in https://github.com/kubernetes/kubernetes/issues/44956, and now pod deletion will respect grace period timeout:
```
NAME READY STATUS RESTARTS AGE
gracefully-terminating-pod 1/1 Terminating 0 6m
```
@dchen1107 @yujuhong @feiskyer /cc @kubernetes/sig-node-bugs
Automatic merge from submit-queue
Allow Partial Success for ImageGC
Fixes#44951. When the eviction manager is under disk pressure, it first attempts to reclaim disk space by deleting images. However, if there are any errors during the image deletion process, the eviction manager treats that as a failed attempt delete images--even if some were successfully deleted.
This change essentially makes the eviction manager ignore errors during image garbage collection, and instead rely solely on the quantity of resources reclaimed. If image deletion completely fails, for example, then this should still work as it would return 0 bytes freed. This allows for partial success, because any resources freed are counted, regardless of if some images fail to be deleted, for example.
This does not require any changes to the image manager, as the current behavior is already to return the disk space freed along with any errors.
```release-note
Fixes a bug where pods were evicted even after images are successfully deleted.
```
cc @dchen1107 @vishh @kubernetes/kubernetes-release-managers
note to reviewers: this is mostly whitespace changes, so it will make more sense in reviewable
Automatic merge from submit-queue (batch tested with PRs 44940, 44974, 44935)
Remove import of internal api package in generated external-versioned listers
Follow up of https://github.com/kubernetes/kubernetes/pull/44523
One line change in cmd/libs/go2idl/lister-gen/generators/lister.go, and simple changes in pkg/apis/autoscaling/v2alpha1/register.go, other changes are generated.
The internal api package will be eliminated from client-go, so these imports should be removed. Also, it's more correct to report the versioned resource in the error.
Automatic merge from submit-queue
OpenAPI support for kubectl
Support for openapi spec in kubectl.
Includes:
- downloading and caching openapi spec to a local file
- parsing openapi spec into binary serializable datastructures (10x faster load times 600ms -> 40ms)
- caching parsed openapi spec in memory for each command
```release-note
NONE
```
Automatic merge from submit-queue
Add redirect support to SpdyRoundTripper
Add support for following redirects to the SpdyRoundTripper. This is
necessary for clients using it directly (e.g. the apiserver talking
directly to the kubelet) because the CRI streaming server issues a
redirect for streaming requests.
We need this in OpenShift because we have code that executes inside our apiserver that talks directly to the node to perform an attach request, and we need to be able to follow that redirect.
This code was adapted from the upgrade-aware proxy handler.
cc @smarterclayton @sttts @liggitt @timstclair @kubernetes/sig-api-machinery-pr-reviews
Add support for following redirects to the SpdyRoundTripper. This is
necessary for clients using it directly (e.g. the apiserver talking
directly to the kubelet) because the CRI streaming server issues a
redirect for streaming requests.
Also extract common logic for following redirects.