Automatic merge from submit-queue (batch tested with PRs 52768, 51898, 53510, 53097, 53058). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
migration of federation test
**What this PR does / why we need it**:
Migrate federation(multicluster) e2e test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
Ref issue https://github.com/kubernetes/kubernetes/issues/50735
- Move `ubernetes_lite.go` to new created directory named **multicluster**.
**Special notes for your reviewer**:
**Release note**:
none
/cc @quinton-hoole
There is a networking e2e test with the It() description:
```
"should provide unchanging, static URL paths for kubernetes api services"
```
This test performs GETs from the Kubernetes API using various paths,
including "/logs". This test for a GET using path "/logs" should be
skipped for provider type "skeleton", since this path is unsupported.
This change adds "skeleton" to the list of providers for which
this test case should be skipped.
fixes#53529
Automatic merge from submit-queue (batch tested with PRs 53278, 53184). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added integration test for TaintNodeByCondition.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #42001
**Release note**:
```release-note
Added integration test for TaintNodeByCondition.
```
Automatic merge from submit-queue (batch tested with PRs 53278, 53184). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add API version apps/v1, and bump DaemonSet to apps/v1
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: kubernetes/features#484
**Special notes for your reviewer**: This PR targets `master`, as a backup if #53223 (targeting features branch) falls through
@kubernetes/sig-apps-api-reviews
**Release note**:
```release-note
Add API version apps/v1, and bump DaemonSet to apps/v1
```
Automatic merge from submit-queue (batch tested with PRs 53227, 53120). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
E2E test to verify clean up of stale dummy VM for vSphere dynamic provisioning
Verify if the dummy stale VM's created during dynamic provisioning are deleted by the clean up routine in vSphere cloud Provider.
**Testing Done:**
- Create a storage class with invalid policy on a VSAN datastore.
- Create a PVC using the above storage class
- Verify if the PVC is not bound.
- Delete the PVC.
- Sleep for 6 minutes so that vSphere Cloud Provider clean up routine can delete the stale dummy VM's.
- Verify if the VM is not present. Otherwise fail the test.
@rohitjogvmw @divyenpatel
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 51750, 53195, 53384, 53410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add ping6 option for e2e ext connectivity test for IPv6-only clusters
e2e tests provide only an (IPv4) ping test for external connectivity.
We need a way to conditionally run a ping6 external connectivity check,
and disable the (IPv4) ping-based external connectivity check,
for end-to-end testing on IPv6-only clusters.
This feature will be needed for creating gating IPv6 CI tests.
fixes#53383
**What this PR does / why we need it**:
This adds an IPv6 (ping6) version of the external connectivity ping check to the e2e test suite,
and adds "Feature:" flags for selecting whether the IPv4 or IPv6 (or both) versions
of the connectivity test should be run. We need this change to be able to use the
e2e test suite in upstream gating IPv6 CI tests on IPv6-only clusters (at least until
dual-stack operation is fully supported in Kubernetes).
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53383
**Special notes for your reviewer**:
Please let me know if there are better tags to use for selecting IPv4 vs IPv6 testing.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53345, 53389). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add IPv6 option for e2e iPerf test
The e2e iPerf test case currently only runs in IPv4 mode.
This change adds an option to run an iPerf test in IPv6 mode (i.e. by running
iPerf with a "-V" command line flag), so that the test can be run on
IPv6-only clusters.
**What this PR does / why we need it**:
This change adds an option to run an iPerf test in IPv6 mode (i.e. by running
iPerf with a "-V" command line flag), so that the test can be run on
IPv6-only clusters. It also adds a Feature tag to the current IPv4 iPerf test
so that it can be disabled when running e2e tests on an IPv6-only cluster.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53388
**Special notes for your reviewer**:
Please let me know if there are better "Feature:" tags to use for selecting whether to run the IPv4 vs IPv6 test case.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 53228, 53232, 53353). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes a regression introduced by PR 52290 that extended resource
capacity may temporarily drop to zero after kubelet restarts and PODs restarted during
that time window could fail to be scheduled.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53342
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes test/e2e_node/gpu_device_plugin.go test failure.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes https://github.com/kubernetes/kubernetes/issues/53354
**Special notes for your reviewer**:
**Release note**:
```release-note
```
e2e tests provide only an (IPv4) ping test for external connectivity.
We need a way to conditionally run a ping6 external connectivity check,
and disable the (IPv4) ping-based external connectivity check,
for end-to-end testing on IPv6-only clusters.
This feature will be needed for creating gating IPv6 CI tests.
fixes#53383
Automatic merge from submit-queue (batch tested with PRs 49826, 53404). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix typo in health check url
fixes the url used to scrape bootstrapping details in a test failure case
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remove conformance tag for internet connectivity
**What this PR does / why we need it**:
ICMP ping is not available in many environments, so we should avoid
using a e2e test based the premise as a conformance test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
Part of Fix for #52098, Please see #51287 for more discussion
**Release note**:
```release-note
NONE
```
The e2e iPerf test case currently only runs in IPv4 mode.
This change add an option to run an iPerf test in IPv6 mode (i.e. by running
iPerf with a "-V" command line flag), so that the test can be run on
IPv6-only clusters.
Automatic merge from submit-queue (batch tested with PRs 52723, 53271). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update file location in e2e test comment
**What this PR does / why we need it**: The location provided, "docs/design/expansion.md" leads to something saying the file has moved with a link. The link goes to a 404 error. The file was moved out of tree to https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/expansion.md and the comment here should be changed
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53270
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Disable autoscaling before removing autoscaled node pool
This is to prevent flakes due to API calls failing in AfterEach during master restart, which is triggered by deleting an autoscaled node pool. Adding disable call before deleting node pool should prevent this as we'll wait for master restart in disableAutoscaler function.
While it may be faster to wait after deletion of autoscaled node pools, this is less complex and will be easier to remove in the future when changing autoscaling setttings no longer triggers master restart.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix skip condition for autoscaling test of scale to zero
This fixes test running in wrong setup (on single MIG vs multiple MIGs as was intended.)
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Change RBAC storage version to v1 for 1.9
v1 was introduced in 1.8, but storage version remained at v1beta1 to accommodate HA rolling upgrades. in 1.9, we can change the persisted and preferred version to v1
```release-note
RBAC objects are now stored in etcd in v1 format. After completing an upgrade to 1.9, RBAC objects (Roles, RoleBindings, ClusterRoles, ClusterRoleBindings) should be migrated to ensure all persisted objects are written in `v1` format, prior to `v1alpha1` support being removed in a future release.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Remake cluster size autoscaling scale to zero test
This PR affects only cluster size autoscaling test suite. Changes:
* check whether autoscaling for is enabled by looking for a node group with a given max number of nodes instead of min as the field is omitted if value is 0
* split scale to zero test into GKE & GCE versions, add GKE-specific setup and verification
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
enable to specific unconfined AppArmor profile
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52370
**Special notes for your reviewer**:
/assign @tallclair @liggitt
**Release note**:
```release-note
enable to specific unconfined AppArmor profile
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
openapi: Validate unregistered type, if they can be found
**What this PR does / why we need it**:
Types that are not registered/hard-coded in kubectl won't be validated, even if they could because they are defined in openapi. If they are neither registered nor in openapi, then skip validation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes nothing
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Migrate sig-ui e2e test
**What this PR does / why we need it**:
Migrate sig-ui e2e tests
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
none
Automatic merge from submit-queue (batch tested with PRs 53234, 53252, 53267, 53276, 53107). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prepull images after disk eviction tests
Example failure: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-flaky/2855
Disk eviction tests trigger image garbage collection. It can remove images required for subsequent tests.
This results in the error during pod creation:
`timed out waiting for the condition`
You can see in the events after the test:
`I0929 15:47:05.884] I0929 15:17:09.376591 2309 util.go:4734] Event(v1.ObjectReference{Kind:"Pod", Namespace:"e2e-tests-localstorage-eviction-test-mn5v4", Name:"container-disk-hog-pod", UID:"8dba851c-a528-11e7-a9a6-42010a800fd7", APIVersion:"v1", ResourceVersion:"116", FieldPath:"spec.containers{container-disk-hog-container}"}): type: 'Warning' reason: 'ErrImageNeverPull' Container image "busybox" is not present with pull policy of Never`
/assign @Random-Liu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes a flakiness in GPUDevicePlugin e2e test.
Waits till nvidia gpu disappears from all nodes after deleting the
device plug DaemonSet to make sure its pods are deleted from all nodes.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53281
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 53263, 52967, 53262, 52654, 53187). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix sed command to not try shell redirection
RunCmd uses Go's os/exec library to run commands directly. Since these
are not run through a shell, we can't use shell syntax for piping or
file redirection. The proper way to do that is to create a Command
object and set the Std{in,out,err} pipes appropriately. Luckily sed
can handle the behavior we need without having to manually set this up.
fixes https://github.com/kubernetes/kubeadm/issues/472
RunCmd uses Go's os/exec library to run commands directly. Since these
are not run through a shell, we can't use shell syntax for piping for
file redirection. The proper way to do that is to create a Command
object and set the Std{in,out,err} pipes appropriately. Luckily sed
can handle the behavior we need without having to manually set this up.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
In cluster autoscaling tests, improve error logging on enable autoscaling failure
This adds logging command and request output in addition to error.
Automatic merge from submit-queue (batch tested with PRs 51311, 52575, 53169). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix a scheduler flaky e2e test
**What this PR does / why we need it**:
Makes a scheduler e2e test that verifies the resource limit predicate more robust.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53066
**Release note**:
```release-note
NONE
```
@kubernetes/sig-scheduling-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50280, 52529, 53093, 53108, 53168). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve PVC ref volume metric test robustness
This test has been flaking. The current working theory is that
volume stats collection didn't run in time to grab the metrics
from the newly created pod.
Made the following changes:
- Added more logs to help debug future failures
- Poll metrics a few additional times before failing the test
fixes#53150
This test has been flaking. The current working theory is that
volume stats collection didn't run in time to grab the metrics
from the newly created pod.
Made the following changes:
- Added more logs to help debug future failures
- Poll metrics a few additional times before failing the test
Automatic merge from submit-queue (batch tested with PRs 52630, 53110, 53136, 53075). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix host network flake tests
**What this PR does / why we need it**:
Fix flaky test "Security Context when creating a pod in the host network namespace should listen on same port in the host network containers".
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#53091
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
federation: simplify deepcopy calls
We have static DeepCopy now without the possibility of errors. Makes the code much simpler.
Automatic merge from submit-queue (batch tested with PRs 50988, 50509, 52660, 52663, 52250). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Added device plugin e2e kubelet failure test
Signed-off-by: Renaud Gaubert <renaud.gaubert@gmail.com>
**What this PR does / why we need it**:
This is part of issue #52859 (fixes#52859)
This PR adds a e2e_node test for the device plugin.
Specifically it implements testing of failure handling by the device plugin components in case Kubelet restart / crashes.
I might try to refactor the GPU tests in a later PR.
**Special notes for your reviewer**:
@jiayingz @vishh
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52990, 53064, 52686, 52221, 53069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow kubelet metrics tests to run on gke
**What this PR does / why we need it**:
On GKE, you can still access kubelet metrics, so allow the kubelet metrics test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
NONE
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve HT detection
**What this PR does / why we need it**:
Fix Cpu Manager e2e node tests that fail due to hard-coded `Thread(s) per core` char position.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52988
Automatic merge from submit-queue (batch tested with PRs 52721, 53057, 52493, 52998, 52896). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Move deployment collision avoidance e2e test to integration
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #52113
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
In autoscaling tests, add waiting for new pool to become ready
This adds missing timeout when adding a node pool in GKE scale to 0 test and improves logging error when enabling autoscaling.
Automatic merge from submit-queue (batch tested with PRs 51648, 53030, 53009). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixed intermitant e2e aggregator test on GKE.
**What this PR does / why we need it**: Issue was caused by another test cleaning up its namespace.
This caused the namespace controller to try to clean up that namespace.
This involves deleting all flunders under that namespace.
However the sample-apiserver was not honoring the namespace filter.
So the flunders for the test would randomly disappear.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50945
**Special notes for your reviewer**: Requires we fix the container image to contain this fix to work.
**Release note**:
```release-note NONE
```
Fixes issues/50945.
Issue was caused by another test cleaning up its namespace.
This caused the namespace controller to try to clean up that namespace.
This involves deleting all flunders under that namespace.
However the sample-apiserver was not honoring the namespace filter.
So the flunders for the test would randomly disappear.
Fixed image path to pick up newly built fixes from this PR.
Automatic merge from submit-queue (batch tested with PRs 51759, 53001, 52806). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix broken statefulset e2e test
**What this PR does / why we need it**:
Fixes the CockroachDB statefulset e2e test.
This was broken back in #43637 when the logic in
`(*StatefulSetTester).CreateStatefulSet` switched from using
`generated.ReadOrDie` to read the entire service.yaml file and pass it
to kubectl to using `manifest.SvcFromManifest`, which assumes that the
file contains only a single service.
To fix the test, just remove the second service, which isn't needed to test the Statefulset functionality.
**Which issue this PR fixes**:
Fixes#52750
**Special notes for your reviewer**:
N/A
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51067, 52319, 52803, 52961, 51972). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add support for skeleton in GetSigner
Adding support for skeleton to GetSigner to be able to run
e2e tests against a bare metal multinode cluster.
Closes#35613
Automatic merge from submit-queue (batch tested with PRs 51067, 52319, 52803, 52961, 51972). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Move prometheus metrics for docker operations into dockershim
Automatic merge from submit-queue (batch tested with PRs 52960, 52373). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Refactor eviction tests
fixes: #52203
We have a bunch of eviction tests, which each break independently, and take a large amount of time to fix.
This refactors these tests to share the core eviction testing logic. Each tests needs only to set kubelet flags, and specify which pods to run.
I decided to omit the memory eviction tests because they work. Best not to disturb them.
A large portion of the code changes are the renaming of inode_eviction_test.go -> eviction_test.go
This should probably wait until after https://github.com/kubernetes/kubernetes/pull/50392
/assign @mtaufen @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 52905, 52766). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Refactor parsing cluster autoscaler status, add logging error
Minor improvements to autoscaling test suite and e2e framework.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix GCE LB resource cleanup for service e2e tests.
**What this PR does / why we need it**: Fix GCE LB resource cleanup logic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52347
**Special notes for your reviewer**:
/assign @shyamjvs @nicksardo
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52880, 52855, 52761, 52885, 52929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Don't need to check useAnnotation in dns e2e test
**What this PR does / why we need it**:
hostname/subdomain annotations were removed in #44137. This PR removes the check.
Also, `var dnsServiceLabelSelector` is not used anymore.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
ref: https://github.com/kubernetes/kubernetes/pull/44137
**Special notes for your reviewer**:
/cc @bowei @MrHohn
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52880, 52855, 52761, 52885, 52929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Remove cloud provider rackspace
**What this PR does / why we need it**:
For now, we have to implement functions in both `rackspace` and `openstack` packages if we want to add function for cinder, for example [resize for cinder](https://github.com/kubernetes/kubernetes/pull/51498). Since openstack has implemented all the functions rackspace has, and rackspace is considered deprecated for a long time, [rackspace deprecated](https://github.com/rackspace/gophercloud/issues/592) ,
after talking with @mikedanese and @jamiehannaford offline , i sent this PR to remove `rackspace` in favor of `openstack`
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52854
**Special notes for your reviewer**:
**Release note**:
```release-note
The Rackspace cloud provider has been removed after a long deprecation period. It was deprecated because it duplicates a lot of the OpenStack logic and can no longer be maintained. Please use the OpenStack cloud provider instead.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Allow dns e2e test case for ExternalName to run on aws
**What this PR does / why we need it**:
#52840 uses allocated clusterIP instead of hard-coded one. So we don't need to care about the clusterIP range of the CI job config. Let it run on pull-kubernetes-e2e-kops-aws
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47224
**Special notes for your reviewer**:
ref: https://github.com/kubernetes/test-infra/pull/4462
/cc @bowei @MrHohn @justinsb
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
bazel: build/test almost everything
**What this PR does / why we need it**: Miscellaneous cleanups and bug fixes. The main motivating idea here was to make `bazel build //...` and `bazel test //...` mostly work. (There's a few reasons these still don't work, but we're a lot closer.)
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @BenTheElder @mikedanese @spxtr
Automatic merge from submit-queue (batch tested with PRs 52469, 52574, 52330, 52689, 52829). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixing E2E Test - After restarting kubelet test expects node's status to be NotReady
**What this PR does / why we need it**:
This PR is fixing the e2e tests involves restarting the kubelets. After the kubelet is restarted, test expect the desired state to be NotReady.
After restarting the kubelet we should wait for some time and then check nodes status to be Ready.
Node should not be checked for NotReady state, after restarting kubelet.
**Which issue this PR fixes**
fixes # https://github.com/vmware/kubernetes/issues/285
**Special notes for your reviewer**:
@BaluDontu @rohitjogvmw @tusharnt
Test logs before fix
-----
STEP: Restarting kubelet
Sep 15 11:26:32.768: INFO: Attempting sudo systemctl restart kubelet
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: command: sudo systemctl restart kubelet
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: stdout: ""
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: stderr: ""
Sep 15 11:26:33.001: INFO: ssh root@10.162.22.205:22: exit code: 0
Sep 15 11:26:33.002: INFO: Waiting up to 1m0s for node kubernetes-node2 condition Ready to be false
Sep 15 11:26:33.012: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:35.023: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:37.032: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:39.041: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:41.051: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:43.061: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:45.070: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:47.080: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:49.093: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:51.105: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:53.117: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:55.128: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:57.140: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:26:59.151: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:01.158: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:03.167: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:05.180: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:07.188: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:09.210: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:11.221: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:13.231: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:15.240: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:17.249: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:19.263: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:21.272: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:23.283: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:25.309: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:27.317: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:29.327: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:31.342: INFO: Condition Ready of node kubernetes-node2 is true instead of false. Reason: KubeletReady, message: kubelet is posting ready status
Sep 15 11:27:33.343: INFO: Node kubernetes-node2 didn't reach desired Ready condition status (false) within 1m0s
Sep 15 11:27:33.343: INFO: Node kubernetes-node2 failed to enter NotReady state
[AfterEach] [sig-storage] PersistentVolumes:vsphere
Test logs after fix
-----
STEP: Restarting kubelet
Sep 18 15:40:49.066: INFO: Checking if sudo command is present
Sep 18 15:40:49.342: INFO: Checking if systemctl command is present
Sep 18 15:40:49.573: INFO: Attempting `sudo systemctl status kubelet | grep 'Main PID'`
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: command: sudo systemctl status kubelet | grep 'Main PID'
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: stdout: " Main PID: 19715 (docker)\n"
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: stderr: ""
Sep 18 15:40:49.733: INFO: ssh root@10.162.16.97:22: exit code: 0
Sep 18 15:40:49.733: INFO: Attempting `sudo systemctl restart kubelet`
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: command: sudo systemctl restart kubelet
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: stdout: ""
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: stderr: ""
Sep 18 15:40:49.986: INFO: ssh root@10.162.16.97:22: exit code: 0
Sep 18 15:40:49.988: INFO: Attempting `sudo systemctl status kubelet | grep 'Main PID'`
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: command: sudo systemctl status kubelet | grep 'Main PID'
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: stdout: " Main PID: 25021 (docker)\n"
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: stderr: ""
Sep 18 15:40:50.158: INFO: ssh root@10.162.16.97:22: exit code: 0
Sep 18 15:40:50.158: INFO: Noticed that kubelet PID is changed. Waiting for 30 Seconds for Kubelet to come back
Sep 18 15:41:20.159: INFO: Waiting up to 1m0s for node kubernetes-node4 condition Ready to be true
STEP: Testing that written file is accessible.
Sep 18 15:41:20.191: INFO: Running '/Users/divyenp/github/vmware/kubernetes/_output/dockerized/bin/darwin/amd64/kubectl --server=https://10.162.0.45 --kubeconfig=/Users/divyenp/.kube/config exec --namespace=e2e-tests-pv-9j8j0 pvc-tester-3t9ds -- /bin/sh -c cat /mnt/_SUCCESS'
Sep 18 15:41:20.855: INFO: stderr: ""
Sep 18 15:41:20.855: INFO:
Sep 18 15:41:20.855: INFO: Volume mount detected on pod pvc-tester-3t9ds and written file /mnt/_SUCCESS is readable post-restart.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51902, 52718, 52687, 52137, 52697). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Multi-arch allowPrivilegeEscalation tests
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52698
**Special notes for your reviewer**:
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 52485, 52443, 52597, 52450, 51971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Removing PrometheusPushGateway --prom-push-gateway flag from e2e tests.
**What this PR does / why we need it**: Removing obsolete PrometheusPushGateway --prom-push-gateway flag from e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#45947
**Special notes for your reviewer**:
**Release note**:
```release-note
Removing `--prom-push-gateway` flag from e2e tests
```
Automatic merge from submit-queue (batch tested with PRs 50392, 52108, 52083, 52134, 51526). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
e2e: minor changes to network/service testing utils
Add more logging to help debug. Also refactor several functions to improve
reusability.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
inode eviction tests fill a constant number of inodes
Issue: #52203
inode eviction tests pass often on some OS distributions, and almost never on others. See [these testgrid tests](https://k8s-testgrid.appspot.com/sig-node#kubelet-flaky-gce-e2e&include-filter-by-regex=Inode)
These differences are most likely because different images have fewer or greater inode capacity, and thus percentage based rules (e.g. inodesFree<50%) make the test more stressful for some OS distributions than others.
This changes the test to require that a constant number of inodes are consumed, regardless of the number of inodes in the filesystem, by setting the new threshold to:
nodefs.inodesFree<(current_inodes_free - 200k)
so that after pods consume 200k inodes, they will be evicted. It requires querying the summary API until we successfully determine the current number of free Inodes.
Automatic merge from submit-queue (batch tested with PRs 51929, 52015, 51906, 52069, 51542). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
move specialDefaultResourcePrefixes out of vendor/k8s.io/apiserver
just a clean-up, fixes TODO: move out of this package, it is not generic
@sttts PTAL
/assign @sttts
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
should use time.Since instead of time.Now().Sub
**What this PR does / why we need it**:
should use time.Since instead of time.Now().Sub
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
```release-note
```
NONE
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Debug for issues #50945
Aggregator e2e test is intermittantly failing on GKE but not GCE.
Adding the following debugging for help trace issue.
Make sure we always use the same rest client.
Randomly generate the flunder resource name to detect parallel tests.
Print endpoints for sample-system in case multiple instances.
Print original and new pods in case the pod has been restarted.
**What this PR does / why we need it**: Adds debugging for aggregator e2e test to track down GKE flakiness.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50945
**Special notes for your reviewer**: This is primarily additional debugging information.
**Release note**:
```release-note NONE
```
Aggregator e2e test is intermittantly failing on GKE but not GCE.
Adding the following debugging for help trace issue.
Make sure we always use the same rest client.
Randomly generate the flunder resource name to detect parallel tests.
Print endpoints for sample-system in case multiple instances.
Print original and new pods in case the pod has been restarted.
Fixed import list.
Remove rand seed.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Don't specify clusterIP in dns e2e test
**What this PR does / why we need it**:
Different upgrade tests may configure different service clusterIP ranges. If we specify the clusterIP in dns e2e test, it will succeed in one upgrade test but fail in another. This PR doesn't specify clusterIP. It just uses the allocated clusterIP.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50274
**Special notes for your reviewer**:
Hope this can really fixes that issue.
/cc @thockin @MrHohn
**Release note**:
```release-note
NONE
```
This was broken back in #43637 when the logic in
`(*StatefulSetTester).CreateStatefulSet` switched from using
`generated.ReadOrDie` to read the entire service.yaml file and pass it
to kubectl to using `manifest.SvcFromManifest`, which assumes that the
file contains only a single service.
Fixes#52750
Automatic merge from submit-queue (batch tested with PRs 52843, 52710, 52821, 52844). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
improve retrying logic when checking CA status
This should reduce the flake rate in cluster size autoscaling test suite.
Automatic merge from submit-queue (batch tested with PRs 48406, 52819). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fixed nil dereference in dynamic provisioning e2e tests
**What this PR does / why we need it**: Fixed nil dereference in dynamic provisioning e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52815
**Release note**:
```release-note-none
NONE
```
/sig storage
/assign @saad-ali
/cc @wongma7
/release-note-none
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Retry if possible while creating latency pods in density test
Saw the [last run](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/37) of density test on 5k-node fail due to it:
```
Expected error:
<*errors.StatusError | 0xc44f2fd7a0>: {
ErrStatus: {
TypeMeta: {Kind: "", APIVersion: ""},
ListMeta: {SelfLink: "", ResourceVersion: "", Continue: ""},
Status: "Failure",
Message: "timeout",
Reason: "",
Details: nil,
Code: 500,
},
}
timeout
not to have occurred
```
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Support kubernetes-anywhere provider
**What this PR does / why we need it**:
Implements a new `kubernetes-anywhere` provider to allow upgrade testing in the e2e binary. This is the final step to allow https://github.com/kubernetes/test-infra/pull/4495 and https://github.com/kubernetes/kubernetes-anywhere/pull/450.
**Which issue this PR fixes**:
https://github.com/kubernetes/kubeadm/issues/311
**Special notes for your reviewer**:
Some questions I had
- Does the `--provider` flag specified [here](dbbf6261e0/jobs/config.json (L8587)) get sent to the flag defined [here](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/test_context.go#L219)? Or should I add another `--provider` flag inside `--upgrade_args` like this: `--upgrade_args=... --provider=kubernetes-anywhere`?
- Is it necessary to add waiting logic after the `make` command, or will it implicitly handle that by itself?
Some other points:
- I chose `sed` to manipulate the current kubernetes-anywhere `.config` rather than duplicating another [`anywhere.go`](https://github.com/kubernetes/test-infra/blob/master/kubetest/anywhere.go). One suggestion was to use `jq` but since the config on disk is not serialized to JSON yet, I'm not sure how that'd work.
- Since I don't have a GCE/GKE account or vCenter, I can't actually verify the e2e binary works. I've managed to build it, but if somebody could quickly run a smoke test, I'd appreciate it. This is my first poke around test-infra and e2e, so there might be some plumbing missing
/cc @jessicaochen @luxas @pipejakob @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 52500, 52533). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add mount options e2e test
**What this PR does / why we need it**: A test for newly added StorageClass.mountOptions and PV.mountOptions: provision a pv using a class with its storageclass.mountoptions set, and the end result should be that the mount options can be seen from the mounter.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: Fixes#52138
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
In autoscaling integration test, use allocatable instead of capacity for node memory
This makes the remaining cluster autoscaling test (integration test of HPA and CA working together to scale up the cluster) use node allocatable resources when computing how much memory we need to consume in order to trigger scale up/prevent scale down. Follow up to #52650 as that one is already merging.
cc @wasylkowski
Automatic merge from submit-queue (batch tested with PRs 51337, 47080, 52646, 52635, 52666). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Fix: update system spec to support Docker 17.03
Docker 17.03 is 1.13 with bug fixes so they are of the same minor version release. We've validated them both in https://github.com/kubernetes/kubernetes/issues/42926. This PR changes the system spec to support Docker 17.03.
**This should be in 1.8.**
**Release note**:
```
Kubernetes 1.8 supports docker version 17.03.x.
```
/assign @Random-Liu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
In cluster size autoscaling tests, use allocatable instead of capacity for node memory
This makes cluster size autoscaling e2e tests use node allocatable resources when computing how much memory we need to consume in order to trigger scale up/prevent scale down. It should fix failing tests in GKE.
Automatic merge from submit-queue (batch tested with PRs 52350, 52659). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Add e2e test for storageclass.reclaimpolicy
**What this PR does / why we need it**: Adds another dynamic provisioning test where the storageclass.reclaimpolicy == retain. Have to manually delete the PV at the end of the test.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: https://github.com/kubernetes/kubernetes/issues/52138
**Special notes for your reviewer**: I have not tested it but it's ready for review, I will comment and edit this when i've verified it actually works.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51796, 52223)
Add bsalamat to sig-scheduling-maintainers
**What this PR does / why we need it**:
Adds bsalamat to sig-scheduling-maintainers.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes # N/A
**Release note**:
```release-note
NONE
```
@kubernetes/sig-scheduling-pr-reviews @davidopp @timothysc @k82cn @wojtek-t
Automatic merge from submit-queue
Bugfix: Fix e2e Flaky Apps/Job BackoffLimit test
This fix is linked to the PR #51153 that introduce the `JobSpec.BackoffLimit`.
Previously the Timeout used in the test was too aggressive and generates flaky test execution. Now it used the default `framework.JobTimeout` used in others tests.
**What this PR does / why we need it**:
This PR should fix flaky "[sig-apps] Job should exceed backoffLimit" test, due to a too short timeout duration.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#51153
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237)
Improve apiserver metrics reporting
Normalize "WATCHLIST" to "WATCH", add "scope" to the other metrics (listing 50k pods is != listing pods in a namespace), and add a new scope "resource" to cover individual resource calls.
This roughly aligns metrics with our ACL model (technically resource scope is GET, but POST to a subresource and POST to a namespace are not the same thing).
```release-note
WATCHLIST calls are now reported as WATCH verbs in prometheus for the apiserver_request_* series. A new "scope" label is added to all apiserver_request_* values that is either 'cluster', 'resource', or 'namespace' depending on which level the query is performed at.
```
Automatic merge from submit-queue (batch tested with PRs 52442, 52247, 46542, 52363, 51781)
Add more tests for pod preemption
**What this PR does / why we need it**:
Adds more e2e and integration tests for pod preemption.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
This PR is based on #50949. Only the last commit is new.
**Release note**:
```release-note
NONE
```
ref/ #47604
@kubernetes/sig-scheduling-pr-reviews @davidopp
Automatic merge from submit-queue
[fluentd-gcp addon] Remove some e2e tests out of blocking suites
Fixes https://github.com/kubernetes/kubernetes/issues/52433
Some Stackdriver Logging e2e tests are broken in release-blocking suites:
- Due to the change in Docker 1.13, on some systems logs are automatically split by 16K chunks. This PR removes an e2e test that assumes otherwise
- In large clusters, it's not possible to ingest system logs from all nodes
Since it's not a Kubernetes problem per se, mitigating this by removing these tests from blocking suites.
Automatic merge from submit-queue
Fix failing autoscaling test in GKE
This should fix `[sig-autoscaling] Cluster size autoscaling [Slow] should increase cluster size if pending pods are small and there is another node pool that is not autoscaled [Feature:ClusterSizeAutoscalingScaleUp]` by getting a list of nodes from GKE nodepool in a different way (filtering nodes by labels.) Currently, gcloud command used for it is failing, as we only have GKE node pool name in the test and not the actual MIG name.
Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372)
Remove the conversion of client config
It was needed because the clientset code in client-go was a copy of the clientset code in Kubernetes.. client-go is authoritative now, so we can remove the nasty copy.
This fix is linked to the PR #51153 that introduce the
JobSpec.BackoffLimit.
Previously the Timeout used in the test was too agressive and generates
flaky test execution. Now it used the default framework.JobTimeout used
in others tests.
Automatic merge from submit-queue
Make CPU constraint for l7-lb-controller in density test scale with #nodes
Just noticed that we changed the memory last time, but didn't change cpu. From the last run:
```
Sep 13 04:25:03.360: INFO: Unexpected error occurred: Container l7-lb-controller-v0.9.6-gce-scale-cluster-master/l7-lb-controller is using 0.642709233/0.15 CPU
```
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)
Extends GPUDevicePlugin e2e test to exercise device plugin restarts.
**What this PR does / why we need it**:
This is part of issue #52189 but does not fix it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)
[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations
Stackdriver doesn't support log entries bigger than 100KB, so by default fluentd plugin just drops such entries. To avoid that and increase the visibility of this problem it's suggested to trim long lines instead.
/cc @igorpeshansky
```release-note
[fluentd-gcp addon] Fluentd will trim lines exceeding 100KB instead of dropping them.
```
Automatic merge from submit-queue
Version gates the ephemeral storage e2e test
Version gates the ephemeral storage e2e test.
**Release note**:
```
NONE
```
@kubernetes/sig-testing-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)
StatefulSet: Deflake e2e RunHostCmd more.
It turns out that at some points while the Node is recovering from a reboot, we get a different kind of error ("unable to upgrade connection"). Since we can't distinguish these transient errors from an error encountered after successfully executing the remote command, let's just retry all errors for 5min. If this doesn't work, I'm gonna blame it on sig-node.
ref #48031
Automatic merge from submit-queue (batch tested with PRs 48226, 52046, 52231, 52344, 52352)
Port Guestbook tests to mutiarch
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52232
**Special notes for your reviewer**:
**Release note**:
```NONE
NONE
```
Automatic merge from submit-queue
Node e2e tests for the CPU Manager.
**What this PR does / why we need it**:
- Adds node e2e tests for the CPU Manager implementation in https://github.com/kubernetes/kubernetes/pull/49186.
**Special notes for your reviewer**:
- Previous PR in this series: #51180
- Only `test/e2e_node/cpu_manager_test.go` must be reviewed as a part of this PR (i.e., the last commit). Rest of the comments belong in #51357 and #51180.
- The tests have been on run on `n1-standard-n4` and `n1-standard-n2` instances on GCE.
To run this node e2e test, use the following command:
```sh
make test-e2e-node TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS="CPU Manager" SKIP="" PARALLELISM=1
```
CC @ConnorDoyle @sjenning
It turns out that at some points while the Node is recovering from a
reboot, we get a different kind of error ("unable to upgrade
connection"). Since we can't distinguish these transient errors from an
error encountered after successfully executing the remote command,
let's just retry all errors for 5min. If this doesn't work, I'm gonna
blame it on sig-node.
Automatic merge from submit-queue (batch tested with PRs 52007, 52196, 52169, 52263, 52291)
Summary tests should expect rss usage now
**What this PR does / why we need it**:
Fixes summary test to expect rss usage now.
Previously, cAdvisor reported rss and not total_rss, but that has now been fixed in most recent version of cAdvisor now in the project.
See: https://github.com/kubernetes/kubernetes/pull/43399#issuecomment-287858599
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50289, 52106)
Fix AppArmor test at scale
**What this PR does / why we need it**:
The AppArmor test only runs on a single node, but previously was loading the necessary profiles to every node. This caused unnecessary churn in very large clusters, so this PR updates the test to only load the profiles to a single node, and ensure the test pod is run on that node (using pod affinity).
**Which issue this PR fixes**: fixes#51791
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 52227, 52120)
Fix discovery restmapper finding resources in non-preferred versions
Fixes: #52219
Also reverts behavioral changes to tests that version-qualified cronjobs to work around this issue.
The discovery rest mapper was only populating the priority rest mapper's search list with preferred groupversions.
That meant that if a resource existed in multiple non-preferred versions, AND did not exist in the preferred version (like cronjob, which only exists in v1beta2.batch and v2alpha1.batch, but not v1.batch), the priority restmapper would not find it in its group/version priority list, and would return an error.
```release-note
Fixed an issue looking up cronjobs when they existed in more than one API version
```
Automatic merge from submit-queue
Extend nvidia-gpus e2e test to include a device plugin based test
**What this PR does / why we need it**:
This is needed to verify device plugin feature.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/features/issues/368
**Special notes for your reviewer**:
Related test_infra PR: https://github.com/kubernetes/test-infra/pull/4265
**Release note**:
Add an e2e test for nvidia gpu device plugin
Automatic merge from submit-queue (batch tested with PRs 52047, 52063, 51528)
Improve dynamic kubelet config e2e node test and fix bugs
Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.
Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.
This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.
Related issue: #50217
I had been sitting on this until the cAdvisor fix merged in #51751, as these tests fail without that fix.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add pod preemption to the scheduler
**What this PR does / why we need it**:
This is the last of a series of PRs to add priority-based preemption to the scheduler. This PR connects the preemption logic to the scheduler workflow.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48646
**Special notes for your reviewer**:
This PR includes other PRs which are under review (#50805, #50405, #50190). All the new code is located in 43627afdf9.
**Release note**:
```release-note
Add priority-based preemption to the scheduler.
```
ref/ #47604
/assign @davidopp
@kubernetes/sig-scheduling-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 51900, 51782, 52030)
apiservers: stratify versioned informer construction
The versioned share informer factory has been part of the GenericApiServer config,
but its construction depended on other fields of that config (e.g. the loopback
client config). Hence, the order of changes to the config mattered.
This PR stratifies this by moving the SharedInformerFactory from the generic Config
to the CompleteConfig struct. Hence, it is only filled during completion when it is
guaranteed that the loopback client config is set.
While doing this, the CompletedConfig construction is made more type-safe again,
i.e. the use of SkipCompletion() is considereably reduced. This is archieved by
splitting the derived apiserver Configs into the GenericConfig and the ExtraConfig
part. Then the completion is structural again because CompleteConfig is again
of the same structure: generic CompletedConfig and local completed ExtraConfig.
Fixes#50661.
Automatic merge from submit-queue
fix format of forbidden messages
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51813
**Special notes for your reviewer**:
/assign @deads2k @liggitt
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
Multiarch support for pets images
**What this PR does / why we need it**:
This PR is for multiarch support for pets image
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#52133
**Special notes for your reviewer**:
Copied over the `contrib/pets/peer-finder` as this one is heavily used in many docker images under `test/images`. After this PR I'll submit the PR in contrib project to remove it.
**Release note**:
```NONE
```
Automatic merge from submit-queue
Pipe in upgrade image target for kube-proxy migration tests
**What this PR does / why we need it**:
https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-upgrade-kube-proxy-ds&width=20
and
https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-downgrade-kube-proxy-ds&width=20
are still failing.
Reproduced it locally and found node image is being default to debian during upgrade (it was gci before upgrade) because we don't pass in `gci` via `--upgrade--target`. And for some reasons (haven't figured out yet), the upgraded node uses debian image with gci startupscripts...
This PR pipes in `--upgrade-target` for kube-proxy migration tests, hopefully in conjunction with https://github.com/kubernetes/test-infra/pull/4447 it will bring the tests back to normal.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE
**Special notes for your reviewer**:
Sorry for bothering again.
/assign @krousey
**Release note**:
```release-note
NONE
```
Rather than just changing the config once to see if dynamic kubelet
config at-least-sort-of-works, this extends the test to check that the
Kubelet reports the expected Node condition and the expected configuration
values after several possible state transitions.
Additionally, this adds a stress test that changes the configuration 100
times. It is possible for resource leaks across Kubelet restarts to
eventually prevent the Kubelet from restarting. For example, this test
revealed that cAdvisor's leaking journalctl processes (see:
https://github.com/google/cadvisor/issues/1725) could break dynamic
kubelet config. This test will help reveal these problems earlier.
This commit also makes better use of const strings and fixes a few bugs
that the new testing turned up.
Related issue: #50217
Automatic merge from submit-queue (batch tested with PRs 52097, 52054)
Move paused deployment e2e tests to integration
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #52113
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
StatefulSet: Deflake e2e RunHostCmd.
The initial retry up to 20s was giving up too soon. I'm seeing this test flake because the Node rebooted and it takes ~2min to recover. Now StatefulSet RunHostCmd calls will use the same 5min timeout as with other Pod state checks.
ref #48031
Automatic merge from submit-queue (batch tested with PRs 51728, 49202)
Enable CRI-O stats from cAdvisor
**What this PR does / why we need it**:
cAdvisor may support multiple container runtimes (docker, rkt, cri-o, systemd, etc.)
As long as the kubelet continues to run cAdvisor, runtimes with native cAdvisor support may not want to run multiple monitoring agents to avoid performance regression in production. Pending kubelet running a more light-weight monitoring solution, this PR allows remote runtimes to have their stats pulled from cAdvisor when cAdvisor is registered stats provider by introspection of the runtime endpoint.
See issue https://github.com/kubernetes/kubernetes/issues/51798
**Special notes for your reviewer**:
cAdvisor will be bumped to pick up https://github.com/google/cadvisor/pull/1741
At that time, CRI-O will support fetching stats from cAdvisor.
**Release note**:
```release-note
NONE
```
The initial retry up to 20s was giving up too soon.
I'm seeing this test flake because the Node rebooted and it takes ~2min
to recover.
Now StatefulSet RunHostCmd calls will use the same 5min timeout as with
other Pod state checks.
Automatic merge from submit-queue (batch tested with PRs 51956, 50708)
Move autoscaling/v2 from alpha1 to beta1
This graduates autoscaling/v2alpha1 to autoscaling/v2beta1. The move is more-or-less just a straightforward rename.
Part of kubernetes/features#117
```release-note
v2 of the autoscaling API group, including improvements to the HorizontalPodAutoscaler, has moved from alpha1 to beta1.
```
Automatic merge from submit-queue (batch tested with PRs 51839, 51987)
Disable rbac/v1alpha1, settings/v1alpha1, and scheduling/v1alpha1 by default
**What this PR does / why we need it**: Disables alpha features which were previously enabled by default. Also changes tests which relied on these alpha features being enabled by default.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47691
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixed a bug where some alpha features were enabled by default.
The feature is still Alpha and at times, the IP address previously used
by the load balancer in the test will not completely freed even after
the load balancer is long gone. In this case, the test URL with the IP
would return a 404 response. Tolerate this error and retry until the new
load balancer is fully established.
Charge object count when object is created, no matter if the object is
initialized or not.
Charge the remaining quota when the object is initialized.
Also, checking initializer.Pending and initializer.Result when
determining if an object is initialized. We didn't need to check them
because before 51082, having 0 pending initializer and nil
initializers.Result is invalid.
Automatic merge from submit-queue (batch tested with PRs 51733, 51838)
Decouple kube-proxy upgrade/downgrade tests from upgradeTests
**What this PR does / why we need it**:
Fixes the failing kube-proxy migration CI jobs:
- https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-upgrade-kube-proxy-ds
- https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-downgrade-kube-proxy-ds
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51729
**Special notes for your reviewer**:
/assign @krousey @nicksardo
Could you please take a look post code-freeze (I believe it is fixing things)? Thanks!
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51733, 51838)
Relax update validation of uninitialized pod
Split from https://github.com/kubernetes/kubernetes/pull/50344
Fix https://github.com/kubernetes/kubernetes/issues/47837
* Let the podStrategy to only call `validation.ValidatePod()` if the old pod is not initialized, so fields are mutable.
* Let the podStatusStrategy refuse updates if the old pod is not initialized.
cc @smarterclayton
```release-note
Pod spec is mutable when the pod is uninitialized. The apiserver requires the pod spec to be valid even if it's uninitialized. Updating the status field of uninitialized pods is invalid.
```
Automatic merge from submit-queue (batch tested with PRs 51984, 51351, 51873, 51795, 51634)
Revert to using isolated PID namespaces in Docker
**What this PR does / why we need it**: Reverts to the previous docker default of using isolated PID namespaces for containers in a pod. There exist container images that expect always to be PID 1 which we want to support unmodified in 1.8.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48937
**Special notes for your reviewer**:
**Release note**:
```release-note
Sharing a PID namespace between containers in a pod is disabled by default in 1.8. To enable for a node, use the --docker-disable-shared-pid=false kubelet flag. Note that PID namespace sharing requires docker >= 1.13.1.
```
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)
Enabling aggregator functionality on kubemark, gce
Enabling full functionality aggregator functionality in kubemark tests.
This includes configuring it to work in gce (we seem to assume gce in our kubemark tests)
It also includes setting up the relevant security and auth config.
**What this PR does / why we need it**: Configure aggregator properly on kubemark tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48428
**Special notes for your reviewer**:
**Release note**:
```release-note NONE
```
Enabling full functionality aggregator functionality in kubemark tests.
This includes configuring it to work in gce (we seem to assume gce in our kubemark tests)
It also includes setting up the relevant security and auth config.
Removing unneeded reference to CA key for MHBauer.
Fixed to pull the "parsed" values for the certs.
Fix from shyamjvs.
Automatic merge from submit-queue (batch tested with PRs 51915, 51294, 51562, 51911)
Remove OutOfDisk from controllers
This is one of the working items for #48843 for 1.8.
This changes the scheduler and daemonset controllers to no longer respect the OutOfDisk condition. The kubelet has not published OutOfDisk=True since 1.5.
This still preserves the Toleration for the OutOfDisk condition, as (I think?) this is required for backwards compatibility. I added TODOs to remove this in 1.10.
Automatic merge from submit-queue (batch tested with PRs 51833, 51936)
Changed volume IO e2e test to verify file hash instead of content.
**What this PR does / why we need it**: The existing way of verifying file content takes too much memory, causing processes to be OOM killed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/51717
**Release note**:
```release-note
NONE
```
/sig storage
/release-note-none
/assign @jeffvance @rootfs
/cc @msau42
Automatic merge from submit-queue (batch tested with PRs 51845, 51868, 51864)
Update sys spec to support docker 1.11-1.13 and overlay2.
Fixes https://github.com/kubernetes/kubernetes/issues/32536.
Update docker spec to:
1) Support overlay2;
2) Support docker version 1.11-1.13.
@dchen1107 @yguo0905 @luxas
/cc @kubernetes/sig-node-pr-reviews
```release-note
Kubernetes 1.8 supports docker version 1.11.x, 1.12.x and 1.13.x. And also supports overlay2.
```
Automatic merge from submit-queue
Enable batch/v1beta1.CronJobs by default
This PR re-applies the cronjobs->beta back (https://github.com/kubernetes/kubernetes/pull/51720) with the fix from @shyamjvs.
Fixes#51692
@apelisse @dchen1107 @smarterclayton ptal
@janetkuo @erictune fyi
Automatic merge from submit-queue
Remove deprecated init-container in annotations
fixes#50655fixes#51816closes#41004fixes#51816
Builds on #50654 and drops the initContainer annotations on conversion to prevent bypassing API server validation/security and targeting version-skewed kubelets that still honor the annotations
```release-note
The deprecated alpha and beta initContainer annotations are no longer supported. Init containers must be specified using the initContainers field in the pod spec.
```
Automatic merge from submit-queue
Job failure policy controller support
**What this PR does / why we need it**:
Start implementing the support of the "Backoff policy and failed pod limit" in the ```JobController``` defined in https://github.com/kubernetes/community/pull/583.
This PR depends on a previous PR #48075 that updates the K8s API types.
TODO:
* [X] Implement ```JobSpec.BackoffLimit``` support
* [x] Rebase when #48075 has been merged.
* [X] Implement end2end tests
implements https://github.com/kubernetes/community/pull/583
**Special notes for your reviewer**:
**Release note**:
```release-note
Add backoff policy and failed pod limit for a job
```
Automatic merge from submit-queue (batch tested with PRs 51805, 51725, 50925, 51474, 51638)
Flexvolume dynamic plugin discovery: Prober unit tests and basic e2e test.
**What this PR does / why we need it**: Tests for changes introduced in PR #50031 .
As part of the prober unit test, I mocked filesystem, filesystem watch, and Flexvolume plugin initialization.
Moved the filesystem event goroutine to watcher implementation.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51147
**Special notes for your reviewer**:
First commit contains added functionality of the mock filesystem.
Second commit is the refactor for moving mock filesystem into a common util directory.
Third commit is the unit and e2e tests.
**Release note**:
```release-note
NONE
```
/release-note-none
/sig storage
/assign @saad-ali @liggitt
/cc @mtaufen @chakri-nelluri @wongma7
Automatic merge from submit-queue
bump QEMU version to v2.9.1
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
xref #38067
**Special notes for your reviewer**:
/assign @luxas
**Release note**:
```release-note
update QEMU version to v2.9.1
```
Job failure policy integration in JobController. From the
JobSpec.BackoffLimit the JobController will define the backoff
duration between Job retry.
It use the ```workqueue.RateLimitingInterface``` to store the number of
"retry" as "requeue" and the default Job backoff initial duration is set
during the initialization of the ```workqueue.RateLimiter.
Since the number of retry for each job is store in a local structure
"JobController.queue" if the JobController restarts the number of retries
will be lost and the backoff duration will be reset to 0.
Add e2e test for Job backoff failure policy
Automatic merge from submit-queue (batch tested with PRs 50602, 51561, 51703, 51748, 49142)
Use arm32v7|arm64v8 images instead of the deprecated armhf|aarch64 image organizations
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50601
**Special notes for your reviewer**:
/assign @ixdy @jbeda @zmerlynn
**Release note**:
```release-note
Use arm32v7|arm64v8 images instead of the deprecated armhf|aarch64 image organizations
```
Automatic merge from submit-queue (batch tested with PRs 51583, 51283, 51374, 51690, 51716)
Unify initializer name validation
Unify the validation rules on initializer names. Fix https://github.com/kubernetes/kubernetes/issues/51843.
```release-note
Action required: validation rule on metadata.initializers.pending[x].name is tightened. The initializer name needs to contain at least three segments separated by dots. If you create objects with pending initializers, (i.e., not relying on apiserver adding pending initializers according to initializerconfiguration), you need to update the initializer name in existing objects and in configuration files to comply to the new validation rule.
```
Automatic merge from submit-queue (batch tested with PRs 50832, 51119, 51636, 48921, 51712)
add reconcile command to kubectl auth
This pull exposes the RBAC reconcile commands through `kubectl auth reconcile -f FILE`. When passed a file which contains RBAC roles, rolebindings, clusterroles, or clusterrolebindings, it will compute covers and add the missing rules.
The logic required to properly "apply" rbac permissions is more complicated that a json merge since you have to compute logical covers operations between rule sets. This means that we cannot use `kubectl apply` to update rbac roles without risking breaking old clients (like controllers).
To solve this problem, RBAC created reconcile functions to use during startup for "stock" roles. We want to offer this power to users who are running their own controllers and extension servers.
This is an intersection between @kubernetes/sig-auth-misc and @kubernetes/sig-cli-misc
Automatic merge from submit-queue (batch tested with PRs 51335, 51364, 51130, 48075, 50920)
[API] Feature/job failure policy
**What this PR does / why we need it**: Implements the Backoff policy and failed pod limit defined in https://github.com/kubernetes/community/pull/583
**Which issue this PR fixes**:
fixes#27997, fixes#30243
**Special notes for your reviewer**:
This is a WIP PR, I updated the api batchv1.JobSpec in order to prepare the backoff policy implementation in the JobController.
**Release note**:
```release-note
Add backoff policy and failed pod limit for a job
```
Automatic merge from submit-queue
Use the right image for the right platform in the e2e tests
**What this PR does / why we need it**:
This PR is for enabling kubernetes tests for multi architecture platform
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#38067
**Special notes for your reviewer**:
This will enable conformance tests for all the supported architectures.
**Release note**:
```release-note
Make all e2e tests lookup image to use from a centralized place. In that centralized place, add support for multiple platforms.
```
x-ref #38067
Automatic merge from submit-queue (batch tested with PRs 45724, 48051, 46444, 51056, 51605)
Add selfsubjectrulesreview in authorization
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#47834#31292
**Special notes for your reviewer**:
**Release note**:
```release-note
Add selfsubjectrulesreview API for allowing users to query which permissions they have in a given namespace.
```
/cc @deads2k @liggitt
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)
Address TestEtcdStoragePath flakes
- Wait for the master to be healthy
- Wait longer for the master to start
- Fail gracefully if starting the master panics
Signed-off-by: Monis Khan <mkhan@redhat.com>
```release-note
NONE
```
Fixes#49423
@kubernetes/sig-api-machinery-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 50381, 51307, 49645, 50995, 51523)
Remove deprecated and experimental fields from KubeletConfiguration
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP (maybe not v1 in 1.8, given how close we are, but definitely in 1.9).
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
fixes: #51657
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51632, 51055, 51676, 51560, 50007)
test/e2e/auth: fix audit log test format parsing
Fixes https://github.com/kubernetes/kubernetes/issues/51556
```release-note
NONE
```
cc @CaoShuFeng
Still need to figure out how to run this test locally.
Automatic merge from submit-queue (batch tested with PRs 51628, 51637, 51490, 51279, 51302)
Fix pod local ephemeral storage usage calculation
We use podDiskUsage to calculate pod local ephemeral storage which is not correct, because podDiskUsage also contains HostPath volume which is considered as persistent storage
This pr fixes it
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#51489
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @jingxu97 @vishh
cc @ddysher
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
Add PVCRef to VolumeStats
**What this PR does / why we need it**:
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference
**Which issue this PR fixes** : [#363](https://github.com/kubernetes/features/issues/363)
**Special notes for your reviewer**:
**Release note**:
```
`VolumeStats` reported by the kubelet stats summary API
(http://<node>:10255/stats/summary) now include a PVCRef
field describing the PVC referenced by the volume (if any).
```
Automatic merge from submit-queue (batch tested with PRs 51513, 51515, 50570, 51482, 51448)
Removes redundant prefix in cluster-lifecycle e2e test names
**What this PR does / why we need it**:
Removes redundant prefix in cluster-lifecycle e2e test names
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
xref: #50054
**Special notes for your reviewer**:
/cc @jbeda
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50719, 51216, 50212, 51408, 51381)
Make selector immutable for v1beta2 deployment, replicaset and daemonset prior update
**What this PR does / why we need it**:
This PR ensures controller selector is immutable for deployment and replicaset prior update by ignoring any change to `Spec`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50808
**Special notes for your reviewer**:
This will be a breaking change.
**Release note**:
```release-note
For Deployment, ReplicaSet, and DaemonSet, selectors are now immutable when updating via the new `apps/v1beta2` API. For backward compatibility, selectors can still be changed when updating via `apps/v1beta1` or `extensions/v1beta1`.
```
Automatic merge from submit-queue (batch tested with PRs 51707, 51662, 51723, 50163, 51633)
update GC controller to wait until controllers have been initialized …
fixes#51013
Alternative to https://github.com/kubernetes/kubernetes/pull/51492 which keeps those few controllers (only one) from starting the informers early.
Automatic merge from submit-queue (batch tested with PRs 51707, 51662, 51723, 50163, 51633)
Change SizeLimit to a pointer
This PR fixes issue #50121
```release-note
The `emptyDir.sizeLimit` field is now correctly omitted from API requests and responses when unset.
```
Automatic merge from submit-queue (batch tested with PRs 51707, 51662, 51723, 50163, 51633)
Adding vishh to test/ reviewers and approvers
Rationale: Reviewing/Shepherding lots of features/PRs around node and resource management.
Automatic merge from submit-queue (batch tested with PRs 50775, 51397, 51168, 51465, 51536)
Enable batch/v1beta1.CronJobs by default
This PR moves to CronJobs beta entirely, enabling `batch/v1beta1` by default.
Related issue: #41039
@erictune @janetkuo ptal
```release-note
Promote CronJobs to batch/v1beta1.
```
Automatic merge from submit-queue (batch tested with PRs 50775, 51397, 51168, 51465, 51536)
Allow bearer requests to be proxied by kubectl proxy
Use a fake transport to capture changes to the request and then surface
them back to the end user.
Fixes#50466
@liggitt no tests yet, but works locally
As we work towards providing a stable (v1) kubeletconfig API,
we cannot afford to have deprecated or "experimental" (alpha) fields
living in the KubeletConfiguration struct. This removes all existing
experimental or deprecated fields, and places them in KubeletFlags
instead.
I'm going to send another PR after this one that organizes the remaining
fields into substructures for readability. Then, we should try to move
to v1 ASAP.
It makes far more sense to focus on a clean API in kubeletconfig v2,
than to try and further clean up the existing "API" that everyone
already depends on.
Automatic merge from submit-queue
e2e: Add tests for network tiers in GCE
This test depends on #51301, which adds the new feature. Only the `e2e: Add tests for network tiers in GCE` commit is new.
#51301 should pass this new test.
Automatic merge from submit-queue (batch tested with PRs 51439, 51361, 51140, 51539, 51585)
Enable alpha GCE disk API
This PR builds on top of #50467 to allow the GCE disk API to use either the alpha or stable APIs.
CC @freehan
Automatic merge from submit-queue (batch tested with PRs 47054, 50398, 51541, 51535, 51545)
Switch away from gcloud deprecated flags in compute resource listings
**What is fixed**
Remove deprecated `gcloud compute` flags, see linked issue.
**Which issue this PR fixes**:
fixes#49673
**Special notes for your reviewer**:
The change in `gcloudComputeResourceList` in `test/e2e/framework/ingress_utils.go` isn't strictly needed as currently no affected resources are called on within that file, however the function has the _potential_ to access affected resources so I covered it as well. Happy to change if deemed unnecessary.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51228, 50185, 50940, 51544, 51543)
Add upgrades tests for kube-proxy daemonset migration path
**What this PR does / why we need it**:
From #23225, this is a part of setting up CIs to validate the kube-proxy migration path (static pods -> daemonset and reverse).
The other part of the works (adding real CIs that run these tests) will be in a separate PR against [kubernetes/test-infra](https://github.com/kubernetes/test-infra).
Though this is currently blocked by #50705.
**Special notes for your reviewer**:
cc @roberthbailey @pwittrock
**Release note**:
```release-note
NONE
```
For pod volumes that reference a PVC, add a PVCRef to the corresponding
volume stat. This allows metrics to be indexed/queried by PVC name
which is more user-friendly than Pod reference
Automatic merge from submit-queue (batch tested with PRs 51377, 46580, 50998, 51466, 49749)
Adding e2e SELinux test for local storage
Adding e2e test for SELinux enabled local storage
/sig storage
Closes#45054
Automatic merge from submit-queue (batch tested with PRs 51377, 46580, 50998, 51466, 49749)
Use the pre-built docker binaries on Ubuntu for benchmark tests
- Tested manually.
- The `ubuntu-init-docker.yaml` is copied from `cos-init-docker.yaml` with the following changes needed by Ubuntu. This change is temporary -- we will remove the script and the tests once we know the performance of using the pre-built Docker 1.12 on Ubuntu.
```
71,72c71,72
< mount --bind "${install_location}"/docker-containerd /usr/bin/docker-containerd
< mount --bind "${install_location}"/docker-containerd-shim /usr/bin/docker-containerd-shim
---
> mount --bind "${install_location}"/docker-containerd /usr/bin/containerd
> mount --bind "${install_location}"/docker-containerd-shim /usr/bin/containerd-shim
75c75
< mount --bind "${install_location}"/docker-runc /usr/bin/docker-runc
---
> mount --bind "${install_location}"/docker-runc /usr/sbin/runc
88c88
< local requested_version="$(get_metadata "gci-docker-version")"
---
> local requested_version="$(get_metadata "ubuntu-docker-version")"
93,98d92
< # Check if we have the requested version installed.
< if check_installed /usr/bin/docker "${requested_version}"; then
< echo "Requested version already installed. Exiting."
< exit 0
< fi
<
100c94
< /usr/bin/systemctl stop docker
---
> systemctl stop docker
106c100
< /usr/bin/systemctl start docker && exit $rc
---
> systemctl start docker && exit $rc
```
- Updated all tests to use the latest Ubuntu image.
**Release note**:
```
None
```
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 49961, 50005, 50738, 51045, 49927)
Add cluster e2es to verify scheduler local storage support
Add cluster e2es to verify scheduler local storage support and remove some unused private functions
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
part of #50818
**Release note**:
```release-note
Add cluster e2es to verify scheduler local ephemeral storage support
```
/assign @jingxu97
/cc @ddysher
Automatic merge from submit-queue (batch tested with PRs 44719, 48454)
check job ActiveDeadlineSeconds
**What this PR does / why we need it**:
enqueue a sync task after ActiveDeadlineSeconds
**Which issue this PR fixes** *:
fixes#32149
**Special notes for your reviewer**:
**Release note**:
```release-note
enqueue a sync task to wake up jobcontroller to check job ActiveDeadlineSeconds in time
```
Automatic merge from submit-queue
Added an end-to-end test ensuring that Cluster Autoscaler does not scale up when all pending pods are unschedulable
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50919, 51410, 50099, 51300, 50296)
GCE: Read networkProjectID param
Fixes#48515
/assign bowei
The first commit is the original PR cherrypicked. The master's kubelet isn't provided a cloud config path, so the project is retrieved via instance metadata. In the GKE case, this project cannot be retrieved by the master and caused an error.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51471, 50561, 50435, 51473, 51436)
Feature gate initializers field
The metadata.initializers field should be feature gated and disabled by default while in alpha, especially since enforcement of initializer permission that keeps users from submitting objects with their own initializers specified is done via an admission plugin most clusters do not enable yet.
Not gating the field and tests caused tests added in https://github.com/kubernetes/kubernetes/issues/51429 to fail on clusters that don't enable the admission plugin.
This PR:
* adds an `Initializers` feature gate, auto-enables the feature gate if the admission plugin is enabled
* clears the `metadata.initializers` field of objects on create/update if the feature gate is not set
* marks the e2e tests as feature-dependent (will follow up with PR to test-infra to enable the feature and opt in for GCE e2e tests)
```release-note
Use of the alpha initializers feature now requires enabling the `Initializers` feature gate. This feature gate is auto-enabled if the `Initialzers` admission plugin is enabled.
```
Automatic merge from submit-queue
Moved node condition filter into a predicates.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50360
**Release note**:
```release-note
A new predicates, named 'CheckNodeCondition', was added to replace node condition filter. 'NetworkUnavailable', 'OutOfDisk' and 'NotReady' maybe reported as a reason when failed to schedule pods.
```
Automatic merge from submit-queue (batch tested with PRs 50953, 51082)
Fix mergekey of initializers; Repair invalid update of initializers
Fix https://github.com/kubernetes/kubernetes/issues/51131
The PR did two things to make parallel patching `metadata.initializers.pending` possible:
* Add mergekey to initializers.pending
* Let the initializer admission plugin set the `metadata.intializers` to nil if an update makes the `pending` and the `result` both nil, instead of returning a validation error. Otherwise if multiple initializer controllers sending the patch removing themselves from `pending` at the same time, one of them will get a validation error.
```release-note
The patch to remove the last initializer from metadata.initializer.pending will result in metadata.initializer to be set to nil (assuming metadata.initializer.result is also nil), instead of resulting in an validation error.
```
Automatic merge from submit-queue
Fix forbidden message format
Before this change:
$ kubectl get pods --as=tom
Error from server (Forbidden): pods "" is forbidden: User "tom" cannot list pods in the namespace "default".
After this change:
$ kubectl get pods --as=tom
Error from server (Forbidden): pods is forbidden: User "tom" cannot list pods in the namespace "default".
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```
Fix forbidden message format, remove extra ""
```
Automatic merge from submit-queue
Let the quota evaluator handle mutating specs of pod & pvc
### Background
The final goal is to address https://github.com/kubernetes/kubernetes/issues/47837, which aims to allow more mutation for uninitialized objects.
To do that, we [decided](https://github.com/kubernetes/kubernetes/issues/47837#issuecomment-321462433) to let the admission controllers to handle mutation of uninitialized objects.
### Issue
#50399 attempted to fix all admission controllers so that can handle mutating uninitialized objects. It was incomplete. I didn't realize although the resourcequota admission plugin handles the update operation, the underlying evaluator didn't. This PR updated the evaluators to handle updates of uninitialized pods/pvc.
### TODO
We still miss another piece. The [quota replenish controller](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/resourcequota/replenishment_controller.go) uses the sharedinformer, which doesn't observe the deletion of uninitialized pods at the moment. So there is a quota leak if a pod is deleted before it's initialized. It will be addressed with https://github.com/kubernetes/kubernetes/issues/48893.
Automatic merge from submit-queue
Make coreos test images sshd not allow password login.
This will prevent security scanners from triggering.
Configuration is verbatim from:
https://coreos.com/os/docs/latest/customizing-sshd.html
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51054, 51101, 50031, 51296, 51173)
Dynamic Flexvolume plugin discovery, probing with filesystem watch.
**What this PR does / why we need it**: Enables dynamic Flexvolume plugin discovery. This model uses a filesystem watch (fsnotify library), which notifies the system that a probe is necessary only if something changes in the Flexvolume plugin directory.
This PR uses the dependency injection model in https://github.com/kubernetes/kubernetes/pull/49668.
**Release Note**:
```release-note
Dynamic Flexvolume plugin discovery. Flexvolume plugins can now be discovered on the fly rather than only at system initialization time.
```
/sig-storage
/assign @jsafrane @saad-ali
/cc @bassam @chakri-nelluri @kokhang @liggitt @thockin
Automatic merge from submit-queue (batch tested with PRs 50889, 51347, 50582, 51297, 51264)
Change eviction manager to manage one single local storage resource
**What this PR does / why we need it**:
We decided to manage one single resource name, eviction policy should be modified too.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #50818
**Special notes for your reviewer**:
**Release note**:
```release-note
Change eviction manager to manage one single local ephemeral storage resource
```
/assign @jingxu97
Before this change:
# kubectl get pods --as=tom
Error from server (Forbidden): pods "" is forbidden: User "tom" cannot list pods in the namespace "default".
After this change:
# kubectl get pods --as=tom
Error from server (Forbidden): pods is forbidden: User "tom" cannot list pods in the namespace "default".
Automatic merge from submit-queue
Fixed gke auth update wait condition.
Lookup whoami on gke using gcloud auth list.
Make sure we do not run the test on any cluster older than 1.7.
**What this PR does / why we need it**: Fixes issue with aggregator e2e test on GKE
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50945
**Special notes for your reviewer**: There is a TODO, follow up will be provided when the immediate problem is resolved.
**Release note**: ```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51134, 51122, 50562, 50971, 51327)
Made the tests ensure that Cluster Autoscaler is on before running.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Configuration is based on:
https://coreos.com/os/docs/latest/customizing-sshd.html
The specific SSHD config is:
# Use most defaults for sshd configuration.
UsePrivilegeSeparation sandbox
Subsystem sftp internal-sftp
ClientAliveInterval 180
UseDNS no
UsePAM yes
PrintLastLog no # handled by PAM
PrintMotd no # handled by PAM
AuthenticationMethods publickey
This will prevent security scanners from triggering.
Automatic merge from submit-queue
AllowedNotReadyNodes allowed to be not ready for absolutely *any* reason
It's as good as we allow those many nodes to be not part of the cluster at all, ever.
Btw - currently our 5k-node correctness test fails if "kubelet stopped posting node status" or "route not created", etc (ref: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-correctness/3/build-log.txt)
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue (batch tested with PRs 51244, 50559, 49770, 51194, 50901)
Distribute pods efficiently in CA scalability tests
**What this PR does / why we need it**:
Instead of using runReplicatedPodOnEachNode method
which is suited to a small number of nodes,
distribute pods on the nodes with desired load
using RCs that eat up all the space we want to be
empty after distribution.
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50213, 50707, 49502, 51230, 50848)
StatefulSet: Deflake e2e `kubectl exec` commands.
This may help with another source of flakiness found while investigating #48031.
We seem to get a lot of flakes due to "connection refused" while running `kubectl exec`. I can't find any reason this would be caused by the test flow, so I'm adding retries to see if that helps.
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)
Enable overlay2 on cos-m60 in node e2e tests
Ref: https://github.com/kubernetes/kubernetes/issues/42926
- Restart docker with `-s overlay2` in cloud-init before running all node e2e tests. I have to copy the systemd unit file to `/etc/systemd/system` because the `/usr/lib/systemd/system/` is read only.
- Updated node e2e tests to use the new cos-m60 image.
- The name of the cloud init file (`cos-init-live-restore.yaml`) does not indicate overlay2 will be enabled, but I can't just change the name in this PR, since it's referenced in test-infra.
**Release note**:
```
None
```
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)
StatefulSet: Deflake e2e "restart" phase.
This addresses another source of flakiness found while investigating #48031.
The test used to scale the StatefulSet down to 0, wait for ListPods to return 0 matching Pods, and then scale the StatefulSet back up.
This was prone to a race in which StatefulSet was told to scale back up before it had observed its own deletion of the last Pod, as evidenced by logs showing the creation of Pod ss-1 prior to the creation of the replacement Pod ss-0.
Instead, we now wait for the controller to observe all deletions before scaling it back up. This should fix flakes of the form:
```
Too many pods scheduled, expected 1 got 2
```
We seem to get a lot of flakes due to "connection refused" while running
`kubectl exec`. I can't find any reason this would be caused by the test
flow, so I'm adding retries to see if that helps.
Instead of using runReplicatedPodOnEachNode method
which is suited to a small number of nodes,
distribute pods on the nodes with desired load
using RCs that eat up all the space we want to be
empty after distribution.
Automatic merge from submit-queue (batch tested with PRs 51193, 51154, 42689, 51189, 51200)
Re-enable OIR e2e tests.
Re-enabling test skeleton for opaque integer resources originally submitted as part of #41870. The e2e was disabled since it was flaky. This is the first step toward re-enabling them. Currently all cases are skipped, so this exercises only the BeforeEach behavior and the deferred removal of OIRs from a node.
cc @timothysc
Automatic merge from submit-queue (batch tested with PRs 51108, 51035, 50539, 51160, 50947)
Auto-calculate CLUSTER_IP_RANGE based on cluster size
In preparation for eliminating CLUSTER_IP_RANGE env var from job configs, making it less error prone while folks try to start their own large cluster tests (https://github.com/kubernetes/kubernetes/issues/50907).
/cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)
Add statefulset upgrade tests to cluster_upgrade
**What this PR does / why we need it**:
Adds already created statefulset upgrade tests to cluster_upgrade.go. With further test infra changes, this will allow them to be continuously run, giving better signals.
Detect and prevent issues like https://github.com/kubernetes/kubernetes/issues/48327
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)
implement proposal 34058: hostPath volume type
**What this PR does / why we need it**:
implement proposal #34058
**Which issue this PR fixes** : fixes#46549
**Special notes for your reviewer**:
cc @thockin @luxas @euank PTAL
Automatic merge from submit-queue (batch tested with PRs 50489, 51070, 51011, 51022, 51141)
update to rbac v1 in yaml file
**What this PR does / why we need it**:
ref to https://github.com/kubernetes/kubernetes/pull/49642
ref https://github.com/kubernetes/features/issues/2
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
cc @liggitt
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Skip "Simple pod should support exec through kubectl proxy" test
As reported in https://github.com/kubernetes/kubernetes/issues/50466,
this test doesn't work in GKE because it uses a bearer token and the feature only works with client certs.
As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
**What this PR does / why we need it**: Fixes the broken test in https://k8s-testgrid.appspot.com/release-master-blocking#gke
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: works-around #50466
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
The test used to scale the StatefulSet down to 0, wait for ListPods to
return 0 matching Pods, and then scale the StatefulSet back up.
This was prone to a race in which StatefulSet was told to scale back up
before it had observed its own deletion of the last Pod, as evidenced by
logs showing the creation of Pod ss-1 prior to the creation of the
replacement Pod ss-0.
We now wait for the controller to observe all deletions before
scaling it back up. This should fix flakes of the form:
```
Too many pods scheduled, expected 1 got 2
```
Automatic merge from submit-queue (batch tested with PRs 50257, 50247, 50665, 50554, 51077)
Replace hard-code "cpu" and "memory" to consts
**What this PR does / why we need it**:
There are many places using hard coded "cpu" and "memory" as resource name. This PR replace them to consts.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
/kind cleanup
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50980, 46902, 51051, 51062, 51020)
Remove seemingly obsolete binaries
It's hard to tell if these are safe to remove. Let CI tell me.
Automatic merge from submit-queue (batch tested with PRs 51039, 50512, 50546, 50965, 50467)
Kubectl: Plumb openapi validation (disabled by default)
**What this PR does / why we need it**: Creates a new flag '--openapi' and plumb in the validation code so that it can be used by default to validate objects against the openapi schema.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: partially https://github.com/kubernetes/kubectl/issues/49
**Special notes for your reviewer**:
This is not complete, the name of the variable must change for example.
**Release note**:
```release-note
Kubectl uses openapi for validation. If OpenAPI is not available on the server, it defaults back to the old Swagger.
```
Automatic merge from submit-queue
StatefulSet: Deflake e2e "Saturate" phase.
This should reduce one source of flakiness found while investigating #48031.
The "Saturate" phase of StatefulSet e2e tests verifies orderly startup by controlling when each Pod is allowed to report Ready. If a Pod unexepectedly goes down during the test, the replacement Pod
created by the controller will forget if it was already allowed to report Ready.
After this change, the signal that allows each Pod to report Ready is persisted in the Pod's PVC. Thus, the replacement Pod will remember that it was already told to proceed to a Ready state.
Automatic merge from submit-queue (batch tested with PRs 51102, 50712, 51037, 51044, 51059)
[sig-network-e2e] Remove redundant sig prefix from tests
**What this PR does / why we need it**:
Remove redundant sig prefix from:
```
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for endpoint-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for endpoint-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for node-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for node-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for pod-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for pod-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update endpoints: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update endpoints: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update nodePort: http [Slow]
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update nodePort: udp [Slow]
[sig-network] Loadbalancing: L7 [sig-network] GCE [Slow] [Feature:Ingress] should conform to Ingress spec
[sig-network] Loadbalancing: L7 [sig-network] GCE [Slow] [Feature:Ingress] should create ingress with given static-ip
```
Umbrella issue #49161
**Special notes for your reviewer**:
cc @xiangpengzhao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)
Fix GC integration test race
During TestCreateWithNonExistentOwner, when creating a pod with a
non-existent owner, assume it's possible the pod will be deleted before
we start checking for the pod's existence. Assuming that the pod still
exists immediately after Create returns is flaky if the GC reacts very
quickly.
```release-note
NONE
```
Might fix https://github.com/kubernetes/kubernetes/issues/50943; without the additional test context provided by this PR, it's not entirely possible to assess the root cause of the reported failure (as we don't know whether the original assertion failure was due to there being 0 or >1 pods).
/cc @caesarxuchao
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)
Revert "Merge pull request #51008 from kubernetes/revert-50789-fix-scheme"
I'm spinning up a cluster right now to test this fix, but I'm pretty sure this was the problem.
There doesn't seem to be a way to confirm from logs, because AFAICT the logs from the hollow kubelet containers are not collected as part of the kubemark test.
**What this PR does / why we need it**:
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
**Which issue this PR fixes**: fixes#51007
**Release note**:
```release-note
NONE
```
/cc @shyamjvs @wojtek-t
As reported in https://github.com/kubernetes/kubernetes/issues/50466,
this test doesn't work in GKE because the transport layer doesn't work
with dialing.
As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
Automatic merge from submit-queue
Should generate files before scheduler perf
**What this PR does / why we need it**:
For a newly cloned project, generated files are not included. Then scheduler_perf will fail:
```
undefined: openapi.GetOpenAPIDefinitions
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
fixes: #51090
**Special notes for your reviewer**:
Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640)
Increase latency threshold for list api calls
This is only a short-term solution to make our density test green. In the long-term, we should measure as per our new SLIs.
From @wojtek-t's [doc](https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw) on the new SLIs/SLOs, we have the following SLO for list calls:
```
SLO1: In default Kubernetes installation, 99th percentile of SLI2 per cluster-day:
<= 1s if total number of objects of the same type as resource in the system <= X
<= 5s if total number of objects of the same type as resource in the system <= Y
<= 30s if total number of objects of the same types as resource in the system <= Z
```
I would guess that 170,000 pods would fall into the 2nd bracket (at least) and hence the new value of 5s. WDYT?
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Revert #50362.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #50884
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 50693, 50831, 47506, 49119, 50871)
Add instance metadata from flag even when using image config.
Also add instance metadata from flag even when we are using image config.
* Sometimes we need to dynamically generate instance metadata, it's troublesome to put them into image config.
* Sometimes we want to apply instance metadata to all images, it's duplicated to add them to each image in the image config.
/assign @yguo0905 Could you help me review this?
The "Saturate" phase of StatefulSet e2e tests verifies orderly startup
by controlling when each Pod is allowed to report Ready.
If a Pod unexepectedly goes down during the test, the replacement Pod
created by the controller will forget if it was already allowed to
report Ready.
After this change, the signal that allows each Pod to report Ready is
persisted in the Pod's PVC. Thus, the replacement Pod will remember that
it was already told to proceed to a Ready state.
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.
This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
Automatic merge from submit-queue (batch tested with PRs 47896, 50678, 50620, 50631, 51005)
Made the difference between scale-up timeout and cluster set-up timeout explicit.
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
During TestCreateWithNonExistentOwner, when creating a pod with a
non-existent owner, assume it's possible the pod will be deleted before
we start checking for the pod's existence. Assuming that the pod still
exists immediately after Create returns is flaky if the GC reacts very
quickly.
- Wait for the master to be healthy
- Wait longer for the master to start
- Fail gracefully if starting the master panics
Signed-off-by: Monis Khan <mkhan@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 46512, 50146)
Make metav1.(Micro)?Time functions take pointers
Is there any reason for those functions not to be on pointers?
Automatic merge from submit-queue (batch tested with PRs 50904, 50691)
Stackdriver Logging e2e: Explicitly check for docker and kubelet logs presence
Check for kubelet and docker logs explicitly in the Stackdriver Logging e2e tests
Automatic merge from submit-queue
Add enj to OWNERS for test/integration/etcd/etcd_storage_path_test.go
@deads2k is the bot smart enough to not spam me with every test change? Perhaps I should create an `OWNERS` file in `test/integration/etcd`?
**Release note**:
```release-note
NONE
```
@kubernetes/sig-api-machinery-pr-reviews
Automatic merge from submit-queue
CollisionCount should have type int32 across controllers that use it for collision avoidance
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50530
**Special notes for your reviewer**:
/cc @liyinan926
/assign @kow3ns @thockin @janetkuo
**Release note**:
```release-note
Change CollisionCount from int64 to int32 across controllers
```
Automatic merge from submit-queue (batch tested with PRs 50277, 50823, 50376, 50867)
Move e2e taints test file to sig-scheduling
**What this PR does / why we need it**:
Move taint test file to e2e scheduling and add sig-scheduling prefix.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
none
Automatic merge from submit-queue
Change API version of statefulset scale subresource e2e test to v1beta2
**What this PR does / why we need it**:
This PR changes API version of statefulset scale subresource e2e test from `v1beta1` to `v1beta2`.
`apps/v1beta2` has been enabled.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #50109
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50281, 50747, 50347, 50834, 50852)
Add e2e aggregator test.
What this PR does / why we need it:
This adds an e2e test for aggregation based on the sample-apiserver.
Currently is uses a sample-apiserver built as of 1.7.
This should ensure that the aggregation system works end-to-end.
It will also help detect if we break "old" extension api servers.
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes
fixes#43714
**Special notes for your reviewer**:
**Release note**: NONE
Automatic merge from submit-queue (batch tested with PRs 50563, 50698, 50796)
Add ControllerRevision to apps/v1beta2
**What this PR does / why we need it**:
This PR added `ControllerRevision` currently in `apps/v1beta1` to `apps/v1beta2`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50696.
**Special notes for your reviewer**:
@kow3ns @janetkuo
**Release note**:
```release-note
Add ControllerRevision to apps/v1beta2
```
What this PR does / why we need it:
This adds an e2e test for aggregation based on the sample-apiserver.
Currently is uses a sample-apiserver built as of 1.7.
This should ensure that the aggregation system works end-to-end.
It will also help detect if we break "old" extension api servers.
Which issue this PR fixes (optional, in fixes #<issue number>(, fixes
fixes#43714
Fixed bazel for the change.
Fixed # of args issue from govet.
Added code to test dynamic.Client.
Automatic merge from submit-queue
Migrate sig-apimachinery and sig-servicecatalog e2e tests
**What this PR does / why we need it**:
Migrate sig-apimachinery and sig-servicecatalog e2e tests
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Ref Umbrella issue #49161
1. Move generated_clientset.go to sig-apimachinary
2. Move podpreset.go to sig-servicecatalog by creating new directory.
**Special notes for your reviewer**:
**Release note**:
none
/cc @liggitt
Automatic merge from submit-queue (batch tested with PRs 50550, 50768)
Don't SSH to master for metrics in case of GKE
cc @kubernetes/sig-scalability-misc @crassirostris
Automatic merge from submit-queue
add some e2e for node authz
**What this PR does / why we need it**:
fix#47174
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47174
**Special notes for your reviewer**:
**Release note**:
```
None
```
Automatic merge from submit-queue
Promote CronJobs to batch/v1beta1 - just the API
This PR promotes CronJobs to beta.
@erictune @kubernetes/sig-apps-api-reviews @kubernetes/api-approvers ptal
This builds on top of #41890 and needs #40932 as well
```release-note
Promote CronJobs to batch/v1beta1.
```
Automatic merge from submit-queue (batch tested with PRs 50711, 50742, 50204)
fix panic in e2e
**What this PR does / why we need it**:
fix#50660
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
no
**Release note**:
```release-note
none
```
Automatic merge from submit-queue (batch tested with PRs 50670, 50332)
e2e test for local storage mount point
**What this PR does / why we need it**:
We discovered that kubernetes can treat local directories and actual mountpoints differently. For example, https://github.com/kubernetes/kubernetes/issues/48331. The current local storage e2e tests use directories.
This PR introduces a test that creates a tmpfs and mounts it, and runs one of the local storage e2e tests.
**Which issue this PR fixes**: fixes https://github.com/kubernetes/kubernetes/issues/49126
**Special notes for your reviewer**:
I cherrypicked PR https://github.com/kubernetes/kubernetes/pull/50177, since local storage e2e tests are broken in master on 2017-08-08 due to "no such host" error. This PR replaces NodeExec with SSH commands.
You can run the tests using the following commands:
```
$ NUM_NODES=1 KUBE_FEATURE_GATES="PersistentLocalVolumes=true" go run hack/e2e.go -- -v --up
$ go run hack/e2e.go -- -v --test --test_args="--ginkgo.focus=\[Feature:LocalPersistentVolumes\]"
```
Here are the summary of results from my test run:
```
Ran 9 of 651 Specs in 387.905 seconds
SUCCESS! -- 9 Passed | 0 Failed | 0 Pending | 642 Skipped PASS
Ginkgo ran 1 suite in 6m29.369318483s
Test Suite Passed
2017/08/08 11:54:01 util.go:133: Step './hack/ginkgo-e2e.sh --ginkgo.focus=\[Feature:LocalPersistentVolumes\]' finished in 6m32.077462612s
```
**Release note**:
`NONE`
Automatic merge from submit-queue (batch tested with PRs 50198, 49051, 48432)
move KubeletConfiguration out of componentconfig API group
I'm splitting #44252 into more manageable steps. This step moves the types and updates references.
To reviewers: the most important changes are the removals from pkg/apis/componentconfig and additions to pkg/kubelet/apis/kubeletconfig. Almost everything else is an import or name update.
I have one unanswered question: Should I create a whole new api scheme for Kubelet APIs rather than register e.g. a kubeletconfig group with the default runtime.Scheme instance? This feels like the right thing, as the Kubelet should be exposing its own API, but there's a big fat warning not to do this in `pkg/api/register.go`. Can anyone answer this?
Automatic merge from submit-queue (batch tested with PRs 50198, 49051, 48432)
Add prefix to common networking e2e tests
**What this PR does / why we need it**:
Common networking e2e tests shared by node and cluster suites should also have prefix `[sig-network]`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
**Special notes for your reviewer**:
/cc @bowei
**Release note**:
```release-note
NONE
```
LocalVolumeType tmpfs added
Added checks to ensure tha volume created during setup contains expected testFileContent
Refactored tests out to avoid code duplication
Two different tests are performed with tmpfs:
-serial write and read in two different pods
-write and read in two different pods mounted at the same time
Fixed local storage test failures by integrating https://github.com/kubernetes/kubernetes/pull/50177
Switched NodeExec to SSH
Automatic merge from submit-queue (batch tested with PRs 49904, 50484, 50214)
Adding support for internal IP for e2e tests
Currently IssueSSHComand in util.go only checks for External IP address
to ssh, this PR adds check for internal IP too.
Closes#50630
Automatic merge from submit-queue (batch tested with PRs 50094, 48966, 49478, 50593, 49140)
Migrate sig-auth e2e tests.
**What this PR does / why we need it:** This PR adds [sig-auth] prefix to
workload e2e tests in accord to requirements of adding a SIG dashboard
to testgrid. Refer PR #48781 for guidelines.
**Release note**:
```release-note
```
**What this PR does / why we need it:** This PR adds [sig-auth] prefix to
workload e2e tests in accord to requirements of adding a SIG dashboard
to testgrid. Refer PR #48781 for guidelines.
Automatic merge from submit-queue
Moved node condition filter into a predicates.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50360
**Release note**:
```release-note
A new predicates, named 'CheckNodeCondition', was added to replace node condition filter. 'NetworkUnavailable', 'OutOfDisk' and 'NotReady' maybe reported as a reason when failed to schedule pods.
```
Automatic merge from submit-queue (batch tested with PRs 49847, 49743, 49853, 50225, 50479)
Add node benchmark tests for cos-m60 with docker 1.12.6
Ref: https://github.com/kubernetes/kubernetes/issues/42926
This PR adds a benchmark tests against cos-m60 with docker 1.12.6 on http://node-perf-dash.k8s.io. This test is useful for docker validation -- we can compare the performance of different dockers on the same OS.
cos-m60 comes with docker 1.13.1 by default, so we need to use cloud-init to downgrade the version to 1.12.6.
**Release note**:
```
None
```
/assign @dchen1107
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
Pass config to external Kubemark cluster in e2e tests
When cluster autoscaler is used in kubemark tests,
pass default kubeconfig as external cluster config.
@shyamjvs @gmarek
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
fix a typo
**What this PR does / why we need it**:
fix a small typo
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
verions->versions
**Special notes for your reviewer**:
**Release note**:
NONE
```release-note
```NONE
Automatic merge from submit-queue (batch tested with PRs 50485, 49951, 50508, 50511, 50506)
Multiarch nonewprivs test image
**What this PR does / why we need it**:
This PR is for converting nonewprivs image which pushed very recently part of https://github.com/kubernetes/kubernetes/pull/47019.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#50498
**Special notes for your reviewer**:
**Release note**:
```NONE```
Automatic merge from submit-queue (batch tested with PRs 50537, 49699, 50160, 49025, 50205)
When not using a CloudProvider, set both InternalIP and ExternalIP on Nodes
#36095 changed all of the cloudproviders to set both InternalIP and ExternalIP on Nodes, but the non-cloudprovider fallback code now only sets InternalIP.
This causes the test "should be able to create a functioning NodePort service" in test/e2e/service.go to fail on cloud-provider-less clusters, because (with LegacyHostIP gone), it now will only try to work with ExternalIPs, and will fail if the node has only an InternalIP.
There isn't much other code that assumes that ExternalIP will always be set (there's something in pkg/master/master.go, but I don't know what it's doing, so maybe it's only useful in the case where InternalIP != ExternalIP anyway). But given that several of the cloudproviders (mesos, ovirt, rackspace) now explicitly set both InternalIP and ExternalIP to the same value always, it seemed right to do that in the fallback case too.
@deads2k FYI
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
code format in master_utils.go
**What this PR does / why we need it**:
code format
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #N/A
**Release note**:
```release-note
None
```
Automatic merge from submit-queue
move logs to kubectl/util
Move `pkg/util/logs` to `pkg/kubectl/util/logs` per https://github.com/kubernetes/kubernetes/issues/48209#issuecomment-311730681
This will make kubeadm, kubefed, gke-certificates-controller and e2e have dependency on kubectl, which should be fine.
partially addresses: kubernetes/community#598
```release-note
NONE
```
/assign @apelisse @monopole
Automatic merge from submit-queue
Remove deprecated ESIPP beta annotations
**What this PR does / why we need it**:
Remove deprecated ESIPP beta annotations.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50187
**Special notes for your reviewer**:
/assign @MrHohn
/sig network
**Release note**:
```release-note
Beta annotations `service.beta.kubernetes.io/external-traffic` and `service.beta.kubernetes.io/healthcheck-nodeport` have been removed. Please use fields `service.spec.externalTrafficPolicy` and `service.spec.healthCheckNodePort` instead.
```
Automatic merge from submit-queue
Migrate to controller references helpers in meta/v1
**What this PR does / why we need it**:
This is a follow up for #48319 that migrates all method usages to new methods in meta/v1.
**Special notes for your reviewer**:
Looking at each commit individually might be easier.
**Release note**:
```release-note
NONE
```
/sig api-machinery
/kind cleanup
Automatic merge from submit-queue
Add Cluster Autoscaler scalability test suite
This suite is intended for manually testing Cluster Autoscaler on large clusters. It isn't supposed to be run automatically (at least for now).
It can be run on Kubemark (with #50440) with the following setup:
- start Kubemark with NUM_NODES=1 (as we require there to be exactly 1 replica per hollow-node replication controller in this setup)
- set kubemark-master machine type manually to appropriate type for the Kubemark cluster size. Maximum Kubemark cluster size reached in test run is defined by maxNodes constant, so for maxNodes=1000, please upgrade to n1-standard-32. Adjust if modifying maxNodes.
- start Cluster Autoscaler pod in the external cluster using image built from version with Kubemark cloud provider (release pending)
- for grabbing metrics from ClusterAutoscaler (with #50382), add "--include-cluster-autoscaler=true" parameter in addition to regular flags for gathering components' metrics/resource usage during e2e tests
cc @bskiba
Automatic merge from submit-queue (batch tested with PRs 45186, 50440)
Add functionality needed by Cluster Autoscaler to Kubemark Provider.
Make adding nodes asynchronous. Add method for getting target
size of node group. Add method for getting node group for node.
Factor out some common code.
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45186, 50440)
Retry fed-svc creation on diff NodePort during e2e tests
**What this PR does / why we need it**:
Currently in federated end2end tests, the creation of services are
done with a randomize NodePort selection take is causing e2e test
flakes if the creation of a federated service failed if the port is
not available.
Now the util.CreateService(...) function is retrying to create the
service on different nodePort in case of error. The method retry until
success or all possible NodePorts have been tested and also failed.
**Which issue this PR fixes**
fixes#44018
Make adding nodes asynchronous. Add method for getting target
size of node group. Add method for getting node group for node.
Factor out some common code.
Automatic merge from submit-queue (batch tested with PRs 50386, 50374, 50444, 50382)
Add grabbing Cluster Autoscaler metrics in e2e tests
This adds:
- collecting metrics from Cluster Autoscaler before & after e2e test run
- --include-cluster-autoscaler opt-in flag
- passing external cluster client to MetricsGrabber (required for Kubemark setup, as Cluster Autoscaler doesn't run on master in this case)
Most types now have valid rest mappings because
NewDefaultRESTMapperFromScheme no longer ignores certain import
paths. Thus we can no longer use the lack of a valid REST mapping
as an indicator for when to use kindWhiteList. Thus kindWhiteList
now serves as a whitelist for all kinds and not just those that
formally had no mapping. This does mean that we could whitelist
kinds due to a name conflict, but that is unlikely as names such as
GetOptions are not appropriate for new objects.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 49725, 50367, 50391, 48857, 50181)
Add e2e test for privileged containers
**What this PR does / why we need it**:
This PR adds node e2e test for privileged containers.
**Which issue this PR fixes**
Part of #44118.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/assign @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 49642, 50335, 50390, 49283, 46582)
Improve GC discovery sync performance
Improve GC discovery sync performance by only syncing when discovered
resource diffs are detected. Before, the GC worker pool was shut down
and monitors resynced unconditionally every sync period, leading to
significant processing delays causing test flakes where otherwise
reasonable GC timeouts were being exceeded.
Related to https://github.com/kubernetes/kubernetes/issues/49966.
/cc @kubernetes/sig-api-machinery-bugs
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49642, 50335, 50390, 49283, 46582)
Add rbac.authorization.k8s.io/v1
xref https://github.com/kubernetes/features/issues/2
Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes
```release-note
The `rbac.authorization.k8s.io/v1beta1` API has been promoted to `rbac.authorization.k8s.io/v1` with no changes.
The `rbac.authorization.k8s.io/v1alpha1` version is deprecated and will be removed in a future release.
```
Automatic merge from submit-queue (batch tested with PRs 50300, 50328, 50368, 50370, 50372)
Reduce hollow-kubelet cpu request
Fixes https://github.com/kubernetes/kubernetes/issues/50366
This should make kubemark-500 fit in 6 nodes again. Checked that it should be enough.
cc @kubernetes/sig-scalability-misc
Automatic merge from submit-queue (batch tested with PRs 50418, 49830, 49206, 49061, 49912)
add LocalZone into gce.conf and refactor gce cloud provider configura…
The main goal of this PR is to make gce cloud provider able to run locally.
1. added a LocalZone parameter into gce.conf.
2. refactor `newGCECloud` to avoid contacting metadata server if configuration is already available.
```release-note
None
```
Automatic merge from submit-queue
remove apps/v1beta2 defaulting codes for obj.Spec.Selector and obj.Labels
**What this PR does / why we need it**:
This PR removes defaulting codes for `obj.Spec.Selector`. Currently, `obj.Spec.Selector.MatchLabels` is set to `obj.Spec.Template.Labels` if `obj.Spec.Template.Labels != nil && obj.Spec.Selector == nil`. We should not perform this defaulting operation as controllers selectors are immutable.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50339
**Special notes for your reviewer**:
This PR removes defaulting codes for `apps/v1beta2` only. The defaulting codes for validation will be removed in another PR.
**Release note**:
```NONE
```
Automatic merge from submit-queue
VSphere cloud provider code refactoring
The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.
- VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
- Volume management on datastore (Create/Delete)
- Volume management on Virtual Machines (Attach/Detach)
- Storage Policy Management
- vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
- Modifications to e2e tests to accomodate the latest design changes.
@divyenpatel @rohitjogvmw @luomiao
```release-note
vSphere cloud provider: vSphere cloud provider code refactoring
```
Automatic merge from submit-queue (batch tested with PRs 50016, 49583, 49930, 46254, 50337)
Alpha Dynamic Kubelet Configuration
Feature: https://github.com/kubernetes/features/issues/281
This proposal contains the alpha implementation of the Dynamic Kubelet Configuration feature proposed in ~#29459~ [community/contributors/design-proposals/dynamic-kubelet-configuration.md](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/dynamic-kubelet-configuration.md).
Please note:
- ~The proposal doc is not yet up to date with this implementation, there are some subtle differences and some more significant ones. I will update the proposal doc to match by tomorrow afternoon.~
- ~This obviously needs more tests. I plan to write several O(soon). Since it's alpha and feature-gated, I'm decoupling this review from the review of the tests.~ I've beefed up the unit tests, though there is still plenty of testing to be done.
- ~I'm temporarily holding off on updating the generated docs, api specs, etc, for the sake of my reviewers 😄~ these files now live in a separate commit; the first commit is the one to review.
/cc @dchen1107 @vishh @bgrant0607 @thockin @derekwaynecarr
```release-note
Adds (alpha feature) the ability to dynamically configure Kubelets by enabling the DynamicKubeletConfig feature gate, posting a ConfigMap to the API server, and setting the spec.configSource field on Node objects. See the proposal at https://github.com/kubernetes/community/blob/master/contributors/design-proposals/dynamic-kubelet-configuration.md for details.
```
Automatic merge from submit-queue (batch tested with PRs 50016, 49583, 49930, 46254, 50337)
Remove scheduledjobs
This is a prerequisite for promoting CronJobs to beta.
**Release note**:
```release-note
Remove deprecated ScheduledJobs endpoints, use CronJobs instead.
```
Automatic merge from submit-queue (batch tested with PRs 50016, 49583, 49930, 46254, 50337)
[Federation] Make the hpa scale time window configurable
This PR is on top of open pr https://github.com/kubernetes/kubernetes/pull/45993.
Please review only the last commit in this PR.
This adds a config param to controller manager, the value of which gets passed to hpa adapter via sync controller.
This is needed to reduce the overall time limit of the hpa scaling window to much lesser (then the default 2 mins) to get e2e tests run faster. Please see the comment on the newly added parameter.
**Special notes for your reviewer**:
@kubernetes/sig-federation-pr-reviews
@quinton-hoole
@marun to please validate the mechanism used to pass a parameter from cmd line to adapter.
**Release note**:
```
federation-controller-manager gets a new flag --hpa-scale-forbidden-window.
This flag is used to configure the duration used by federation hpa controller to determine if it can move max and/or min replicas
around (or not), of a cluster local hpa object, by comparing current time with the last scaled time of that cluster local hpa.
Lower value will result in faster response to scalibility conditions achieved by cluster local hpas on local replicas, but too low
a value can result in thrashing. Higher values will result in slower response to scalibility conditions on local replicas.
```
Pods associated with the test JobTemplate should use a zero
TerminationGracePeriodSeconds to ensure they're deleted immediately.
This should improve test timing assumption consistency.
Automatic merge from submit-queue
Support exec/attach/portforward in `kubectl proxy`
Use the UpgradeAwareProxy shared code in kubectl proxy. Provide a separate transport for those requests that does not have HTTP/2 enabled. Refactor the code to be a bit cleaner in places and to better separate changes.
Fixes#32026
```release-note
`kubectl proxy` will now correctly handle the `exec`, `attach`, and `portforward` commands. You must pass `--disable-filter` to the command in order to allow these endpoints.
```
Improve GC discovery sync performance by only syncing when discovered
resource diffs are detected. Before, the GC worker pool was shut down
and monitors resynced unconditionally every sync period, leading to
significant processing delays causing test flakes where otherwise
reasonable GC timeouts were being exceeded.
Related to https://github.com/kubernetes/kubernetes/issues/49966.
Automatic merge from submit-queue (batch tested with PRs 50173, 50324, 50288, 50263, 50333)
Add blank import for node tests
The node tests weren't being run because the weren't imported in the test/e2e/e2e_test.go file.
Thanks to @abgworrall for sounding the alarm (he noticed [sig-node] wasn't in the test results)!
/assign @yujuhong
/cc @abgworrall
Automatic merge from submit-queue
Fix local storage test failures
**What this PR does / why we need it**:
Fixed a few issues:
- CI environment on GCE cannot resolve node names, need to use IPs. Use a different SSH wrapper that will get the IPs from the node object.
- Use hostdir instead of containerdir now that commands are executed directly on the host, instead of through a container.
- Get the PVC object again after it is bound so that it has the PV name.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50128
**Release note**:
NONE
/release-note-none
/sig storage
Automatic merge from submit-queue
Add waitForFailure for e2e test framework
**What this PR does / why we need it**:
Add waitForFailure for e2e test framework, this could reduce the reliance on logs.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #44118. Refer https://github.com/kubernetes/kubernetes/pull/48858#discussion_r128331726
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Deprecate Deployment .spec.rollbackTo field
~Depends on #48746~ (merged)
xref: #46934, #49135
1. Deprecate Deployment field `.spec.rollbackTo` in `extensions/v1beta1` and `apps/v1beta1`, and remove the same field and `/rollback` endpoint from `apps/v1beta2` Deployment.
1. Add an annotation `deprecated.deployment.rollback.to` in `apps/v1beta2` for conversion to/from other versions.
Note: `apps/v1beta2` is new in 1.8 (and WIP), so it is okay to make breaking changes to it.
```release-note
Deprecate Deployment .spec.rollbackTo field
```
Currently, in federated end2end tests, the creation of services are
done with a randomize NodePort selection. It causing e2e test
flakes if the creation of a federated service failed if the port is
not available.
Now the util.CreateService(...) function is re trying to create the
service on different nodePort in an error case. The method retries until
success or 10 creation retry with other random NodePorts.
If never the service has not been created properly on one of the
federated cluster, a Service shards cleanup is executed before retrying
again the federated service creation.
fixes#44018
Automatic merge from submit-queue
Add a simple cloud provider for e2e tests on kubemark
**What this PR does / why we need it**:
Adds a simplified cloud provider for kubemark. This enables us to add and
remove nodes and operate on nodegroups while running tests on kubemark.
This is needed to run scalability tests for cluster autoscaler on kubemark.
See https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/kubemark_integration.md
**Release note**:
```
NONE
```
Automatic merge from submit-queue
Add e2e test for cronjob chained removal
This is test proving https://github.com/kubernetes/kubernetes/pull/44058 works with cronjobs. This will fail until the aforementioned PR merges.
@caesarxuchao ptal
Automatic merge from submit-queue (batch tested with PRs 50208, 50259, 49702, 50267, 48986)
Move ownership of proxy test to sig-network directory
```release-note
None
```
1. Deprecate `.spec.rollbackTo` field in extensions/v1beta1 and
apps/v1beta1 Deployments
2. Remove the same field from apps/v1beta2 Deployment, and remove
its rollback subresource and endpoint
Automatic merge from submit-queue
StatefulSet scale subresource
**What this PR does / why we need it**: This PR implements scale subresource for StatefulSet.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46005
**Special notes for your reviewer**:
**Release note**:
```release-note
StatefulSet uses scale subresource when scaling in accord with ReplicationController, ReplicaSet, and Deployment implementations.
```
**Feature Checklist**:
- [x] Introduce Registry interface for storage purpose
- [x] Introduce `ScaleREST New(), Get() and Update()` utility functions
- [x] Create a `ScaleREST` object at `NewREST()` and return it
- [x] Enable scale subresource by adding `/scale` field to the storage map
**Testing Checklist**:
- Unit testing
- [x] Modify `newStorage()` to call `NewStorage()`, and change all unit tests accordingly
- [x] Add unit tests for `ScaleREST Get() and Update()` utility functions
- [x] Add missing unit test for `ShortNames`
- Manual testing
- [x] Verify existence of the subresource using `kubectl proxy` command
- [x] Modify the subresource using `curl` via `POST`
- e2e testing
- [x] Add e2e tests using `RESTClient`
Automatic merge from submit-queue
Federated Job controller implementation
Note that job re-balance is not there yet as it's difficult to honor job deadline
requires #35945 and 35943
fixes#34261
@quinton-hoole @nikhiljindal @deepak-vij
**Release note**:
```release-note
Federated Job feature. It is now possible to create a Federated Job
that is automatically deployed to one or more federated clusters
(as Jobs in those clusters). Job parallelism and completions are
spread across clusters according to cluster selection and weighting
preferences. Federated Job status reflects the aggregate status
across all underlying cluster Jobs.
```
Automatic merge from submit-queue (batch tested with PRs 49524, 46760, 50206, 50166, 49603)
Handled taints on node in batch.
**What this PR does / why we need it**:
Enhanced helpers to handled taints on node in batch.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#49522
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 49885, 49751, 49441, 49952, 49945)
Rename e2e sig framework files
**What this PR does / why we need it**:
make files be consistent across all sig e2e tests dir.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50087, 39587, 50042, 50241, 49914)
Add node e2e test for Docker's shared PID namespace
Ref: https://github.com/kubernetes/kubernetes/issues/42926
This PR adds a simple test for the shared PID namespace that's enabled when Docker is 1.13.1+.
/sig node
/area node-e2e
/assign @yujuhong
**Release note**:
```
None
```
This is needed for cluster autoscaler e2e test to
run on kubemark. We need the ability to add and
remove nodes and operate on nodegroups. Kubemark
does not provide this at the moment.
Automatic merge from submit-queue (batch tested with PRs 50091, 50231, 50238, 50236, 50243)
Fix storage tests for multizone test configuration.
**What this PR does / why we need it**:
This PR modifies "[sig-storage] Volumes PD should be mountable with (ext3|ext4)" tests to schedule pods in zone, where PD is created.
This is to make the test work in multizone environment.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 50091, 50231, 50238, 50236, 50243)
Move the sig-instrumentation test to a dedicated folder
Move the last remaining test to sig-instrumentation folder, also move "metrics" package to the "framework" folder
Related issue: https://github.com/kubernetes/kubernetes/issues/49161
/cc @xiangpengzhao @piosz
Automatic merge from submit-queue (batch tested with PRs 50091, 50231, 50238, 50236, 50243)
Modify e2e.go to arbitrarily pick one of zones we have nodes in for multizone tests.
**What this PR does / why we need it**:
When e2e runs in multizone configuration, zone config property can be empty.
This PR, in that case, overrides an empty value with arbitrarily chosen zone that we have nodes in.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 49855, 49915)
Let controllers ignore initialization timeout when creating pods
Partially address https://github.com/kubernetes/kubernetes/issues/48893#issuecomment-318540129.
This only updates the controllers that create pods with `GenerateName`.
The controllers ignore the timeout error when creating the pods, depending on how the initialization progress:
* If the initialization is successful in less than 5 mins, the controller will observe the creation via the informer. All is good.
* If the initialization fails, server will delete the pod, but the controller won't receive any event. The controller will not create new pod until the Creation expectation expires in 5 min.
* If the initialization takes too long (> 5 mins), the Creation expectation expires and the controller will create extra pods.
I'll send follow-up PRs to fix the latter two cases, e.g., by refactoring the sharedInformer.
Automatic merge from submit-queue
Fix typo in test/images/port-forward-tester/Makefile
**What this PR does / why we need it**: the image build fails due to this typo:
```console
$ make WHAT=port-forward-tester
./image-util.sh build port-forward-tester
Building image for port-forward-tester ARCH: ppc64le...
make[1]: Entering directory '[home]/src/k8s.io/kubernetes/test/images/port-forward-tester'
../image-util.sh bin
../image-util.sh: line 22: $2: unbound variable
```
Images already pushed.
**Release note**:
```release-note
NONE
```
/approve no-issue
/assign @mkumatag
Automatic merge from submit-queue (batch tested with PRs 48532, 50054, 50082)
Remove [k8s.io] tag and redundant [sig-storage] tags from volume tests
**What this PR does / why we need it**:
Removes redundant tags from storage e2e test names
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50178
**Release note**:
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 48532, 50054, 50082)
Correcting two spelling mistakes
**What this PR does / why we need it**:
Correcting two spelling mistakes.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
NONE
**Special notes for your reviewer**:
NONE
**Release note**:
NONE
```release-note
```
Automatic merge from submit-queue
Update OWNERS to correct members' handles
**What this PR does / why we need it**:
Fix some typos of members' handles as per https://github.com/kubernetes/kubernetes/issues/50048#issuecomment-319831957.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Associated with: #50048
**Special notes for your reviewer**:
/cc @madhusudancs @sebgoa @liggitt @saad-ali
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
Add basic install and mount flexvolumes e2e tests
fixes https://github.com/kubernetes/kubernetes/issues/47010
These two tests install a skeleton "dummy" flex driver, attachable and non-attachable respectively, then test that a pod can successfully use the flex driver. They are labeled disruptive because kubelet and controller-manager get restarted as part of the flex install. IMO it's important to keep this install procedure as part of the test to isolate any bugs with the startup plugin probe code.
There is a bit of an ugly dependency on cluster/gce/config-test.sh because --flex-volume-plugin-dir must be set to a dir that's readable from controller-manager container and writable by the flex e2e test. The default path is not writable on GCE masters with read-only root so I picked a location that looks okay.
In the "dummy" drivers I trick kubelet into thinking there is a mount point by doing "mount -t tmpfs none ${MNTPATH} >/dev/null 2>&1", hope that is okay.
I have only tested on GCE and theoretically they may work on AWS but I don't think there is a need to test on multiple cloudproviders.
-->
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
increase the GC e2e test timeout
Fix https://github.com/kubernetes/kubernetes/issues/50047.
The root cause is #50046. See log analysis in #50047. For now, we just increase the timeout.
Automatic merge from submit-queue
Fix pointer bug in local volume e2e test
**What this PR does / why we need it**:
Fix pointer bug in local volume e2e test
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/50043
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48237, 50084, 50019, 50069, 50090)
Allow for some pods not to get scheduled in CA tests.
This will allow us to ignore long tail node creation or failure
to create some nodes when running scalability tests on kubemark.
**Release note**:
```
NONE
```
Automatic merge from submit-queue
Moves left networking e2e tests to test/e2e/network
**What this PR does / why we need it**:
#48784 forgot to move some networking e2e tests. This PR moves them.
It also move the networking tests from within `test/e2e/common/networking.go` to `test/e2e/network/networking.go`
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Associated PR #48784
Umbrella issue #49161
**Special notes for your reviewer**:
/assign @wojtek-t @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49916, 50050)
Update images used in the node e2e benchmark tests
Ref: https://github.com/kubernetes/kubernetes/issues/42926
- Update the cosbeta image since the new version contains a 'du' command fix that affects Docker performance.
- Add the coreos and ubuntu image that run Docker 1.12.6 so that we will have more data to compare.
**Release note**:
```
None
```
Automatic merge from submit-queue (batch tested with PRs 50000, 49954, 49943, 50018, 49607)
Add [sig-autoscaling] prefix to autoscaling e2e tests
**What this PR does / why we need it**:
Add [sig-autoscaling] prefix to autoscaling e2e tests
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50000, 49954, 49943, 50018, 49607)
Update the DeleteReplicaSet in rs_util.go to use server side reaper
**What this PR does / why we need it**:
fix#47832
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47832
**Special notes for your reviewer**:
**Release note**:
```
None
```
Automatic merge from submit-queue (batch tested with PRs 50000, 49954, 49943, 50018, 49607)
Add [sig-scalability] prefix to scalability e2e tests
**What this PR does / why we need it**:
Add [sig-scalability] prefix to scalability e2e tests
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Umbrella issue #49161
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add ubuntu to gluster and nfs tests
**What this PR does / why we need it**:
Enable gluster and nfs tests for ubuntu distro
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50039
**Special notes for your reviewer**:
**Release note**:
/release-note-none
/sig storage
Automatic merge from submit-queue (batch tested with PRs 49990, 49997, 44278, 49936, 49891)
Allow mode in e2e-framework to gather metrics only from master
This should enable getting metrics for our 5k-node clusters.
cc @kubernetes/sig-scalability-misc @gmarek
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
remove unused function
**What this PR does / why we need it**:
remove unused function which is not used months ago.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49871, 49422, 49092, 49858, 48999)
Refactor logging e2e tests, add new checks
I split existing code into smaller files to simplify the review and future changes
Also, there are new tests for stackdriver logging:
- ingesting system logs from all nodes
- ingesting logs in json format
- ingesting logs in glog format
Automatic merge from submit-queue (batch tested with PRs 49898, 49897, 49919, 48860, 49491)
Fix usage a make(struct, len()) followed by append()
A couple of places in the code we allocate with make() but then use
append(), instead of copy() or direct assignment. This results in a
slice with len() zero elements at the front followed by the expected
data. The correct form for such usage is `make(struct, 0, len())`.
I found these by running:
```
$ git grep -EI -A7 'make\([^,]*, len\(' | grep 'append(' -B7 | grep -v vendor
```
And then manually looking through the results. I'm sure something better
could exist.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49898, 49897, 49919, 48860, 49491)
Add basic local volume provisioner e2e tests
**What this PR does / why we need it**:
Adds e2e tests to test local volume provisioner.
**Which issue this PR fixes**: fixes https://github.com/kubernetes/kubernetes/issues/48832
**Special notes for your reviewer**:
- bring up local volume provisioner using bootstrapper
- have provisioner create a volume by creating a directory under discovery path.
- check persistent volume is created
- make a claim on the PV, write some data then delete the claim. Verify volume is cleaned up.
**Release note**:
```release-note
```
@ianchakeres @msau42
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747)
Add support for `no_new_privs` via AllowPrivilegeEscalation
**What this PR does / why we need it**:
Implements kubernetes/community#639
Fixes#38417
Adds `AllowPrivilegeEscalation` and `DefaultAllowPrivilegeEscalation` to `PodSecurityPolicy`.
Adds `AllowPrivilegeEscalation` to container `SecurityContext`.
Adds the proposed behavior to `kuberuntime`, `dockershim`, and `rkt`. Adds a bunch of unit tests to ensure the desired default behavior and that when `DefaultAllowPrivilegeEscalation` is explicitly set.
Tests pass locally with docker and rkt runtimes. There are also a few integration tests with a `setuid` binary for sanity.
**Release note**:
```release-note
Adds AllowPrivilegeEscalation to control whether a process can gain more privileges than it's parent process
```
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747)
improve detectability of deleted pods
**What this PR does / why we need it**:
Adds comment to `waitForPodTerminatedInNamespace` to better explain how it's implemented.
~~It improves pod deletion detection in the e2e framework as follows:~~
~~1. the `waitForPodTerminatedInNamespace` func looks for pod.Status.Phase == _PodFailed_ or _PodSucceeded_ since both values imply that all containers have terminated.~~
~~2. the `waitForPodTerminatedInNamespace` func also ignores the pod's Reason if the passed-in `reason` parm is "". Reason is not really relevant to the pod being deleted or not, but if the caller passes a non-blank `reason` then it will be lower-cased, de-blanked and compared to the pod's Reason (also lower-cased and de-blanked). The idea is to make Reason checking more flexible and to prevent a pod from being considered running when all of its containers have terminated just because of a Reason mis-match.~~
Releated to pr [49597](https://github.com/kubernetes/kubernetes/pull/49597) and issue [49529](https://github.com/kubernetes/kubernetes/issues/49529).
**Release note**:
```release-note
NONE
```
A couple of places in the code we allocate with make() but then use
append(), instead of copy() or direct assignment. This results in a
slice with len() zero elements at the front followed by the expected
data. The correct form for such usage is `make(struct, 0, len())`.
I found these by running:
```
$ git grep -EI -A7 'make\([^,]*, len\(' | grep 'append(' -B7 | grep -v vendor
```
And then manually looking through the results. I'm sure something better
could exist.
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Add ext4 and xfs tests to GCE PD basic mount tests
**What this PR does / why we need it**:
Add ext4 and xfs to basic GCE PD mount tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#49511
**Special notes for your reviewer**:
**Release note**:
/release-note-none
/sig storage
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Enable garbage collection of custom resources
Enhance the garbage collector to periodically refresh the resources it monitors (via discovery) to enable custom resource definition GC (addressing #44507 and reverting #47432).
This is a replacement for #46000.
/cc @lavalamp @deads2k @sttts @caesarxuchao
/ref https://github.com/kubernetes/kubernetes/pull/48065
```release-note
The garbage collector now supports custom APIs added via CustomeResourceDefinition or aggregated apiservers. Note that the garbage collector controller refreshes periodically, so there is a latency between when the API is added and when the garbage collector starts to manage it.
```
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Add a support for GKE regional clusters in e2e tests.
**What this PR does / why we need it**:
Add a support for GKE regional clusters in e2e tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49538, 49708, 47665, 49750, 49528)
Use the core client with version
**What this PR does / why we need it**:
Replace the **deprecated** `clientSet.Core()` with `clientSet.CoreV1()`.
**Which issue this PR fixes**: fixes#49535
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49581, 49652, 49681, 49688, 44655)
Re-enable federated ingress test that was disabled due to a federated service deletion bug.
The details of the bug is described in PR #44626. We believe this bug fixes the flakiness in this test and hence we are re-enabling this test to get some mileage on it. If it turns out to be a problem again we are going to revert this back.
**Release note**:
```release-note
NONE
```
/assign @csbell
cc @kubernetes/sig-federation-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 49581, 49652, 49681, 49688, 44655)
Move the audit e2e test out of the node SIG
It was mistakenly moved to sig-node in https://github.com/kubernetes/kubernetes/pull/48910, but this is an apiserver feature, not a node feature.
/cc @crassirostris
Automatic merge from submit-queue (batch tested with PRs 49581, 49652, 49681, 49688, 44655)
Add sig-testing OWNERS_ALIASES
/sig testing
**What this PR does / why we need it**:
follow the sig-foo-{reviewers,approvers} convention
- rename test-infra-maintainers to sig-testing-approvers
- copy sig-testing-approvers to sig-testing-reviewers
- remove inviduals in test/OWNERS in favor of new aliases
as a result
- rmmh gets test/ approver privileges
- spiffxp gets hack/jenkins/ approver privileges
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#49580
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49712, 49694, 49714, 49670, 49717)
Reduce GC e2e test flakiness
Increase GC wait timeout in a flaky e2e test. The test expects a GC
operation to be performed within 30s, while in practice the operation
often takes longer due to a delay between the enqueueing of the owner's
delete operation and the GC's actual processing of that event. Doubling
the time seems to stabilize the test. The test's assumptions can be
revisited, and the processing delay under load can be investigated in
the future.
Extracted from https://github.com/kubernetes/kubernetes/pull/47665 per https://github.com/kubernetes/kubernetes/pull/47665#issuecomment-318219099.
/cc @sttts @caesarxuchao @deads2k @kubernetes/sig-api-machinery-bugs
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 45813, 49594, 49443, 49167, 47539)
Add node e2e tests for GKE environment
Ref: https://github.com/kubernetes/kubernetes/issues/46891
This PR adds node e2e tests for validating images used on GKE.
- We pass the `SYSTEM_SPEC_NAME` to the node e2e test process via the flag `--system-spec-name` so that we can skip the environment specific tests using `RunIfSystemSpecNameIs()`.
- Also added `SkipIfContainerRuntimeIs()` as the opposite of `RunIfContainerRuntimeIs()`.
**Release note**:
```
None
```
Enhance the garbage collector to periodically refresh the resources it
monitors (via discovery) to enable custom resource definition GC.
This implementation caches Unstructured structs for any kinds not
covered by a shared informer. The existing meta-only codec only supports
compiled types; an improved codec which supports arbitrary types could
be introduced to optimize caching to store only metadata for all
non-informer types.
Automatic merge from submit-queue (batch tested with PRs 49619, 49598, 47267, 49597, 49638)
Remove default binding of system:node role to system:nodes group
part of https://github.com/kubernetes/features/issues/279
deprecation of this automatic binding announced in 1.7 in https://github.com/kubernetes/kubernetes/pull/46076
```release-note
RBAC: the `system:node` role is no longer automatically granted to the `system:nodes` group in new clusters. It is recommended that nodes be authorized using the `Node` authorization mode instead. Installations that wish to continue giving all members of the `system:nodes` group the `system:node` role (which grants broad read access, including all secrets and configmaps) must create an installation-specific `ClusterRoleBinding`.
```
Automatic merge from submit-queue (batch tested with PRs 49619, 49598, 47267, 49597, 49638)
improve log for pod deletion poll loop
**What this PR does / why we need it**:
It improves some logging related to waiting for a pod to reach a passed-in condition. Specifically, related to issue [49529](https://github.com/kubernetes/kubernetes/issues/49529) where better logging may help to debug the root cause.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49619, 49598, 47267, 49597, 49638)
Flag support in kubectl plugins
Adds support to flags in `kubectl` plugins. Flags are declared in the plugin descriptor and are passed to plugins through env vars, similar to global flags (which already works).
Fixes https://github.com/kubernetes/kubernetes/issues/49122
**Release note**:
```release-note
Added flag support to kubectl plugins
```
PTAL @monopole @kubernetes/sig-cli-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 47738, 49196, 48907, 48533, 48822)
Move e2e dependent images from kubernetes/kubernetes.github.io repo
**What this PR does / why we need it**:
Move e2e dependent images from kubernetes/kubernetes.github.io repo
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48530
**Special notes for your reviewer**:
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 49238, 49595, 43494, 47897, 48905)
Add apps/v1beta2.ReplicaSet
~Depends on #48746~ (merged)
~Depends on #49357~ (merged)
xref: #49135
```release-note
Add a new API object apps/v1beta2.ReplicaSet
```
Automatic merge from submit-queue (batch tested with PRs 49665, 49689, 49495, 49146, 48934)
Add Integration tests for Inter-Pod Affinity(AntiAffinity)
Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
**What this PR does / why we need it**:
Adds integration tests for inter-pod affinity(anti-affinity)
**Special notes for your reviewer**:
Once after @bsalamat 's #48847 gets merged, changes in this PR will be restructured according to the merged changes.
ref/ #48176#48847
@kubernetes/sig-scheduling-pr-reviews
@davidopp @k82cn
Increase GC wait timeout in a flaky e2e test. The test expects a GC
operation to be performed within 30s, while in practice the operation
often takes longer due to a delay between the enqueueing of the owner's
delete operation and the GC's actual processing of that event. Doubling
the time seems to stabilize the test. The test's assumptions can be
revisited, and the processing delay under load can be investigated in
the future.
Automatic merge from submit-queue (batch tested with PRs 46358, 49408)
[Federation] Updates to enable hpa controllers test in integration and e2e
Enables the apis on api server in both scenario.
Additional logic to enable and run the crud portion of objects in integration, for controllers which implement additional logic in reconcile.
**Special notes for your reviewer**:
This on top of an existing PR https://github.com/kubernetes/kubernetes/pull/45497.
The last 2 commits are reviewable here
@kubernetes/sig-federation-pr-reviews
cc @marun @perotinus
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 46913, 48910, 48858, 47160)
Add e2e test for readOnlyRootFilesystem containers
**What this PR does / why we need it**:
This PR adds node e2e test for readOnlyRootFilesystem containers.
**Which issue this PR fixes**
Part of #44118.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46913, 48910, 48858, 47160)
move sig-node related e2e tests to node subdir
I need help making sure I picked the right ones and/or didn't miss anything.
Potential additions include: `logging_soak.go`, `ssh.go`, `kubelet_perf.go`.
/cc @dchen1107 @vishh @tallclair @yujuhong @Random-Liu @abgworrall @dashpole @yguo0905
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Use presence of kubeconfig file to toggle standalone mode
Fixes#40049
```release-note
The deprecated --api-servers flag has been removed. Use --kubeconfig to provide API server connection information instead. The --require-kubeconfig flag is now deprecated. The default kubeconfig path is also deprecated. Both --require-kubeconfig and the default kubeconfig path will be removed in Kubernetes v1.10.0.
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-misc
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48224, 45431, 45946, 48775, 49396)
Update cos-dev image in benchmark tests to cos-dev-61-9759-0-0
Ref: https://github.com/kubernetes/kubernetes/issues/42926
`cos-dev-61-9759-0-0` contains a fix in Linux utility `du` that would affect the measurement of docker performance in kubelet. I'd like to update the benchmark to use the new image.
**Release note**:
```
None
```
/assign @tallclair
/cc @kewu1992 @abgworrall
follow the sig-foo-{reviewers,approvers} convention
- rename test-infra-maintainers to sig-testing-approvers
- copy sig-testing-approvers to sig-testing-reviewers
- remove inviduals in test/OWNERS in favor of new aliases
as a result
- rmmh gets test/ approver privileges
- spiffxp gets hack/jenkins/ approver privileges
Automatic merge from submit-queue (batch tested with PRs 49286, 49550)
Remove myself from a bunch of places
I am assigned in reviews which I never get to do. I prefer drive-bys whenever I can do them rather than the bot choosing myself in random, ends up being mere spam.
@smarterclayton please approve.
Automatic merge from submit-queue (batch tested with PRs 48846, 49483, 49341)
Add [sig-network] prefix to network e2e tests
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Associated PR #48784
Umbrella issue #49161
**Special notes for your reviewer**:
/assign @@wojtek-t @freehan
/cc @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove flags low-diskspace-threshold-mb and outofdisk-transition-frequency
issue: #48843
This removes two flags replaced by the eviction manager. These have been depreciated for two releases, which I believe correctly follows the kubernetes depreciation guidelines.
```release-note
Remove depreciated flags: --low-diskspace-threshold-mb and --outofdisk-transition-frequency, which are replaced by --eviction-hard
```
cc @mtaufen since I am changing kubelet flags
cc @vishh @derekwaynecarr
/sig node
Automatic merge from submit-queue (batch tested with PRs 49358, 49253)
Remove hostname label condition in SchedulerPredicates
**What this PR does / why we need it**:
```
validates that NodeSelector is respected if matching [Conformance]
validates that required NodeAffinity setting is respected if matching
```
The two tests above make the assumption that the node names are equal to the `kubernetes.io/hostname` labels. Unfortunately, this is not necessarily true all the time. For instance, when using the AWS Cloud Provider + Container Linux:
- The node name is set using the AWS SDK's `ec2.Instance.PrivateDnsName` and has the form `ip-10-0-35-57.ca-central-1.compute.internal` [[1](https://github.com/kubernetes/kubernetes/blob/v1.7.1/pkg/cloudprovider/providers/aws/aws.go#L3343-L3346)] [[2](https://raw.githubusercontent.com/aws/aws-sdk-go/master/service/ec2/api.go)]
- The node's hostname, however, is a simple call to `os.Hostname()`, itself reading `/proc/sys/kernel/hostname`, which contains what the AWS DHCP assigned to the instance, typically the hostname short-form: `ip-10-0-16-137`. [[1](https://github.com/kubernetes/kubernetes/blob/v1.7.1/pkg/util/node/node.go#L43-L54)]
Consequently, we are trying to assign a pod to a node having the following label: `kubernetes.io/hostname=ip-10-0-35-57.ca-central-1.compute.internal` (in addition to the randomly generated label), whereas the actual label on the node is `kubernetes.io/hostname=ip-10-0-35-57`.
Furthermore, this inaccurate `kubernetes.io/hostname=<nodename>` condition is actually useless given we already match over a random label, that was assigned to that node. Later, the test ensures that the scheduled pod was scheduled to the right node by comparing the pod's node name and the node name we expected the pod to be on:
```
framework.ExpectNoError(framework.WaitForPodNotPending(cs, ns, labelPodName))
labelPod, err := cs.Core().Pods(ns).Get(labelPodName, metav1.GetOptions{})
framework.ExpectNoError(err)
Expect(labelPod.Spec.NodeName).To(Equal(nodeName))
```
The `k8s.io/apimachinery/pkg/types/nodename` data structure actually [warns](55bee3ad21/staging/src/k8s.io/apimachinery/pkg/types/nodename.go (L40-L43)) about the fact that the node name might be different than the hostname on AWS.
**Release note**:
```release-note
NONE
```