Automatic merge from submit-queue (batch tested with PRs 41984, 41682, 41924, 41928)
Move node problem detector test into node e2e.
Move current NPD e2e test into node e2e.
In fact, current NPD e2e test is only a functionality test for NPD. It creates test NPD pod, sets test configuration, generates test logs and verifies test result.
It doesn't actually test the NPD really deployed in the cluster.
So it doesn't actually need to run in cluster e2e. Running it in node e2e will:
1) Make it easier to run the test.
2) Make it more light weight to introduce this as a pre/post submit test in NPD repo in the future.
Except this, I'm working on a cluster e2e to run some basic functionality test and benchmark test against the real NPD deployed in the cluster. Will send the PR later.
/cc @dchen1107 @kubernetes/node-problem-detector-reviewers
Automatic merge from submit-queue (batch tested with PRs 42128, 42064, 42253, 42309, 42322)
Add storage.k8s.io/v1 API
This is combined version of reverted #40088 (first 4 commits) and #41646. The difference is that all controllers and tests use old `storage.k8s.io/v1beta1` API so in theory all tests can pass on GKE.
Release note:
```release-note
StorageClassName attribute has been added to PersistentVolume and PersistentVolumeClaim objects and should be used instead of annotation `volume.beta.kubernetes.io/storage-class`. The beta annotation is still working in this release, however it will be removed in a future release.
```
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)
Adjust parameters of GCL cluster logging load tests
This PR increases the amount of logs produced in load tests to match the number of nodes and provide the predictable load of 100 KB/sec on each node.
Also this PR reduces in half amount of time, given for ingesting logs.
Automatic merge from submit-queue (batch tested with PRs 41980, 42192, 42223, 41822, 42048)
Take into account number of restarts in cluster logging tests
Before, in cluster logging tests, we only measured e2e number of lines delivered to the backend.
Also, befure https://github.com/kubernetes/kubernetes/pull/41795 was merged, from the k8s perspective, fluentd was always working properly, even if it's crashlooping inside.
Now we can detect whether fluentd is truly working properly, experiencing no, or almost no OOMs duing its operation.
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)
Admission Controller: Add Pod Preset
Based off the proposal in https://github.com/kubernetes/community/pull/254
cc @pmorie @pwittrock
TODO:
- [ ] tests
**What this PR does / why we need it**: Implements the Pod Injection Policy admission controller
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Added new Api `PodPreset` to enable defining cross-cutting injection of Volumes and Environment into Pods.
```
Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212)
Ingress-glbc upgrade tests
Basically #41676 but with some fixes and added comments. @bprashanth has been away this week and it's desirable to have this in before code freeze.
export functions from pkg/api/validation
add settings API
add settings to pkg/registry
add settings api to pkg/master/master.go
add admission control plugin for pod preset
add new admission control plugin to kube-apiserver
add settings to import_known_versions.go
add settings to codegen
add validation tests
add settings to client generation
add protobufs generation for settings api
update linted packages
add settings to testapi
add settings install to clientset
add start of e2e
add pod preset plugin to config-test.sh
Signed-off-by: Jess Frazelle <acidburn@google.com>
Automatic merge from submit-queue
Extend experimental support to multiple Nvidia GPUs
Extended from #28216
```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with support for multiple Nvidia GPUs.
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```
1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```
TODO:
- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)
Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Automatic merge from submit-queue (batch tested with PRs 41234, 42186, 41615, 42028, 41788)
Additional upgrade e2e tests
**What this PR does / why we need it**: Add basic upgrade tests for DaemonSet and Job, and add "during upgrade" testing to ConfigMap test. Add a simple harness for testing upgrade tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**: continuation of #41296 @krousey please review, thanks
**Release note**: `NONE`
Automatic merge from submit-queue (batch tested with PRs 41205, 42196, 42068, 41588, 41271)
Implements an upgrade test for Job
**What this PR does / why we need it**:
This PR implements a cluster upgrade test for Job. Some functionality for Job testing has been moved from the e2e package to the framework package to facilitate code reuse between the e2e package and the upgrade package without introducing cyclic dependencies.
We need this PR to help automate the testing of cluster upgrades between versions.
**Release note**
```release-note
NONE
```
Updates the dnsmasq cache/mux layer to be managed by dnsmasq-nanny.
dnsmasq-nanny manages dnsmasq based on values from the
kube-system:kube-dns configmap:
"stubDomains": {
"acme.local": ["1.2.3.4"]
},
is a map of domain to list of nameservers for the domain. This is used
to inject private DNS domains into the kube-dns namespace. In the above
example, any DNS requests for *.acme.local will be served by the
nameserver 1.2.3.4.
"upstreamNameservers": ["8.8.8.8", "8.8.4.4"]
is a list of upstreamNameservers to use, overriding the configuration
specified in /etc/resolv.conf.
Automatic merge from submit-queue (batch tested with PRs 41116, 41804, 42104, 42111, 42120)
make kubectl taint command respect effect NoExecute
**What this PR does / why we need it**:
Part of feature forgiveness implementation, make kubectl taint command respect effect NoExecute.
**Which issue this PR fixes**:
Related Issue: #1574
Related PR: #39469
**Special notes for your reviewer**:
**Release note**:
```release-note
make kubectl taint command respect effect NoExecute
```
Automatic merge from submit-queue (batch tested with PRs 41962, 42055, 42062, 42019, 42054)
PV upgrade test
**What this PR does / why we need it**:
This PR adds a PV upgrade test to the new upgrade test framework. Before, this test had to be done manually. Currently the upgrade test framework only works on the GCE environment, so I plan to add support for other providers later. In order to write the test, I had to modify and refactor some volume test util libraries. I reran the impacted tests to make sure they still passed.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
It's probably easier to review the two commits separately. I split it up into the refactor changes, and the upgrade test changes.
**Release note**:
NONE
cc @saad-ali @krousey
Automatic merge from submit-queue (batch tested with PRs 42044, 41694, 41927, 42050, 41987)
Use existing nginx image in the node
**What this PR does / why we need it**:
Fixes a test flake: `[k8s.io] Garbage collector should orphan pods created by rc if delete options say so`
**Which issue this PR fixes**
fixes#35771
**Special notes for your reviewer**:
**Release note**:
```release-note NONE
```
Automatic merge from submit-queue (batch tested with PRs 35408, 41915, 41992, 41964, 41925)
e2e/upgrade: add sysctls
Add sysctl upgrade tests.
How can these be effectively tested?
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)
Guaranteed admission for Critical Pods
This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.
cc: @vishh
Automatic merge from submit-queue (batch tested with PRs 40932, 41896, 41815, 41309, 41628)
Modify CronJob API to add job history limits, cleanup jobs in controller
**What this PR does / why we need it**:
As discussed in #34710: this adds two limits to `CronJobSpec`, to limit the number of finished jobs created by a CronJob to keep.
**Which issue this PR fixes**: fixes#34710
**Special notes for your reviewer**:
cc @soltysh, please have a look and let me know what you think -- I'll then add end to end testing and update the doc in a separate commit. What is the timeline to get this into 1.6?
The plan:
- [x] API changes
- [x] Changing versioned APIs
- [x] `types.go`
- [x] `defaults.go` (nothing to do)
- [x] `conversion.go` (nothing to do?)
- [x] `conversion_test.go` (nothing to do?)
- [x] Changing the internal structure
- [x] `types.go`
- [x] `validation.go`
- [x] `validation_test.go`
- [x] Edit version conversions
- [x] Edit (nothing to do?)
- [x] Run `hack/update-codegen.sh`
- [x] Generate protobuf objects
- [x] Run `hack/update-generated-protobuf.sh`
- [x] Generate json (un)marshaling code
- [x] Run `hack/update-codecgen.sh`
- [x] Update fuzzer
- [x] Actual logic
- [x] Unit tests
- [x] End to end tests
- [x] Documentation changes and API specs update in separate commit
**Release note**:
```release-note
Add configurable limits to CronJob resource to specify how many successful and failed jobs are preserved.
```
Automatic merge from submit-queue (batch tested with PRs 42106, 42094, 42069, 42098, 41852)
Pod deletion observation is flaking, increase timeout and debug more
We can afford to wait longer than 30 seconds, and we should be printing
more error and output information about the cause of the failure.
Fixes / triages #41902
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)
Add storage.k8s.io/v1 API
v1 API is direct copy of v1beta1 API. This v1 API gets installed and exposed in this PR, I tested that kubectl can create both v1beta1 and v1 StorageClass.
~~Rest of Kubernetes (controllers, examples,. tests, ...) still use v1beta1 API, I will update it when this PR gets merged as these changes would get lost among generated code.~~ Most parts use v1 API now, it would not compile / run tests without it.
**Release note**:
```
Kubernetes API storage.k8s.io for storage objects is now fully supported and is available as storage.k8s.io/v1. Beta version of the API storage.k8s.io/v1beta1 is still available in this release, however it will be removed in a future Kubernetes release.
Together with the API endpoint, StorageClass annotation "storageclass.beta.kubernetes.io/is-default-class" is deprecated and "storageclass.kubernetes.io/is-default-class" should be used instead to mark a default storage class. The beta annotation is still working in this release, however it won't be supported in the next one.
```
@kubernetes/sig-storage-misc
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
Fix bash command's execution in MakePod().
Add isPriviledged as a parameter to MakePod().
Move PD utils to pv_util.go
Ran all the tests in pd.go, persistent_volumes.go,
persistent_volumes-disruptive.go.
These changes are needed for the PV upgrade test I am working on.
Automatic merge from submit-queue (batch tested with PRs 41146, 41486, 41482, 41538, 41784)
Switch statefulset controller to shared informers
Originally part of #40097
I *think* the controller currently makes a deep copy of a StatefulSet before it mutates it, but I'm not 100% sure. For those who are most familiar with this code, could you please confirm?
@beeps @smarterclayton @ingvagabund @sttts @liggitt @deads2k @kubernetes/sig-apps-pr-reviews @kubernetes/sig-scalability-pr-reviews @timothysc @gmarek @wojtek-t
Automatic merge from submit-queue (batch tested with PRs 38957, 41819, 41851, 40667, 41373)
Move pvutil.go from e2e package to framework package
**What this PR does / why we need it**:
This PR moves pvutil.go to the e2e/framework package.
I am working on a PV upgrade test, and would like to use some of the wrapper functions in pvutil.go. However, the upgrade test is in the upgrade package, and not the e2e package, and it cannot import the e2e package because it would create a circular dependency. So pvutil.go needs to be moved out of e2e in order to break the circular dependency. This is a refactoring name change, no logic has been modified.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 41844, 41803, 39116, 41129, 41240)
NPD: Update NPD test.
For https://github.com/kubernetes/node-problem-detector/issues/58.
Update NPD e2e test based on the new behavior.
Note that before merging this PR, we need to merge all pending PRs in npd, and release the v0.3.0-alpha.1 version of NPD.
/cc @dchen1107 @kubernetes/node-problem-detector-reviewers
Automatic merge from submit-queue (batch tested with PRs 41844, 41803, 39116, 41129, 41240)
Allow for not-ready pods in large clusters
This is to workaround issues with non-starting pods in large clusters in roughly 1/3rd of runs.
Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782)
Debug what is hapening in large clusters
What I'm seeing in large clusters is:
```
I0219 19:34:29.994] [90m/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/common/secrets.go:44[0m
I0219 19:34:29.994] [90m------------------------------[0m
I0219 21:27:11.421] Dumping master and node logs to /workspace/_artifacts
I0219 21:27:11.422] Master SSH not supported for gke
```
i have no idea what is happening during those 2 hours, and would like to understand this.
Automatic merge from submit-queue (batch tested with PRs 41709, 41685, 41754, 41759, 37237)
Projected volume plugin
This is a WIP volume driver implementation as noted in the commit for https://github.com/kubernetes/kubernetes/pull/35313.
Automatic merge from submit-queue (batch tested with PRs 41421, 41440, 36765, 41722)
Use watch param instead of deprecated /watch/ prefix
Switches clients to use watch param instead of /watch/ prefix
```release-note
Clients now use the `?watch=true` parameter to make watch API calls, instead of the `/watch/` path prefix
```
Automatic merge from submit-queue
Adds kube-public to the whitelist to not be deleted for e2e tests
We added the `kube-public` namespace but didn't add it to a whitelist of namespaces to not delete as part of e2e cleanup.
```release-note
```
Automatic merge from submit-queue
Add cluster logging load test for GCL
Changed code around logging tests a little to avoid duplication. Added GCL go library, because command line tool cannot handle required number of logs effectively.
Related to https://github.com/kubernetes/kubernetes/issues/34310
@piosz
Automatic merge from submit-queue (batch tested with PRs 41548, 41221)
StatefulSet Upgrade Test
Adds StatefulSet upgrade tests and moves common functionality into the framework package. This removes the potential for cyclic dependencies while allowing for code reuse.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41517, 41494, 41163)
Deployment: filter out old RSes that are deleted or with non-zero replicas before cleanup
Fixes#36379
cc @zmerlynn @yujuhong @kargakis @kubernetes/sig-apps-bugs
This is the first step in a two step change. First is to move the file.
The second step is to change the package and callers. It needs to be
two steps in order to correctly transfer commit history.
Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416)
Revert "Modify load test to not create too may services/endpoints"
Reverts kubernetes/kubernetes#40798
We seem to be ready to go for it now.
Automatic merge from submit-queue
Ingress e2e typos
**What this PR does / why we need it**: fix typos in e2e test
**Special notes for your reviewer**: none
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove alpha provisioning
This is the first part of https://github.com/kubernetes/features/issues/36
@kubernetes/sig-storage-misc
**Release note**:
```release-note
Alpha version of dynamic volume provisioning is removed in this release. Annotation
"volume.alpha.kubernetes.io/storage-class" does not have any special meaning. A default storage class
and DefaultStorageClass admission plugin can be used to preserve similar behavior of Kubernetes cluster,
see https://kubernetes.io/docs/user-guide/persistent-volumes/#class-1 for details.
```
Automatic merge from submit-queue (batch tested with PRs 37137, 41506, 41239, 41511, 37953)
Add field to control service account token automounting
Fixes https://github.com/kubernetes/kubernetes/issues/16779
* adds an `automountServiceAccountToken *bool` field to `ServiceAccount` and `PodSpec`
* if set in both the service account and pod, the pod wins
* if unset in both the service account and pod, we automount for backwards compatibility
```release-note
An `automountServiceAccountToken *bool` field was added to ServiceAccount and PodSpec objects. If set to `false` on a pod spec, no service account token is automounted in the pod. If set to `false` on a service account, no service account token is automounted for that service account unless explicitly overridden in the pod spec.
```
Automatic merge from submit-queue (batch tested with PRs 37137, 41506, 41239, 41511, 37953)
Fix typos in e2e
**What this PR does / why we need it**: fix typos in e2e test
**Special notes for your reviewer**: none
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 37137, 41506, 41239, 41511, 37953)
e2e test for storage class diskformat verification for vsphere cloud provider
**What this PR does / why we need it**:
This PR adds a new e2e test for vsphere cloud provider.
Test is to verify diskformat specified in storage-class is being honored while volume creation.
Steps:
1. Create StorageClass with diskformat set to valid type (supported options are `eagerzeroedthick`, `zeroedthick` and `thin`)
2. Create PVC which uses the StorageClass created in step 1.
3. Wait for PV to be provisioned.
4. Wait for PVC's status to become Bound
5. Create POD using PVC on specific node.
6. Wait for Disk to be attached to the node.
7. Get node VM's devices and find PV's Volume Disk.
8. Get Backing Info of the Volume Disk and obtain Property of `VirtualDiskFlatVer2BackingInfo` - `EagerlyScrub` and `ThinProvisioned`
9. Based on the value of `EagerlyScrub` and `ThinProvisioned`, verify if diskformat is correct.
10. Delete POD and Wait for Volume Disk to be detached from the Node.
11. Delete PVC, PV and Storage Class
**Which issue this PR fixes** *
fixes #
**Special notes for your reviewer**:
Test is executed against v1.6.0-alpha.1
Test is failing on v1.4.8
**Release Note**
```release-note
NONE
```
@kerneltime @BaluDontu @abrarshivani please review this PR
Automatic merge from submit-queue (batch tested with PRs 41134, 41410, 40177, 41049, 41313)
PV E2E: dynamically provisioned volume should not create in unmanaged zone
**What this PR does / why we need it**:
Adds e2e test to that attempts to provision a volume in an unmanaged zone and fails on success. This is to catch regressions of #31948.
cc @jeffvance
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41134, 41410, 40177, 41049, 41313)
Isolate recycler behavior in PV E2E
**What this PR does / why we need it**:
Sets the default `reclaimPolicy` for PV E2E to `Retain` and isolates `Recycle` tests to their own context. The purpose of this is to future proof the PV test suite against the possible deprecation of the `Recycle` behavior. This is done by consolidating recycling test code into a single Context block that can be removed en masse without affecting the test suite.
Secondly, adds a liveliness check for the NFS server pod prior to each test to avoid maxing out timeouts if the NFS server becomes unavailable.
cc @saad-ali @jeffvance
Automatic merge from submit-queue (batch tested with PRs 41216, 41362, 41275, 41277, 41412)
Fix statefulset e2e test
...by removing the liveness/readiness probes from the cockroachdb
manifests, as explained in
https://github.com/kubernetes/test-infra/issues/1740#issuecomment-279555187
@kow3ns @spxtr
Added nfs-server verification between tests; prototyped recycler test
recycler test in working state, needs clean up/comments
comments added; gets mount path from pod instead of hard code
typos
removed NFS server checker, will be handled in separate PR
catch err; inaccurate comment
gofmt; typo
Automatic merge from submit-queue (batch tested with PRs 41382, 41407, 41409, 41296, 39636)
Update to use proxy subresource consistently
Proxy subresources have been in place since 1.2.0 and improve the ability to put policy in place around proxy access.
This PR updates the last few clients to use proxy subresources rather than the root proxy
Automatic merge from submit-queue (batch tested with PRs 41382, 41407, 41409, 41296, 39636)
implement configmap upgrade test
**What this PR does / why we need it**: Add an automated ConfigMap upgrade test to the e2e upgrade test suite. The test creates a ConfigMap and verifies it can be consumed by pods before/after upgrade. See PR #39953 and #40747 for context.
**Special notes for your reviewer**:
@krousey please review for consistency with new upgrade test interface. I copied heavily from secrets test for this. I will implement similar tests for DaemonSets and Jobs next. Note that I have not run this test locally.
**Release note**:
`NONE`
...by removing the liveness/readiness probes from the cockroachdb
manifests, as explained in
github.com/kubernetes/test-infra/issues/1740#issuecomment-279555187
Automatic merge from submit-queue (batch tested with PRs 41342, 41257)
Move two flaky e2e tests to the flaky suite.
cc @kargakis @davidopp
We should have moved these a long time ago.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 41357, 41178, 41280, 41184, 41278)
fix flaky host cleanup test
**What this PR does / why we need it**:
Fixes 2 flakes in the "HostCleanup tests in e2e/_kubelet.go_
Also does some very minor refactoring.
**Which issue this PR fixes**
This is an improved fix for issue [31272](https://github.com/kubernetes/kubernetes/issues/31272)
**Special notes for your reviewer**:
```release-note
NONE
```
Automatic merge from submit-queue
Add e2e test for external provisioners
this is a retry of https://github.com/kubernetes/kubernetes/pull/39545
This time around:
* take advantage of the system:persistent-volume-provisioner bootstrap cluster role to grant the external provisioner pod serviceaccount permissions
* add storageclass suffix so that the first and third test don't conflict when one creates the storageclass with the same name before the other
also tested more thoroughly myself on gce :)
@jsafrane if you would like to re-review
Automatic merge from submit-queue (batch tested with PRs 39418, 41175, 40355, 41114, 32325)
TaintController
```release-note
This PR adds a manager to NodeController that is responsible for removing Pods from Nodes tainted with NoExecute Taints. This feature is beta (as the rest of taints) and enabled by default. It's gated by controller-manager enable-taint-manager flag.
```
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)
StatefulSet hardening
**What this PR does / why we need it**:
This PR contains the following changes to StatefulSet. Only one change effects the semantics of how the controller operates (This is described in #38418), and this change only brings the controller into conformance with its documented behavior.
1. pcb and pcb controller are removed and their functionality is encapsulated in StatefulPodControlInterface. This class modules the design contoller.PodControlInterface and provides an abstraction to clientset.Interface which is useful for testing purposes.
2. IdentityMappers has been removed to clarify what properties of a Pod are mutated by the controller. All mutations are performed in the UpdateStatefulPod method of the StatefulPodControlInterface.
3. The statefulSetIterator and petQueue classes are removed. These classes sorted Pods by CreationTimestamp. This is brittle and not resilient to clock skew. The current control loop, which implements the same logic, is in stateful_set_control.go. The Pods are now sorted and considered by their ordinal indices, as is outlined in the documentation.
4. StatefulSetController now checks to see if the Pods matching a StatefulSet's Selector also match the Name of the StatefulSet. This will make the controller resilient to overlapping, and will be enhanced by the addition of ControllerRefs.
5. The total lines of production code have been reduced, and the total number of unit tests has been increased. All new code has 100% unit coverage giving the module 83% coverage. Tests for StatefulSetController have been added, but it is not practical to achieve greater coverage in unit testing for this code (the e2e tests for StatefulSet cover these areas).
6. Issue #38418 is fixed in that StaefulSet will ensure that all Pods that are predecessors of another Pod are Running and Ready prior to launching a new Pod. This removes the potential for deadlock when a Pod needs to be rescheduled while its predecessor is hung in Pending or Initializing.
7. All reference to pet have been removed from the code and comments.
**Which issue this PR fixes**
fixes #38418,#36859
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue #38418 which, under circumstance, could cause StatefulSet to deadlock.
Mediates issue #36859. StatefulSet only acts on Pods whose identity matches the StatefulSet, providing a partial mediation for overlapping controllers.
```
Automatic merge from submit-queue (batch tested with PRs 41074, 41147, 40854, 41167, 40045)
Fix some funky funcs.
This is code cleanup. Fix function declarations and remove stale comment.
Automatic merge from submit-queue (batch tested with PRs 41074, 41147, 40854, 41167, 40045)
Upgrade test for deployments
Upgrade test for Deployments. Should prevent issues like https://github.com/kubernetes/kubernetes/issues/40415 in the future.
@krousey @janetkuo @soltysh
Haven't managed to run it locally...
```
$ go run hack/e2e.go --up --test --test_args="--ginkgo.focus=\[Feature:MasterUpgrade\] --upgrade-target=ci/latest --upgrade-image=gci"
2017/02/02 11:43:22 e2e.go:946: Running: ./hack/e2e-internal/e2e-down.sh
2017/02/02 11:43:22 e2e.go:948: Step './hack/e2e-internal/e2e-down.sh' finished in 7.278236ms
2017/02/02 11:43:22 e2e.go:946: Running: ./hack/e2e-internal/e2e-up.sh
2017/02/02 11:43:22 e2e.go:948: Step './hack/e2e-internal/e2e-up.sh' finished in 5.286328ms
2017/02/02 11:43:22 e2e.go:946: Running: ./cluster/kubectl.sh version --match-server-version=false
2017/02/02 11:43:22 e2e.go:948: Step './cluster/kubectl.sh version --match-server-version=false' finished in 213.847259ms
2017/02/02 11:43:22 e2e.go:946: Running: ./hack/e2e-internal/e2e-status.sh
2017/02/02 11:43:22 e2e.go:948: Step './hack/e2e-internal/e2e-status.sh' finished in 103.253183ms
2017/02/02 11:43:22 e2e.go:230: Something went wrong: encountered 2 errors: [exit status 1 exit status 1]
exit status 1
```
@krousey any eta for when the upgrade framework will be integrated in the pr builder?
Automatic merge from submit-queue (batch tested with PRs 41145, 38771, 41003, 41089, 40365)
Use privileged containers for statefulset e2e tests
Test containers need to run as spc_t in order to interact with the host
filesystem under /tmp, as the tests for StatefulSet are doing. Docker
will transition the container into this domain when running the container
as privileged.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
**Release note**:
```release-note
NONE
```
/cc @ncdc @soltysh @pmorie
1. pcb and pcb controller are removed and their functionality is
encapsulated in StatefulPodControlInterface.
2. IdentityMappers has been removed to clarify what properties of a Pod are
mutated by the controller. All mutations are performed in the
UpdateStatefulPod method of the StatefulPodControlInterface.
3. The statefulSetIterator and petQueue classes are removed. These classes
sorted Pods by CreationTimestamp. This is brittle and not resilient to
clock skew. The current control loop, which implements the same logic,
is in stateful_set_control.go. The Pods are now sorted and considered by
their ordinal indices, as is outlined in the documentation.
4. StatefulSetController now checks to see if the Pods matching a
StatefulSet's Selector also match the Name of the StatefulSet. This will
make the controller resilient to overlapping, and will be enhanced by
the addition of ControllerRefs.
Automatic merge from submit-queue (batch tested with PRs 38796, 40823, 40756, 41083, 41105)
e2e tests for vSphere cloud provider
**What this PR does / why we need it**:
This PR contains changes for existing e2e volume provisioning test cases for running on vsphere cloud provider.
**Following is the summary of changes made in existing e2e test cases**
**Added test/e2e/persistent_volumes-vsphere.go**
- This test verifies deleting a PVC before the pod does not cause pod deletion to fail on PD detach and deleting the PV before the pod does not cause pod deletion to fail on PD detach.
**test/e2e/volume_provisioning.go**
- This test creates a StorageClass and claim with dynamic provisioning and alpha dynamic provisioning annotations and verifies that required volumes are getting created. Test also verifies that created volume is readable and retaining data.
- Added vsphere as supported cloud provider. Also set pluginName to "kubernetes.io/vsphere-volume" for vsphere cloud provider.
**test/e2e/volumes.go**
- Added test spec for vsphere
- This test creates requested volume, mount it on the pod, write some random content at /opt/0/index.html and verifies file contents are perfect to make sure we don't see the content from previous test runs.
- This test also passes "1234" as fsGroup to mount volume and verifies fsGroup is set correctly.
**added test/e2e/vsphere_utils.go**
- Added function verifyVSphereDiskAttached - Verify the persistent disk attached to the node.
- Added function waitForVSphereDiskToDetach - Wait until vsphere vmdk is deteched from the given node or time out after 5 minutes
- Added getVSpherePersistentVolumeSpec - create vsphere volume spec with given VMDK volume path, Reclaim Policy and labels
- Added getVSpherePersistentVolumeClaimSpec - get vsphere persistent volume spec with given selector labels
- createVSphereVolume - function to create vmdk volume
**Following is the summary of new e2e tests added with this PR**
**test/e2e/vsphere_volume_placement.go**
- contains volume placement tests using node label selector
- Test Back-to-back pod creation/deletion with the same volume source on the same worker node
- Test Back-to-back pod creation/deletion with the same volume source attach/detach to different worker nodes
**test/e2e/pv_reclaimpolicy.go**
- contains tests for PV/PVC - Reclaiming Policy
- Test verifies persistent volume should be deleted when reclaimPolicy on the PV is set to delete and associated claim is deleted
- Test also verified that persistent volume should be retained when reclaimPolicy on the PV is set to retain and associated claim is deleted
**test/e2e/pvc_label_selector.go**
- This is function test for Selector-Label Volume Binding Feature.
- Verify volume with the matching label is bounded with the PVC.
Other changes
Updated pkg/cloudprovider/providers/vsphere/BUILD and test/e2e/BUILD
**Which issue this PR fixes** *
fixes # 41087
**Special notes for your reviewer**:
Updated tests were executed on kubernetes v1.4.8 release on vsphere.
Test steps are provided in comments
@kerneltime @BaluDontu
Test containers need to run as spc_t in order to interact with the host
filesystem under /tmp, as the tests for StatefulSet are doing. Docker
will transition the container into this domain when running the container
as privileged.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Automatic merge from submit-queue (batch tested with PRs 40345, 38183, 40236, 40861, 40900)
Remove checks for pods responding in deployment e2e tests
Fixes#39879
Remove it because it caused deployment e2e tests sometimes timed out waiting for pods responding, and pods responding isn't related to deployment controller and is not a prerequisite of deployment e2e tests.
@kargakis
addressed review comments
Addressed review comment for pv_reclaimpolicy.go to verify content of the volume
addressed 2nd round of review comments
addressed 3rd round of review comments from jeffvance
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)
Promote SubjectAccessReview to v1
We have multiple features that depend on this API:
SubjectAccessReview
- [webhook authorization](https://kubernetes.io/docs/admin/authorization/#webhook-mode)
- [kubelet delegated authorization](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authorization)
- add-on API server delegated authorization
The API has been in use since 1.3 in beta status (v1beta1) with negligible changes:
- Added a status field for reporting errors evaluating access
- A typo was discovered in the SubjectAccessReviewSpec Groups field name
This PR promotes the existing v1beta1 API to v1, with the only change being the typo fix to the groups field. (fixes https://github.com/kubernetes/kubernetes/issues/32709)
Because the API does not persist data (it is a query/response-style API), there are no data migration concerns.
This positions us to promote the features that depend on this API to stable in 1.7
cc @kubernetes/sig-auth-api-reviews @kubernetes/sig-auth-misc
```release-note
The authorization.k8s.io API group was promoted to v1
```
Remove it because it caused deployment e2e tests sometimes timed out
waiting for pods responding, and pods responding isn't related to
deployment controller and is not a prerequisite of deployment e2e tests.
Automatic merge from submit-queue (batch tested with PRs 40289, 40877, 40879, 39972, 40942)
PV E2E: provide each spec with a fresh nfs host
**What this PR does / why we need it**:
PersistentVolume e2e currently reuses an NFS host pod created at the start of the suite and accessed by each test. This is far less favorable than using a fresh volume per test. Additionally, this guards against the volume host pod or it's kubelet being disrupted, which has led to flakes.
```release-note-none
```
Automatic merge from submit-queue (batch tested with PRs 40906, 40924, 40938, 40902, 40911)
Add [Flaky] tag to persistent volumes tests
**What this PR does / why we need it**:
Persistent Volume tests continue to flake in CI.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
```release-note
NONE
```
Automatic merge from submit-queue
Add an upgrade test for secrets.
**What this PR does / why we need it**: This PR adds an upgrade test for secrets. It creates a secret and makes sure that pods can consume it before an after an upgrade.
Automatic merge from submit-queue (batch tested with PRs 40696, 39914, 40374)
Forgiveness library changes
**What this PR does / why we need it**:
Splited from #34825, contains library changes that are needed to implement forgiveness:
1. ~~make taints-tolerations matching respect timestamps, so that one toleration can just tolerate a taint for only a period of time.~~ As TaintManager is caching taints and observing taint changes, time-based checking is now outside the library (in TaintManager). see #40355.
2. make tolerations respect wildcard key.
3. add/refresh some related functions to wrap taints-tolerations operation.
**Which issue this PR fixes**:
Related issue: #1574
Related PR: #34825, #39469
~~Please note that the first 2 commits in this PR come from #39469 .~~
**Special notes for your reviewer**:
~~Since currently we have `pkg/api/helpers.go` and `pkg/api/v1/helpers.go`, there are some duplicated periods of code laying in these two files.~~
~~Ideally we should move taints-tolerations related functions into a separate package (pkg/util/taints), and make it a unified set of implementations. But I'd just suggest to do it in a follow-up PR after Forgiveness ones done, in case of feature Forgiveness getting blocked to long.~~
**Release note**:
```release-note
make tolerations respect wildcard key
```
Automatic merge from submit-queue (batch tested with PRs 40864, 40666, 38382, 40874)
Promote init containers to GA
This is proposed for 1.6
PR moves beta proved concept for init containers to stable. Specification of init containers can be now stated under initContainers field in PodSpec/PodTemplateSpec. Specifying init-containers in annotation is still possible, but will be removed in future version.
```release-note
Init containers have graduated to GA and now appear as a field. The beta annotation value will still be respected and overrides the field value.
```
Automatic merge from submit-queue (batch tested with PRs 40864, 40666, 38382, 40874)
test: reduce deployment progress deadline, ensure its rs is up
Fixes https://github.com/kubernetes/kubernetes/issues/39785 by reducing the deadline of the expected progress and making sure the new replica set is up before checking the deployment condition.
@kubernetes/sig-apps-misc
Automatic merge from submit-queue (batch tested with PRs 40884, 40809, 40845, 40866, 40875)
Remove many job e2e tests.
These tests have equivalent unit test coverage as far as I can tell, in [pkg/controller/job/jobcontroller_test.go](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/job/jobcontroller_test.go). See #40839 for context.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40812, 39903, 40525, 40729)
test/node_e2e: wire-in cri-enabled local testing
This commit wires-in the pre-existing `--container-runtime` flag for
local node_e2e testing.
This is needed in order to further skip docker specific testing
and validation.
Local CRI node_e2e can now be performed via
`make test-e2e-node RUNTIME=remote REMOTE=false`
which will also take care of passing the appropriate argument to
the kubelet.
Automatic merge from submit-queue
test: move deployment helper in testing framework
Wanted to get this out of the way before submitting an upgrade test for Deployments and I need the helper in the framework utility
@janetkuo @soltysh
Automatic merge from submit-queue (batch tested with PRs 35782, 35831, 39279, 40853, 40867)
Test GCE PD unmounts and detaches when the namespace of the pvc&pod is deleted.
Addition to Persistent Volume E2E testing. On a GCE cluster, create a pv, pvc, and client pod. Delete the namespace and check that the disk detaches successfully.
@jeffvance
~~DEPENDENT ON~~ #34353 merged. No dependencies.
Automatic merge from submit-queue (batch tested with PRs 39169, 40719, 38954, 40808, 40689)
Add StatefulSets checks at Service level
Hi!
Please let me propose some very small e2e testsuite enhancement.
This PR removed a `TODO` about checking governing service at unit test level (which is hard) and adds this to e2e testsuite.
Thanks
Sebastian
Automatic merge from submit-queue
Add websocket support for port forwarding
#32880
**Release note**:
```release-note
Port forwarding can forward over websockets or SPDY.
```
This commit wires-in the pre-existing `--container-runtime` flag for
local node_e2e testing.
This is needed in order to further skip docker specific testing
and validation.
Local CRI node_e2e can now be performed via
`make test-e2e-node RUNTIME=remote REMOTE=false`
which will also take care of passing the appropriate arguments to
the kubelet.
- split out port forwarding into its own package
Allow multiple port forwarding ports
- Make it easy to determine which port is tied to which channel
- odd channels are for data
- even channels are for errors
- allow comma separated ports to specify multiple ports
Add portfowardtester 1.2 to whitelist
Automatic merge from submit-queue (batch tested with PRs 40527, 40738, 39366, 40609, 40748)
Removed Flaky tag from PV e2e, added [Volume] to disruptive PV e2e
**What this PR does / why we need it**:
Removes `[Flaky]` from PV e2e testing. Flakes were due to interference from an external test disrupting a cluster node. The test has been [removed](9f36c032de) and the flakes have [cleared](https://k8s-testgrid.appspot.com/google-gce#gce-flaky).
Secondly, added `[Volume]` tag to PV disruptive tests.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#39119
**Release note**:
NONE
```release-note
```
Automatic merge from submit-queue
refactor pv e2e code to improve readability
**What this PR does / why we need it**:
Moved the helper functions out of _persistent_volumes.go_ to a new file, _pvutil.go_, in order to improve readability and make it easier to add new tests.
Also, all pod delete code now calls the same helper function `deletePodWithWait`.
**Release note**:
```
NONE
```
Automatic merge from submit-queue
e2e test should fail if err==timeout
If err==timeout, it means the replicaset (the owner) is not deleted after 30s, which indicates a bug, so the e2e test should fail.
Automatic merge from submit-queue (batch tested with PRs 40584, 40319)
ssh support for local
**What this PR does / why we need it**: adds local deployment support for e2e tests. Useful for non-cloud, simple testing.
**Special notes for your reviewer**: Formerly this pr was part of #38214
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40046, 40073, 40547, 40534, 40249)
Fix e2e: validates that InterPodAntiAffinity is respected if matching 2
**What this PR does / why we need it**:
Fixed e2e: validates that InterPodAntiAffinity is respected if matching 2
```
• Failure [120.255 seconds]
[k8s.io] SchedulerPredicates [Serial]
/root/code/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:656
validates that InterPodAntiAffinity is respected if matching 2 [It]
/root/code/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/scheduler_predicates.go:561
Not scheduled Pods: []v1.Pod(nil)
Expected
<int>: 0
to equal
<int>: 1
/root/code/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/scheduler_predicates.go:933
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/30142
**Special notes for your reviewer**:
xref: https://bugzilla.redhat.com/show_bug.cgi?id=1413748
While looking into the above bug, I found that the e2e was failing.
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 39538, 40188, 40357, 38214, 40195)
test for host cleanup in unfavorable pod deletes
addresses issue #31272 with a new e2e test in _kubelet.go_
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 40215, 40340, 39523)
Retry resource quota lookup until count stabilizes
On contended servers the service account controller can slow down,
leading to the count changing during a run. Wait up to 5s for the count
to stabilize, assuming that updates come at a consistent rate, and are
not held indefinitely.
Upstream of openshift/origin#12605 (we create more secrets and flake more often)
Automatic merge from submit-queue
Fix federation component logging when e2e test case fails
When a federation e2e test case fails, federation component logs (esp. controller-manager) were very useful in debugging the failure cause. Due to recent updates in framework, the logs were not captured. This PR will fix those issues.
cc @kubernetes/sig-federation-misc @nikhiljindal @madhusudancs
Automatic merge from submit-queue (batch tested with PRs 40335, 40320, 40324, 39103, 40315)
add some resillency to the new volume-server func
**What this PR does / why we need it**: server-pod `Create` won't fail the test if the server pod already exists.
**Special notes for your reviewer**: Formerly this pr was part of #38214
Automatic merge from submit-queue
Move b.gcr.io/k8s_authenticated_test to gcr.io/k8s-authenticated-test
**What this PR does / why we need it**:
Per https://cloud.google.com/container-registry/docs/support/deprecation-notices, b.gcr.io access will be deprecated soon.
I've already mirrored the repo to the location specified in this PR.
Automatic merge from submit-queue (batch tested with PRs 39260, 40216, 40213, 40325, 40333)
Adding framework to allow multiple upgrade tests
**What this PR does / why we need it**: This adds a framework for multiple tests to run during an upgrade. This also moves the existing services test to that framework.
On contended servers the service account controller can slow down,
leading to the count changing during a run. Wait up to 5s for the count
to stabilize, assuming that updates come at a consistent rate, and are
not held indefinitely.
Automatic merge from submit-queue
Optional configmaps and secrets
Allow configmaps and secrets for environment variables and volume sources to be optional
Implements approved proposal c9f881b7bb
Release note:
```release-note
Volumes and environment variables populated from ConfigMap and Secret objects can now tolerate the named source object or specific keys being missing, by adding `optional: true` to the volume or environment variable source specifications.
```
Automatic merge from submit-queue
move pkg/fields to apimachinery
Purely mechanical move of `pkg/fields` to apimachinery.
Discussed with @lavalamp on slack. Moving this an `labels` to apimachinery.
@liggitt any concerns? I think the idea of field selection should become generic and this ends up shared between client and server, so this is a more logical location.
Automatic merge from submit-queue
make client-go more authoritative
Builds on https://github.com/kubernetes/kubernetes/pull/40103
This moves a few more support package to client-go for origination.
1. restclient/watch - nodep
1. util/flowcontrol - used interface
1. util/integer, util/clock - used in controllers and in support of util/flowcontrol
Automatic merge from submit-queue (batch tested with PRs 39625, 39842)
Add RBAC v1beta1
Add `rbac.authorization.k8s.io/v1beta1`. This scrubs `v1alpha1` to remove cruft, then add `v1beta1`. We'll update other bits of infrastructure to code to `v1beta1` as a separate step.
```release-note
The `attributeRestrictions` field has been removed from the PolicyRule type in the rbac.authorization.k8s.io/v1alpha1 API. The field was not used by the RBAC authorizer.
```
@kubernetes/sig-auth-misc @liggitt @erictune
Automatic merge from submit-queue (batch tested with PRs 34763, 38706, 39939, 40020)
Use Statefulset instead in e2e and controller
Quick fix ref: #35534
We should finish the issue to meet v1.6 milestone.
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
Add optional per-request context to restclient
**What this PR does / why we need it**: It adds per-request contexts to restclient's API, and uses them to add timeouts to all proxy calls in the e2e tests. An entire e2e shouldn't hang for hours on a single API call.
**Which issue this PR fixes**: #38305
**Special notes for your reviewer**:
This adds a feature to the low-level rest client request feature that is entirely optional. It doesn't affect any requests that don't use it. The api of the generated clients does not change, and they currently don't take advantage of this.
I intend to patch this in to 1.5 as a mostly test only change since it's not going to affect any controller, generated client, or user of the generated client.
cc @kubernetes/sig-api-machinery
cc @saad-ali
Automatic merge from submit-queue
Fix examples e2e permission check
Ref #39382
Follow-up from #39896
Permission check should be done within the e2e test namespace, not cluster-wide
Also improved RBAC audit logging to make the scope of the permission check clearer
Automatic merge from submit-queue
Remove sleep from DynamicProvisioner test.
The comment says that the sleep is there because of 10 minute PV controller
sync. The controller sync is now 15 seconds and it should be quick enough
to hide this in subsequent `WaitForPersistentVolumeDeleted(.. , 20*time.Minute)`
Automatic merge from submit-queue (batch tested with PRs 38427, 39896, 39889, 39871, 39895)
Grant permissions to e2e examples test service account
ref #39382
Automatic merge from submit-queue
Updated unit tests
@janetkuo updated the flaky unit test to have the same structure with regard to uncasting as the rest of the tests. ptal
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Made cache.Controller to be interface.
**What this PR does / why we need it**:
#37504
Automatic merge from submit-queue
run staging client-go update
Chasing to see what real problems we have in staging-client-go.
@sttts you get similar results?
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts
Automatic merge from submit-queue (batch tested with PRs 39803, 39698, 39537, 39478)
[scheduling] Moved pod affinity and anti-affinity from annotations to api fields #25319
Converted pod affinity and anti-affinity from annotations to api fields
Related: #25319
Related: #34508
**Release note**:
```Pod affinity and anti-affinity has moved from annotations to api fields in the pod spec. Pod affinity or anti-affinity that is defined in the annotations will be ignored.```
Automatic merge from submit-queue
Add [Volume] tag to all the volume-related E2E tests.
**What this PR does / why we need it**:
Tags all the volume/storage related e2e tests to make it easier to run a volume test suite.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
fixes#35542
**Special notes for your reviewer**:
Please let me know if there are tests that should/should not be included.
**Release note**:
NONE
```release-note
```
Automatic merge from submit-queue
[CRI] Don't include user data in CRI streaming redirect URLs
Fixes: https://github.com/kubernetes/kubernetes/issues/36187
Avoid userdata in the redirect URLs by caching the {Exec,Attach,PortForward}Requests with a unique token. When the redirect URL is created, the token is substituted for the request params. When the streaming server receives the token request, the token is used to fetch the actual request parameters out of the cache.
For additional security, the token is generated using the secure random function, is single use (i.e. the first request with the token consumes it), and has a short expiration time.
/cc @kubernetes/sig-node
Automatic merge from submit-queue (batch tested with PRs 39475, 38666, 39327, 38396, 39613)
e2e tests: new portforwardertester with another three tests for case …
PR include:
- add new e2e test cases for BIND_ADDRESS='0.0.0.0'
- add to portforwardertester.go os.Getenv("BIND_ADDRESS") and if not set, it should be localhost for backward compability with existing tests
- for existing tests pass explicity BIND_ADDRESS='localhost'
- rename existing tests
It was mention in the issue: #32128
cc @mzylowski @pskrzyns
Automatic merge from submit-queue (batch tested with PRs 39495, 39547)
Tag persistent volume PersistentVolume E2E [Volume][Serial][Flaky]
**What this PR does / why we need it**:
When run parallel with other tests that use PV(C)s, cross-test binding causes flakes. Add `[Serial]` tag.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: f
Partly addresses #39119
**Special notes for your reviewer**:
cc @saad-ali @jsafrane @jeffvance
Automatic merge from submit-queue (batch tested with PRs 39684, 39577, 38989, 39534, 39702)
Set PodStatus QOSClass field
This PR continues the work for https://github.com/kubernetes/kubernetes/pull/37968
It converts all local usage of the `qos` package class types to the new API level types (first commit) and sets the pod status QOSClass field in the at pod creation time on the API server in `PrepareForCreate` and in the kubelet in the pod status update path (second commit). This way the pod QOS class is set even if the pod isn't scheduled yet.
Fixes#33255
@ConnorDoyle @derekwaynecarr @vishh
Automatic merge from submit-queue (batch tested with PRs 38212, 38792, 39641, 36390, 39005)
Updating federated service controller to support cascading deletion
Ref https://github.com/kubernetes/kubernetes/issues/33612
Service controller is special than other federation controllers because it does not use federatedinformer and updater to sync services (it was written before we had those frameworks).
Updating service controller code to instantiate these frameworks and then use deletion helper to perform cascading deletion.
Note that, I havent changed the queuing logic in this PR so we still dont use federated informer to manage the queue. Will do that in the next PR.
cc @kubernetes/sig-federation-misc @mwielgus @quinton-hoole
```release-note
federation: Adding support for DeleteOptions.OrphanDependents for federated services. Setting it to false while deleting a federated service also deletes the corresponding services from all registered clusters.
```
Automatic merge from submit-queue (batch tested with PRs 39695, 37054, 39627, 39546, 39615)
Add configs that run more advanced density and load tests
Wojtek is on vacation this week - @timothysc can you please take a look? It's rather terrible, but I don't have a better idea on how to make parametric tests.
cc @wojtek-t
Added [Volume] tag per issue #35542; added [Flaky] to GCE tests until confirmed fixed. Added [Serial] to NFS to address possible cross test contamination.
Automatic merge from submit-queue (batch tested with PRs 39394, 38270, 39473, 39516, 36243)
Fix wrong skipf parameter
**How to reproduce**
When run e2e test, it reports `%!!(MISSING)d(MISSING)`:
```
STEP: Checking for multi-zone cluster. Zone count = 1
Dec 6 14:16:43.272: INFO: Zone count is %!!(MISSING)d(MISSING), only run for multi-zone clusters, skipping test
[AfterEach] [k8s.io] Multi-AZ Clusters
```
We need to pass a string parameter to `SkipUnlessAtLeast`
The comment says that the sleep is there because of 10 minute PV controller
sync. The controller sync is now 15 seconds and it should be quick enough
to hide this in subsequent WaitForPersistentVolumeDeleted(.. , 20*time.Minute)
Automatic merge from submit-queue (batch tested with PRs 39493, 39496)
Use privileged containers for host path e2e tests
Test containers need to run as spc_t in order to interact with the host
filesystem under /tmp, as the tests for HostPath are doing. Docker will
transition the container into this domain when running the container as
privileged.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Currently, this test fails with AVC denials like:
```
time->Thu Jan 5 10:17:51 2017
type=SYSCALL msg=audit(1483629471.846:6623): arch=c000003e syscall=257 success=no exit=-13 a0=ffffffffffffff9c a1=c820010120 a2=80241 a3=1a4 items=0 ppid=4112 pid=4130 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="mt" exe="/mt" subj=system_u:system_r:svirt_lxc_net_t:s0:c123,c328 key=(null)
type=AVC msg=audit(1483629471.846:6623): avc: denied { write } for pid=4130 comm="mt" name="sub-path" dev="xvda2" ino=118491348 scontext=system_u:system_r:svirt_lxc_net_t:s0:c123,c328 tcontext=system_u:object_r:container_runtime_tmp_t:s0 tclass=dir
```
```release-note
NONE
```
/cc @ncdc @pmorie
Test containers need to run as spc_t in order to interact with the host
filesystem under /tmp, as the tests for HostPath are doing. Docker will
transition the container into this domain when running the container as
privileged.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Automatic merge from submit-queue
Remove jobs that do not exist from active list of CronJob
**What this PR does / why we need it**: This PR modifies the controller for CronJob to remove from the active job list any job that does not exist anymore, to avoid staying blocked in active state forever. See #37957.
**Which issue this PR fixes**: fixes#37957
**Special notes for your reviewer**:
**Release note**:
```
```
Automatic merge from submit-queue (batch tested with PRs 38433, 36245)
Allow pods to define multiple environment variables from a whole ConfigMap
Allow environment variables to be populated from ConfigMaps
- ConfigMaps represent an entire set of EnvVars
- EnvVars can override ConfigMaps
fixes#26299
Automatic merge from submit-queue (batch tested with PRs 39092, 39126, 37380, 37093, 39237)
Endpoints with TolerateUnready annotation, should list Pods in state terminating
**What this PR does / why we need it**:
We are using preStop lifecycle hooks to gracefully remove a node from a cluster. This hook is potentially long running and after the preStop hook is fired, the DNS resolution of the soon to be stopped Pod is failing, which causes a failure there.
**Special notes for your reviewer**:
Would be great to backport that to 1.4, 1.3
**Release note**:
```release-note
Endpoints, that tolerate unready Pods, are now listing Pods in state Terminating as well
```
@bprashanth
Automatic merge from submit-queue
Add Persistent Volume E2E in the context of a disrupted kubelet
This PR adds a test suite for persistent volumes affected by a disrupted kubelet. Two cases are presented:
1. A volume mounted via PVC remains accessible after a kubelet restart.
2. When a pod is deleted while the kubelet is down, the mounted volume is unmounted successfully.
Automatic merge from submit-queue
Remove system:anonymous check from kubectl test
This verbiage doesn't appear when the cluster is `AlwaysAllow` (and just makes the check more brittle).
Follow-on to #39263, this is the last (consistent) failure on [kops-aws](https://k8s-testgrid.appspot.com/google-aws#kops-aws&sort-by-failures=)
Automatic merge from submit-queue
Avoid unnecessary memory allocations
Low-hanging fruits in saving memory allocations. During our 5000-node kubemark runs I've see this:
ControllerManager:
- 40.17% k8s.io/kubernetes/pkg/util/system.IsMasterNode
- 19.04% k8s.io/kubernetes/pkg/controller.(*PodControllerRefManager).Classify
Scheduler:
- 42.74% k8s.io/kubernetes/plugin/pkg/scheduler/algrorithm/predicates.(*MaxPDVolumeCountChecker).filterVolumes
This PR is eliminating all of those.
Automatic merge from submit-queue
CreateNodeSelectorPods should respect parameter
Fix (1): `CreateNodeSelectorPods` should respect parameter `id`.
The existing e2e does not break because it happened use "node-selector" as id, which is the same as the hard coded value.
Fix (2): The current `CreateNodeSelectorPods` does not use `nodeSelector` parameter, it hard coded a label instead.
The reason current e2e does not influenced because we happened use the same label: https://github.com/kubernetes/kubernetes/blob/master/test/e2e/cluster_size_autoscaling.go#L177
Found these bugs during testing #36238
Automatic merge from submit-queue (batch tested with PRs 39059, 39175, 35676, 38655)
ReplicaSet has onwer ref of the Deployment that created it
**What this PR does / why we need it**:
This enabled garbage collection for ReplicaSets and ensures they are owned by their respective Deployment objects.
fixes https://github.com/kubernetes/kubernetes/issues/33845
This is an initial PR to get feedback. Will update this quickly with unit tests if this seems like in the right direction
Automatic merge from submit-queue
In-cluster configs must take flag overrides into account
**What this PR does / why we need it**: Some flags must override in-cluster configs if provided to `kubectl` inside a cluster.
**Which issue this PR fixes**: Fixes https://github.com/kubernetes/kubernetes/issues/38834
**Release note**:
```release-note
Fixed a bug where the --server, --token, and --certificate-authority flags were not overriding the related in-cluster configs when provided in a `kubectl` call inside a cluster.
```
Automatic merge from submit-queue
Add test to detach a pd whose node was deleted
**What this PR does / why we need it**:
A test for the following issue :
If a node with a GCE PD attached is deleted (before the volume is detached), subsequent attempts by the attach/detach controller to detach it should not fail.
**Bonus** :Added additional code to ensure that the pd can still be attached to a different node.
Edit : Removed it as it was making the test much slower.
https://github.com/kubernetes/kubernetes/issues/29358
Automatic merge from submit-queue (batch tested with PRs 38426, 38917, 38891, 38935)
Support different image during GCE node upgrade
**What this PR does / why we need it**: It lets GCE upgrade tests upgrade to a GCI node image.
**Which issue this PR fixes**: fixes#37855
Automatic merge from submit-queue (batch tested with PRs 38942, 38958)
Added MULTIZONE flag to e2e remove master script.
Added MULTIZONE flag to e2e remove master script. The script is used by HA tests which set-up multizone cluster.
Automatic merge from submit-queue (batch tested with PRs 34353, 33837, 38878)
Add e2e test for configmap volume
There are two patches:
- refactor e2e volume tests to allow multiple volumes mounted into single pod
- add a test for ConfigMap volume mounted twice to test #28502
Automatic merge from submit-queue (batch tested with PRs 34353, 33837, 38878)
Gce persistentvolume testing
Add E2E PersistentVolume test for a GCE environment. Tests that deleting a PV or PVC before the referencing pod does not fail on unmount and detach during pod deletion.
cc @jeffvance
Automatic merge from submit-queue (batch tested with PRs 37468, 36546, 38713, 38902, 38614)
Remove extensions/v1beta1 Job
Fixes https://github.com/kubernetes/kubernetes/issues/32763. This endpoint was deprecated in 1.5 and was planned to be removed in 1.6.
**Release note**:
```release-note
Remove extensions/v1beta1 Jobs resource, and job/v1beta1 generator.
```
Automatic merge from submit-queue (batch tested with PRs 37468, 36546, 38713, 38902, 38614)
Adds e2e firewall tests for LoadBalancer service, ingress, and e2e cluster
Fixes#25488 and fixes#31827.
This PR adds e2e firewall test for LoadBalancer type service, ingress and e2e cluster.
Test details for LoadBalancer type service as below:
- Verifies corresponding firewall rule has correct `sourceRanges`, `ports and protocols` and `target tags`.
- Verifies requests can reach all expected instances.
- Verifies requests can not reach instances that are not included.
Overview of the test procedure:
- Creates a LoadBalancer type service.
- Validates the corresponding firewall rule.
- Creates netexec pods as service backends.
- Sends requests from outside of the cluster and examine hitting all instances in range.
- Removes tags from one of the instances in order to get it out of firewall rule's range.
- Sends requests from outside of the cluster and examine not hitting this instance.
- Recovers tags for this instances and verifies its traffic is back.
@bprashanth @bowei @thockin
For LoadBalancer type service:
- Verifies corresponding firewall rule has correct sourceRanges, ports
& protocols, target tags.
- Verifies requests can reach all expected instances.
- Verifies requests can not reach instances that are not included.
For Ingress resrouce:
- Verifies the ingress firewall rule has correct sourceRanges, target
tags and tcp ports.
For general e2e cluster:
- Verifies all required firewall rules has correct sourceRange, ports
& protocols, source tags and target tags.
- Verifies well know ports on master and nodes are not
exposed externally
Automatic merge from submit-queue
Don't check nodeport for nginx ingress
Services behind a standard nginx ingress don't need nodeport, so don't check that.
Extracted delete operations into functions
wait on pv/pvc bind
removed redundant verification, minor refactors
GCEPD: fixed typo
name verifyDiskAttached to verifyGCEDiskAttached
fix empty log msg
Updated test owners
removed unnecessary api calls
Check for apierr IsNotFound for pod,pv,pvc but ignore result
Disable dynamic provisioning in test PVCs
gofmt'd
Automatic merge from submit-queue
Fix Recreate for Deployments and stop using events in e2e tests
Fixes https://github.com/kubernetes/kubernetes/issues/36453 by removing events from the deployment tests. The test about events during a Rolling deployment is redundant so I just removed it (we already have another test specifically for Rolling deployments).
Closes https://github.com/kubernetes/kubernetes/issues/32567 (preferred to use pod LISTs instead of a new status API field for replica sets that would add many more writes to replica sets).
@kubernetes/deployment
Automatic merge from submit-queue (batch tested with PRs 38830, 38750)
[Federation] Stop cleaning federation namespace in e2e tests
when --clean-start=true flag is provided to e2e tests it would cleanup all the leftover namespaces except `default` and `kube-system` and because of this when we run e2e tests in federation soak test job, the federation control plane is destroyed before it runs the tests and all tests start to fail.
So adding federation-system to the list of namespace to be left intact and also changed the default federation namespace name from `federation` to `federation-system` to be consistent with the newer method of deploying federation using kubefed.
@madhusudancs @nikhiljindal
Automatic merge from submit-queue (batch tested with PRs 38830, 38750)
Remove the ReadyReplica version guard
**What this PR does / why we need it**: Removes outlived version guards.
**Which issue this PR fixes**: fixes#37310
Automatic merge from submit-queue (batch tested with PRs 38154, 38502)
Rename "release_1_5" clientset to just "clientset"
We used to keep multiple releases in the main repo. Now that [client-go](https://github.com/kubernetes/client-go) does the versioning, there is no need to keep releases in the main repo. This PR renames the "release_1_5" clientset to just "clientset", clientset development will be done in this directory.
@kubernetes/sig-api-machinery @deads2k
```release-note
The main repository does not keep multiple releases of clientsets anymore. Please find previous releases at https://github.com/kubernetes/client-go
```
Automatic merge from submit-queue
Add a package for handling version numbers (including non-"Semantic" versions)
As noted in #32401, we are using Semantic Version-parsing libraries to parse version numbers that aren't necessarily "Semantic". Although, contrary to what I'd said there, it turns out that this wasn't actually currently a problem for the iptables code, because the regexp used to extract the version number out of the "iptables --version" output only pulled out three components, so given "iptables v1.4.19.1", it would have extracted just "1.4.19". Still, it could be a problem if they later release "1.5" rather than "1.5.0", or if we eventually need to _compare_ against a 4-digit version number.
Also, as noted in #23854, we were also using two different semver libraries in different parts of the code (plus a wrapper around one of them in pkg/version).
This PR adds pkg/util/version, with code to parse and compare both semver and non-semver version strings, and then updates kubernetes to use it everywhere (including getting rid of a bunch of code duplication in kubelet by making utilversion.Version implement the kubecontainer.Version interface directly).
Ironically, this does not actually allow us to get rid of either of the vendored semver libraries, because we still have other dependencies that depend on each of them. (cadvisor uses blang/semver and etcd uses coreos/go-semver)
fixes#32401, #23854
Automatic merge from submit-queue
Add an option to run Job in Density/Load config
cc @timothysc @jeremyeder
@erictune @soltysh - I run this test and it seems to me that Job has noticeably worse performance than Deployment. I'll create an issue for this, but this PR is for easy repro.
Automatic merge from submit-queue (batch tested with PRs 36419, 38330, 37718, 38244, 38375)
Guarantees drop packets commands succeed in reboot test
Fixes the main case in #33405 and #36230.
Previous attempted fix in #38057.
During the reboot test, the iptables command that was supposed to take the node offline failed to exec.
Turned out the xtables lock was holding by other processes led to this failure. Logs as below:
```
I1202 20:00:29.686] Dec 2 20:00:29.685: INFO: ssh jenkins@146.148.111.167:22: stdout:
"+ sleep 10
+ sudo iptables -I INPUT 1 -s 127.0.0.1 -j ACCEPT
Another app is currently holding the xtables lock. Perhaps you want to use the -w option?"
I1202 20:00:29.686] Dec 2 20:00:29.685: INFO: ssh jenkins@146.148.111.167:22: stderr: ""
I1202 20:00:29.686] Dec 2 20:00:29.685: INFO: ssh jenkins@146.148.111.167:22: exit code: 0
```
This reboot test won't pass if any one of these iptables commands fails. This PR put "reboot" commands into while loops to guarantee it retries until succeed.
`sudo iptables -t filter -nL` is removed since it is clear now that the `FILTER` rules won't be clobbered.
(Tests passed on local cluster.)
@bprashanth
Automatic merge from submit-queue (batch tested with PRs 36071, 32752, 37998, 38350, 38401)
Pass addressable values to DeepCopy
Extracted from https://github.com/kubernetes/kubernetes/pull/35728
These are the places we are currently calling DeepCopy incorrectly, and we need to fix, even if we don't pick up the changes to DeepCopy in #35728:
* creating a new cloner means we have no generated functions registered
* passing non-addressable values doesn't pick up generated deep copy functions, and forces us into reflective mode
Automatic merge from submit-queue (batch tested with PRs 37092, 37850)
Turns on dns horizontal scaling tests for GKE
Seems like the dns-autoscaler is already enabled in [this recent gke build](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gke/769/).
Turning on the corresponding e2e tests to increase test coverage.
Probably better to wait for this fix#37261 to go in first.
@bowei @bprashanth
cc @maisem @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 37325, 38313, 38141, 38321, 38333)
Cleanup firewalls, add nginx ingress to presubmit
Make the firewall cleanup code follow the same pattern as the other cleanup functions, and add the nginx ingress e2e to presubmit.
Planning to watch the test for a bit, and if it works alright, I'll add the other Ingress e2e to post-submit merge blocker.
Automatic merge from submit-queue (batch tested with PRs 37325, 38313, 38141, 38321, 38333)
Fix running e2e with 'Completed' kube-system pods
As of now, e2e runner keeps waiting for pods in `kube-system` namespace to be "Running and Ready" if there are any pods in `Completed` state in that namespace.
This for example happens after following [Kubernetes Hosted Installation](http://docs.projectcalico.org/v2.0/getting-started/kubernetes/installation/#kubernetes-hosted-installation) instructions for Calico, making it impossible to run conformance tests against the cluster. It's also to possible to reproduce the problem like that:
```
$ cat testjob.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: tst
namespace: kube-system
spec:
template:
metadata:
name: tst
spec:
containers:
- name: tst
image: busybox
command: ["echo", "test"]
restartPolicy: Never
$ kubectl create -f testjob.yaml
$ go run hack/e2e.go -v --test --test_args='--ginkgo.focus=existing\s+RC'
```
Automatic merge from submit-queue
Delete regional static-ip instead of global for type=lb
Global vs region is the difference between
```
$ gcloud compute addresses delete foo --global
$ gcloud compute addresses delete foo --region us-central1
```
Type=LoadBalancer users the second type and were were doing the first.
Also adds some logging.
Automatic merge from submit-queue (batch tested with PRs 38294, 37009, 36778, 38130, 37835)
fix permissions when using fsGroup
Currently, when an fsGroup is specified, the permissions of the defaultMode are not respected and all files created by the atomic writer have mode 777. This is because in `SetVolumeOwnership()` the `filepath.Walk` includes the symlinks created by the atomic writer. The symlinks have mode 777 when read from `info.Mode()`. However, when the are chmod'ed later, the chmod applies to the file the symlink points to, not the symlink itself, resulting in the wrong mode for the underlying file.
This PR skips chmod/chown for symlinks in the walk since those operations are carried out on the underlying file which will be included elsewhere in the walk.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1384458
@derekwaynecarr @pmorie
Automatic merge from submit-queue (batch tested with PRs 38173, 38151, 38197, 38221)
test: wait for ready replica set before adopting
Reworked version of https://github.com/kubernetes/kubernetes/pull/36439 which was reverted in https://github.com/kubernetes/kubernetes/pull/38049. This PR doesn't use any of the new status API added in replica sets so it should cause no trouble with upgrade tests.
@kubernetes/deployment @smarterclayton
Automatic merge from submit-queue (batch tested with PRs 37032, 38119, 38186, 38200, 38139)
New ns param for NewClusterVerification
**What this PR does / why we need it**: Allows the test to specify alternate namespaces to when waiting for pods to be in a specific state.
**Which issue this PR fixes**: fixes#38138
**Special notes for your reviewer**: Minor fix
**Release note**: None
Automatic merge from submit-queue
[Federation] Separate the cleanup phases of service and service shards so that service shards can be cleaned up even after the service is deleted elsewhere.
Fixes Federated Service e2e test.
This separation is necessary because "Federated Service DNS should be
able to discover a federated service" e2e test recently added a case
where it deletes the service from federation but not the shards from
the underlying clusters.
Because of the way cleanup was implemented in the AfterEach block
currently, we did not cleanup any of the underlying shards. However,
separating the two phases of the cleanup needs this separation.
cc @kubernetes/sig-cluster-federation @nikhiljindal
Automatic merge from submit-queue (batch tested with PRs 37328, 38102, 37261, 31321, 38146)
Fixes flake: wait for dns pods terminating after test completed
From #37194. Based on #36600. Please only look at the second commit.
As mentioned in [comment](https://github.com/kubernetes/kubernetes/issues/37194#issuecomment-262007174), "DNS horizontal autoscaling" test does not wait for the additional pods to be terminated and this may lead to the failure of later tests.
This fix adds a wait loop at the end of the serial test to ensure the cluster recovers to the original state. In the non-serial test it does not wait for the additional pods terminating because it will not affect other tests, given they are able to be run simultaneously. Plus wait for pods terminating will take certain amount of time.
Note this only fixes certain case of #37194. I noticed there are other failures irrelevant to dns autoscaler. LIke [this one](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-serial/34/).
@bprashanth @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 37328, 38102, 37261, 31321, 38146)
Make thirdparty codec able to decode DeleteOptions
Fix#37278.
Without this PR, the gvk sent to the delegated codec will be the thirdparty one, which is not recognized by the delegated codec (usually api.Codecs).
Automatic merge from submit-queue (batch tested with PRs 38076, 38137, 36882, 37634, 37558)
Make logging for gcl e2e test more verbose
To help debug https://github.com/kubernetes/kubernetes/issues/37241
CC @piosz
Automatic merge from submit-queue (batch tested with PRs 36352, 36538, 37976, 36374)
test: update deployment helper to return better error messages
@kubernetes/deployment the problem with https://github.com/kubernetes/kubernetes/issues/36270 is that the selector key is never added in the deployment but this change would make it clearer.
This separation is necessary because "Federated Service DNS should be
able to discover a federated service" e2e test recently added a case
where it deletes the service from federation but not the shards from
the underlying clusters.
Because of the way cleanup was implemented in the AfterEach block
currently, we did not cleanup any of the underlying shards. However,
separating the two phases of the cleanup needs this separation.
Automatic merge from submit-queue (batch tested with PRs 38049, 37823, 38000, 36646)
Revert "test: update rollover test to wait for available rs before adopting"
This reverts commit 5b7bf78f3f from pr #36439 which appears to have mostly broken the gci-gke test.
Automatic merge from submit-queue (batch tested with PRs 37094, 37663, 37442, 37808, 37826)
Moved gobindata, refactored ReadOrDie refs
**What this PR does / why we need it**: Having gobindata inside of test/e2e/framework prevents external projects from importing the framework. Moving it out and managing refs fixes this problem.
**Which issue this PR fixes**: fixes#37007
Automatic merge from submit-queue (batch tested with PRs 37997, 37939, 37990, 36700, 37258)
Add cluster-level AppArmor E2E test
My goal is to reuse this test for an automated cluster upgrade test.
Automatic merge from submit-queue
test: update rollover test to wait for available rs before adopting
Scenario that happened in https://github.com/kubernetes/kubernetes/issues/35355#issuecomment-257808460
-- Replica set that is about to be adopted has 2 out of 4 ready replicas
-- Deployment is created with 4 replicas, adopts pre-existing replica set, creates a new one, and starts rolling replicas over to the new replica set.
```
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled down replica set test-rollover-controller to 3
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set test-rollover-deployment-2505289747 to 1
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment-2505289747: {replicaset-controller } SuccessfulCreate: Created pod: test-rollover-deployment-2505289747-iuiei
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment-2505289747-iuiei: {default-scheduler } Scheduled: Successfully assigned test-rollover-deployment-2505289747-iuiei to gke-jenkins-e2e-default-pool-33c0400e-6q5m
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:05 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set test-rollover-deployment-2505289747 to 2
```
At this point there is no minimum availability for the Deployment (maxUnavailable is 1 meaning desired minimum available is 3 but we only have 2), and the new replica set uses a non-existent image. New replica set is scaled up to 1 (maxSurge is 1), then old replica set is scaled down by one, because cleanupUnhealthyReplicas observes that it has 2 unhealthy replicas - it can only scale down one though because the [maximum replicas it can cleanup is one](d87dfa2723/pkg/controller/deployment/rolling.go (L125)) (4+1-3-1). New replica set is scaled to 2. Available replicas are still 2 (third replica from the old replica set has yet to come up).
-- Deployment is rolled over with a new update. Test reaches for the WaitForDeploymentStatus check but there are only 2 availableReplicas (maxUnavailable is still violated).
This change makes the test wait for a healthy replica set before proceeding thus it should never hit the scenario described above.
@kubernetes/deployment
- Remaining spaghetti untangled
- Missed bazel update and a few hardcoded refs
- New instance of framework.ReadOrDie reference removed post rebase
- Resolve new clientset rebase
- Fixed e2e/generated BUILD dep
- A space
- Missed gobindata ref in golang.sh
Automatic merge from submit-queue
Build vendored copy of go-bindata and use that in go generate step
**What this PR does / why we need it**: as the title says, uses the vendored version of `go-bindata` rather than expecting developers to `go get` it (when building outside docker).
**Which issue this PR fixes**: fixes#34067, partially addresses #36655
**Special notes for your reviewer**: we still call `go generate` far too many times:
```console
~/.../src/k8s.io/kubernetes $ which go-bindata
~/.../src/k8s.io/kubernetes $ make
+++ [1116 17:35:28] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:29] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:30] Building go targets for linux/amd64:
cmd/libs/go2idl/deepcopy-gen
+++ [1116 17:35:35] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:35] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:36] Building go targets for linux/amd64:
cmd/libs/go2idl/defaulter-gen
+++ [1116 17:35:41] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:41] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:42] Building go targets for linux/amd64:
cmd/libs/go2idl/conversion-gen
+++ [1116 17:35:47] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:47] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:48] Building go targets for linux/amd64:
cmd/libs/go2idl/openapi-gen
+++ [1116 17:35:56] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:56] Generating bindata:
test/e2e/framework/gobindata_util.go
```
Fixing that is a separate effort, though.
cc @sebgoa @ZhangBanger
Automatic merge from submit-queue
Cleanup old cloud resources after 48 hours
With this pr the ingress e2e purges old leaked resources (>48h), so even if tests fail due to leaks, the entire queue won't close till someone bumps up quota through a manual request.
Automatic merge from submit-queue
Adds termination hook in reboot test for debugging
From #33405 and #36230.
Logs the SSH command issued for dropping inbound / outbound traffic to file and dump it out when test ends.
The first `sudo iptables -t filter -nL` is called to confirm the rules are injected. The second `sudo iptables -t filter -nL` is to check whether the rules get clobbered. Adds `date` in between to check time frame.
@bprashanth @freehan
Automatic merge from submit-queue
Update Stateful Set example files for 1.5
1. Remove initialized annotation from statefulset examples
2. Update storage class annotation to beta in statefulset examples
3. Remove alpha limitation on PetSet in cassandra example
cc @erictune @foxish @kow3ns @enisoc @chrislovecnm @kubernetes/sig-apps
```release-note
NONE
```
Automatic merge from submit-queue
Fix package aliases to follow golang convention
Some package aliases are not not align with golang convention https://blog.golang.org/package-names. This PR fixes them. Also adds a verify script and presubmit checks.
Fixes#35070.
cc/ @timstclair @Random-Liu
Automatic merge from submit-queue
Revision handling in federated deployment controller
Deployment controller in regular kubernetes automatically adds an annotation in deployment. This causes a bit of confusion in controller and tests. This PR skips revision annotation in checks. In the next K8S release we will need to have better support for deployment revisions.
Helps with #36588
cc: @nikhiljindal @madhusudancs
Automatic merge from submit-queue
Stop deleting underlying services when federation service is deleted
Fixes https://github.com/kubernetes/kubernetes/issues/36799
Fixing federation service controller to not delete services from underlying clusters when federated service is deleted.
None of the federation controller should do this unless explicitly asked by the user using DeleteOptions. This is the only federation controller that does that.
cc @kubernetes/sig-cluster-federation @madhusudancs
```release-note
federation service controller: stop deleting services from underlying clusters when federated service is deleted.
```
Automatic merge from submit-queue
Skip rather than fail networking tests on single node
**What this PR does / why we need it**:
Needed for the general e2e tidying we need to do for flakey slow tests, imo pre 1.5, see #31402 and so on.
**Which issue this PR fixes** *
Dont fail multinode tests if on a single node cluster, skip instead.
Automatic merge from submit-queue
Fix nil pointer dereference in test framework
Checking the `result.Code` prior to `err` in the if statement causes a panic if result is `nil`. It turns out the formatting of the error is already in `IssueSSHCommandWithResult`, so removing redundant code is enough to fix the issue. Logging the SSH result was also redundant, so I removed that as well.
Automatic merge from submit-queue
Fixes dns autoscaling test flakes
Fixes #36457 and fixes#36569.
#36457 is flake due to the 10 minutes timeout for scaling down cluster. Changes to use `scaleDownTimeout` from [test/e2e/cluster_size_autoscaling.go](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/cluster_size_autoscaling.go), which is 15 minutes.
The failure in #36569 is because we get the schedulable nodes number at the beginning of the test and assume it will not change unless we manually change the cluster size. But below logs indicate there may be nodes become ready after the test has begun.
```
[BeforeEach] [k8s.io] DNS horizontal autoscaling
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/dns_autoscaling.go:71
Nov 10 00:36:26.951: INFO: Condition Ready of node jenkins-e2e-minion-group-x6w1 is false instead of true. Reason: KubeletNotReady, message: Kubenet does not have netConfig. This is most likely due to lack of PodCIDR
STEP: Replace the dns autoscaling parameters with testing parameters
Nov 10 00:36:26.961: INFO: DNS autoscaling ConfigMap updated.
STEP: Wait for kube-dns scaled to expected number
Nov 10 00:36:26.961: INFO: Waiting up to 5m0s for kube-dns reach 8 replicas
...
Expected error:
<*errors.errorString | 0xc420b17ef0>: {
s: "err waiting for DNS replicas to satisfy 8, got 9: timed out waiting for the condition",
}
err waiting for DNS replicas to satisfy 8, got 9: timed out waiting for the condition
not to have occurred
```
This fix puts the logic of counting schedulable nodes into the polling loop. By doing so, the test will have the correct expected replicas count even if schedulable nodes change in between.
@bowei @bprashanth
---
Updates: all `ExpectNoError(err)` are changed to `Expect(err).NotTo(HaveOccurred())`
Automatic merge from submit-queue
federation service e2e: Creating configmap for kube-dns
Ref #37105 and #37143.
Creating kube-dns config map to pass federations flag to kube-dns.
This is required since we moved to the new add on manager. With the old add on manager, we were using kube-dns rc that included the federations flag.
cc @kubernetes/sig-cluster-federation @madhusudancs @bowei @MrHohn
Verified that the tests pass with this change.
Checking the result.Code prior to err in the if statement causes a panic
if result is nil. It turns out the formatting of the error is already in
IssueSSHCommandWithResult, so removing redundant code is enough to fix
the issue. Logging the SSH result was also redundant, so I removed that
as well.
Automatic merge from submit-queue
Modify GCI mounter to enable NFSv3
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.
After this change, need to make push the image and update the sha number in Changelog.
Automatic merge from submit-queue
Guard the ready replica checking by server version
I fixed replica readiness checking for 1.4->1.5 upgrades by using a field that only exists in versions >=1.4.0 in #36924
This fixed a lot of issues in 1.4->1.5 upgrade testing, but did not fix 1.3->1.5 upgrade tests. I've disabled replica checking for 1.3 masters as the old logic was broken anyway.
This will not affect the 1.3 CI tests. Just 1.3 -> {1.4, 1.5} upgrade tests.
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubernetes-e2e-gke-container_vm-1.3-container_vm-1.5-upgrade-cluster-new/330?log
is an example of this breakage. This is the tell-tale logs:
```console
Nov 22 09:40:50.469: INFO: 11 / 11 pods in namespace 'kube-system' are running and ready (506 seconds elapsed)
Nov 22 09:40:50.469: INFO: expected 5 pod replicas in namespace 'kube-system', 0 are Running and Ready.
Nov 22 09:40:50.469: INFO: POD NODE PHASE GRACE CONDITIONS
```
Automatic merge from submit-queue
Use netexec container in http lifecycle hook test.
Fixes https://github.com/kubernetes/kubernetes/issues/33636.
The original test is using `"echo -e \"HTTP/1.1 200 OK\n\" | nc -l -p 1234` as a simple http server.
However, it seems that this is not very reliable, which may response before golang thinks it should.
So we get the error:
```
I1106 06:14:13.325397 2096 logs.go:41] Unsolicited response received on idle HTTP channel starting with "HTTP/1.1 200 OK\n\n"; err=<nil>
```
This PR changes the test to use the `netexec` container which is a simple http server written by golang and used in many of our networking e2e test. It should be more reliable.
Mark 1.5 since this is fixing a 1.5 release blocking issue. Mark P0 to match the original issue.
@dchen1107
This disables ready replica checking for 1.3 masters, but only from 1.4
or 1.5 clients. The old logic was broken anyway due to overlapping
labels with replica sets.
Automatic merge from submit-queue
Fixed e2e tests for HA master.
Set of fixes that allows HA master e2e tests to pass for removal/addition master replicas.
The summary of changes:
- fixed host name in etcd certs,
- added cluster validation after kube-down,
- fixed the number of master replicas in cluster validation,
- made MULTIZONE=true required for HA master deployments, ensured we correctly handle MULTIZONE=true when user wants to create HA master but not kubelets in multiple zones,
- extended verification of master replicas in HA master e2e tests.
Automatic merge from submit-queue
Test flake: ignore dig failure; wait until timeout
The intent of this test was to continue trying until the timeout has
been reached. The extra assertion makes the test fail early when it
should not.
https://github.com/kubernetes/kubernetes/issues/37144
fix 37114
Automatic merge from submit-queue
Update the timeout in CLOSE_WAIT e2e test
I see some test flakes due to the timeout being too strict,
updating to a larger value.
Adding tail -n 1, it looks like there may be leftover state for other
runs. We really only care about one of the CLOSE_WAIT entries.
Automatic merge from submit-queue
Test case for scalling down before scale up finished
Verifies that current pet (one which was created last by scale up action)
will enter running before scale down will take place
related: #30082
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.
After this change, need to make push the image and update the sha number in Changelog.
I see some test flakes due to the timeout being too strict,
updating to a larger value.
Adding tail -n 1, it looks like there may be leftover state for other
runs. We really only care about one of the CLOSE_WAIT entries.
Automatic merge from submit-queue
Validate pet set scale up/down order
This change implements test cases to validate that:
- scale up will be done in order
- scale down in reverse order
- if any of the pets will be unhealthy, scale up/down will halt
related: https://github.com/kubernetes/kubernetes/issues/30082
Verifies that current pet (one which was created last by scale up action)
will enter running before scale down will take place
Change-Id: Ib47644c22b3b097bc466f68c5cac46c78dd44552
This change implements test cases to validate that:
scale up will be done in order
scale down in reverse order
if any of the pets will be unhealthy, scale up/down will halt
Change-Id: I4487a7c9823d94a2995bbb06accdfd8f3001354c
Automatic merge from submit-queue
Retry job update after failure to prevent modification conflict
This fixes#34585 flake.
@janetkuo || @kubernetes/sig-apps ptal
I've been getting too many emails recently wrt to that issue, so I wanted to "clean" my inbox a bit 😉
Automatic merge from submit-queue
Add limited config-map support to kube-dns
This is an integration bugfix for https://github.com/kubernetes/kubernetes/issues/36194
```release-note
kube-dns
Added --config-map and --config-map-namespace command line options.
If --config-map is set, kube-dns will load dynamic configuration from the config map
referenced by --config-map-namespace, --config-map. The config-map supports
the following properties: "federations".
--federations flag is now deprecated. Prefer to set federations via the config-map.
Federations can be configured by settings the "federations" field to the value currently
set in the command line.
Example:
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-dns
namespace: kube-system
data:
federations: abc=def
```
- Adds command line flags --config-map, --config-map-ns.
- Fixes 36194 (https://github.com/kubernetes/kubernetes/issues/36194)
- Update kube-dns yamls
- Update bazel (hack/update-bazel.sh)
- Update known command line flags
- Temporarily reference new kube-dns image (this will be fixed with
a separate commit when the DNS image is created)
Automatic merge from submit-queue
Add e2e test for statefulset updates
Verify that one can (manually) update statefulset template
cc @erictune @foxish @kow3ns @kubernetes/sig-apps
Automatic merge from submit-queue
Wait for all Nodes to be schedulable before running e2e tests
This should fix the problem we're seeing when running tests on large clusters.
cc @dchen1107
Automatic merge from submit-queue
Delete taint annotation when removing last taint
It messes with debugging of tests failures.
cc @davidopp @kevin-wangzefeng
Automatic merge from submit-queue
Add more test cases to k8s e2e upgrade tests
**Special notes for your reviewer**:
Added guestbook, secrets, daemonsets, configmaps, jobs to e2e upgrade tests according to the discussions in #35078
Still need to run these test cases in real setup, raised a PR here for initial comments
@quinton-hoole
Automatic merge from submit-queue
Add clarity/retries to proxy url test
Improve one segment of the kube-proxy networking test by:
1. Retrying for 30s
2. Bucketing into 2 failure modes
3. Adding some clarity by describing the exec pod on failure
Althought 1 shouldn't be necessary, I don't think we lose anything if the kube-proxy convenience endpoint doesn't respond immediately, and if it fails for 30s straight it is indicative of something that requires attention probably within 1.5.
Fixes https://github.com/kubernetes/kubernetes/issues/32436
Automatic merge from submit-queue
Enable NFSv4 and GlusterFS tests on cluster e2e tests
Enable NFSv4 and GlusterFS tests on cluster e2e tests for GCI images
only.
Automatic merge from submit-queue
Add e2e test for CockroachDB statefulset
Refactor the code of statefulset e2e test for clustered applications, and add a test for CockroachDB.
The yaml file is copied from examples/cockroachdb/
cc @erictune @foxish @kow3ns @kubernetes/sig-apps
Automatic merge from submit-queue
e2e pod cleanup test: restrict pods to be assigned to nodes observed …
The test checks the individual kubelet /runningPods endpoint based on the
initial list of nodes it observes. It is important that all pods are
scheduled only onto those nodes. Apply node labels to ensure no stray pods on
other nodes.
This fixes#35197
kubectl-in-pod.json has a hardcoded path for
kubectl in its hostPath ('/usr/bin/kubectl').
When the test actually runs on gke, kubectl
is available at '/workspace/kubernetes/platforms/linux/amd64/kubectl'
and NOT at '/usr/bin/kubectl'. So we need to
fix the hostPath for the test to work properly.
Fixes#36586
Automatic merge from submit-queue
Cleanup pod in MatchContainerOutput
MatchContainerOutput always creates a pod and does not cleanup. We need
to fix this to be better at re-trying the scenarios.
When there is an error say in the first attempt of ExpectNoErrorWithRetries
(for example in "Pods should contain environment variables for services" test)
the retries logic calls MatchContainerOutput another time and the
podClient.create fails correctly since the pod was not cleaned up the
first time MatchContainerOutput was called.
Fixes#35089
MatchContainerOutput always creates a pod and does not cleanup. We need
to fix this to be better at re-trying the scenarios.
When there is an error say in the first attempt of ExpectNoErrorWithRetries
(for example in "Pods should contain environment variables for services" test)
the retries logic calls MatchContainerOutput another time and the
podClient.create fails correctly since the pod was not cleaned up the
first time MatchContainerOutput was called.
Fixes#35089
Automatic merge from submit-queue
[kubelet] rename --cgroups-per-qos to --experimental-cgroups-per-qos
This reflects the true nature of "cgroups per qos" feature.
```release-note
* Rename `--cgroups-per-qos` to `--experimental-cgroups-per-qos` in Kubelet
```
Automatic merge from submit-queue
tests: update memory resource limits
```release-note
NONE
```
On ubuntu, the `RestartNever` test OOMs its cgroup limit fairly
frequently.
This bumps the number up to something suitably large since the test
isn't testing anything related to this anyways.
Fixes#36159
Fix based on https://github.com/kubernetes/kubernetes/issues/36159#issuecomment-258992255
cc @yujuhong @saad-ali
The test checks the individual kubelet /runningPods endpoint based on the
initial list of nodes it observes. It is important that all pods are
scheduled only onto those nodes. Apply node labels to ensure no stray pods on
other nodes.
On ubuntu, the `RestartNever` test OOMs its cgroup limit fairly
frequently.
This bumps the number up to something suitably large since the test
isn't testing anything related to this anyways.
Fixes#36159
Automatic merge from submit-queue
Use generous limits in the resource usage tracking tests
These tests are mainly used to catch resource leaks in the soak cluster. Using
higher limits to reduce noise.
This should fix#32942 and #32214.
Automatic merge from submit-queue
Add back e2e tests for disruption budget
The tests were temporarily removed due to problems with api versions in client-go. The test is almost exactly the same as it used to be before api version change.
cc: @caesarxuchao @davidopp
Automatic merge from submit-queue
Restore event messages for replica sets in the deployment controller
Needed to unblock release upgrade tests (see https://github.com/kubernetes/kubernetes/issues/36453)
@kubernetes/deployment ptal
Automatic merge from submit-queue
Node Conformance & E2E: Get node name from node object.
This PR changes the node e2e test framework to get node name from apiserver instead of test flags.
When a user tried out the node conformance test, he found that node conformance test will not work properly if kubelet is started with `hostname-override`.
The reason is that node conformance test is using [the default node name - `os.Hostname`](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L124), which may be different from `hostname-override`. This will cause test pods not scheduled, and eventually test timeout.
We can expose a flag from node conformance test, and let user set node name themselves if they are using `hostname-override` on kubelet. However, let the framework automatically detect it from apiserver is more user friendly.
/cc @kubernetes/sig-node
This PR 1) only changes node e2e test framework; 2) fixes a problem in node conformance test which is a 1.5 feature. @saad-ali Can we have this in 1.5?
Automatic merge from submit-queue
Add e2e node test for log path
fixes#34661
A node e2e test to check if container logs files are properly created with right content.
Since the log files under `/var/log/containers` are actually symbolic of docker containers log files, we can not use a pod to mount them in and do check (symbolic doesn't supported by docker volume).
cc @Random-Liu
Automatic merge from submit-queue
Migrates addons from RCs to Deployments
Fixes#33698.
Below addons are being migrated:
- kube-dns
- GLBC default backend
- Dashboard UI
- Kibana
For the new deployments, the version suffixes are removed from their names. Version related labels are also removed because they are confusing and not needed any more with regard to how Deployment and the new Addon Manager works.
The `replica` field in `kube-dns` Deployment manifest is removed for the incoming DNS horizontal autoscaling feature #33239.
The `replica` field in `Dashboard` Deployment manifest is also removed because the rescheduler e2e test is manually scaling it.
Some resource limit related fields in `heapster-controller.yaml` are removed, as they will be set up by the `addon resizer` containers. Detailed reasons in #34513.
Three e2e tests are modified:
- `rescheduler.go`: Changed to resize Dashboard UI Deployment instead of ReplicationController.
- `addon_update.go`: Some namespace related changes in order to make it compatible with the new Addon Manager.
- `dns_autoscaling.go`: Changed to examine kube-dns Deployment instead of ReplicationController.
Both of above two tests passed on my own cluster. The upgrade process --- from old Addons with RCs to new Addons with Deployments --- was also tested and worked as expected.
The last commit upgrades Addon Manager to v6.0. It is still a work in process and currently waiting for #35220 to be finished. (The Addon Manager image in used comes from a non-official registry but it mostly works except some corner cases.)
@piosz @gmarek could you please review the heapster part and the rescheduler test?
@mikedanese @thockin
cc @kubernetes/sig-cluster-lifecycle
---
Notes:
- Kube-dns manifest still uses *-rc.yaml for the new Deployment. The stale file names are preserved here for receiving faster review. May send out PR to re-organize kube-dns's file names after this.
- Heapster Deployment's name remains in the old fashion(with `-v1.2.0` suffix) for avoiding describe this upgrade transition explicitly. In this way we don't need to attach fake apply labels to the old Deployments.
Automatic merge from submit-queue
Enable NFS and GlusterFS tests in both node and cluster e2e tests
This PR is to enable NFS and GlusterFS tests on both node and cluster
e2e tests.
It also change the code to use ExecCommandInPod instead of kubectl since
node does not have kubectl available
This PR is to enable NFS and GlusterFS tests on both node and cluster
e2e tests for gci and containervm distro.
It also change the code to use ExecCommandInPod instead of kubectl since
node does not have kubectl available
Automatic merge from submit-queue
Extend test timeout for LB creation in large clusters
This will most probably be necessary to test 3000-node clusters.
Automatic merge from submit-queue
add e2e test for kubectl in a Pod
Add a e2e test to make sure kubectl can talk to the api server when it is mounted in a pod.
Fixes: #33138
Automatic merge from submit-queue
Make GCI nodes mount non tmpfs, ext* & bind mounts using an external mounter
This PR downloads the stage1 & gci-mounter ACIs as part of cluster bring up instead of downloading them dynamically from gcr.io, which was the cause for #36206.
I have also optimized the containerized mounter to pre-load the mounter image once to avoid fetch latency while using it.
Original PR which got reverted: https://github.com/kubernetes/kubernetes/pull/35821
```release-note
GCI nodes use an external mounter script to mount NFS & GlusterFS storage volumes
```
@mtaufen Node e2e is not re-enabled in this PR.
cc @jingxu97