Commit Graph

7086 Commits

Author SHA1 Message Date
Jon Cope
29d4c1c1ce Extracted GCEPD pv tests 2017-03-28 09:38:36 -05:00
Jon Cope
60c235d573 Volume Provisioning E2E: test PVC delete causes PV delete
Removed wait for PVC phase Pending.

iterate test 100 times to increase chance of regression

Moved claim obj assignment out of loop.

add wait loop check for PVs

loop until no PVs detected

refactor per git comments

replace api calls with framework wrappers

add default suffix
2017-03-28 09:23:33 -05:00
Kubernetes Submit Queue
c9356c6af6 Merge pull request #43741 from wojtek-t/ignore_image_puller_errors
Automatic merge from submit-queue

Fix problems of not-starting image pullers

In e2e.go there are the following lines:
https://github.com/kubernetes/kubernetes/blob/master/test/e2e/e2e.go#L150
```
	if err := framework.WaitForPodsSuccess(c, metav1.NamespaceSystem, framework.ImagePullerLabels, imagePrePullingTimeout); err != nil {
		// There is no guarantee that the image pulling will succeed in 3 minutes
		// and we don't even run the image puller on all platforms (including GKE).
		// We wait for it so we get an indication of failures in the logs, and to
		// maximize benefit of image pre-pulling.
		framework.Logf("WARNING: Image pulling pods failed to enter success in %v: %v", imagePrePullingTimeout, err)
	}
```

However, few lines above:
https://github.com/kubernetes/kubernetes/blob/master/test/e2e/e2e.go#L143

we were waiting for all image pullers to actually enter Success state. It's pretty clear that the latter wasn't expected.

This PR is fixing this problem.

Ref #43728

@anhowe @davidopp
2017-03-28 05:40:35 -07:00
Kubernetes Submit Queue
7d040dd128 Merge pull request #43742 from crassirostris/cluster-logging-separate-dir
Automatic merge from submit-queue

Move cluster logging tests to a separate folder

Since there are several e2e tests for cluster logging and the infrastructure for them got complicated, it makes sense to move those tests to a separate folder.

Also, adding myself and Piotr to OWNERS of this directory as owners of the tests.
2017-03-28 04:54:43 -07:00
Mik Vyatskov
ecbf98531c Move cluster logging tests to a separate folder 2017-03-28 11:18:05 +02:00
Wojciech Tyczynski
2e3cd93fc8 Fix problems of not-starting image pullers 2017-03-28 11:05:57 +02:00
Kubernetes Submit Queue
f655c06b68 Merge pull request #43731 from bowei/flake-dns
Automatic merge from submit-queue

Move DNS configmap tests to slow, serial suites

These tests take a long time due to the ConfigMap update interval
and may briefly disrupt DNS resolution in the cluster.
2017-03-28 00:54:40 -07:00
Federico Gimenez
3e595d2321 replicaset adopt and release e2e tests 2017-03-28 09:12:01 +02:00
Kubernetes Submit Queue
d1bff958c2 Merge pull request #43727 from bowei/ping-instead-of-wget
Automatic merge from submit-queue

Use ping to ip instead of wget google.com in net connectivity check

This is a flakey test and this commit reduces the number of dependent
systems involved with the flake.
2017-03-27 19:05:39 -07:00
Bowei Du
1fe0a32007 Move DNS configmap tests to slow, serial suites
These tests take a long time due to the ConfigMap update interval
and may briefly disrupt DNS resolution in the cluster.
2017-03-27 17:39:11 -07:00
Bowei Du
d3e4959104 Use ping to ip instead of wget google.com in net connectivity check
This is a flakey test and this commit reduces the number of dependent
systems involved with the flake.
2017-03-27 17:18:55 -07:00
Kubernetes Submit Queue
4159cb57b6 Merge pull request #42835 from deads2k/server-01-remove-insecure
Automatic merge from submit-queue (batch tested with PRs 42835, 42974)

remove legacy insecure port options from genericapiserver

The insecure port has been a source of problems and it will prevent proper aggregation into a cluster, so the genericapiserver has no need for it.  In addition, there's no reason for it to be in the main kube-apiserver flow either.  This pull removes it from genericapiserver and removes it from the shared kube-apiserver code.  It's still wired up in the command, but its no longer possible for someone to mess up and start using in mainline code.

@kubernetes/sig-api-machinery-misc @ncdc
2017-03-27 17:00:21 -07:00
Kubernetes Submit Queue
dfbbb115dd Merge pull request #43383 from deads2k/server-10-safe-proxy
Automatic merge from submit-queue

proxy to IP instead of name, but still use host verification

I think I found a setting that lets us proxy to an IP and still do hostname verification on the certificate.  

@liggitt @sttts  Can you see if you agree that this knob does what I think it does?  Last commit only, still needs tests.
2017-03-27 16:01:06 -07:00
Kubernetes Submit Queue
c81e99d98d Merge pull request #41728 from vmware/e2eTestsUpdate-v3
Automatic merge from submit-queue (batch tested with PRs 41728, 42231)

Adding new tests to e2e/vsphere_volume_placement.go

**What this PR does / why we need it**:
Adding new tests to e2e/vsphere_volume_placement.go

Below is the tests description and test steps.

**Test Back-to-back pod creation/deletion with different volume sources on the same worker node**

1. Create volumes - vmdk2, vmdk1 is created in the test setup.
2. Create pod Spec - pod-SpecA with volume path of vmdk1 and NodeSelector set to label assigned to node1.
3. Create pod Spec - pod-SpecB with volume path of vmdk2 and NodeSelector set to label assigned to node1.
4. Create pod-A using pod-SpecA and wait for pod to become ready.
5. Create pod-B using pod-SpecB and wait for POD to become ready.
6. Verify volumes are attached to the node.
7. Create empty file on the volume to make sure volume is accessible. (Perform this step on pod-A and pod-B)
8. Verify file created in step 5 is present on the volume. (perform this step on pod-A and pod-B)
9. Delete pod-A and pod-B
10. Repeatedly (5 times) perform step 4 to 9 and verify associated volume's content is matching.
11. Wait for vmdk1 and vmdk2 to be detached from node.
12. Delete vmdk1 and vmdk2

**Test multiple volumes from different datastore within the same pod**

1. Create volumes - vmdk2 on non default shared datastore.
2. Create pod Spec with volume path of vmdk1 (vmdk1 is created in test setup on default datastore) and vmdk2.
3. Create pod using spec created in step-2 and wait for pod to become ready.
4. Verify both volumes are attached to the node on which pod are created. Write some data to make sure volume are accessible.
5. Delete pod.
6. Wait for vmdk1 and vmdk2 to be detached from node.
7. Create pod using spec created in step-2 and wait for pod to become ready.
8. Verify both volumes are attached to the node on which PODs are created. Verify volume contents are matching with the content written in step 4.
9. Delete POD.
10. Wait for vmdk1 and vmdk2 to be detached from node.
11. Delete vmdk1 and vmdk2

**Test multiple volumes from same datastore within the same pod**

1. Create volumes - vmdk2, vmdk1 is created in testsetup
2. Create pod Spec with volume path of vmdk1 (vmdk1 is created in test setup) and vmdk2.
3. Create pod using spec created in step-2 and wait for pod to become ready.
4. Verify both volumes are attached to the node on which pod are created. Write some data to make sure volume are accessible.
5. Delete pod.
6. Wait for vmdk1 and vmdk2 to be detached from node.
7. Create pod using spec created in step-2 and wait for pod to become ready.
8. Verify both volumes are attached to the node on which PODs are created. Verify volume contents are matching with the content written in step 4.
9. Delete POD.
10. Wait for vmdk1 and vmdk2 to be detached from node.
11. Delete vmdk1 and vmdk2


**Which issue this PR fixes** 
fixes #

**Special notes for your reviewer**:
Executed tests against K8S v1.5.3 release 

**Release note**:

```release-note
NONE
```

cc: @kerneltime @abrarshivani @BaluDontu @tusharnt @pdhamdhere
2017-03-27 11:54:25 -07:00
deads2k
cd29754680 move legacy insecure options out of the main flow 2017-03-27 14:07:54 -04:00
Kubernetes Submit Queue
5c2a8c252c Merge pull request #43641 from krousey/upgrades
Automatic merge from submit-queue

Better logging for PVC flakes

Addresses some logging ambiguities in https://github.com/kubernetes/kubernetes/issues/43610, but does not fix it.
2017-03-27 10:43:42 -07:00
deads2k
3414231672 proxy to IP instead of name, but still use host verification 2017-03-27 12:33:03 -04:00
Kubernetes Submit Queue
b705835bae Merge pull request #42911 from deads2k/server-04-combined
Automatic merge from submit-queue (batch tested with PRs 43694, 41262, 42911)

combine kube-apiserver and kube-aggregator

This combines several pulls currently in progress and wires them together.  The aggregator sits in front of the normal kube-apiserver and allows local fallthrough instead of proxying.

@kubernetes/sig-api-machinery-misc 
@DirectXMan12 since you seem invested, your life will get easier
@luxas FYI since you've started trying to wire something together.  



Dependent Pulls LGTM:
- [x] https://github.com/kubernetes/kubernetes/pull/42801
- [x] https://github.com/kubernetes/kubernetes/pull/42886
- [x] https://github.com/kubernetes/kubernetes/pull/42900
- [x] https://github.com/kubernetes/kubernetes/pull/42732
- [x] https://github.com/kubernetes/kubernetes/pull/42672
- [x] https://github.com/kubernetes/kubernetes/pull/43141
- [x] https://github.com/kubernetes/kubernetes/pull/43076
- [x] https://github.com/kubernetes/kubernetes/pull/43149
- [x] https://github.com/kubernetes/kubernetes/pull/43226
- [x] https://github.com/kubernetes/kubernetes/pull/43144
2017-03-27 09:30:24 -07:00
Kubernetes Submit Queue
8ce3e4ffef Merge pull request #43694 from crassirostris/cluster-logging-load-test-fix
Automatic merge from submit-queue

Fix Stackdriver cluster logging load test

Had only one node locally, didn't catch this bug
2017-03-27 09:17:45 -07:00
Kubernetes Submit Queue
b9f975de83 Merge pull request #43621 from MaciekPytel/ca_unflake_e2e
Automatic merge from submit-queue (batch tested with PRs 42900, 43044, 42896, 43308, 43621)

Fix flaky cluster-autoscaler e2e

This should fix 2 cluster-autoscaler flaky e2e. 

**Release note**:

```release-note
```
2017-03-27 08:32:31 -07:00
deads2k
8e26fa25da wire in aggregation 2017-03-27 09:44:10 -04:00
Maciej Pytel
b285e27ffa Fix flaky cluster-autoscaler e2e 2017-03-27 14:22:06 +02:00
deads2k
087a030221 require codecfactory 2017-03-27 08:19:08 -04:00
Mik Vyatskov
3c88704d48 Fix Stackdriver cluster logging load test 2017-03-27 11:45:43 +02:00
Timothy St. Clair
368cd0dd7c Sig-scalability test segregation to allow for assignment and cleaning, and
easier refactoring
2017-03-26 11:01:09 -05:00
Kubernetes Submit Queue
8d7ba2bea2 Merge pull request #42507 from crassirostris/cluster-logging-tests-uniform-load-distribution
Automatic merge from submit-queue (batch tested with PRs 43149, 41399, 43154, 43569, 42507)

Distribute load in cluster load tests uniformly

This PR makes cluster logging load tests distribute logging uniformly, to avoid situation, where 80% of pods are allocated on one node and overall results are worse then it could be.
2017-03-26 00:55:25 -07:00
Kubernetes Submit Queue
33012c03e0 Merge pull request #43421 from bizhao/master
Automatic merge from submit-queue (batch tested with PRs 43429, 43416, 43312, 43141, 43421)

Make e2e-dns test more stable by increasing wait time

**What this PR does / why we need it**:
In many cases, 60 seconds are not enough for generating all dns results.
Since the probeCmd takes up to 600 seconds, use 600 here too.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 
fixes #43264, #43094, #43100 

**Special notes for your reviewer**:

**Release note**:

NONE
2017-03-25 22:24:28 -07:00
Jonathan MacMillan
4517a73e98 [Federation] Fix a typo in the ReplicaSet E2E test: s/extrace/extract/ 2017-03-25 21:31:32 -07:00
Kubernetes Submit Queue
0afb06b600 Merge pull request #43083 from alejandroEsc/ae/osx/e2e_node_problem_detector
Automatic merge from submit-queue (batch tested with PRs 43378, 43216, 43384, 43083, 43428)

Darwin won't build: syscall.Sysinfo issue.

**What this PR does / why we need it**: On darwin had problems building and testing because of syscall.Sysinfo_t etc which is a linux specific command. 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

**Special notes for your reviewer**: Definitely would like another set of eyes on the bootTime function, it will have to be inaccurate but open to suggestions about improving this for darwin.


**Release note**:
```
NONE
```
2017-03-25 21:22:26 -07:00
Kubernetes Submit Queue
eb57e10684 Merge pull request #43361 from timothysc/remove_mesos_e2e
Automatic merge from submit-queue (batch tested with PRs 43144, 42671, 43226, 43314, 43361)

Removal of unused mesos e2e test.

**What this PR does / why we need it**:
Remove mesos e2e test which is not used. 

```
NONE
```

/cc @sttts @k82cn
2017-03-25 19:10:28 -07:00
Kubernetes Submit Queue
d7332315b0 Merge pull request #42959 from smarterclayton/test_clean
Automatic merge from submit-queue (batch tested with PRs 42998, 42902, 42959, 43020, 42948)

Delete host exec pods rapidly
2017-03-25 17:17:25 -07:00
Kubernetes Submit Queue
a9c8d97709 Merge pull request #42801 from deads2k/agg-25-local
Automatic merge from submit-queue

add local option to APIService

APIServices need an option to avoid proxying in cases where the groupversion is handled later in the chain.  This will allow a coherent and complete set of APIServices, but won't require extra connections.

@kubernetes/sig-api-machinery-misc @ncdc @cheftako
2017-03-25 15:12:19 -07:00
Kubernetes Submit Queue
4f606b9c8d Merge pull request #42820 from MrHohn/addon-kubemark-v6.4-beta.1
Automatic merge from submit-queue (batch tested with PRs 42672, 42770, 42818, 42820, 40849)

kubemark test: Bump addon-manager to v6.4-beta.1

Follow up PR of #42760. This PR bumps addon-manager to v6.4-beta.1 for kubemark test.

**Release note**:

```release-note
NONE
```
2017-03-25 14:27:27 -07:00
Kubernetes Submit Queue
2c5d0cb2c9 Merge pull request #43649 from csbell/ingress-timeout
Automatic merge from submit-queue (batch tested with PRs 43048, 43624, 43649)

[Federation][e2e] Ingress delays and service DNS issues

Ingress has been seen to take >10 minutes to allocate an IP in some circumstances (even more so in parallel testing). Also, due to issues with Services and DNS, disable those tests so we can get a green grid (see #43646)
2017-03-25 13:29:24 -07:00
Kubernetes Submit Queue
2a61cfb555 Merge pull request #43654 from perotinus/disablerebalancetest
Automatic merge from submit-queue (batch tested with PRs 43653, 43654)

[Federation] Disable the E2E test for federated replica set rebalancing

We are able to reproduce the flaky failure locally, and can debug without running this on the CI.
2017-03-24 21:44:20 -07:00
Jonathan MacMillan
3efa3d32d2 [Federation] Create a unique label and label selector for each replica set created by the replica sets E2E test. 2017-03-24 19:28:56 -07:00
Jonathan MacMillan
3412f5b8eb [Federation] Disable the E2E test for federated replica set rebalancing 2017-03-24 19:28:37 -07:00
Kubernetes Submit Queue
ae0faca7af Merge pull request #43170 from madhusudancs/fed-system-ns
Automatic merge from submit-queue (batch tested with PRs 43642, 43170, 41813, 42170, 41581)

Add the ability to customize federation system namespace in e2e turn up scripts.

**Release note**:

```release-note
NONE
```
2017-03-24 19:04:24 -07:00
Ryan Hitchman
9675c6935e Remove an accidentally duplicated line when adding a defer in #42157. 2017-03-24 17:04:07 -07:00
Nick Sardo
baab99b823 Adding load balancer src ranges; support flag overrides 2017-03-24 16:36:19 -07:00
Christian Bell
69ec314565 [Federation][e2e] Ingress delays and service DNS issues
Ingress has been seen to take >10 minutes to allocate an IP in
some circumstances (even more so in parallel testing). Also, due
to issues with Services and DNS, disable those tests so we can
get a green grid.
2017-03-24 16:23:27 -07:00
Kubernetes Submit Queue
57fbbc6104 Merge pull request #42545 from shiywang/e2e
Automatic merge from submit-queue (batch tested with PRs 42522, 42545, 42556, 42006, 42631)

add e2e test for `apply set/view last-applied` command

@pwittrock  @kubernetes/sig-cli-pr-reviews
2017-03-24 15:10:29 -07:00
Madhusudan.C.S
bec6b64db3 Add the ability to customize federation system namespace in e2e turn up scripts. 2017-03-24 14:42:50 -07:00
Kris
d9a84e624c Better logging for PVC flakes 2017-03-24 14:25:25 -07:00
Kubernetes Submit Queue
a34da7644f Merge pull request #42551 from shashidharatd/e2e-tests
Automatic merge from submit-queue (batch tested with PRs 42237, 42297, 42279, 42436, 42551)

Cleanup federation_util.go in e2e/framework

The only function GetValidDNSSubdomainName in  test/e2e/framework/federation_util.go is no longer used for some time now. so cleaning it up.

cc @kubernetes/sig-federation-pr-reviews @madhusudancs
2017-03-24 14:16:28 -07:00
Kubernetes Submit Queue
694088b3ad Merge pull request #42279 from copejon/fix-pvc-poll-error-message
Automatic merge from submit-queue (batch tested with PRs 42237, 42297, 42279, 42436, 42551)

Reword PVC polling message to log a more readable message.

**What this PR does / why we need it**:
Previous message used to report an error is misleading and poorly written.  This PR changes the log to be more readable.

```release-note
NONE
```
2017-03-24 14:16:25 -07:00
Kubernetes Submit Queue
fb537762fc Merge pull request #42297 from YuPengZTE/devErrorf
Automatic merge from submit-queue (batch tested with PRs 42237, 42297, 42279, 42436, 42551)

should replace errors.New(fmt.Sprintf(...)) with fmt.Errorf(...)

Signed-off-by: yupengzte <yu.peng36@zte.com.cn>



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-03-24 14:16:23 -07:00
Kubernetes Submit Queue
5136922fb2 Merge pull request #42157 from jsafrane/fix-zone-test
Automatic merge from submit-queue

Fix test for provisioning in unmanaged zone.

defer evaluates arguments of the deferred function immediately, so it actually
deleted a storage class and a claim before the test could do anything useful.

The test passed just accidentally, as the test is expected to time out. It
timed out from wrong reasons though.

@copejon @kubernetes/sig-storage-pr-reviews 

```release-note
NONE
```
2017-03-24 13:19:28 -07:00
Kubernetes Submit Queue
e0817ef832 Merge pull request #42381 from timchenxiaoyu/createdtypo
Automatic merge from submit-queue

fix created typo
2017-03-24 10:25:48 -07:00
Kubernetes Submit Queue
ff353231ec Merge pull request #42102 from timchenxiaoyu/kubltworderror
Automatic merge from submit-queue

kubelet word mistake
2017-03-24 10:25:06 -07:00
Jan Safranek
a569b14ee3 Fix test for provisioning in unmanaged zone.
defer evaluates arguments of the deferred function immediately, so it actually
deleted a storage class and a claim before the test could do anything useful.

The test passed just accidentally, as the test is expected to time out. It
timed out from wrong reasons though.
2017-03-24 10:06:37 +01:00
krousey
638044a1ec Revert "Fix the ETCD env vars for downgrade" 2017-03-23 15:29:13 -07:00
Kubernetes Submit Queue
6ed3bce7f4 Merge pull request #43546 from calebamiles/wip-bump-cni-ref
Automatic merge from submit-queue

Bump CNI consumers to v0.5.1

**What this PR does / why we need it**:
- vendored CNI plugins properly handle `DEL` on missing resources
- update CNI version refs

**Which issue this PR fixes**

fixes #43488

**Release note**:

`bumps CNI to version v0.5.1 where plugins properly handle DEL on non existent resources`
2017-03-23 14:13:05 -07:00
Dan Gillespie
2ed83e118f Disable Horizontal Pod Autoscaler upgrade tests until they can be debugged. They were causing upgrades to timeout 2017-03-23 10:10:58 -07:00
Kubernetes Submit Queue
68a858f7da Merge pull request #43574 from crassirostris/cluster-logging-gcl-reduce-load
Automatic merge from submit-queue

Increase delays between calling Stackdriver Logging API in e2e tests

Fix https://github.com/kubernetes/kubernetes/issues/43442

This is a temporary hack, proper solution will be implemented soon
2017-03-23 09:33:50 -07:00
Mik Vyatskov
e034987a71 Increase delays between calling Stackdriver Logging API in e2e tests 2017-03-23 16:29:35 +01:00
Mik Vyatskov
bf0f070f4c Check fluentd deployment befure running cluster logging e2e tests 2017-03-23 11:17:06 +01:00
caleb miles
f4d9bbc7d8
Bump CNI consumers to latest version
- vendored CNI plugins properly handle `DEL` on missing resources
- [based on v0.5.1](https://github.com/kubernetes/kubernetes/issues/43488#issuecomment-288525151)
2017-03-22 16:03:13 -07:00
Kubernetes Submit Queue
3705358c59 Merge pull request #43533 from krousey/downgrades
Automatic merge from submit-queue

Fix the ETCD env vars for downgrade
2017-03-22 14:24:03 -07:00
Kubernetes Submit Queue
5f39ef817e Merge pull request #43521 from jszczepkowski/hpa-e2e-retrans
Automatic merge from submit-queue (batch tested with PRs 43465, 43529, 43474, 43521)

Added retransmissions in service call by e2e resource consumer library.

Added retransmissions in service call by e2e resource consumer library.
Fixes #43187.

```release-note
NONE
```
2017-03-22 12:35:13 -07:00
Kris
da74b86b99 Fix the ETCD env vars for downgrade 2017-03-22 11:25:42 -07:00
Jerzy Szczepkowski
fd6b982bfb Added retransmissions in service call by e2e resource consumer library.
Added retransmissions in service call by e2e resource consumer library.
Fixes #43187.
2017-03-22 15:34:33 +01:00
Maciej Pytel
53df30f4c6 Fix Cluster-Autoscaler e2e failing on some node configs 2017-03-22 14:13:54 +01:00
Kubernetes Submit Queue
9dae6a734a Merge pull request #42930 from KarolKraskiewicz/influxdb-clientv2
Automatic merge from submit-queue

update influxdb dependency to v1.1.1 and change client to v2

**What this PR does / why we need it**:
1. it updates version of influxdb libraries used by tests to v1.1.1 to match version used by grafana
2. it switches influxdb client to v2 to address the fact that [v1 is being depricated](https://github.com/influxdata/influxdb/tree/v1.1.1/client#description)

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
cc @piosz 
1. [vendor/BUILD](https://github.com/KarolKraskiewicz/kubernetes/blob/master/vendor/BUILD)  didn't get regenerated after executing `./hack/godep-save.sh` so I left previous version.
Not sure how to trigger regeneration of this file.
2. `tests/e2e/monitoring.go` seem to be passing without changes, even after changing version of the client. 

**Release note**:

```release-note
```
2017-03-22 02:41:43 -07:00
Kubernetes Submit Queue
5c262ab82b Merge pull request #43419 from janetkuo/ds-e2e-node-selector-updates
Automatic merge from submit-queue (batch tested with PRs 43481, 43419, 42741, 43480)

Add e2e test for DaemonSet node selector updates

@kargakis @lukaszo @kubernetes/sig-apps-bugs
2017-03-21 19:29:25 -07:00
Kris
723d301b47 Add ETCD env vars for downgrade 2017-03-21 15:13:50 -07:00
Janet Kuo
791a10f37f Add e2e test for DaemonSet node selector updates 2017-03-21 14:02:17 -07:00
Mikhail Vyatskov
6122ff88fa Distribute load in cluster load tests uniformly 2017-03-21 15:58:36 +01:00
Kubernetes Submit Queue
681ba3a6d3 Merge pull request #43309 from MaciekPytel/ca_drain_e2e
Automatic merge from submit-queue

e2e test for cluster-autoscaler draining node

**What this PR does / why we need it**:
Adds an e2e test for Cluster-Autoscaler removing a node with a pod running (by rescheduling the pod).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
@mwielgus can you take a look?

**Release note**:

```release-note
```
2017-03-21 05:48:38 -07:00
Binbin Zhao
3b98c80891 Increase wait time of result check to 600 seconds
In many cases, 60 seconds are not enough for generating all dns results.
Since the probeCmd takes up to 600 seconds, use 600 here too.
2017-03-20 20:02:19 -07:00
Kubernetes Submit Queue
e3f6f14bf0 Merge pull request #43411 from csbell/test-timeouts-cleanup
Automatic merge from submit-queue

Unify test timeouts under a common name.

Some timeouts were too aggressive and since we've slowly been moving every controller to 5 minutes, consolidate everyone under ``federatedDefaultTestTimeout``. To aid in debugging some service-related issues, if a service cannot be deleted, we issue a kubectl describe on it prior to failing.
2017-03-20 19:07:26 -07:00
Kubernetes Submit Queue
e4536e85b8 Merge pull request #43399 from derekwaynecarr/fix_node_e2e
Automatic merge from submit-queue (batch tested with PRs 42452, 43399)

Fix faulty assumptions in summary API testing

**What this PR does / why we need it**:
1. on systemd, launch kubelet in dedicated part of cgroup hierarchy
1. bump allowable memory usage for busy box containers as my own local testing often showed values > 1mb which were valid per the memory limit settings we impose
1. there is a logic flaw today in how we report node.memory.stats that needs to be fixed in follow-on.

for the last issue, we look at `/sys/fs/cgroup/memory.stat[rss]` value which if you have global accounting enabled on systemd machines (as expected) will report 0 because nothing runs local to the root cgroup.  we really want to be showing the total_rss value for non-leaf cgroups so we get the full hierarchy of usage.
2017-03-20 15:23:35 -07:00
Christian Bell
273eb6b9b5 Unify test timeouts under a common name. 2017-03-20 14:53:40 -07:00
Derek Carr
5c8b957779 Fix faulty assumptions in summary API testing 2017-03-20 14:56:11 -04:00
Maciej Pytel
7f9b3b6358 e2e test for cluster-autoscaler draining node 2017-03-20 16:46:43 +01:00
Alejandro Escobar
ee741f23af added a space
bazel update

added new files to reflect that only one method has changed between arch types.

forgot to add changes to a commit.

changes made and gfmt run.

changed node_problem_detector to node_problem_detector_linux and made it linux only.

updated bazel
2017-03-20 06:40:27 -07:00
Kubernetes Submit Queue
38055983e0 Merge pull request #43294 from crassirostris/cluster-logging-test-simplifying
Automatic merge from submit-queue

Loosen requirements of cluster logging e2e tests, make them more stable

There should be an e2e test for cloud logging in the main test suite, because this is the important part of functionality and it can be broken by different components.

However, existing cluster logging e2e tests were too strict for the current solution, which may loose some log entries, which results in flakes. There's no way to fix this problem in 1.6, so this PR makes basic cluster logging e2e tests less strict.
2017-03-20 06:04:59 -07:00
Timothy St. Clair
8ec182066b Removal of unused mesos e2e test. 2017-03-19 19:21:21 -05:00
divyenpatel
57754d6c41 updated test/e2e/pvutil.go to use c.CoreV1() 2017-03-19 11:53:40 -07:00
divyenpatel
608638cb52 fixed common funcs to add support for new tests
added new tests to vsphere_volume_placement.go

addressed jeffvance's review comments
2017-03-19 11:53:30 -07:00
Kubernetes Submit Queue
ae6a5d2bf3 Merge pull request #42827 from madhusudancs/fed-rs-e2e-update-withname
Automatic merge from submit-queue (batch tested with PRs 43355, 42827)

[Federation] Rewrite ReplicaSet CRUD and Preferences tests.

I think `should create replicasets and rebalance them` test is still flaky. I still don't know the source of this flakiness. I will continue hunting. But it is a lot less flaky than before (or perhaps it even never passed before?). This PR could be merged now and flake hunting can happen in parallel.

```release-note
NONE
```
2017-03-19 10:49:46 -07:00
Madhusudan.C.S
cf0c84bc45 [Federation] Major rewrite of replicaset e2e tests. 2017-03-18 20:05:51 -07:00
Christian Bell
fb645a11a3 Increase timeout on e2e deployment 2017-03-17 22:05:09 -07:00
saadali
82d5244cd4 Mark failing PersistentVolumes:GCEPD tests flaky 2017-03-17 17:16:07 -07:00
Kubernetes Submit Queue
7b2a86ceb3 Merge pull request #43296 from jsafrane/v1-2-tests
Automatic merge from submit-queue

Use storage.k8s.io/v1 in tests instead of v1beta1

This is trimmed version of #42477 and contains only tests of the new storage API. Together with #43285 it passes all dynamic provisioning tests on my GCE.

I did not change vsphere_utils.go and vsphere_volume_diskformat.go as @divyenpatel runs master vsphere tests with Kubernetes 1.5 - @divyenpatel, did I get it right?

@kubernetes/sig-storage-pr-reviews, @msau42, @ethernetdan
```release-note
NONE
```
2017-03-17 16:28:05 -07:00
Kubernetes Submit Queue
edbed83790 Merge pull request #43313 from janetkuo/ds-e2e-no-update
Automatic merge from submit-queue (batch tested with PRs 43313, 43257, 43271, 43307)

In DaemonSet e2e tests, use Patch instead of Update to avoid conflict

Fixes #43310

@marun @kargakis @lukaszo @kubernetes/sig-apps-bugs
2017-03-17 15:12:29 -07:00
Kubernetes Submit Queue
f37cffcf4e Merge pull request #43239 from enisoc/kubectl-controller-ref
Automatic merge from submit-queue

kubectl: Use v1.5-compatible ownership logic when listing dependents.

**What this PR does / why we need it**:

This restores compatibility between kubectl 1.6 and clusters running Kubernetes 1.5.x. It introduces transitional ownership logic in which the client considers ControllerRef when it exists, but does not require it to exist.

If we were to ignore ControllerRef altogether (pre-1.6 client behavior), we would introduce a new failure mode in v1.6 because controllers that used to get stuck due to selector overlap will now make progress. For example, that means when reaping ReplicaSets of an overlapping Deployment, we would risk deleting ReplicaSets belonging to a different Deployment that we aren't about to delete.

This transitional logic avoids such surprises in 1.6 clusters, and does no worse than kubectl 1.5 did in 1.5 clusters. To prevent this when kubectl 1.5 is used against 1.6 clusters, we can cherrypick this change.

**Which issue this PR fixes**:

Fixes #43159

**Special notes for your reviewer**:

**Release note**:
```release-note
```
2017-03-17 14:25:38 -07:00
Yu-Ju Hong
d3a5eaa4c0 e2e: bump the memory limit for kubelet
The test is mainly for monitoring and tracking resource leaks. Bump the
limit to account for variations in different settings.
2017-03-17 11:44:32 -07:00
Janet Kuo
e42bac7cf8 In DaemonSet e2e tests, use Patch instead of Update to avoid conflict 2017-03-17 11:20:53 -07:00
Kubernetes Submit Queue
66483995b9 Merge pull request #43285 from jsafrane/fix-default-storage-class-test
Automatic merge from submit-queue (batch tested with PRs 42869, 43298, 43285)

Fix default storage class tests

Name of the default storage class is not "default", it must be discovered dynamically.

```release-note
NONE
```

This fixes flake `storageclasses.storage.k8s.io "default" not found` in #43261
2017-03-17 11:12:25 -07:00
Kubernetes Submit Queue
2739d6ca30 Merge pull request #43298 from piosz/heapster-bump
Automatic merge from submit-queue (batch tested with PRs 42869, 43298, 43285)

Bumped Heapster to v1.3.0

``` release-note
Bumped Heapster to v1.3.0.
More details about the release https://github.com/kubernetes/heapster/releases/tag/v1.3.0
```
2017-03-17 11:12:23 -07:00
Kubernetes Submit Queue
44146189e8 Merge pull request #43220 from MaciekPytel/montoring_e2e_timeout
Automatic merge from submit-queue

Add retry to monitoring e2e

**What this PR does / why we need it**:
Add retry to monitoring e2e to prevent it from failing because heapster have not yet been started after cluster creation.
@piosz @jszczepkowski 

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #43024

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-03-17 10:09:40 -07:00
Mik Vyatskov
c4660222e6 Loosen requirements of cluster logging e2e tests, make them more stable 2017-03-17 17:13:10 +01:00
Piotr Szczesniak
69fd7aafd0 Bumped Heapster to v1.3.0 2017-03-17 15:45:52 +01:00
Jan Safranek
e0ab1dcfe2 Use storage.k8s.io/v1 in tests instead of v1beta1 2017-03-17 14:08:39 +01:00
Maciej Pytel
5b5a914918 Retry getting RCs in monitoring e2e 2017-03-17 11:54:37 +01:00
Jan Safranek
3b1d1c40b4 Fix default storage class tests
Name of the default storage class is not "default", it must be discovered
dynamically.
2017-03-17 10:41:11 +01:00
Kubernetes Submit Queue
18582a611a Merge pull request #43254 from krousey/upgrades
Automatic merge from submit-queue (batch tested with PRs 43254, 43255, 43184, 42509)

Fix nil-map panic in AppArmor skip
2017-03-16 19:02:19 -07:00
Kris
7ccca44137 Fix nil-map panic in AppArmor skip 2017-03-16 15:57:10 -07:00
Yu-Ju Hong
056e343e03 node e2e: improve the validate OOM score test for infra containers
The test blindly checked all "pause" processes on the node, assuming
they were all infra containers. This change takes a snapshot of all
existing "pause" processes on the node, and exclude them in the
validation. The test still relies on the fact that it runs exclusively
on the node. If that assumption changes, we will need other methods to
locate the PIDs of the infra containers.
2017-03-16 15:39:03 -07:00
Kubernetes Submit Queue
5e0f0047dd Merge pull request #43242 from timstclair/summary-test
Automatic merge from submit-queue

Relax 'misc' container memory constraints

Fixes https://github.com/kubernetes/kubernetes/issues/40607

/cc @dchen1107
2017-03-16 15:25:23 -07:00
Kubernetes Submit Queue
81ac61488f Merge pull request #42773 from MrHohn/dns-autoscaler-testfix
Automatic merge from submit-queue

dns_autoscaling test: Don't expect configMap to be re-created with previous params

Fix #42766.

In dns_autoscaling e2e test, there is a test case that we delete the kube-dns-autoscaler configMap first and wait for it to be recreated with previous params. This is not necessarily true, because we will preserve the kube-dns-autoscaler configMap during upgrade but we may update the default params in kube-dns-autoscale binary, which means a configMap with new params will be created in the upgrade case.

One of the test failures could be found on: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gke-gci-1.5-gci-1.6-upgrade-cluster-new/18#k8sio-dns-horizontal-autoscaling-kube-dns-autoscaler-should-scale-kube-dns-pods-in-both-nonfaulty-and-faulty-scenarios

This PR removes the codes that make above assertion. 

@bowei @skriss

```release-note
NONE
```
2017-03-16 14:34:32 -07:00
Anthony Yeh
fa23729a6d kubectl: Use v1.5-compatible ownership logic when listing dependents.
In particular, we should not assume ControllerRefs are necessarily set.
However, we can still use ControllerRefs that do exist to avoid
interfering with controllers that do use it.
2017-03-16 12:28:38 -07:00
Tim St. Clair
827dd340d4
Relax 'misc' container memory constraints 2017-03-16 12:08:22 -07:00
Kubernetes Submit Queue
6656ffc300 Merge pull request #43165 from Random-Liu/update-npd
Automatic merge from submit-queue

Update npd to the official v0.3.0 release.

Update npd to the official release v0.3.0.

This also fixes a npd bug https://github.com/kubernetes/node-problem-detector/pull/98.

@dchen1107 @kubernetes/node-problem-detector-reviewers
2017-03-16 11:23:43 -07:00
Kubernetes Submit Queue
67a6c68d77 Merge pull request #43089 from krousey/upgrades
Automatic merge from submit-queue

Add guards for StatefulSet and AppArmor upgrade testing

This PR adds automated upgrade infrastructure to allow test suites to know what versions and node images are going to be testing and whether or not they should be skipped. It also adds a guard to prevent StatefulSets from being tested with versions prior to 1.5.0, and a guard to prevent AppArmor from running on distros other than gci and ubuntu.
2017-03-16 09:26:56 -07:00
Kubernetes Submit Queue
5139da2d95 Merge pull request #42928 from bsalamat/e2e_flake_predicates
Automatic merge from submit-queue (batch tested with PRs 43180, 42928)

Fix waitForScheduler in scheduer predicates e2e tests

**What this PR does / why we need it**: Fixes waitForScheduler in e2e to resolve flaky tests in scheduler_predicates.go

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #42691

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-03-15 19:58:22 -07:00
Kubernetes Submit Queue
ae828c9c6c Merge pull request #43157 from msau42/default-sc-test
Automatic merge from submit-queue (batch tested with PRs 43162, 43157)

Use beta default class annotation for default storageclass tests.

**What this PR does / why we need it**:
The default storageclasses are still installed with the beta annotation, so the test should explicitly use the beta annotation.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #43150

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-03-15 18:24:21 -07:00
Kubernetes Submit Queue
5bd730645c Merge pull request #43162 from fejta/own
Automatic merge from submit-queue

Add saad-ali and marun to test/OWNERS

/assign @saad-ali @marun 

Also ensure that approvers are in the reviewer list, and sort both lists.
2017-03-15 17:38:11 -07:00
Bobby Salamat
2775a52e7a Fix waitForScheduler in scheduer predicates e2e tests 2017-03-15 17:21:35 -07:00
Kris
3d05080982 Skip AppArmor tests on unsupported distros 2017-03-15 17:04:06 -07:00
Kris
c0ecd93801 Change the skipping mechanism to be more generic
And convert StatefulSet's version skipping to the new API.
2017-03-15 17:04:06 -07:00
Kubernetes Submit Queue
7e0b68239f Merge pull request #43115 from timstclair/summary-test
Automatic merge from submit-queue (batch tested with PRs 40964, 42967, 43091, 43115)

Add process debug information to summary test

Print out the processes in each system cgroup when the Summary API test fails, to help debug https://github.com/kubernetes/kubernetes/issues/40607

/cc @yujuhong @Random-Liu
2017-03-15 16:08:29 -07:00
Kubernetes Submit Queue
e4796bcf6b Merge pull request #43091 from gnufied/fix-dswp-flake
Automatic merge from submit-queue (batch tested with PRs 40964, 42967, 43091, 43115)

fixes dswp flake

Sometimes a pod may not appear in desired state
of world immediately, we poll before failing.

It only adds additional 30s to tests in worst case.

Fixes #42990 

cc @jingxu97
2017-03-15 16:08:27 -07:00
Random-Liu
c4b3fd4e63 Update npd to the official v0.3.0 release. 2017-03-15 14:26:12 -07:00
Erick Fejta
cd91502df0 Add saad-ali and marun to test/OWNERS 2017-03-15 13:25:11 -07:00
Michelle Au
3fd61fbfe1 Use beta default class annotation for default storageclass tests. 2017-03-15 11:44:14 -07:00
Kubernetes Submit Queue
355f576c0b Merge pull request #43107 from intelsdi-x/guarantee-watch-before-action
Automatic merge from submit-queue

Guarantee watch before action in e2e event observer helper function.

**What this PR does / why we need it**:

Adds a missing synchronization barrier to an e2e event observation helper function.

- This change should guarantee that in observeEventAfterAction,
  the action is only executed after the informer begins watching
  the event stream.

**Release note**:

```release-note
NONE
```

cc @kubernetes/sig-scheduling-pr-reviews @bsalamat
2017-03-15 11:09:20 -07:00
Kubernetes Submit Queue
fb243a4b57 Merge pull request #43117 from crassirostris/fix-es-cluster-logging-tests
Automatic merge from submit-queue (batch tested with PRs 40404, 43134, 43117)

Fix ES cluster logging test

Fix #37324

Test was broken because fluentd-gcp now parses golang and fluentd-es doesn't
2017-03-15 08:27:28 -07:00
Hemant Kumar
1a6c36da53 Fix dswp flake - from pod not showing in dswp
Sometimes a pod may not appear in desired state
of world immediately, we poll before failing.

It only adds additional 30s to tests in worst case.
2017-03-15 10:12:25 -04:00
Kubernetes Submit Queue
0afdcfcaf6 Merge pull request #43114 from enisoc/deployment-upgrade-test
Automatic merge from submit-queue

Fix Deployment upgrade test.

**What this PR does / why we need it**:

When the upgrade test operates on Deployments in a pre-1.6 cluster (i.e. during the Setup phase), it needs to use the v1.5 deployment/util logic. In particular, the v1.5 logic does not filter children to only those with a matching ControllerRef.

**Which issue this PR fixes**:

Fixes #42738

**Special notes for your reviewer**:

**Release note**:
```release-note
```
cc @kubernetes/sig-apps-pr-reviews
2017-03-15 07:03:19 -07:00
Kubernetes Submit Queue
df97434e8a Merge pull request #43110 from caesarxuchao/fix-42952
Automatic merge from submit-queue (batch tested with PRs 43106, 43110)

Wait for garbagecollector to be synced in test

Fix #42952

Without the `cache.WaitForCacheSync` in the test, it's possible for the GC to get a merged event of RC's creation and its update (update to deletionTimestamp != 0), before GC gets the creation event of the pod, so it's possible the GC will handle the foreground deletion of the RC before it adds the Pod to the dependency graph, thus the race.

With the `cache.WaitForCacheSync` in the test, because GC runs a single thread to process graph changes, it's guaranteed the Pod will be added to the dependency graph before GC handles the foreground deletion of the RC.

Note that this pull fixes the race in the test. The race described in the first point of #26120 still exists.
2017-03-15 06:14:21 -07:00
Kubernetes Submit Queue
8993397a1e Merge pull request #42846 from msau42/pd-flake
Automatic merge from submit-queue

Retry calls to ReadFileViaContainer in PD tests

**What this PR does / why we need it**:
kubectl exec occasionally fails to return a valid output string.  It seems to be an issue with docker #34256.  This PR retries the 'kubectl exec' call to workaround the issue.  This should fix the flaky PD test issues.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #28283

**Release note**:

NONE
2017-03-14 23:55:03 -07:00
Kubernetes Submit Queue
586fd3374f Merge pull request #43090 from foxish/fix-network-partition-flake
Automatic merge from submit-queue (batch tested with PRs 42854, 43105, 43090)

Add a timeout to allow replacement pod to become ready

Hopefully fixes https://github.com/kubernetes/kubernetes/issues/37259

```
I0314 04:26:02.562] Mar 14 04:26:02.562: INFO: Pod my-hostname-net-1bgrj still exists
I0314 04:26:22.491] Mar 14 04:26:22.491: INFO: Waiting for pod my-hostname-net-1bgrj to disappear
I0314 04:26:22.496] Mar 14 04:26:22.495: INFO: Pod my-hostname-net-1bgrj no longer exists
I0314 04:26:22.496] STEP: verifying whether the pod from the unreachable node is recreated
I0314 04:26:22.498] Mar 14 04:26:22.498: INFO: Pod name my-hostname-net: Found 3 pods out of 3
I0314 04:26:22.499] STEP: ensuring each pod is running
I0314 04:26:22.499] STEP: trying to dial each unique pod
I0314 04:26:22.579] Mar 14 04:26:22.579: INFO: Controller my-hostname-net: Got expected result from replica 1 [my-hostname-net-5jrdb]: "my-hostname-net-5jrdb", 1 of 3 required successes so far
I0314 04:26:22.642] Mar 14 04:26:22.642: INFO: Controller my-hostname-net: Got expected result from replica 2 [my-hostname-net-mjf3c]: "my-hostname-net-mjf3c", 2 of 3 required successes so far
I0314 04:31:22.645] Mar 14 04:31:22.644: INFO: Controller my-hostname-net: Failed to Get from replica 3 [my-hostname-net-rf46s]: Get https://35.184.87.178/api/v1/namespaces/e2e-tests-network-partition-s5gqt/pods/my-hostname-net-rf46s/proxy/: context deadline exceeded
```

The issue appears to be that we have a race between the pod being "running + ready" and being accessible via the APIServer proxy.


cc @kow3ns @bowei @davidopp
2017-03-14 18:44:22 -07:00
Kubernetes Submit Queue
8f9cba87a9 Merge pull request #43105 from intelsdi-x/reuse-sched-event-predicates
Automatic merge from submit-queue (batch tested with PRs 42854, 43105, 43090)

Move e2e sched event predicates to new file.

**What this PR does / why we need it**:

Small e2e test refactor for scheduler. Moves scheduler event predicates out of opaque_resource.go for reuse elsewhere.

**Release note**:

```release-note
NONE
```

cc @kubernetes/sig-scheduling-pr-reviews @timothysc @bsalamat
2017-03-14 18:44:20 -07:00
Mikhail Vyatskov
720b4c5c99 Fix ES cluster logging test 2017-03-14 17:48:58 -07:00
Tim St. Clair
9a1236ae20
Add process debug information to summary test 2017-03-14 17:45:12 -07:00
Anthony Yeh
0c015927c4 Fix Deployment upgrade test.
When the upgrade test operates on Deployments in a pre-1.6 cluster
(i.e. during the Setup phase), it needs to use the v1.5 deployment/util
logic. In particular, the v1.5 logic does not filter children to only
those with a matching ControllerRef.
2017-03-14 17:39:29 -07:00
Anirudh
7698196fa5 Add a timeout to allow replacement pod to become ready 2017-03-14 17:09:39 -07:00
Chao Xu
0605ba7a6d wait for garbagecollector to be synced in test 2017-03-14 16:19:33 -07:00
Connor Doyle
05696cfea1 Add sync barrier to event obs helper.
- This change should guarantee that in observeEventAfterAction,
  the action is only executed after the informer begins watching
  the event stream.
2017-03-14 16:01:59 -07:00
Kubernetes Submit Queue
1221779b18 Merge pull request #43018 from nicksardo/ingress-upgrade-cleanup-flake
Automatic merge from submit-queue (batch tested with PRs 43018, 42713)

Log instead of fail on GLBCs tendency to leak resources

**What this PR does / why we need it**:
Stops upgrade tests from flaking because the GLBC does not cleanup all resources due to a race condition.

**Which issue this PR fixes**: fixes #38569

**Special notes for your reviewer**:
To be reviewed by @mml 

```release-note
NONE
```
2017-03-14 15:59:18 -07:00
Connor Doyle
4f847cb440 Move e2e sched event predicates to new file. 2017-03-14 15:20:27 -07:00
Madhusudan.C.S
ce81111609 [Federation] Add a new helper function to create replicasets with name for updates in e2es. 2017-03-14 14:59:21 -07:00
Kubernetes Submit Queue
442e920085 Merge pull request #43029 from janetkuo/deployment-controllerRef-test
Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029)

Add e2e test for Deployment controllerRef orphaning and adoption

Follow up #42908 

@enisoc @kubernetes/sig-apps-bugs @kargakis
2017-03-14 13:52:46 -07:00
Kubernetes Submit Queue
42cdb052b6 Merge pull request #42968 from timothysc/sched_e2e_breakout
Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029)

Initial breakout of scheduling e2es to help assist in assignment and refactoring

**What this PR does / why we need it**:
This PR segregates the scheduling specific e2es to isolate the library which will assist both in refactoring but also auto-assignment of issues.  

**Which issue this PR fixes** 
xref: https://github.com/kubernetes/kubernetes/issues/42691#issuecomment-285563265

**Special notes for your reviewer**:
All this change does is shuffle code around and quarantine.  Behavioral, and other cleanup changes, will be in follow on PRs.  As of today, the e2es are a monolith and there is massive symbol pollution, this 1st step allows us to segregate the e2es and tease apart the dependency mess. 

**Release note**:

```
NONE
```

/cc @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-testing-pr-reviews @marun @skriss 

/cc @gmarek - same trick for load + density, etc.
2017-03-14 13:52:43 -07:00
Kris
111d9a3daf Skip StatefulSet tests for versions less than 1.5.0
PetSets were alpha in 1.4.0. They were renamed to StatefulSets and
graduated to beta starting in 1.5.0.
2017-03-14 12:12:49 -07:00
Kris
161d0f3ffd Add mechanism for upgrade tests to be skipped because of versions 2017-03-14 12:12:49 -07:00
Karol Kraśkiewicz
b6edff5cb7 bazel build files generated for new inflixdb library 2017-03-14 19:55:33 +01:00
Karol Kraśkiewicz
2a270e8f18 update influxdb to v1.1.1 and change client to v2 2017-03-14 19:50:43 +01:00
Kubernetes Submit Queue
0ea3e9a2c1 Merge pull request #43066 from foxish/fix-statefulset-apps
Automatic merge from submit-queue (batch tested with PRs 43034, 43066)

Fix StatefulSet apps e2e tests

Fixes https://github.com/kubernetes/kubernetes/issues/42490

```release-note
NONE
```

cc @kubernetes/sig-apps-bugs
2017-03-14 11:44:39 -07:00
Kubernetes Submit Queue
dc2b0ee2cf Merge pull request #43034 from enisoc/statefulset-patch
Automatic merge from submit-queue (batch tested with PRs 43034, 43066)

Allow StatefulSet controller to PATCH Pods.

**What this PR does / why we need it**:

StatefulSet now needs the PATCH permission on Pods since it calls into ControllerRefManager to adopt and release. This adds the permission and the missing e2e test that should have caught this.

**Which issue this PR fixes**:

**Special notes for your reviewer**:

This is based on #42925.

**Release note**:
```release-note
```
cc @kubernetes/sig-apps-pr-reviews
2017-03-14 11:44:37 -07:00
Kubernetes Submit Queue
f53ba5581b Merge pull request #43080 from foxish/foxish-patch-2
Automatic merge from submit-queue

Add rest of workloads team to test/OWNERS

```release-note
NONE
```

cc @kubernetes/sig-apps-misc
2017-03-14 10:19:39 -07:00
Kubernetes Submit Queue
6de28fab7d Merge pull request #42942 from vishh/gpu-cont-fix
Automatic merge from submit-queue (batch tested with PRs 42942, 42935)

[Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs

Fixes #42412

**Background**
Support for multiple GPUs is an experimental feature in v1.6. 
Container restarts were handled incorrectly which resulted in stranding of GPUs
Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods.

**What does this PR do**
This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs.
GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache.
Node e2e has been extended to validate this failure scenario.

**Risk**
Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet.

**Workarounds**
In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever`  in their pods.
There is no workaround for the race condition caused by using the runtime cache though.
Hence it is worth including this fix in v1.6.0.

cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews 

Replaces #42560
2017-03-14 10:19:17 -07:00
Anthony Yeh
53a6f4402f Allow StatefulSet controller to PATCH Pods.
Also add an e2e test that should have caught this.
2017-03-14 09:27:33 -07:00
Anirudh Ramanathan
5267f05be7 Add people to test/OWNERS 2017-03-14 08:52:08 -07:00
Anirudh
bcc73dbe1a Fix StatefulSet apps flakes 2017-03-14 02:44:55 -07:00
Timothy St. Clair
6cc40678b6 Initial breakout of scheduling e2es to help assist in both assignment
and refactoring.
2017-03-13 22:34:57 -05:00
Janet Kuo
c97935533a Add e2e test for Deployment controllerRef orphaning and adoption 2017-03-13 18:43:09 -07:00
Nick Sardo
3e85c0f758 Log instead of fail on GLBCs tendency to leak resources 2017-03-13 15:31:03 -07:00
Kubernetes Submit Queue
5913c5a453 Merge pull request #42925 from janetkuo/ds-adopt-e2e
Automatic merge from submit-queue

Allow DaemonSet controller to PATCH pods, and add more steps and logs in DaemonSet pods adoption e2e test

DaemonSet pods adoption failed because DS controller aren't allowed to patch pods when claiming pods. 

[Edit] This PR fixes #42908 by modifying RBAC to allow DaemonSet controllers to patch pods, as well as adding more logs and steps to the original e2e test to make debugging easier. 

Tested locally with a local cluster and GCE cluster. 
@kargakis @lukaszo @kubernetes/sig-apps-pr-reviews
2017-03-13 14:06:03 -07:00
Kubernetes Submit Queue
19574a10f2 Merge pull request #42906 from intelsdi-x/reuse-observer-helpers
Automatic merge from submit-queue (batch tested with PRs 42940, 42906, 42970, 42848)

Move node and event observer helpers to e2e/common

**What this PR does / why we need it**:

Moves existing test helper functions in OIR e2e tests to `test/e2e/common`. These functions wrap informers to help test writers to observe events instead of long-polling for status updates.

For usage examples, see `test/e2e/opaque_resource.go`.

cc @kubernetes/sig-scheduling-misc

**Release note**:
```release-note
NONE
```
2017-03-13 13:22:12 -07:00
Kubernetes Submit Queue
d60d965f33 Merge pull request #42940 from caesarxuchao/fix-gc-orphan-rs
Automatic merge from submit-queue (batch tested with PRs 42940, 42906, 42970, 42848)

Increase timeout for the orphan e2e test

Fix #42086.

Analysis of test logs are in https://github.com/kubernetes/kubernetes/issues/42086#issuecomment-285770868 and the following comments.

@deads2k PTAL, thanks!
2017-03-13 13:22:10 -07:00
Janet Kuo
287b962860 Add more steps and logs in DaemonSet pods adoption e2e test 2017-03-13 11:37:17 -07:00
Vishnu Kannan
8ed9bff073 handle container restarts for GPUs
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Kubernetes Submit Queue
ab9b299c30 Merge pull request #42915 from kubernetes/fabianofranz-test-approver
Automatic merge from submit-queue

Add fabianofranz as approver for test/e2e/kubectl.go

Adding myself as approver for `kubectl` end-to-end tests.

```release-note
NONE
```
2017-03-13 07:39:29 -07:00
deads2k
b27de102cb add local option to APIService 2017-03-13 10:10:50 -04:00
Connor Doyle
ba9410621f Move node and event observer helpers to e2e/common 2017-03-12 19:35:26 -07:00
Clayton Coleman
e3ca9570a5
Delete host exec pods rapidly
For some tests this makes namespace deletion faster (we poll for
resources at 1/2 their innate termination grace period), thus speeding
up the test.
2017-03-12 00:43:23 -05:00
Michail Kargakis
aec9f3151d Add/remove kargakis from a couple of places 2017-03-11 16:05:57 +01:00
Kubernetes Submit Queue
81ba4741f3 Merge pull request #42901 from fabianofranz/issues_42697
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)

Fixes kubectl skew test failure when using kubectl.sh

Fixes leftovers from https://github.com/kubernetes/kubernetes/pull/42737.

**Release note**:

```release-note
NONE
```
2017-03-10 22:02:20 -08:00
Kubernetes Submit Queue
8cb14a4f7f Merge pull request #42755 from aveshagarwal/master-fix-default-toleration-seconds
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)

Fix DefaultTolerationSeconds admission plugin

DefaultTolerationSeconds is not working as expected. It is supposed to add default tolerations (for unreachable and notready conditions). but no pod was getting these toleration. And api server was throwing this error:

```
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.769212   32070 admission.go:71] expected pod but got Pod
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.789055   32070 admission.go:71] expected pod but got Pod
Mar 08 13:44:02 fedora25 hyperkube[32070]: E0308 13:44:02.006784   32070 admission.go:71] expected pod but got Pod
Mar 08 13:45:39 fedora25 hyperkube[32070]: E0308 13:45:39.754669   32070 admission.go:71] expected pod but got Pod
Mar 08 14:48:16 fedora25 hyperkube[32070]: E0308 14:48:16.673181   32070 admission.go:71] expected pod but got Pod
```

The reason for this error is that the input to admission plugins is internal api objects not versioned objects so expecting versioned object is incorrect. Due to this, no pod got desired tolerations and it always showed:

```
Tolerations: <none>
```

After this fix, the correct  tolerations are being assigned to pods as follows:

```
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
```

@davidopp @kevin-wangzefeng @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scheduling-bugs @derekwaynecarr 

Fixes https://github.com/kubernetes/kubernetes/issues/42716
2017-03-10 22:02:18 -08:00
Kubernetes Submit Queue
ca09352dd9 Merge pull request #42349 from timstclair/aa-upgrade
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)

AppArmor cluster upgrade test

Add a cluster upgrade test for AppArmor. I still need to test this (having some trouble with the cluster-upgrade tests), but wanted to start the review process.

/cc @dchen1107 @roberthbailey
2017-03-10 22:02:16 -08:00
Kubernetes Submit Queue
328e555f72 Merge pull request #41794 from shashidharatd/federation-upgrade-tests-1
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)

[Federation][e2e] Add framework for upgrade test in federation

Adding framework for federation upgrade tests. please refer to #41791

cc @madhusudancs @nikhiljindal @kubernetes/sig-federation-pr-reviews
2017-03-10 22:02:15 -08:00
Chao Xu
a3f4053cb3 increase timeout for orphan e2e test 2017-03-10 18:13:48 -08:00
Christian Bell
9a37fe6dff [Federation] Deployments unaware of ReadyReplicas
The Deployment controller was not propagating ReadyReplicas to underlying clusters causing these errors:
```
Error syncing cluster controller: Deployment.apps "federation-deployment" is invalid: status.availableReplicas: Invalid value: 5: cannot be greater than readyReplicas
```

This was caught in e2e testing and is a 1.6 regression for support that was added in #37959. Without this fix, users will be unable to scale up their deployments.
2017-03-10 15:00:02 -08:00
Fabiano Franz
224ee822d4 Add fabianofranz as approver for test/e2e/kubectl.go 2017-03-10 18:13:43 -03:00
Kubernetes Submit Queue
e2218290cf Merge pull request #42444 from jingxu97/Mar/deleteVolume
Automatic merge from submit-queue (batch tested with PRs 42608, 42444)

Return nil when deleting non-exist GCE PD

When gce cloud tries to delete a disk, if the disk could not be found
from the zones, the function should return nil error. This modified behavior is also consistent with AWS
2017-03-10 12:50:24 -08:00
shashidharatd
a14f8dc346 auto generated bazel build files 2017-03-11 01:39:56 +05:30
shashidharatd
662f0ef531 Add framework for federation upgrade tests 2017-03-11 01:39:56 +05:30
shashidharatd
4443a1b40d Move few reusable functions to upgrade_utils.go 2017-03-11 01:39:56 +05:30
Fabiano Franz
adea540a5b Fixes kubectl skew test failure when using kubectl.sh 2017-03-10 15:25:38 -03:00
Kubernetes Submit Queue
f71492a9ac Merge pull request #42719 from gmarek/taint-test
Automatic merge from submit-queue (batch tested with PRs 36704, 42719)

Extend timeouts in taints test to account for slow Pod deletions

Fix #42685

Before merging this we need a consensus on what to do with slow Pod deletions.
2017-03-10 09:06:23 -08:00
Kubernetes Submit Queue
4ff0af821a Merge pull request #42879 from jsafrane/test-pod-logs
Automatic merge from submit-queue

e2e test: Log container output on TestContainerOutput error

When a pod started with TestContainerOutput or TestContainerOutputRegexp
fails from unknown reason, we should log all output of all its containers
so we can analyze what went wrong.

This would help us to see what wrong in https://github.com/kubernetes/kubernetes/issues/40811 - a container is running there for 3 minutes and dies and we want to see what it did for these 3 minutes.

```release-note
NONE
```
2017-03-10 06:13:44 -08:00
gmarek
4e5b4e7ee0 Extend timeouts in taints test to account for slow Pod deletions 2017-03-10 14:23:47 +01:00
Jan Safranek
bc06c636d1 e2e test: Log container output on TestContainerOutput error
When a pod started with TestContainerOutput or TestContainerOutputRegexp
fails from unknown reason, we should log all output of all its containers
so we can analyze what went wrong.
2017-03-10 10:08:57 +01:00
Avesh Agarwal
9f533de80d Fix DefaultTolerationSeconds admission plugin. It was using
versioned object whereas admission plugins operate on internal objects.
2017-03-09 20:24:43 -05:00
Random-Liu
f81460e35d Change the junit file name format to junit_image-name_id.xml,
and make the gci image name shorter.
2017-03-09 16:47:48 -08:00
Kubernetes Submit Queue
4540674b04 Merge pull request #42758 from krousey/downgrades
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

Implement automated downgrade testing.

Node version cannot be higher than the master version, so we must
switch the node version first. Also, we must use the upgrade script
from the appropriate version for GCE.
2017-03-09 15:06:56 -08:00
Kubernetes Submit Queue
7c08e817a5 Merge pull request #42734 from dashpole/deletion_timeout
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

Create DefaultPodDeletionTimeout for e2e tests

In our e2e and e2e_node tests, we had a number of different timeouts for deletion.
Recent changes to the way deletion works (#41644, #41456) have resulted in some timeouts in e2e tests.  #42661 was the most recent fix for this.
Most of these tests are not meant to test pod deletion latency, but rather just to clean up pods after a test is finished.
For this reason, we should change all these tests to use a standard, fairly high timeout for deletion.

cc @vishh @Random-Liu
2017-03-09 15:06:53 -08:00
Michelle Au
e5189a6dd3 Retry calls to ReadFileViaContainer in PD tests 2017-03-09 14:47:05 -08:00
Kubernetes Submit Queue
a22fac00dd Merge pull request #42833 from caesarxuchao/pod-deletion
Automatic merge from submit-queue

Don't wait for the final deletion of pod

The final deletion of the pod depends on kubelet and other components operating correctly. The purpose of this e2e test is verifying the clientset can handle deleteOptions correctly, so waiting for the deletionTimestamp and deletionGraceperiod get set is good enough.

In the long run, we should move this set of e2e tests to integration tests.

Fix #42724 #42646

cc @marun
2017-03-09 13:21:53 -08:00
Kris
cc84e0895a Implement automated downgrade testing.
Node version cannot be higher than the master version, so we must
switch the node version first. Also, we must use the upgrade script
from the appropriate version for GCE.
2017-03-09 12:45:20 -08:00
Chao Xu
130437b94e wait for the deletionTimestamp set instead of waiting for the final deletion 2017-03-09 11:35:51 -08:00
Zihong Zheng
34b8d008ec kubemark test: Bump addon-manager to v6.4-beta.1 2017-03-09 10:13:07 -08:00
Kubernetes Submit Queue
7b4bec038c Merge pull request #42805 from deads2k/client-01-flake-debug
Automatic merge from submit-queue

add debugging to the client watch test

Adds debugging information for https://github.com/kubernetes/kubernetes/issues/42724.  I suspect that the watch is closing early, but I'd like proof before I consider things like retrying the list and doing another watch to observe the delete.  I'm not even sure that would satisfy the test

It seems like a flaky way to build the test.  Why wouldn't we delete non-gracefully?

@kubernetes/sig-api-machinery-misc @caesarxuchao 
@wojtek-t saw you just hit this if you wanted to take a quick look at the debugging I added.
2017-03-09 08:20:45 -08:00
deads2k
ceb3e27fff add debugging to the client watch test 2017-03-09 09:27:41 -05:00
Kubernetes Submit Queue
cf732613e3 Merge pull request #42278 from marun/fed-api-fixture
Automatic merge from submit-queue (batch tested with PRs 42728, 42278)

[Federation] Create integration test fixture for api

This PR factors a reusable fixture for the federation api server out of the existing integration test.

Targets #40705

cc: @kubernetes/sig-federation-pr-reviews
2017-03-09 05:45:32 -08:00
Kubernetes Submit Queue
eefa2ef1bb Merge pull request #42425 from apprenda/kubeadm_189_docker_version
Automatic merge from submit-queue (batch tested with PRs 42762, 42739, 42425, 42778)

kubeadm: update docker version for CE and EE

**What this PR does / why we need it**: Update regex for docker version to also capture new CE and EE versions. 

**Which issue this PR fixes**: fixes #https://github.com/kubernetes/kubeadm/issues/189

**Special notes for your reviewer**: /cc @jbeda @luxas

**Release note**:
```release-note
NONE
```
2017-03-09 02:51:40 -08:00
Kubernetes Submit Queue
1bfb8e89b4 Merge pull request #42762 from csbell/e2e-dns-name
Automatic merge from submit-queue (batch tested with PRs 42762, 42739, 42425, 42778)

[Federation][e2e] Use correct default dns name in e2e-testing

After some kubefed changes, the environment variable did not get propagated and we defaulted back to 'federation' instead of 'e2e-federation'. This fixes ongoing service test issues in e2e.
2017-03-09 02:51:36 -08:00
Kubernetes Submit Queue
2828db8f89 Merge pull request #42357 from msau42/disable_storage_class
Automatic merge from submit-queue

Add default storageclass tests

**What this PR does / why we need it**:
Adds test cases for using and disabling the default storageclass.

**Release note**:

NONE
2017-03-09 01:39:42 -08:00
Kubernetes Submit Queue
24d84fa3fb Merge pull request #42754 from madhusudancs/fed-e2e-nodeport-random
Automatic merge from submit-queue (batch tested with PRs 42211, 38691, 42737, 42757, 42754)

[Federation] Generate a random nodePort for each service object in e2e tests.

We now run e2e tests in parallel in the CI environment and nodeports are a single available range of numbers for all the service objects, so they have to be unique for each service object.

```release-note
NONE
```
2017-03-08 18:52:32 -08:00
Kubernetes Submit Queue
6b36b3aa20 Merge pull request #42737 from fabianofranz/issues_42697
Automatic merge from submit-queue (batch tested with PRs 42211, 38691, 42737, 42757, 42754)

Fix failing kubectl skew tests

Fixes https://github.com/kubernetes/kubernetes/issues/42697

Skew kubectl tests [are broken](https://k8s-testgrid.appspot.com/release-1.6-upgrade-skew#gce-1.6-master-cvm-kubectl-skew&width=80) in "Simple pod should handle in-cluster config" for trying to copy the `kubectl.sh` script instead of the actual `kubectl` binary.


**Release note**:

```release-note
NONE
```
2017-03-08 18:52:28 -08:00
Kubernetes Submit Queue
3f5c13f305 Merge pull request #42211 from janetkuo/ds-e2e-template-generation
Automatic merge from submit-queue (batch tested with PRs 42211, 38691, 42737, 42757, 42754)

Add more e2e tests for DaemonSet templateGeneration and pod adoption

Depends on #42173

@erictune @kargakis @lukaszo @kubernetes/sig-apps-pr-reviews
2017-03-08 18:52:21 -08:00
Zihong Zheng
a98992060c dns_autoscaling e2e: Dont expect configMap to be re-created with previous params 2017-03-08 18:15:06 -08:00
timchenxiaoyu
767719ea9c fix across typo 2017-03-09 09:07:21 +08:00
Christian Bell
3fc13f0be6 [Federation][e2e] Use correct default dns name in e2e-testing
After some kubefed changes, the environment variable did not get propagated
and we defaulted back to 'federation' instead of
'e2e-federation'. This fixes ongoing service test issues in e2e.
2017-03-08 15:27:47 -08:00
David Ashpole
3806d386df use default timeout for deletion 2017-03-08 14:40:19 -08:00
Derek McQuay
35f07095d8
kubeadm: validators pass warnings and errors
This change allows validators to pass warnings as well as errors. This
was needed because of how support for docker 1.13+ and the new EE and CE
versions is currently being handled.
2017-03-08 14:35:26 -08:00
Madhusudan.C.S
a6a541e1eb [Federation] Generate a random nodePort for each service object in e2e tests.
We now run e2e tests in parallel in the CI environment and nodeports are
a single available range of numbers for all the service objects, so they
have to be unique for each service object.
2017-03-08 14:25:00 -08:00