Commit Graph

279 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
4b44926f90 Merge pull request #37325 from ivan4th/fix-e2e-with-complete-pods-in-kube-system-ns
Automatic merge from submit-queue (batch tested with PRs 37325, 38313, 38141, 38321, 38333)

Fix running e2e with 'Completed' kube-system pods

As of now, e2e runner keeps waiting for pods in `kube-system` namespace to be "Running and Ready" if there are any pods in `Completed` state in that namespace.
This for example happens after following [Kubernetes Hosted Installation](http://docs.projectcalico.org/v2.0/getting-started/kubernetes/installation/#kubernetes-hosted-installation) instructions for Calico, making it impossible to run conformance tests against the cluster. It's also to possible to reproduce the problem like that:
```
$ cat testjob.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: tst
  namespace: kube-system
spec:
  template:
    metadata:
      name: tst
    spec:
      containers:
      - name: tst
        image: busybox
        command: ["echo",  "test"]
      restartPolicy: Never
$ kubectl create -f testjob.yaml
$ go run hack/e2e.go -v --test --test_args='--ginkgo.focus=existing\s+RC'
```
2016-12-07 17:14:14 -08:00
Kubernetes Submit Queue
66f5d07e05 Merge pull request #38255 from bprashanth/svc_cleanup
Automatic merge from submit-queue

Delete regional static-ip instead of global for type=lb

Global vs region is the difference between 
```
$ gcloud compute addresses delete foo --global
$ gcloud compute addresses delete foo --region us-central1
```

Type=LoadBalancer users the second type and were were doing the first. 
Also adds some logging.
2016-12-07 13:38:50 -08:00
bprashanth
e1daee5b37 Delete regional static-ip instead of global for type=lb 2016-12-07 11:33:04 -08:00
Marcin Wielgus
eca87cfea8 Fix skip logic in e2e framework 2016-12-07 12:07:27 +01:00
Kubernetes Submit Queue
544ccd2e35 Merge pull request #38197 from kargakis/wait-for-readyrs-before-adopting
Automatic merge from submit-queue (batch tested with PRs 38173, 38151, 38197, 38221)

test: wait for ready replica set before adopting

Reworked version of https://github.com/kubernetes/kubernetes/pull/36439 which was reverted in https://github.com/kubernetes/kubernetes/pull/38049. This PR doesn't use any of the new status API added in replica sets so it should cause no trouble with upgrade tests.

@kubernetes/deployment @smarterclayton
2016-12-06 21:14:33 -08:00
Michail Kargakis
884b0a6f20 test: wait for ready replica set before adopting 2016-12-06 17:37:37 +01:00
Kubernetes Submit Queue
c3a2cc5370 Merge pull request #38185 from kargakis/restore-poll-for-test-util
Automatic merge from submit-queue

test: restore polling for stabilizing deployment tests

Discussed in 886052c225 (commitcomment-20081416)

@rmmh @janetkuo @wojtek-t ptal
2016-12-06 06:31:01 -08:00
gmarek
070f0979c2 Make it possible to run Load test using Deployments or ReplicaSets 2016-12-06 12:22:58 +01:00
Michail Kargakis
4949d61b39 test: restore polling for stabilizing deployment tests 2016-12-06 11:57:32 +01:00
Kubernetes Submit Queue
216a03749f Merge pull request #37261 from MrHohn/dnsautoscale-wait-pods
Automatic merge from submit-queue (batch tested with PRs 37328, 38102, 37261, 31321, 38146)

Fixes flake: wait for dns pods terminating after test completed

From #37194. Based on #36600. Please only look at the second commit.

As mentioned in [comment](https://github.com/kubernetes/kubernetes/issues/37194#issuecomment-262007174), "DNS horizontal autoscaling" test does not wait for the additional pods to be terminated and this may lead to the failure of later tests.

This fix adds a wait loop at the end of the serial test to ensure the cluster recovers to the original state. In the non-serial test it does not wait for the additional pods terminating because it will not affect other tests, given they are able to be run simultaneously. Plus wait for pods terminating will take certain amount of time.

Note this only fixes certain case of #37194. I noticed there are other failures irrelevant to dns autoscaler. LIke [this one](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-serial/34/).

@bprashanth @Random-Liu
2016-12-05 20:16:51 -08:00
Kubernetes Submit Queue
4fed3269dd Merge pull request #36352 from kargakis/fix-deployment-helper
Automatic merge from submit-queue (batch tested with PRs 36352, 36538, 37976, 36374)

test: update deployment helper to return better error messages

@kubernetes/deployment the problem with https://github.com/kubernetes/kubernetes/issues/36270 is that the selector key is never added in the deployment but this change would make it clearer.
2016-12-05 11:08:42 -08:00
Dr. Stefan Schimanski
458d2b2fe4 Add verb support for discovery client 2016-12-05 12:36:05 +01:00
Clayton Coleman
3454a8d52c
refactor: update bazel, codec, and gofmt 2016-12-03 19:10:53 -05:00
Clayton Coleman
5df8cc39c9
refactor: generated 2016-12-03 19:10:46 -05:00
Clayton Coleman
7aacb61fe1
Revert "test: update rollover test to wait for available rs before adopting"
This reverts commit 5b7bf78f3f.
2016-12-03 14:42:50 -05:00
Kubernetes Submit Queue
8bf1ae8313 Merge pull request #37094 from sjug/reshuffle_gobindata_dep
Automatic merge from submit-queue (batch tested with PRs 37094, 37663, 37442, 37808, 37826)

Moved gobindata, refactored ReadOrDie refs

**What this PR does / why we need it**: Having gobindata inside of test/e2e/framework prevents external projects from importing the framework. Moving it out and managing refs fixes this problem.

**Which issue this PR fixes**: fixes #37007
2016-12-03 04:27:46 -08:00
Kubernetes Submit Queue
0ca2b4cbf1 Merge pull request #37258 from timstclair/apparmor-test
Automatic merge from submit-queue (batch tested with PRs 37997, 37939, 37990, 36700, 37258)

Add cluster-level AppArmor E2E test

My goal is to reuse this test for an automated cluster upgrade test.
2016-12-02 19:26:52 -08:00
Kubernetes Submit Queue
d71154b910 Merge pull request #36439 from kargakis/update-rollover-test
Automatic merge from submit-queue

test: update rollover test to wait for available rs before adopting

Scenario that happened in https://github.com/kubernetes/kubernetes/issues/35355#issuecomment-257808460

-- Replica set that is about to be adopted has 2 out of 4 ready replicas
-- Deployment is created with 4 replicas, adopts pre-existing replica set, creates a new one, and starts rolling replicas over to the new replica set.
```
Nov  2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled down replica set test-rollover-controller to 3
Nov  2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set test-rollover-deployment-2505289747 to 1
Nov  2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment-2505289747: {replicaset-controller } SuccessfulCreate: Created pod: test-rollover-deployment-2505289747-iuiei
Nov  2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment-2505289747-iuiei: {default-scheduler } Scheduled: Successfully assigned test-rollover-deployment-2505289747-iuiei to gke-jenkins-e2e-default-pool-33c0400e-6q5m
Nov  2 01:38:17.088: INFO: At 2016-11-02 01:38:05 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set test-rollover-deployment-2505289747 to 2
```
At this point there is no minimum availability for the Deployment (maxUnavailable is 1 meaning desired minimum available is 3 but we only have 2), and the new replica set uses a non-existent image. New replica set is scaled up to 1 (maxSurge is 1), then old replica set is scaled down by one, because cleanupUnhealthyReplicas observes that it has 2 unhealthy replicas - it can only scale down one though because the [maximum replicas it can cleanup is one](d87dfa2723/pkg/controller/deployment/rolling.go (L125)) (4+1-3-1). New replica set is scaled to 2. Available replicas are still 2 (third replica from the old replica set has yet to come up).
-- Deployment is rolled over with a new update. Test reaches for the WaitForDeploymentStatus check but there are only 2 availableReplicas (maxUnavailable is still violated).

This change makes the test wait for a healthy replica set before proceeding thus it should never hit the scenario described above.

@kubernetes/deployment
2016-12-02 12:48:44 -08:00
Tim St. Clair
9ea7e0af26
Add cluster-level AppArmor E2E test 2016-12-02 10:37:28 -08:00
Sebastian Jug
79202656bc - Moved gobindata, refactored ReadOrDie refs
- Remaining spaghetti untangled
- Missed bazel update and a few hardcoded refs
- New instance of framework.ReadOrDie reference removed post rebase
- Resolve new clientset rebase
- Fixed e2e/generated BUILD dep
- A space
- Missed gobindata ref in golang.sh
2016-12-02 12:57:03 -05:00
Marcin Wielgus
cf92f1cdba Skip some disruption e2e test in big clusters 2016-12-01 14:26:38 +01:00
Zihong Zheng
c5e68b5451 Fixes flake: wait for dns pods terminating after test completed 2016-11-30 21:04:04 -08:00
Kubernetes Submit Queue
737edd02a4 Merge pull request #35258 from feiskyer/package-aliase
Automatic merge from submit-queue

Fix package aliases to follow golang convention

Some package aliases are not not align with golang convention https://blog.golang.org/package-names. This PR fixes them. Also adds a verify script and presubmit checks.

Fixes #35070.

cc/ @timstclair @Random-Liu
2016-11-30 16:39:46 -08:00
Pengfei Ni
f584ed4398 Fix package aliases to follow golang convention 2016-11-30 15:40:50 +08:00
Michael Taufen
f789ecdcaf Fix nil pointer dereference in test framework
Checking the result.Code prior to err in the if statement causes a panic
if result is nil. It turns out the formatting of the error is already in
IssueSSHCommandWithResult, so removing redundant code is enough to fix
the issue. Logging the SSH result was also redundant, so I removed that
as well.
2016-11-28 14:44:51 -08:00
Michail Kargakis
5b7bf78f3f test: update rollover test to wait for available rs before adopting 2016-11-28 14:00:59 +01:00
Ivan Shvedunov
29fd58ad0e Fix running e2e with 'Completed' kube-system pods 2016-11-24 19:10:43 +03:00
Clayton Coleman
35a6bfbcee
generated: refactor 2016-11-23 22:30:47 -06:00
Chao Xu
a55c71db4d test/e2e 2016-11-23 15:53:09 -08:00
Kris
1614b97725 Guard the ready replica checking by server version
This disables ready replica checking for 1.3 masters, but only from 1.4
or 1.5 clients. The old logic was broken anyway due to overlapping
labels with replica sets.
2016-11-22 11:16:15 -08:00
Jerzy Szczepkowski
69dc4b0fc5 E2E tests cleanup: moved generation of regexp for master replica to a separate function.
E2E tests cleanup: moved generation of regexp for master replica to a separate function.
2016-11-22 14:52:33 +01:00
Jerzy Szczepkowski
d01998f5fa Fixed e2e tests for HA master.
Set of fixes that allows HA master e2e tests to pass for removal/addition of master replicas.
2016-11-22 12:03:28 +01:00
Kubernetes Submit Queue
2b86ed3f90 Merge pull request #37077 from soltysh/issue34585
Automatic merge from submit-queue

Retry job update after failure to prevent modification conflict

This fixes #34585 flake.

@janetkuo || @kubernetes/sig-apps  ptal

I've been getting too many emails recently wrt to that issue, so I wanted to "clean" my inbox a bit 😉
2016-11-19 12:43:14 -08:00
Maciej Szulik
b253c20d80 Retry job update after failure to prevent modification conflict 2016-11-19 19:40:08 +01:00
Wojciech Tyczynski
d549ece3b2 Extend logging for unschedulable nodes 2016-11-18 14:49:32 +01:00
Kubernetes Submit Queue
08204bea62 Merge pull request #36849 from janetkuo/e2e-statefulset-update
Automatic merge from submit-queue

Add e2e test for statefulset updates

Verify that one can (manually) update statefulset template 

cc @erictune @foxish @kow3ns @kubernetes/sig-apps
2016-11-17 10:12:21 -08:00
gmarek
eecc840074 Delete taint annotation when removing last taint 2016-11-17 16:02:34 +01:00
Kris
5a87385342 Replace controller presence checking logic 2016-11-16 16:12:26 -08:00
Janet Kuo
45de9fbe34 Add e2e test for statefulset updates 2016-11-15 14:55:08 -08:00
Davanum Srinivas
d4a912208d Cleanup pod in MatchContainerOutput
MatchContainerOutput always creates a pod and does not cleanup. We need
to fix this to be better at re-trying the scenarios.

When there is an error say in the first attempt of ExpectNoErrorWithRetries
(for example in "Pods should contain environment variables for services" test)
the retries logic calls MatchContainerOutput another time and the
podClient.create fails correctly since the pod was not cleaned up the
first time MatchContainerOutput was called.

Fixes #35089
2016-11-14 20:36:13 -05:00
Kubernetes Submit Queue
1bc5b822cd Merge pull request #36479 from Random-Liu/node-e2e-node-name
Automatic merge from submit-queue

Node Conformance & E2E: Get node name from node object.

This PR changes the node e2e test framework to get node name from apiserver instead of test flags.

When a user tried out the node conformance test, he found that node conformance test will not work properly if kubelet is started with `hostname-override`.

The reason is that node conformance test is using [the default node name - `os.Hostname`](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L124), which may be different from `hostname-override`. This will cause test pods not scheduled, and eventually test timeout.

We can expose a flag from node conformance test, and let user set node name themselves if they are using `hostname-override` on kubelet. However, let the framework automatically detect it from apiserver is more user friendly.

/cc @kubernetes/sig-node 
This PR 1) only changes node e2e test framework; 2) fixes a problem in node conformance test which is a 1.5 feature. @saad-ali Can we have this in 1.5?
2016-11-10 18:56:53 -08:00
Random-Liu
1c70c899f7 Get node name from node object. 2016-11-10 14:25:50 -08:00
Zihong Zheng
68f7a739c0 Modifies Rescheduler e2e test for the new dashboard addon 2016-11-09 09:17:05 -08:00
Wojciech Tyczynski
3ed6ea96c0 Increase initialization timeout for podStore 2016-11-09 16:32:58 +01:00
Kubernetes Submit Queue
a2bf827e18 Merge pull request #36154 from m1093782566/m109-fix-del-ns
Automatic merge from submit-queue

fix e2e delete namespace bug

<!--  Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->

**What:**

Fix `CreateNamespace()` bug in `test/e2e/framework/framework.go`.

**Why:**

If we successfully create a namespace but fail to create a ServiceAccount in it, we should delete the namespace after test.

The current implement of `CreateNamespace()` int e2e test will forget to delete the namespace which we fail to create serviceAccount in it(but the namespace has been successfully created!).
2016-11-08 07:18:24 -08:00
Michail Kargakis
886052c225 test: update deployment helper to return better error messages 2016-11-07 11:43:28 +01:00
Kubernetes Submit Queue
a811515d34 Merge pull request #35691 from kargakis/controller-changes-for-perma-failed
Automatic merge from submit-queue

Controller changes for perma failed deployments

This PR adds support for reporting failed deployments based on a timeout
parameter defined in the spec. If there is no progress for the amount
of time defined as progressDeadlineSeconds then the deployment will be
marked as failed by a Progressing condition with a ProgressDeadlineExceeded
reason.

Follow-up to https://github.com/kubernetes/kubernetes/pull/19343

Docs at kubernetes/kubernetes.github.io#1337

Fixes https://github.com/kubernetes/kubernetes/issues/14519

@kubernetes/deployment @smarterclayton
2016-11-04 14:49:43 -07:00
Kubernetes Submit Queue
3adc580278 Merge pull request #36208 from bprashanth/curl_timeout
Automatic merge from submit-queue

Stricter timeouts for nodePort curling

If the timeouts are indeed because of  https://github.com/kubernetes/kubernetes/issues/34665#issuecomment-258021964, stricter timeouts will probably surface as a more isolated failure
2016-11-04 10:29:21 -07:00
Michail Kargakis
de8214ad4d test: e2e tests for perma-failed deployments 2016-11-04 13:36:46 +01:00
Kubernetes Submit Queue
929d3f74e8 Merge pull request #34645 from kargakis/rs-conditions-controller-changes
Automatic merge from submit-queue

Replica set conditions controller changes

Follow-up to https://github.com/kubernetes/kubernetes/pull/33905, partially addresses https://github.com/kubernetes/kubernetes/issues/32863.

@smarterclayton @soltysh @bgrant0607 @mfojtik I just need to add e2e tests
2016-11-04 04:21:10 -07:00