Commit Graph

1199 Commits

Author SHA1 Message Date
tanshanshan
65b59474dc fix-todo 2017-09-25 15:42:21 +08:00
Serguei Bezverkhi
6201727935 Add support for skeleton in GetSigner
Adding support for skeleton to GetSigner to be able to run
e2e tests against a bare metal multinode cluster.
2017-09-24 20:26:28 -04:00
Kubernetes Submit Queue
8c29b6540b Merge pull request #52751 from MrHohn/e2e-service-cleanup-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Fix GCE LB resource cleanup for service e2e tests.

**What this PR does / why we need it**: Fix GCE LB resource cleanup logic.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #52347

**Special notes for your reviewer**:
/assign @shyamjvs @nicksardo 

**Release note**:

```release-note
NONE
```
2017-09-24 05:21:16 -07:00
Kubernetes Submit Queue
2e7efd3af3 Merge pull request #52485 from flix-tech/sig-test-45947-remove-flag
Automatic merge from submit-queue (batch tested with PRs 52485, 52443, 52597, 52450, 51971). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Removing PrometheusPushGateway --prom-push-gateway flag from e2e tests.

**What this PR does / why we need it**: Removing obsolete PrometheusPushGateway --prom-push-gateway flag from e2e tests.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #45947

**Special notes for your reviewer**:

**Release note**:

```release-note
Removing `--prom-push-gateway` flag from e2e tests
```
2017-09-23 18:48:50 -07:00
Kubernetes Submit Queue
7e7bcabe17 Merge pull request #52355 from davidz627/e2e_nil
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

E2E test to make sure controller does not crash because of nil volume spec

Fixes #49521

Tests fix of issue referenced in #49418
2017-09-23 15:25:07 -07:00
Kubernetes Submit Queue
044e79c714 Merge pull request #52134 from yujuhong/minor-test-fixes
Automatic merge from submit-queue (batch tested with PRs 50392, 52108, 52083, 52134, 51526). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

e2e: minor changes to network/service testing utils

Add more logging to help debug. Also refactor several functions to improve
reusability.
2017-09-23 07:14:05 -07:00
Hemant Kumar
381e334d87 Fix volume metric flake
Make sure we only run this test in environments
that support it.
2017-09-22 16:30:11 -04:00
Jiaying Zhang
ba40bee5c1 Modified test/e2e_node/gpu-device-plugin.go to make sure it passes. 2017-09-22 20:21:26 +02:00
Aleksandra Malinowska
88da2c1c70 refactor parsing cluster autoscaler status 2017-09-22 12:26:50 +02:00
Renaud Gaubert
6993612cec Added device plugin e2e kubelet failure test
Signed-off-by: Renaud Gaubert <renaud.gaubert@gmail.com>
2017-09-22 01:24:01 +02:00
Kubernetes Submit Queue
542486186f Merge pull request #52732 from shyamjvs/fix-metrics-perf-tests
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Increase api latency threshold for cluster-scoped list calls

Recent change from @smarterclayton (https://github.com/kubernetes/kubernetes/pull/52237) added scope to apiserver metrics. As a result, our current threshold for list calls is no longer sufficient for all-namespace calls which are now being measured separately from namespaced lists. For e.g (from our [last 5k run](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/37)):

```
WARNING Top latency metric: {Resource:pods Subresource: Verb:LIST Scope:cluster Latency:{Perc50:4.498374s Perc90:7.548079s Perc99:8.169389s Perc100:0s} Count:1400}
```

cc @kubernetes/sig-scalability-misc @kubernetes/sig-api-machinery-misc @wojtek-t
2017-09-21 10:49:54 -07:00
Shyam Jeedigunta
f373645865 Increase api latency threshold for cluster-scoped list calls 2017-09-21 13:33:22 +02:00
Kubernetes Submit Queue
654c522e4c Merge pull request #52477 from jamiehannaford/kubernetes-anywhere
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

Support kubernetes-anywhere provider

**What this PR does / why we need it**:

Implements a new `kubernetes-anywhere` provider to allow upgrade testing in the e2e binary. This is the final step to allow https://github.com/kubernetes/test-infra/pull/4495 and https://github.com/kubernetes/kubernetes-anywhere/pull/450.

**Which issue this PR fixes**:

https://github.com/kubernetes/kubeadm/issues/311

**Special notes for your reviewer**:

Some questions I had

- Does the `--provider` flag specified [here](dbbf6261e0/jobs/config.json (L8587)) get sent to the flag defined [here](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/test_context.go#L219)? Or should I add another `--provider` flag inside `--upgrade_args` like this: `--upgrade_args=... --provider=kubernetes-anywhere`?
- Is it necessary to add waiting logic after the `make` command, or will it implicitly handle that by itself?

Some other points:

- I chose `sed` to manipulate the current kubernetes-anywhere `.config` rather than duplicating another [`anywhere.go`](https://github.com/kubernetes/test-infra/blob/master/kubetest/anywhere.go). One suggestion was to use `jq` but since the config on disk is not serialized to JSON yet, I'm not sure how that'd work.
- Since I don't have a GCE/GKE account or vCenter, I can't actually verify the e2e binary works. I've managed to build it, but if somebody could quickly run a smoke test, I'd appreciate it. This is my first poke around test-infra and e2e, so there might be some plumbing missing

/cc @jessicaochen @luxas @pipejakob @roberthbailey
2017-09-20 15:20:47 -07:00
Zihong Zheng
5532e24280 Fix GCE LB resource cleanup for service e2e tests. 2017-09-19 15:42:41 -07:00
Jamie Hannaford
69f5feb295 Support kubernetes-anywhere provider 2017-09-15 11:13:08 +02:00
Kubernetes Submit Queue
93ddb7be5f Merge pull request #52237 from smarterclayton/watch_metric
Automatic merge from submit-queue (batch tested with PRs 51824, 50476, 52451, 52009, 52237)

Improve apiserver metrics reporting

Normalize "WATCHLIST" to "WATCH", add "scope" to the other metrics (listing 50k pods is != listing pods in a namespace), and add a new scope "resource" to cover individual resource calls.

This roughly aligns metrics with our ACL model (technically resource scope is GET, but POST to a subresource and POST to a namespace are not the same thing).

```release-note
WATCHLIST calls are now reported as WATCH verbs in prometheus for the apiserver_request_* series.  A new "scope" label is added to all apiserver_request_* values that is either 'cluster', 'resource', or 'namespace' depending on which level the query is performed at.
```
2017-09-15 01:08:11 -07:00
Kubernetes Submit Queue
5d995e3f7b Merge pull request #52372 from caesarxuchao/remove-config-copy
Automatic merge from submit-queue (batch tested with PRs 52376, 52439, 52382, 52358, 52372)

Remove the conversion of client config

It was needed because the clientset code in client-go was a copy of the clientset code in Kubernetes.. client-go is authoritative now, so we can remove the nasty copy.
2017-09-14 15:27:17 -07:00
Niels-Ole Kühl
56247c4e83 Removing PrometheusPushGateway --prom-push-gateway flag from e2e tests. 2017-09-14 14:13:31 +02:00
David Zhu
7e10741f94 E2E test to make sure controller does not crash because of nil volume spec. 2017-09-13 17:01:24 -07:00
Aleksandra Malinowska
c173296632 log gcloud command error 2017-09-13 11:56:55 +02:00
Chao Xu
6c5a8d5db9 Remove the conversion of client config, because client-go is authoratative now 2017-09-12 16:02:17 -07:00
Anthony Yeh
bff5f7e6b0 StatefulSet: Deflake e2e RunHostCmd more.
It turns out that at some points while the Node is recovering from a
reboot, we get a different kind of error ("unable to upgrade
connection"). Since we can't distinguish these transient errors from an
error encountered after successfully executing the remote command,
let's just retry all errors for 5min. If this doesn't work, I'm gonna
blame it on sig-node.
2017-09-12 10:12:46 -07:00
Clayton Coleman
30a92a8f0a Report scope in e2e test metrics 2017-09-11 22:13:55 -04:00
Kubernetes Submit Queue
ad0d36f0f0 Merge pull request #52111 from MrHohn/kube-proxy-upgrade-image
Automatic merge from submit-queue

Pipe in upgrade image target for kube-proxy migration tests

**What this PR does / why we need it**:
https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-upgrade-kube-proxy-ds&width=20
and
https://k8s-testgrid.appspot.com/sig-network#gci-gce-latest-downgrade-kube-proxy-ds&width=20
are still failing.

Reproduced it locally and found node image is being default to debian during upgrade (it was gci before upgrade) because we don't pass in `gci` via `--upgrade--target`. And for some reasons (haven't figured out yet), the upgraded node uses debian image with gci startupscripts...

This PR pipes in `--upgrade-target` for kube-proxy migration tests, hopefully in conjunction with https://github.com/kubernetes/test-infra/pull/4447 it will bring the tests back to normal.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #NONE 

**Special notes for your reviewer**:
Sorry for bothering again.
/assign @krousey 

**Release note**:

```release-note
NONE
```
2017-09-07 20:46:04 -07:00
Yu-Ju Hong
0b38495c42 e2e: minor changes to network/service testing utils
Add more logging to help debug. Also refactor several functions to improve
reusability.
2017-09-07 18:43:47 -07:00
Kubernetes Submit Queue
f4f21b3f06 Merge pull request #52054 from janetkuo/pause-dep-integra
Automatic merge from submit-queue (batch tested with PRs 52097, 52054)

Move paused deployment e2e tests to integration

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: xref #52113

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-09-07 15:28:25 -07:00
Zihong Zheng
0cb6471f35 Pipe in upgrade image target to kube-proxy migration tests 2017-09-07 13:39:27 -07:00
Kubernetes Submit Queue
507af4b9c2 Merge pull request #52057 from enisoc/sts-deflake
Automatic merge from submit-queue

StatefulSet: Deflake e2e RunHostCmd.

The initial retry up to 20s was giving up too soon. I'm seeing this test flake because the Node rebooted and it takes ~2min to recover. Now StatefulSet RunHostCmd calls will use the same 5min timeout as with other Pod state checks.

ref #48031
2017-09-07 11:42:32 -07:00
Janet Kuo
124344a1a4 Move paused deployment e2e tests to integration 2017-09-06 18:12:28 -07:00
Anthony Yeh
b4f639f57a StatefulSet: Deflake e2e RunHostCmd.
The initial retry up to 20s was giving up too soon.
I'm seeing this test flake because the Node rebooted and it takes ~2min
to recover.
Now StatefulSet RunHostCmd calls will use the same 5min timeout as with
other Pod state checks.
2017-09-06 17:51:11 -07:00
Yu-Ju Hong
bb50086b8f e2e: network tiers should retry on 404 errors
The feature is still Alpha and at times, the IP address previously used
by the load balancer in the test will not completely freed even after
the load balancer is long gone. In this case, the test URL with the IP
would return a 404 response. Tolerate this error and retry until the new
load balancer is fully established.
2017-09-06 13:16:28 -07:00
Jordan Liggitt
f61ac93a0d Fix dynamic discovery error in e2e 2017-09-05 23:01:54 -04:00
Kubernetes Submit Queue
1732a8b9bd Merge pull request #51562 from nicksardo/gce-attempt-firewall
Automatic merge from submit-queue (batch tested with PRs 51915, 51294, 51562, 51911)

GCE: Gracefully handle permission errors when attempting to create firewall rules

Purpose of this PR is to raise events from the GCE cloud provider if the GCE service account does not have the permissions necessary to create/update/delete firewall rules. 

Fixes #51812

**Release note**:
```release-note
NONE
```

Example Events:

```
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason                          Message
  ---------     --------        -----   ----                    -------------   --------        ------                          -------
  2m            2m              1       service-controller                      Normal          EnsuringLoadBalancer            Ensuring load balancer
  2m            2m              1       gce-cloudprovider                       Normal          LoadBalancerManualChange        Firewall change required by network admin: `gcloud compute firewall-rules create aa8a1dd628ddb11e78ce042010a80000 --network https://www.googleapis.com/compute/v1/projects/playground/global/networks/e2e-test-nicksardo --description "{\"kubernetes.io/service-name\":\"default/myechosvc1\", \"kubernetes.io/service-ip\":\"\"}" --allow tcp:9000 --source-ranges 0.0.0.0/0 --target-tags e2e-test-nicksardo-minion --project playground`
  2m            2m              1       gce-cloudprovider                       Normal          LoadBalancerManualChange        Firewall change required by network admin: `gcloud compute firewall-rules create k8s-1aee5045e658d174-node-hc --network https://www.googleapis.com/compute/v1/projects/playground/global/networks/e2e-test-nicksardo --description "" --allow tcp:10256 --source-ranges 130.211.0.0/22,35.191.0.0/16,209.85.152.0/22,209.85.204.0/22 --target-tags e2e-test-nicksardo-minion --project playground`
  1m            1m              1       service-controller                      Normal          EnsuredLoadBalancer             Ensured load balancer
```
2017-09-05 08:47:28 -07:00
Jordan Liggitt
5acd5b52f4 Tolerate group discovery errors in e2e ns cleanup 2017-09-04 17:31:17 -04:00
Nick Sardo
676b95e097 Gracefully handle permission errors when attempting to create firewall rules 2017-09-04 09:00:49 -07:00
cedric lamoriniere
1dbef2f113 Job failure policy support in JobController
Job failure policy integration in JobController. From the
JobSpec.BackoffLimit the JobController will define the backoff
duration between Job retry.

It use the ```workqueue.RateLimitingInterface``` to store the number of
"retry" as "requeue" and the default Job backoff initial duration is set
during the initialization of the ```workqueue.RateLimiter.

Since the number of retry for each job is store in a local structure
"JobController.queue" if the JobController restarts the number of retries
will be lost and the backoff duration will be reset to 0.

Add e2e test for Job backoff failure policy
2017-09-03 12:07:12 +02:00
Manjunath A Kumatagi
ee4d54c70c Port e2e tests for multi architecture 2017-09-01 05:40:52 +05:30
Manjunath A Kumatagi
22c3a590d1 Fix bazel 2017-09-01 05:39:00 +05:30
Kubernetes Submit Queue
022919d1a4 Merge pull request #51483 from yujuhong/e2e-net-tiers
Automatic merge from submit-queue

e2e: Add tests for network tiers in GCE

This test depends on #51301, which adds the new feature. Only the `e2e: Add tests for network tiers in GCE` commit is new.
#51301 should pass this new test.
2017-08-30 06:55:35 -07:00
Kubernetes Submit Queue
04bc4ec716 Merge pull request #50398 from pci/gcloud-compute-list
Automatic merge from submit-queue (batch tested with PRs 47054, 50398, 51541, 51535, 51545)

Switch away from gcloud deprecated flags in compute resource listings

**What is fixed**

Remove deprecated `gcloud compute` flags, see linked issue.

**Which issue this PR fixes**:

fixes #49673 

**Special notes for your reviewer**:

The change in `gcloudComputeResourceList` in `test/e2e/framework/ingress_utils.go` isn't strictly needed as currently no affected resources are called on within that file, however the function has the _potential_ to access affected resources so I covered it as well. Happy to change if deemed unnecessary.

**Release note**:

```release-note
NONE
```
2017-08-30 01:51:29 -07:00
Kubernetes Submit Queue
b4d08cb9b5 Merge pull request #50940 from MrHohn/kube-proxy-ds-upgrade-tests
Automatic merge from submit-queue (batch tested with PRs 51228, 50185, 50940, 51544, 51543)

Add upgrades tests for kube-proxy daemonset migration path

**What this PR does / why we need it**:
From #23225, this is a part of setting up CIs to validate the kube-proxy migration path (static pods -> daemonset and reverse).
The other part of the works (adding real CIs that run these tests) will be in a separate PR against [kubernetes/test-infra](https://github.com/kubernetes/test-infra).

Though this is currently blocked by #50705.

**Special notes for your reviewer**:
cc @roberthbailey  @pwittrock 

**Release note**:

```release-note
NONE
```
2017-08-29 23:54:30 -07:00
Kubernetes Submit Queue
01e961b380 Merge pull request #49749 from sbezverk/e2e_selinux_local_starage_test
Automatic merge from submit-queue (batch tested with PRs 51377, 46580, 50998, 51466, 49749)

Adding e2e SELinux test for local storage

Adding e2e test for SELinux enabled local storage
/sig storage
Closes #45054
2017-08-29 22:57:11 -07:00
Philip Ingrey
697f92a5d2 Switch away from gcloud deprecated flags in compute resource listings 2017-08-30 06:41:09 +01:00
Shyam JVS
36910232ab Merge pull request #51343 from shyamjvs/correct-cluster-ip-range
Correct default cluster-ip-range subnet
2017-08-30 01:31:50 +02:00
Shyam Jeedigunta
2df4698473 Correct default cluster-ip-range subnet 2017-08-29 23:15:23 +02:00
Zihong Zheng
5dc0845e36 Add upgrades tests for kube-proxy daemonset migration path 2017-08-29 10:16:37 -07:00
Kubernetes Submit Queue
25da6e64e2 Merge pull request #48454 from weiwei04/check-job-activeDeadlineSeconds
Automatic merge from submit-queue (batch tested with PRs 44719, 48454)

check job ActiveDeadlineSeconds

**What this PR does / why we need it**:

enqueue a sync task after ActiveDeadlineSeconds

**Which issue this PR fixes** *: 

fixes #32149

**Special notes for your reviewer**:

**Release note**:

```release-note
enqueue a sync task to wake up jobcontroller to check job ActiveDeadlineSeconds in time
```
2017-08-29 08:25:06 -07:00
Wei Wei
46239ea30b check job ActiveDeadlineSeconds 2017-08-29 20:15:11 +08:00
Andrzej Wasylkowski
0c1ab5597e Renamed ClusterSize and WaitForClusterSize to NumberOfReadyNodes and WaitForReadyNodes, respectively. 2017-08-29 11:53:17 +02:00
Andrzej Wasylkowski
9b0f4c9f7c Added an end-to-end test ensuring that Cluster Autoscaler does not scale up when all pending pods are unschedulable. 2017-08-29 11:52:26 +02:00