Commit Graph

369 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
22fe385c13 Merge pull request #31653 from euank/kill-the-updates-for-the-nth-time
Automatic merge from submit-queue

test/node-e2e: Update CoreOS update disabling

Previously in this saga... #25004

This disables update-engine and locksmithd with ignition instead of
cloud-init so that they're really totally 100% disabled. Our ignition guy promises.

Pretty much every way of disabling them with cloud-init is mildly racy.

Fixes #31633 

I think @vishh can say "I told you so" after the comment on https://github.com/kubernetes/kubernetes/pull/30023#discussion-diff-73431324 .. he was right, but it turns out "stop" there doesn't really work either because of the mess that is cloud-init. Fortunately, converting our cloud-init to json and calling it "ignition" works quite well 😄 

Testing done: I ssh'd in and verified that yes, they're disabled. I didn't wait on the e2e tests to pass, so we'll let this PR check that.
2016-08-30 00:08:45 -07:00
Phillip Wittrock
956501b1f0 Merge pull request #31555 from mtaufen/eviction-pod-name
Memory eviction test podName must be lowercase
2016-08-29 17:42:10 -07:00
Euan Kemp
e58f3f61f8 test/node-e2e: Update CoreOS update disabling
This disables update-engine and locksmithd with ignition instead of
cloud-init so that they're really totally 100% disabled.

Pretty much every way of disabling them with cloud-init is mildly racy.

Fixes #31633
2016-08-29 16:11:40 -07:00
Kubernetes Submit Queue
e6df2db5c3 Merge pull request #31477 from freehan/cnibump
Automatic merge from submit-queue

bump cni to 9d5e6e6

fixes: #31348
2016-08-28 14:46:20 -07:00
Kubernetes Submit Queue
3c23d68b66 Merge pull request #31471 from timstclair/aa-beta
Automatic merge from submit-queue

[AppArmor] Promote AppArmor annotations to beta

Justification for promoting AppArmor to beta:

1. We will provide an upgrade path to GA
2. We don't anticipate any major changes to the design, and will continue to invest in this feature
3. We will thoroughly test it. If any serious issues are uncovered we can reevaluate, and we're committed to fixing them.
4. We plan to provide beta-level support for the feature anyway (responding quickly to issues).

Note that this does not include the yet-to-be-merged status annotation (https://github.com/kubernetes/kubernetes/pull/31382). I'd like to propose keeping that one alpha for now because I'm not sure the PodStatus is the right long-term home for it (I think a separate monitoring channel, e.g. cAdvisor, would be a better solution).

/cc @thockin @matchstick @erictune
2016-08-28 12:19:56 -07:00
Michael Taufen
d0f84dd4e7 podName must be lowercase 2016-08-26 15:12:34 -07:00
Minhan Xia
69e540e634 bump cni to 9d5e6e6 2016-08-26 13:13:24 -07:00
Michael Taufen
aa1d273584 Wait for the memory pressure condition to be absent before finishing the memory eviction test 2016-08-26 10:15:28 -07:00
Dawn Chen
24e81af7b3 Revert "Avoid disk eviction node e2e test using up all the disk space" 2016-08-26 08:59:42 -07:00
Kubernetes Submit Queue
cce68024e4 Merge pull request #31391 from ronnielai/container-gc
Automatic merge from submit-queue

Avoid disk eviction node e2e test using up all the disk space
2016-08-26 05:25:53 -07:00
Tim St. Clair
a5b7212453
Promote AppArmor annotations to beta 2016-08-25 15:40:32 -07:00
Kubernetes Submit Queue
863dd10ae4 Merge pull request #30540 from Random-Liu/refactor-node-e2e-framework
Automatic merge from submit-queue

Node Conformance Test: Refactor node e2e framework

For #30122, #30174.
Based on #30348.

**Please only review the last 3 commits.**

This PR is part of our roadmap to package node conformance test.
The 1st commit is from #30348, it removed unnecessary dependencies in the node e2e test framework, because we've statically linked these dependencies.

The PR refactored the node e2e framework. Moving different utilities into different packages under `pkg/`.

We need to do this because:
1) Files like e2e_remote.go and e2e_build.go should only be used by runner, but they were compiled into the test suite because they were placed in the same package. The worst thing is that it will introduce some never used flags in the test suite binary.
2) Make the directory structure more clear. Only test should be placed in `test/e2e_node`, other utilities should be placed in different packages in `pkg/`.

@dchen1107 @vishh 
/cc @kubernetes/sig-node @kubernetes/sig-testing
2016-08-25 14:06:56 -07:00
Kubernetes Submit Queue
a9a81219ef Merge pull request #31185 from coufon/log_throughput_benchmark
Automatic merge from submit-queue

add throughput in perf data and disable --cgroups-per-qos

This PR adds throughput data to printed perf data for benchmark. It also disables --cgrous-per-qos in jenkinds-benchmark.properties.
2016-08-25 04:05:20 -07:00
bindata-mockuser
d0577e7c74 Avoid disk eviction node e2e test using up all the disk space 2016-08-24 22:07:58 -07:00
Random-Liu
afb780d4ee Move utilities into different packages. Add local and remove runner. 2016-08-24 20:18:45 -07:00
Kubernetes Submit Queue
f0462c4043 Merge pull request #31200 from ronnielai/test1
Automatic merge from submit-queue

Skip disk eviction test on non-supported images.
2016-08-24 20:06:07 -07:00
Kubernetes Submit Queue
1952986a34 Merge pull request #30348 from Random-Liu/remove-unnecessary-binary-copy
Automatic merge from submit-queue

Node Conformance Test: Remove unnecessary binary copy

For #30122, #30174.

This PR removed unnecessary dependencies in the node e2e test framework, because we've statically linked these dependencies.

@dchen1107 @vishh 
/cc @kubernetes/sig-node @kubernetes/sig-testing
2016-08-24 14:35:34 -07:00
Michael Taufen
9fdb3f291a Stop fd leak in e2e_service.go
Previously this code used http.Get and failed to read/close resp.Body, which
prevented network connection reuse, leaking fds. Now we use http.Head
instead, because its response always has a nil Body, so we don't have to
worry about read/close.
2016-08-24 09:15:25 -07:00
Michael Taufen
2e989a3c38 Revert "Merge pull request #31297 from mikedanese/revert-kubelet"
This reverts the revert of #30090 and #31282.
2016-08-24 09:06:12 -07:00
Kubernetes Submit Queue
9b0d57ff95 Merge pull request #31164 from coufon/skip_benchmark_in_jenkins_serial
Automatic merge from submit-queue

skip benchmark in jenkins serial test

This PR changes jenkins-serial.properties to skip benchmark tests (with tag [Benchmark]) in jenkins serial tests. It also add more comments in run_e2e.go.
2016-08-24 03:32:51 -07:00
Tim St. Clair
a29ad353a6
Increase the AppArmor pod stop timeout to match the start timeout 2016-08-23 17:03:38 -07:00
Michael Taufen
e780bb5fbd Enable dynamic kubelet configuration during tests 2016-08-23 07:42:44 -07:00
Michael Taufen
2413ec4494 Restart Kubelet if it exits during e2e tests 2016-08-23 07:42:44 -07:00
Michael Taufen
085df61204 Node e2e test for Dynamic Kubelet Configuration 2016-08-22 22:45:23 -07:00
Random-Liu
e646dc6b9e Remove unnecessary code after e2e services are statically linked. 2016-08-22 20:51:24 -07:00
bindata-mockuser
7b70c23998 Make disk eviction test run correctly on all images. 2016-08-22 18:40:54 -07:00
Zhou Fang
ac379b038e add throughput in perf data and disable --cgroups-per-qos 2016-08-22 16:04:32 -07:00
Zhou Fang
c2d1a32597 skip benchmark in jenkins serial test 2016-08-22 14:19:40 -07:00
Random-Liu
eca3dc6cd5 Extend serial test suite timeout to 3 hours. 2016-08-21 23:40:05 -07:00
Kubernetes Submit Queue
5645ca749b Merge pull request #30941 from Random-Liu/remove-fatal-in-e2e-suite
Automatic merge from submit-queue

Node E2E: Remove fatal error in e2e_node_suite_test.go

Addresses https://github.com/kubernetes/kubernetes/issues/30779#issuecomment-240532190.

Currently we run node e2e test in parallel, and ginkgo makes sure that we only initialize test framework in the first test node.
However, because we throw out some fatal error during the initialization. Once there is an fatal error, the first test node will die immediately without reporting any error, and the other nodes will exit because the first node is gone with meaningless error.

If kubelet start fails, we'll get something like:
```
------------------------------
Failure [132.485 seconds]
[BeforeSuite] BeforeSuite 
/usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138

  BeforeSuite on Node 1 failed

  /usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138
------------------------------
......
------------------------------
Failure [132.465 seconds]
[BeforeSuite] BeforeSuite 
/usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138

  BeforeSuite on Node 1 failed

  /usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138
```

This PR replaces these fatal errors with gomega assertion, with this PR, we'll get:
```
Failure [132.482 seconds]
[BeforeSuite] BeforeSuite 
/usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138

  should be able to start node services.
  Expected success, but got an error:
      <*errors.errorString | 0xc8203351b0>: {
          s: "failed to run server start command \"/tmp/ginkgo869068712/e2e_node.test --run-services-mode --server-start-timeout 2m0s --report-dir  --node-name lantaol0.mtv.corp.google.com --disable-kubenet=true --cgroups-per-qos=false --manifest-path /tmp/node-e2e-pod221291440 --eviction-hard memory.available<250Mi\": exit status 255",
      }
      failed to run server start command "/tmp/ginkgo869068712/e2e_node.test --run-services-mode --server-start-timeout 2m0s --report-dir  --node-name lantaol0.mtv.corp.google.com --disable-kubenet=true --cgroups-per-qos=false --manifest-path /tmp/node-e2e-pod221291440 --eviction-hard memory.available<250Mi": exit status 255

  /usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:117
------------------------------
Failure [132.485 seconds]
[BeforeSuite] BeforeSuite 
/usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138

  BeforeSuite on Node 1 failed

  /usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138
------------------------------
......
------------------------------
Failure [132.465 seconds]
[BeforeSuite] BeforeSuite 
/usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138

  BeforeSuite on Node 1 failed

  /usr/local/google/home/lantaol/workspace/src/k8s.io/kubernetes/test/e2e_node/e2e_node_suite_test.go:138
```

This is much more informative.

/cc @kubernetes/sig-node
2016-08-21 18:21:22 -07:00
Kubernetes Submit Queue
08b3c6829e Merge pull request #30718 from Random-Liu/wait-node-ready-before-start-test
Automatic merge from submit-queue

Node E2E: Wait for node ready before the node e2e test started.

Fixes https://github.com/kubernetes/kubernetes/issues/30252.

This PR makes node e2e test wait for exactly one node ready before running other test.

@ronnielai @mtaufen
2016-08-21 12:42:03 -07:00
Kubernetes Submit Queue
37f8559c22 Merge pull request #31039 from coufon/add_benchmark_to_jenkins
Automatic merge from submit-queue

Add benchmark to jenkins

This PR contains the following changes:

1. Add more tests in density benchmark test;
2. Add the peak value (100%) in latency and CPU usage statistic data;
3. Move the Ginkgo focus flag from e2e_remote.go to run_e2e.go;
4. Support running benchmark in run_e2e.go. The benchmark configuration file is an extension of image configuration. Each item requires additional GCE machine type (e.g. n1-standard-1, default value will be used if empty) and test names (Ginkgo focus regex strings). A test item is regarded as benchmark if the tests field is non-empty.
2016-08-21 08:32:39 -07:00
Random-Liu
cb760a6ed4 Wait for node ready before the node e2e test started. 2016-08-20 17:54:23 -07:00
Random-Liu
dd6584a606 Statically link apiserver to node e2e. 2016-08-20 17:41:34 -07:00
Kubernetes Submit Queue
98c4029275 Merge pull request #30200 from Random-Liu/move-namespace-controller-to-services
Automatic merge from submit-queue

Node Conformance Test: Move namespace controller to services

For #30122, #30174.
Based on #30116, #30198.

**Please only review the 3rd PR.**

This PR is part of our roadmap to package node conformance test.
The 1st commit is from #30116, which started e2e services in a separate process.
The 2nd commit is from #30198, it statically linked etcd into the node e2e framework.

The 3rd commit is new, it moved namespace controller into e2e services.

@dchen1107 @vishh 
/cc @kubernetes/sig-node @kubernetes/sig-testing
2016-08-20 14:19:40 -07:00
Zhou Fang
b8ab0c50c5 add benchmark to jenkins 2016-08-20 09:43:42 -07:00
bindata-mockuser
1c47d9ddd0 Adding disk eviciton test to node e2e tests 2016-08-19 21:28:02 -07:00
Kubernetes Submit Queue
e9815020eb Merge pull request #30475 from derekwaynecarr/pod-cgroup
Automatic merge from submit-queue

Unblock iterative development on pod-level cgroups

In order to allow forward progress on this feature, it takes the commits from #28017 #29049 and then it globally disables the flag that allows these features to be exercised in the kubelet.  The flag can be re-added to the kubelet when its actually ready.

/cc @vishh @dubstack @kubernetes/rh-cluster-infra
2016-08-19 21:06:48 -07:00
Kubernetes Submit Queue
237db0363a Merge pull request #31035 from ixdy/e2e-service-account
Automatic merge from submit-queue

When running inside docker, activate service account ASAP

Also switching to just use `GOOGLE_APPLICATION_CREDENTIALS`, rather than both.

x-ref https://github.com/kubernetes/test-infra/issues/318
2016-08-19 17:22:34 -07:00
Kubernetes Submit Queue
8c430c2fd7 Merge pull request #30476 from mtaufen/eviction_fix
Automatic merge from submit-queue

Wait for memory to be reclaimed after node_e2e MemoryEviction test

This helps prevent interference with other tests that run immediately after the MemoryEviction test.

/cc @Random-Liu @coufon
2016-08-19 17:22:05 -07:00
Jeff Grafton
8278d1c2ef When running inside docker, activate service account ASAP
Additionally, remove activation code everywhere else, since we do that
already in Jenkins.
2016-08-19 15:41:33 -07:00
Kubernetes Submit Queue
b29023aa91 Merge pull request #30810 from mnshaw/gubernator-bugs
Automatic merge from submit-queue

Gubernator bug fixes: mv and GCS bucket permissions

Fixed issue where results file was not moved correctly, and also the permissions issue with the GCS bucket.

Will rebase after #30414 is merged

@timstclair
2016-08-19 14:34:56 -07:00
Zhou Fang
30eb6882f4 add peak (100%) lantecy and CPU usage in perf data 2016-08-19 14:33:38 -07:00
Zhou Fang
95eb9efb11 add more density benchmark tests 2016-08-19 14:03:21 -07:00
Michael Taufen
a227ce42f2 Wait for memory to be reclaimed after node_e2e MemoryEviction test
This helps prevent interference with other tests that run immediately
after the MemoryEviction test.
2016-08-19 13:58:03 -07:00
Kubernetes Submit Queue
5f7875a9bc Merge pull request #30786 from coufon/add_time_series
Automatic merge from submit-queue

Add logging time series to benchmark test

This PR adds a new file benchmark_util.go which contains tool functions for benchmark (we can migrate benchmark related functions into it). 

The PR logs time series data for density benchmark test.
2016-08-19 13:41:29 -07:00
derekwaynecarr
fde285cd8f Disable cgroups-per-qos flag until implementation is stabilized 2016-08-19 11:08:59 -04:00
Kubernetes Submit Queue
29e16d0174 Merge pull request #30913 from Random-Liu/fix-readiness-check
Automatic merge from submit-queue

Node E2E: Make readiness check handling process exits with 0 exit code.

As is mentioned by @mtaufen:
 "there is a problem with the way service `start` is currently implemented in test/e2e_node/e2e_service.go. If the Kubelet exits with status 0 before the health check completes, cmdErrorChan will be closed and, as a result, nil will be read from that channel, and you will return a nil error from `start`."

This PR changes the logic to:
1) If the err channel returns an error, return the error
2) If the err channel returns a nil, ignore it and continue checking readiness.
3) If the err channel is closed before readiness check succeeds, replace it with `blockCh` and continue checking readiness.

@mtaufen 
/cc @kubernetes/sig-node
2016-08-18 21:54:50 -07:00
Zhou Fang
58495b5214 add labels to perf dataset 2016-08-18 17:15:43 -07:00
Random-Liu
35aad1593f Remove fatal error in e2e_node_suite_test.go 2016-08-18 16:03:02 -07:00