Commit Graph

1790 Commits

Author SHA1 Message Date
Random-Liu
fcfe4264fe Change the disk eviction test to pull images again after the test. 2016-09-21 15:54:03 -07:00
Dawn Chen
f1f16fe03a Update the containervm image to the latest one (container-v1-3-v20160604). 2016-09-21 10:24:22 -07:00
Klaus Ma
10e880684f Removed unused var. 2016-09-21 23:28:15 +08:00
Paul Morie
3539993ee0 Make node E2E tests more transparent 2016-09-20 21:55:41 -04:00
Kubernetes Submit Queue
0986a01f4f Merge pull request #33131 from Random-Liu/fix-node-e2e-for-cri
Automatic merge from submit-queue

Fix the properties file for node e2e cri validation.

I fixed this locally before, but accidentally missed in the PR. Sorry about that.

This time, I've tried myself, it should work.

@yujuhong
2016-09-20 17:09:30 -07:00
Kubernetes Submit Queue
6fd94968e1 Merge pull request #32738 from Amey-D/gci-version-v1.4
Automatic merge from submit-queue

Bump up GCI version.

```release-note
   Upgrading Container-VM base image for k8s on GCE. Brief changelog as follows:
    - Fixed performance regression in veth device driver
    - Docker and related binaries are statically linked
    - Fixed the issue of systemd being oom-killable
```

Fixes #32596

This needs a cherrypick into v1.4 release branch because it is fixing v1.4 release blocking issues. This patch is easy and safe to rollback in case of emergencies.

@vishh can you please review?

Fixes #32596 and many other issues.
cc/ @kubernetes/goog-image  FYI
2016-09-20 16:30:01 -07:00
Random-Liu
87d62d50ee Fix the properties file for node e2e cri validation. 2016-09-20 15:04:55 -07:00
Amey Deshpande
5da8486758 Bump up GCI version.
Brief changelog compared to gci-dev-54-8743-3-0:
- Fixed performance regression in veth device driver
- Docker and related binaries are statically linked
- Fixed the issue of systemd being oom-killable
- Updated built-in kubelet version to 1.3.7
- add ethtool and ebtables binaries expected by kubelet

Fixes #32596
2016-09-20 13:59:31 -07:00
Random-Liu
ae031634e4 Add CRI Validation test. The test run non-flaky, non-serial test against
Kubernetes HEAD and docker v1.11.2 with CRI enabled.
2016-09-20 12:18:07 -07:00
Kubernetes Submit Queue
c21fdc71a3 Merge pull request #32986 from Random-Liu/add-image-white-list
Automatic merge from submit-queue

Node E2E: Add image white list

This is part of #29081. Fixes #29155.

As is discussed with @yujuhong in #29155, it is difficult to maintain the prepull image list if it is not enforced. 

This PR added an image white list in the test framework, only images in the white list could be used in the test. If the image is not in the white list, the test will fail with reason:
```
Image "XXX" is not in the white list, consider adding it to CommonImageWhiteList in test/e2e/common/util.go or NodeImageWhiteList in test/e2e_node/image_list.go
```

Notice that if image pull policy is `PullAlways`, the image is not necessary to be in the white list or prepulled, because the test expects the image to be pulled during the test.

Currently, the image white list is only enabled in node e2e, because the image puller in e2e test is not integrated with the image white list yet.

/cc @kubernetes/sig-node
2016-09-20 07:28:58 -07:00
Random-Liu
08d74f33f6 Add client version. 2016-09-19 21:27:00 -07:00
Random-Liu
ed411c9042 Add image white list, images in white list will be prepulled, and
only images in white list could be used in the test. Currently only
enabled in node e2e test.
2016-09-19 14:39:23 -07:00
Random-Liu
dfcbdae178 Add image pull retry in image pulling test. 2016-09-19 14:18:37 -07:00
Kubernetes Submit Queue
3aa72fa480 Merge pull request #32926 from kubernetes/revert-32841-revert-32251-fix-oom-policy
Automatic merge from submit-queue

[kubelet] Fix oom-score-adj policy in kubelet

Fixes #32238 

We have been having this regression since v1.3. It is critical for GKE/GCE deployments of k8s because docker daemon has a high likelihood of being OOM killed which will end up nuking all containers. 
The reason for moving from mnt to pid is that docker daemon moves itself into a new mnt namespace with systemd based deployments.
2016-09-17 13:00:20 -07:00
Paul Morie
88acffcda1 Fix error message around gcloud calls in node e2e and gubernator 2016-09-17 01:05:20 -04:00
Vish Kannan
a1fe3adbc7 Revert "Revert "[kubelet] Fix oom-score-adj policy in kubelet"" 2016-09-16 16:32:58 -07:00
Kubernetes Submit Queue
d69cdce704 Merge pull request #32820 from coufon/change_collector_log
Automatic merge from submit-queue

change the error log for empty resource usage

This PR changes the error log for empty resource usage buffer for a container to be more clear. It happens when the container name is wrong, or cAdvisor somehow does not response.
2016-09-15 23:54:34 -07:00
Vish Kannan
492ca3bc9c Revert "[kubelet] Fix oom-score-adj policy in kubelet" 2016-09-15 19:28:59 -07:00
Kubernetes Submit Queue
fcc97f37ee Merge pull request #32718 from mikedanese/mv-informer
Automatic merge from submit-queue

move informer and controller to pkg/client/cache

@kubernetes/sig-api-machinery
2016-09-15 16:44:30 -07:00
Zhou Fang
3e16eb5082 change the error log for empty resource usage 2016-09-15 14:13:25 -07:00
Mike Danese
a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Vishnu kannan
e4acad7afb Fix oom-score-adj policy in kubelet.
Docker daemon and kubelet needs to be protected by setting oom-score-adj to -999.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-14 11:56:10 -07:00
Kubernetes Submit Queue
312acd9e30 Merge pull request #32342 from coufon/get_image_machine_info_from_apiserver
Automatic merge from submit-queue

Get image and machine info from apiserver in node e2e test

This PR changes node e2e test to get image and machine information from API server instead of pass them from Jenkins test framework. The original format to pass image and machine info is naming the test node as "machine-image-uuid", which is hard to parse because "-" occurs a lot in both machine and image names.

Now we add two labels "image" and "machine" into performance data. The machine type has the format "cpu:1core,memory:3.6GB".

This PR is based on #32250.
2016-09-14 03:34:45 -07:00
Zhou Fang
b47e22d013 endable DynamicKubeletConfig in benchmark test properties 2016-09-13 13:59:19 -07:00
Zhou Fang
a683eb0418 get image and machine info from api server instead of passing from test
# Please enter the commit message for your changes. Lines starting
2016-09-13 08:41:29 -07:00
Kubernetes Submit Queue
0ca6506850 Merge pull request #32250 from coufon/increase_qps
Automatic merge from submit-queue

Add node e2e density test using 60 QPS for benchmark

This PR adds a new benchmark node e2e density test which sets Kubelet API QPS limit from default 5 to 60, through ConfigMap. 

The latency caused by API QPS limit is as large as ~30% when creating a large batch of pods (e.g. 105). It makes the pod startup latency, as well creation throughput underestimated. This test helps us to know the real performance of Kubelet core.
2016-09-12 20:27:11 -07:00
Michael Taufen
28db03869b Fix memory eviction test parameters. Those parameters should NOT have come through in b9f0bd95 2016-09-12 12:01:53 -07:00
Zhou Fang
a6500cc74a change benchmark configration file to add QPS60 tests 2016-09-12 11:46:38 -07:00
Kubernetes Submit Queue
af325ee7bf Merge pull request #31797 from aveshagarwal/master-dapi-volume-tests-image-update
Automatic merge from submit-queue

Update container image version for downward api volume tests

Some tests were using 0.7, and some were using 0.6, so updating all to 0.7.
@kubernetes/rh-cluster-infra
2016-09-12 01:22:27 -07:00
Kubernetes Submit Queue
469698a803 Merge pull request #32169 from ixdy/node-e2e-flake
Automatic merge from submit-queue

Make error more useful when failing to list node e2e images

To help investigate https://github.com/kubernetes/kubernetes/issues/31694 if it happens again.
2016-09-11 05:07:00 -07:00
Kubernetes Submit Queue
51d996e5d7 Merge pull request #32003 from Random-Liu/change-docker-validation-config-file
Automatic merge from submit-queue

Automated Docker Validation: Change wrong name in perf config.

The config key `containervm-density*` is improper, remove it.

/cc @coufon
2016-09-10 17:58:23 -07:00
Kubernetes Submit Queue
09efe0457d Merge pull request #32163 from mtaufen/more-eviction-logging
Automatic merge from submit-queue

Log pressure condition, memory usage, events in memory eviction test

I want to log this to help us debug some of the latest memory eviction test flakes, where we are seeing burstable "fail" before the besteffort. I saw (in the logs) attempts by the eviction manager to evict besteffort a while before burstable phase changed to "Failed", but the besteffort's phase appeared to remain "Running". I want to see the pressure condition interleaved with the pod phases to get a sense of the eviction manager's knowledge vs. pod phase.
2016-09-09 18:37:55 -07:00
Michael Taufen
b9f0bd959e Log the following items in memory eviction test:
- memory working set
- pressure condition
- events for the default and test namespaces, after the test completes
2016-09-09 13:42:26 -07:00
Kubernetes Submit Queue
e317af87cc Merge pull request #31819 from mtaufen/plumb-feature-gates
Automatic merge from submit-queue

Plumb --feature-gates from TEST_ARGS to components in node e2e tests

This means you can set `TEST_ARGS` on the command line, in a `.properties` config for a Jenkins job, etc, to toggle gated features. For example:

`TEST_ARGS='--feature-gates=DynamicKubeletConfig=true'`

/cc @vishh @jlowdermilk
2016-09-09 12:31:00 -07:00
Zhou Fang
a1d2f43e0e add benchmark test using 60 QPS 2016-09-07 17:51:51 -07:00
Kubernetes Submit Queue
0bd0d5571a Merge pull request #31540 from mtaufen/DockerOrDieRename
Automatic merge from submit-queue

Rename ConnectToDockerOrDie to CreateDockerClientOrDie

This function does not actually attempt to connect to the docker daemon, it just creates a client object that can be used to do so later. The old name was confusing, as it implied that a failure to touch the docker daemon could cause program termination (rather than just a failure to create the client).
2016-09-07 15:27:41 -07:00
Jeff Grafton
1e0cbbf451 Make error more useful when failing to list node e2e images 2016-09-06 17:16:53 -07:00
Minhan Xia
1e88c99e3e bump cni 2016-09-06 10:48:36 -07:00
Kubernetes Submit Queue
2a7d0df30d Merge pull request #30727 from asalkeld/iptables-caps
Automatic merge from submit-queue

Clean up IPTables caps i.e.: sed -i "s/Iptables/IPTables/g"

Fixes #30651
2016-09-06 09:01:27 -07:00
Kubernetes Submit Queue
631fda19a1 Merge pull request #31769 from Random-Liu/wrong-log-permission-bit
Automatic merge from submit-queue

Node E2E: Fix wrong permission bit for log file.

When creating log for logs from journald, we use `0755` which is weird to me.
This PR changes it to `0666`.
2016-09-06 01:24:12 -07:00
Random-Liu
35cabc370e Fix wrong permission bit for log file. 2016-09-02 14:05:18 -07:00
Vishnu kannan
59e14cf55b Increase logging level for e2e node services
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-02 14:05:09 -07:00
Random-Liu
d36f7220da Change wrong name in perf config. 2016-09-02 13:48:10 -07:00
Kubernetes Submit Queue
0cdaf1028e Merge pull request #31912 from mtaufen/eviction-aftereach
Automatic merge from submit-queue

Move wait for pressure to subside to AfterEach 

so we still wait if the test part of the test for eviction order fails.
2016-09-02 13:15:36 -07:00
Zhou Fang
bb0a0c0fe9 increase resource usage limit in node e2e test 2016-09-02 09:12:40 -07:00
Avesh Agarwal
4ba39b4722 Update mounttest container image version to 0.7 everywhere. 2016-09-02 09:03:11 -04:00
Kubernetes Submit Queue
ca98482da9 Merge pull request #31945 from Random-Liu/prepull-common-test-image
Automatic merge from submit-queue

Add images used in common test into prepull image list.

Addresses https://github.com/kubernetes/kubernetes/issues/31774#issuecomment-243800830.

Fixes #31774, #31579.

This PR added all images in common test into the node e2e prepull list.
Mark P2 because this help get rid of image pulling flake.

@yujuhong
2016-09-02 01:36:55 -07:00
Random-Liu
76b35cf387 Add images used in common test into prepull image list. 2016-09-01 21:20:49 -07:00
Random-Liu
5420138378 Add automated docker performance validation. 2016-09-01 19:18:20 -07:00
Michael Taufen
3ad9a1e423 Move wait for pressure to subside to AfterEach so we still wait if the test for eviction order fails 2016-09-01 14:39:43 -07:00