Commit Graph

456 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
19a2a10354 Merge pull request #33389 from Random-Liu/lifecycle-hook
Automatic merge from submit-queue

CRI: Fix lifecycle hook and add container lifecycle node e2e test

This PR:
1) Adds pod spec missing handling in kuberuntime. (1st commit)
2) Adds container lifecycle hook node e2e test. (2nd commit)

@yujuhong @feiskyer
2016-09-26 10:48:35 -07:00
Kubernetes Submit Queue
66d67ee41d Merge pull request #33178 from k82cn/remove_unused_var
Automatic merge from submit-queue

Removed unused var.
2016-09-25 21:30:59 -07:00
Kubernetes Submit Queue
1654fd4041 Merge pull request #33105 from k82cn/k8s_33091
Automatic merge from submit-queue

Fixed e2e_node build error on Mac.

fixes #33091 

cc @vishh , any suggestion to avoid duplicated codes?
2016-09-24 22:02:13 -07:00
Klaus Ma
849400abf9 Fix build error on Mac. 2016-09-25 11:15:42 +08:00
Random-Liu
f501acebab Add the monitorParent option when starting services, and set
monitorParent to false when stop-services=false.
2016-09-24 19:45:19 -07:00
Random-Liu
0d3befd7ea Split services.go into services.go, internal_services.go and server.go. 2016-09-24 19:45:19 -07:00
Random-Liu
5eb41e9acb Add container lifecycle hook test. 2016-09-23 17:13:19 -07:00
Kubernetes Submit Queue
071927a59d Merge pull request #32549 from smarterclayton/gc_non_kube_legacy
Automatic merge from submit-queue

Allow garbage collection to work against different API prefixes

The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.

Allows OpenShift to use the GC
2016-09-23 14:06:35 -07:00
Kubernetes Submit Queue
76d15d193d Merge pull request #33236 from dchen1107/test1
Automatic merge from submit-queue

Fix node performance benchmark by using latest containervm image (docker 1.11.2)

Also add two more tests for resource tracking. 

cc/ @Random-Liu @coufon
2016-09-23 04:50:36 -07:00
Kubernetes Submit Queue
1f7e79afbf Merge pull request #33066 from Random-Liu/set-docker-client-version
Automatic merge from submit-queue

Add docker client version.

Addressed https://github.com/kubernetes/kubernetes/issues/29478#issuecomment-248197665.

This partially reverted #31540, because currently we are really trying to connect to docker daemon when creating the client.

This PR updated docker client with real docker apiversion with `UpdateClientVersion`, so that the version related logic of engine-api can work properly, such as https://github.com/docker/engine-api/pull/174/files.

@yujuhong @feiskyer
2016-09-22 19:09:14 -07:00
Clayton Coleman
97c35fcc67 Allow garbage collection to work against different API prefixes
The GC needs to build clients based only on Resource or Kind. Hoist the
restmapper out of the controller and the clientpool, support a new
ClientForGroupVersionKind and ClientForGroupVersionResource, and use the
appropriate one in both places.
2016-09-22 15:00:58 -04:00
Kubernetes Submit Queue
34c61bdba6 Merge pull request #33201 from Random-Liu/disk-eviction-recover-images
Automatic merge from submit-queue

Node E2E: Change the disk eviction test to pull images again after the test.

Fixes https://github.com/kubernetes/kubernetes/issues/32022#issuecomment-248677706.

This PR changes the disk eviction test to pull test images again in `AfterEach`, because images may be evicted during the test.

@yujuhong 
/cc @kubernetes/sig-node
2016-09-22 10:20:42 -07:00
Dawn Chen
3a5ce7f3cd Add resource tracking with 0 pods and 35 pods to node performance benchmark. 2016-09-22 09:22:56 -07:00
Dawn Chen
33343dc4e2 Node performance benchmark test using the latest containervm image. 2016-09-22 09:22:56 -07:00
Kubernetes Submit Queue
db07433782 Merge pull request #33063 from pmorie/node-e2e
Automatic merge from submit-queue

Make node E2E tests more transparent

Add some logging and minor code reorg to make the node E2E tests a little more transparent and understandable.
2016-09-22 08:22:11 -07:00
Kubernetes Submit Queue
03c698ce44 Merge pull request #33194 from dchen1107/master
Automatic merge from submit-queue

Update the containervm image to the latest one (container-v1-3-v20160…

Node e2e is running with old containervm image which only has docker 1.9.1. This pr fixed such issue.
2016-09-21 20:40:02 -07:00
Random-Liu
fcfe4264fe Change the disk eviction test to pull images again after the test. 2016-09-21 15:54:03 -07:00
Dawn Chen
f1f16fe03a Update the containervm image to the latest one (container-v1-3-v20160604). 2016-09-21 10:24:22 -07:00
Klaus Ma
10e880684f Removed unused var. 2016-09-21 23:28:15 +08:00
Paul Morie
3539993ee0 Make node E2E tests more transparent 2016-09-20 21:55:41 -04:00
Kubernetes Submit Queue
0986a01f4f Merge pull request #33131 from Random-Liu/fix-node-e2e-for-cri
Automatic merge from submit-queue

Fix the properties file for node e2e cri validation.

I fixed this locally before, but accidentally missed in the PR. Sorry about that.

This time, I've tried myself, it should work.

@yujuhong
2016-09-20 17:09:30 -07:00
Kubernetes Submit Queue
6fd94968e1 Merge pull request #32738 from Amey-D/gci-version-v1.4
Automatic merge from submit-queue

Bump up GCI version.

```release-note
   Upgrading Container-VM base image for k8s on GCE. Brief changelog as follows:
    - Fixed performance regression in veth device driver
    - Docker and related binaries are statically linked
    - Fixed the issue of systemd being oom-killable
```

Fixes #32596

This needs a cherrypick into v1.4 release branch because it is fixing v1.4 release blocking issues. This patch is easy and safe to rollback in case of emergencies.

@vishh can you please review?

Fixes #32596 and many other issues.
cc/ @kubernetes/goog-image  FYI
2016-09-20 16:30:01 -07:00
Random-Liu
87d62d50ee Fix the properties file for node e2e cri validation. 2016-09-20 15:04:55 -07:00
Amey Deshpande
5da8486758 Bump up GCI version.
Brief changelog compared to gci-dev-54-8743-3-0:
- Fixed performance regression in veth device driver
- Docker and related binaries are statically linked
- Fixed the issue of systemd being oom-killable
- Updated built-in kubelet version to 1.3.7
- add ethtool and ebtables binaries expected by kubelet

Fixes #32596
2016-09-20 13:59:31 -07:00
Random-Liu
ae031634e4 Add CRI Validation test. The test run non-flaky, non-serial test against
Kubernetes HEAD and docker v1.11.2 with CRI enabled.
2016-09-20 12:18:07 -07:00
Kubernetes Submit Queue
c21fdc71a3 Merge pull request #32986 from Random-Liu/add-image-white-list
Automatic merge from submit-queue

Node E2E: Add image white list

This is part of #29081. Fixes #29155.

As is discussed with @yujuhong in #29155, it is difficult to maintain the prepull image list if it is not enforced. 

This PR added an image white list in the test framework, only images in the white list could be used in the test. If the image is not in the white list, the test will fail with reason:
```
Image "XXX" is not in the white list, consider adding it to CommonImageWhiteList in test/e2e/common/util.go or NodeImageWhiteList in test/e2e_node/image_list.go
```

Notice that if image pull policy is `PullAlways`, the image is not necessary to be in the white list or prepulled, because the test expects the image to be pulled during the test.

Currently, the image white list is only enabled in node e2e, because the image puller in e2e test is not integrated with the image white list yet.

/cc @kubernetes/sig-node
2016-09-20 07:28:58 -07:00
Random-Liu
08d74f33f6 Add client version. 2016-09-19 21:27:00 -07:00
Random-Liu
ed411c9042 Add image white list, images in white list will be prepulled, and
only images in white list could be used in the test. Currently only
enabled in node e2e test.
2016-09-19 14:39:23 -07:00
Random-Liu
dfcbdae178 Add image pull retry in image pulling test. 2016-09-19 14:18:37 -07:00
Kubernetes Submit Queue
3aa72fa480 Merge pull request #32926 from kubernetes/revert-32841-revert-32251-fix-oom-policy
Automatic merge from submit-queue

[kubelet] Fix oom-score-adj policy in kubelet

Fixes #32238 

We have been having this regression since v1.3. It is critical for GKE/GCE deployments of k8s because docker daemon has a high likelihood of being OOM killed which will end up nuking all containers. 
The reason for moving from mnt to pid is that docker daemon moves itself into a new mnt namespace with systemd based deployments.
2016-09-17 13:00:20 -07:00
Paul Morie
88acffcda1 Fix error message around gcloud calls in node e2e and gubernator 2016-09-17 01:05:20 -04:00
Vish Kannan
a1fe3adbc7 Revert "Revert "[kubelet] Fix oom-score-adj policy in kubelet"" 2016-09-16 16:32:58 -07:00
Kubernetes Submit Queue
d69cdce704 Merge pull request #32820 from coufon/change_collector_log
Automatic merge from submit-queue

change the error log for empty resource usage

This PR changes the error log for empty resource usage buffer for a container to be more clear. It happens when the container name is wrong, or cAdvisor somehow does not response.
2016-09-15 23:54:34 -07:00
Vish Kannan
492ca3bc9c Revert "[kubelet] Fix oom-score-adj policy in kubelet" 2016-09-15 19:28:59 -07:00
Kubernetes Submit Queue
fcc97f37ee Merge pull request #32718 from mikedanese/mv-informer
Automatic merge from submit-queue

move informer and controller to pkg/client/cache

@kubernetes/sig-api-machinery
2016-09-15 16:44:30 -07:00
Zhou Fang
3e16eb5082 change the error log for empty resource usage 2016-09-15 14:13:25 -07:00
Mike Danese
a765d59932 move informer and controller to pkg/client/cache
Signed-off-by: Mike Danese <mikedanese@google.com>
2016-09-15 12:50:08 -07:00
Vishnu kannan
e4acad7afb Fix oom-score-adj policy in kubelet.
Docker daemon and kubelet needs to be protected by setting oom-score-adj to -999.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-09-14 11:56:10 -07:00
Kubernetes Submit Queue
312acd9e30 Merge pull request #32342 from coufon/get_image_machine_info_from_apiserver
Automatic merge from submit-queue

Get image and machine info from apiserver in node e2e test

This PR changes node e2e test to get image and machine information from API server instead of pass them from Jenkins test framework. The original format to pass image and machine info is naming the test node as "machine-image-uuid", which is hard to parse because "-" occurs a lot in both machine and image names.

Now we add two labels "image" and "machine" into performance data. The machine type has the format "cpu:1core,memory:3.6GB".

This PR is based on #32250.
2016-09-14 03:34:45 -07:00
Zhou Fang
b47e22d013 endable DynamicKubeletConfig in benchmark test properties 2016-09-13 13:59:19 -07:00
Zhou Fang
a683eb0418 get image and machine info from api server instead of passing from test
# Please enter the commit message for your changes. Lines starting
2016-09-13 08:41:29 -07:00
Kubernetes Submit Queue
0ca6506850 Merge pull request #32250 from coufon/increase_qps
Automatic merge from submit-queue

Add node e2e density test using 60 QPS for benchmark

This PR adds a new benchmark node e2e density test which sets Kubelet API QPS limit from default 5 to 60, through ConfigMap. 

The latency caused by API QPS limit is as large as ~30% when creating a large batch of pods (e.g. 105). It makes the pod startup latency, as well creation throughput underestimated. This test helps us to know the real performance of Kubelet core.
2016-09-12 20:27:11 -07:00
Michael Taufen
28db03869b Fix memory eviction test parameters. Those parameters should NOT have come through in b9f0bd95 2016-09-12 12:01:53 -07:00
Zhou Fang
a6500cc74a change benchmark configration file to add QPS60 tests 2016-09-12 11:46:38 -07:00
Kubernetes Submit Queue
af325ee7bf Merge pull request #31797 from aveshagarwal/master-dapi-volume-tests-image-update
Automatic merge from submit-queue

Update container image version for downward api volume tests

Some tests were using 0.7, and some were using 0.6, so updating all to 0.7.
@kubernetes/rh-cluster-infra
2016-09-12 01:22:27 -07:00
Kubernetes Submit Queue
469698a803 Merge pull request #32169 from ixdy/node-e2e-flake
Automatic merge from submit-queue

Make error more useful when failing to list node e2e images

To help investigate https://github.com/kubernetes/kubernetes/issues/31694 if it happens again.
2016-09-11 05:07:00 -07:00
Kubernetes Submit Queue
51d996e5d7 Merge pull request #32003 from Random-Liu/change-docker-validation-config-file
Automatic merge from submit-queue

Automated Docker Validation: Change wrong name in perf config.

The config key `containervm-density*` is improper, remove it.

/cc @coufon
2016-09-10 17:58:23 -07:00
Kubernetes Submit Queue
09efe0457d Merge pull request #32163 from mtaufen/more-eviction-logging
Automatic merge from submit-queue

Log pressure condition, memory usage, events in memory eviction test

I want to log this to help us debug some of the latest memory eviction test flakes, where we are seeing burstable "fail" before the besteffort. I saw (in the logs) attempts by the eviction manager to evict besteffort a while before burstable phase changed to "Failed", but the besteffort's phase appeared to remain "Running". I want to see the pressure condition interleaved with the pod phases to get a sense of the eviction manager's knowledge vs. pod phase.
2016-09-09 18:37:55 -07:00
Michael Taufen
b9f0bd959e Log the following items in memory eviction test:
- memory working set
- pressure condition
- events for the default and test namespaces, after the test completes
2016-09-09 13:42:26 -07:00
Kubernetes Submit Queue
e317af87cc Merge pull request #31819 from mtaufen/plumb-feature-gates
Automatic merge from submit-queue

Plumb --feature-gates from TEST_ARGS to components in node e2e tests

This means you can set `TEST_ARGS` on the command line, in a `.properties` config for a Jenkins job, etc, to toggle gated features. For example:

`TEST_ARGS='--feature-gates=DynamicKubeletConfig=true'`

/cc @vishh @jlowdermilk
2016-09-09 12:31:00 -07:00