kubernetes

Author	SHA1	Message	Date
Kubernetes Submit Queue	586fd3374f	Merge pull request #43090 from foxish/fix-network-partition-flake Automatic merge from submit-queue (batch tested with PRs 42854, 43105, 43090) Add a timeout to allow replacement pod to become ready Hopefully fixes https://github.com/kubernetes/kubernetes/issues/37259 ``` I0314 04:26:02.562] Mar 14 04:26:02.562: INFO: Pod my-hostname-net-1bgrj still exists I0314 04:26:22.491] Mar 14 04:26:22.491: INFO: Waiting for pod my-hostname-net-1bgrj to disappear I0314 04:26:22.496] Mar 14 04:26:22.495: INFO: Pod my-hostname-net-1bgrj no longer exists I0314 04:26:22.496] STEP: verifying whether the pod from the unreachable node is recreated I0314 04:26:22.498] Mar 14 04:26:22.498: INFO: Pod name my-hostname-net: Found 3 pods out of 3 I0314 04:26:22.499] STEP: ensuring each pod is running I0314 04:26:22.499] STEP: trying to dial each unique pod I0314 04:26:22.579] Mar 14 04:26:22.579: INFO: Controller my-hostname-net: Got expected result from replica 1 [my-hostname-net-5jrdb]: "my-hostname-net-5jrdb", 1 of 3 required successes so far I0314 04:26:22.642] Mar 14 04:26:22.642: INFO: Controller my-hostname-net: Got expected result from replica 2 [my-hostname-net-mjf3c]: "my-hostname-net-mjf3c", 2 of 3 required successes so far I0314 04:31:22.645] Mar 14 04:31:22.644: INFO: Controller my-hostname-net: Failed to Get from replica 3 [my-hostname-net-rf46s]: Get https://35.184.87.178/api/v1/namespaces/e2e-tests-network-partition-s5gqt/pods/my-hostname-net-rf46s/proxy/: context deadline exceeded ``` The issue appears to be that we have a race between the pod being "running + ready" and being accessible via the APIServer proxy. cc @kow3ns @bowei @davidopp	2017-03-14 18:44:22 -07:00
Kubernetes Submit Queue	8f9cba87a9	Merge pull request #43105 from intelsdi-x/reuse-sched-event-predicates Automatic merge from submit-queue (batch tested with PRs 42854, 43105, 43090) Move e2e sched event predicates to new file. What this PR does / why we need it: Small e2e test refactor for scheduler. Moves scheduler event predicates out of opaque_resource.go for reuse elsewhere. Release note: ```release-note NONE ``` cc @kubernetes/sig-scheduling-pr-reviews @timothysc @bsalamat	2017-03-14 18:44:20 -07:00
Anirudh	7698196fa5	Add a timeout to allow replacement pod to become ready	2017-03-14 17:09:39 -07:00
Kubernetes Submit Queue	1221779b18	Merge pull request #43018 from nicksardo/ingress-upgrade-cleanup-flake Automatic merge from submit-queue (batch tested with PRs 43018, 42713) Log instead of fail on GLBCs tendency to leak resources What this PR does / why we need it: Stops upgrade tests from flaking because the GLBC does not cleanup all resources due to a race condition. Which issue this PR fixes: fixes #38569 Special notes for your reviewer: To be reviewed by @mml ```release-note NONE ```	2017-03-14 15:59:18 -07:00
Connor Doyle	4f847cb440	Move e2e sched event predicates to new file.	2017-03-14 15:20:27 -07:00
Kubernetes Submit Queue	442e920085	Merge pull request #43029 from janetkuo/deployment-controllerRef-test Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029) Add e2e test for Deployment controllerRef orphaning and adoption Follow up #42908 @enisoc @kubernetes/sig-apps-bugs @kargakis	2017-03-14 13:52:46 -07:00
Kubernetes Submit Queue	42cdb052b6	Merge pull request #42968 from timothysc/sched_e2e_breakout Automatic merge from submit-queue (batch tested with PRs 42775, 42991, 42968, 43029) Initial breakout of scheduling e2es to help assist in assignment and refactoring What this PR does / why we need it: This PR segregates the scheduling specific e2es to isolate the library which will assist both in refactoring but also auto-assignment of issues. Which issue this PR fixes xref: https://github.com/kubernetes/kubernetes/issues/42691#issuecomment-285563265 Special notes for your reviewer: All this change does is shuffle code around and quarantine. Behavioral, and other cleanup changes, will be in follow on PRs. As of today, the e2es are a monolith and there is massive symbol pollution, this 1st step allows us to segregate the e2es and tease apart the dependency mess. Release note: ``` NONE ``` /cc @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-testing-pr-reviews @marun @skriss /cc @gmarek - same trick for load + density, etc.	2017-03-14 13:52:43 -07:00
Kubernetes Submit Queue	0ea3e9a2c1	Merge pull request #43066 from foxish/fix-statefulset-apps Automatic merge from submit-queue (batch tested with PRs 43034, 43066) Fix StatefulSet apps e2e tests Fixes https://github.com/kubernetes/kubernetes/issues/42490 ```release-note NONE ``` cc @kubernetes/sig-apps-bugs	2017-03-14 11:44:39 -07:00
Kubernetes Submit Queue	dc2b0ee2cf	Merge pull request #43034 from enisoc/statefulset-patch Automatic merge from submit-queue (batch tested with PRs 43034, 43066) Allow StatefulSet controller to PATCH Pods. What this PR does / why we need it: StatefulSet now needs the PATCH permission on Pods since it calls into ControllerRefManager to adopt and release. This adds the permission and the missing e2e test that should have caught this. Which issue this PR fixes: Special notes for your reviewer: This is based on #42925. Release note: ```release-note ``` cc @kubernetes/sig-apps-pr-reviews	2017-03-14 11:44:37 -07:00
Kubernetes Submit Queue	f53ba5581b	Merge pull request #43080 from foxish/foxish-patch-2 Automatic merge from submit-queue Add rest of workloads team to test/OWNERS ```release-note NONE ``` cc @kubernetes/sig-apps-misc	2017-03-14 10:19:39 -07:00
Kubernetes Submit Queue	6de28fab7d	Merge pull request #42942 from vishh/gpu-cont-fix Automatic merge from submit-queue (batch tested with PRs 42942, 42935) [Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs Fixes #42412 Background Support for multiple GPUs is an experimental feature in v1.6. Container restarts were handled incorrectly which resulted in stranding of GPUs Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods. What does this PR do This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs. GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache. Node e2e has been extended to validate this failure scenario. Risk Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet. Workarounds In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever` in their pods. There is no workaround for the race condition caused by using the runtime cache though. Hence it is worth including this fix in v1.6.0. cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews Replaces #42560	2017-03-14 10:19:17 -07:00
Anthony Yeh	53a6f4402f	Allow StatefulSet controller to PATCH Pods. Also add an e2e test that should have caught this.	2017-03-14 09:27:33 -07:00
Anirudh Ramanathan	5267f05be7	Add people to test/OWNERS	2017-03-14 08:52:08 -07:00
Anirudh	bcc73dbe1a	Fix StatefulSet apps flakes	2017-03-14 02:44:55 -07:00
Timothy St. Clair	6cc40678b6	Initial breakout of scheduling e2es to help assist in both assignment and refactoring.	2017-03-13 22:34:57 -05:00
Janet Kuo	c97935533a	Add e2e test for Deployment controllerRef orphaning and adoption	2017-03-13 18:43:09 -07:00
Nick Sardo	3e85c0f758	Log instead of fail on GLBCs tendency to leak resources	2017-03-13 15:31:03 -07:00
Kubernetes Submit Queue	5913c5a453	Merge pull request #42925 from janetkuo/ds-adopt-e2e Automatic merge from submit-queue Allow DaemonSet controller to PATCH pods, and add more steps and logs in DaemonSet pods adoption e2e test DaemonSet pods adoption failed because DS controller aren't allowed to patch pods when claiming pods. [Edit] This PR fixes #42908 by modifying RBAC to allow DaemonSet controllers to patch pods, as well as adding more logs and steps to the original e2e test to make debugging easier. Tested locally with a local cluster and GCE cluster. @kargakis @lukaszo @kubernetes/sig-apps-pr-reviews	2017-03-13 14:06:03 -07:00
Kubernetes Submit Queue	19574a10f2	Merge pull request #42906 from intelsdi-x/reuse-observer-helpers Automatic merge from submit-queue (batch tested with PRs 42940, 42906, 42970, 42848) Move node and event observer helpers to e2e/common What this PR does / why we need it: Moves existing test helper functions in OIR e2e tests to `test/e2e/common`. These functions wrap informers to help test writers to observe events instead of long-polling for status updates. For usage examples, see `test/e2e/opaque_resource.go`. cc @kubernetes/sig-scheduling-misc Release note: ```release-note NONE ```	2017-03-13 13:22:12 -07:00
Kubernetes Submit Queue	d60d965f33	Merge pull request #42940 from caesarxuchao/fix-gc-orphan-rs Automatic merge from submit-queue (batch tested with PRs 42940, 42906, 42970, 42848) Increase timeout for the orphan e2e test Fix #42086. Analysis of test logs are in https://github.com/kubernetes/kubernetes/issues/42086#issuecomment-285770868 and the following comments. @deads2k PTAL, thanks!	2017-03-13 13:22:10 -07:00
Janet Kuo	287b962860	Add more steps and logs in DaemonSet pods adoption e2e test	2017-03-13 11:37:17 -07:00
Vishnu Kannan	8ed9bff073	handle container restarts for GPUs Signed-off-by: Vishnu Kannan <vishnuk@google.com>	2017-03-13 10:58:26 -07:00
Kubernetes Submit Queue	ab9b299c30	Merge pull request #42915 from kubernetes/fabianofranz-test-approver Automatic merge from submit-queue Add fabianofranz as approver for test/e2e/kubectl.go Adding myself as approver for `kubectl` end-to-end tests. ```release-note NONE ```	2017-03-13 07:39:29 -07:00
Connor Doyle	ba9410621f	Move node and event observer helpers to e2e/common	2017-03-12 19:35:26 -07:00
Kubernetes Submit Queue	81ba4741f3	Merge pull request #42901 from fabianofranz/issues_42697 Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933) Fixes kubectl skew test failure when using kubectl.sh Fixes leftovers from https://github.com/kubernetes/kubernetes/pull/42737. Release note: ```release-note NONE ```	2017-03-10 22:02:20 -08:00
Kubernetes Submit Queue	8cb14a4f7f	Merge pull request #42755 from aveshagarwal/master-fix-default-toleration-seconds Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933) Fix DefaultTolerationSeconds admission plugin DefaultTolerationSeconds is not working as expected. It is supposed to add default tolerations (for unreachable and notready conditions). but no pod was getting these toleration. And api server was throwing this error: ``` Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.769212 32070 admission.go:71] expected pod but got Pod Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.789055 32070 admission.go:71] expected pod but got Pod Mar 08 13:44:02 fedora25 hyperkube[32070]: E0308 13:44:02.006784 32070 admission.go:71] expected pod but got Pod Mar 08 13:45:39 fedora25 hyperkube[32070]: E0308 13:45:39.754669 32070 admission.go:71] expected pod but got Pod Mar 08 14:48:16 fedora25 hyperkube[32070]: E0308 14:48:16.673181 32070 admission.go:71] expected pod but got Pod ``` The reason for this error is that the input to admission plugins is internal api objects not versioned objects so expecting versioned object is incorrect. Due to this, no pod got desired tolerations and it always showed: ``` Tolerations: <none> ``` After this fix, the correct tolerations are being assigned to pods as follows: ``` Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s ``` @davidopp @kevin-wangzefeng @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scheduling-bugs @derekwaynecarr Fixes https://github.com/kubernetes/kubernetes/issues/42716	2017-03-10 22:02:18 -08:00
Kubernetes Submit Queue	ca09352dd9	Merge pull request #42349 from timstclair/aa-upgrade Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933) AppArmor cluster upgrade test Add a cluster upgrade test for AppArmor. I still need to test this (having some trouble with the cluster-upgrade tests), but wanted to start the review process. /cc @dchen1107 @roberthbailey	2017-03-10 22:02:16 -08:00
Kubernetes Submit Queue	328e555f72	Merge pull request #41794 from shashidharatd/federation-upgrade-tests-1 Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933) [Federation][e2e] Add framework for upgrade test in federation Adding framework for federation upgrade tests. please refer to #41791 cc @madhusudancs @nikhiljindal @kubernetes/sig-federation-pr-reviews	2017-03-10 22:02:15 -08:00
Chao Xu	a3f4053cb3	increase timeout for orphan e2e test	2017-03-10 18:13:48 -08:00
Christian Bell	9a37fe6dff	[Federation] Deployments unaware of ReadyReplicas The Deployment controller was not propagating ReadyReplicas to underlying clusters causing these errors: ``` Error syncing cluster controller: Deployment.apps "federation-deployment" is invalid: status.availableReplicas: Invalid value: 5: cannot be greater than readyReplicas ``` This was caught in e2e testing and is a 1.6 regression for support that was added in #37959. Without this fix, users will be unable to scale up their deployments.	2017-03-10 15:00:02 -08:00
Fabiano Franz	224ee822d4	Add fabianofranz as approver for test/e2e/kubectl.go	2017-03-10 18:13:43 -03:00
Kubernetes Submit Queue	e2218290cf	Merge pull request #42444 from jingxu97/Mar/deleteVolume Automatic merge from submit-queue (batch tested with PRs 42608, 42444) Return nil when deleting non-exist GCE PD When gce cloud tries to delete a disk, if the disk could not be found from the zones, the function should return nil error. This modified behavior is also consistent with AWS	2017-03-10 12:50:24 -08:00
shashidharatd	a14f8dc346	auto generated bazel build files	2017-03-11 01:39:56 +05:30
shashidharatd	662f0ef531	Add framework for federation upgrade tests	2017-03-11 01:39:56 +05:30
shashidharatd	4443a1b40d	Move few reusable functions to upgrade_utils.go	2017-03-11 01:39:56 +05:30
Fabiano Franz	adea540a5b	Fixes kubectl skew test failure when using kubectl.sh	2017-03-10 15:25:38 -03:00
Kubernetes Submit Queue	f71492a9ac	Merge pull request #42719 from gmarek/taint-test Automatic merge from submit-queue (batch tested with PRs 36704, 42719) Extend timeouts in taints test to account for slow Pod deletions Fix #42685 Before merging this we need a consensus on what to do with slow Pod deletions.	2017-03-10 09:06:23 -08:00
Kubernetes Submit Queue	4ff0af821a	Merge pull request #42879 from jsafrane/test-pod-logs Automatic merge from submit-queue e2e test: Log container output on TestContainerOutput error When a pod started with TestContainerOutput or TestContainerOutputRegexp fails from unknown reason, we should log all output of all its containers so we can analyze what went wrong. This would help us to see what wrong in https://github.com/kubernetes/kubernetes/issues/40811 - a container is running there for 3 minutes and dies and we want to see what it did for these 3 minutes. ```release-note NONE ```	2017-03-10 06:13:44 -08:00
gmarek	4e5b4e7ee0	Extend timeouts in taints test to account for slow Pod deletions	2017-03-10 14:23:47 +01:00
Jan Safranek	bc06c636d1	e2e test: Log container output on TestContainerOutput error When a pod started with TestContainerOutput or TestContainerOutputRegexp fails from unknown reason, we should log all output of all its containers so we can analyze what went wrong.	2017-03-10 10:08:57 +01:00
Avesh Agarwal	9f533de80d	Fix DefaultTolerationSeconds admission plugin. It was using versioned object whereas admission plugins operate on internal objects.	2017-03-09 20:24:43 -05:00
Random-Liu	f81460e35d	Change the junit file name format to `junit_image-name_id.xml`, and make the gci image name shorter.	2017-03-09 16:47:48 -08:00
Kubernetes Submit Queue	4540674b04	Merge pull request #42758 from krousey/downgrades Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694) Implement automated downgrade testing. Node version cannot be higher than the master version, so we must switch the node version first. Also, we must use the upgrade script from the appropriate version for GCE.	2017-03-09 15:06:56 -08:00
Kubernetes Submit Queue	7c08e817a5	Merge pull request #42734 from dashpole/deletion_timeout Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694) Create DefaultPodDeletionTimeout for e2e tests In our e2e and e2e_node tests, we had a number of different timeouts for deletion. Recent changes to the way deletion works (#41644, #41456) have resulted in some timeouts in e2e tests. #42661 was the most recent fix for this. Most of these tests are not meant to test pod deletion latency, but rather just to clean up pods after a test is finished. For this reason, we should change all these tests to use a standard, fairly high timeout for deletion. cc @vishh @Random-Liu	2017-03-09 15:06:53 -08:00
Kubernetes Submit Queue	a22fac00dd	Merge pull request #42833 from caesarxuchao/pod-deletion Automatic merge from submit-queue Don't wait for the final deletion of pod The final deletion of the pod depends on kubelet and other components operating correctly. The purpose of this e2e test is verifying the clientset can handle deleteOptions correctly, so waiting for the deletionTimestamp and deletionGraceperiod get set is good enough. In the long run, we should move this set of e2e tests to integration tests. Fix #42724 #42646 cc @marun	2017-03-09 13:21:53 -08:00
Kris	cc84e0895a	Implement automated downgrade testing. Node version cannot be higher than the master version, so we must switch the node version first. Also, we must use the upgrade script from the appropriate version for GCE.	2017-03-09 12:45:20 -08:00
Chao Xu	130437b94e	wait for the deletionTimestamp set instead of waiting for the final deletion	2017-03-09 11:35:51 -08:00
Kubernetes Submit Queue	7b4bec038c	Merge pull request #42805 from deads2k/client-01-flake-debug Automatic merge from submit-queue add debugging to the client watch test Adds debugging information for https://github.com/kubernetes/kubernetes/issues/42724. I suspect that the watch is closing early, but I'd like proof before I consider things like retrying the list and doing another watch to observe the delete. I'm not even sure that would satisfy the test It seems like a flaky way to build the test. Why wouldn't we delete non-gracefully? @kubernetes/sig-api-machinery-misc @caesarxuchao @wojtek-t saw you just hit this if you wanted to take a quick look at the debugging I added.	2017-03-09 08:20:45 -08:00
deads2k	ceb3e27fff	add debugging to the client watch test	2017-03-09 09:27:41 -05:00
Kubernetes Submit Queue	cf732613e3	Merge pull request #42278 from marun/fed-api-fixture Automatic merge from submit-queue (batch tested with PRs 42728, 42278) [Federation] Create integration test fixture for api This PR factors a reusable fixture for the federation api server out of the existing integration test. Targets #40705 cc: @kubernetes/sig-federation-pr-reviews	2017-03-09 05:45:32 -08:00

1 2 3 4 5 ...

6789 Commits