Automatic merge from submit-queue
e2e flake: real fix of PodAntiAffinity test
The fix in PR https://github.com/kubernetes/kubernetes/pull/30135 was wrong in using a wrong test condition for an already broken test.
### Summary
The test tries to launch a pod with an anti-affinity annotation, waits 10
seconds and then checks that it is still pending.
But the anti-affinity annotation does not forbid to launch that pod on just
another node that does not have the zone label at all.
This commit changes this behavior by labeling two nodes with the zone label
and then forcing the pod to be launched on one of those two nodes.
**I assume here that a non-existing label is considered as a different label value.**
Fixes#30078
Automatic merge from submit-queue
SchedulerExtender: add failedPredicateMap in Filter() returns
Fix#25797. modify extender.Filter for adding extenders information to “failedPredicateMap” in findNodesThatFit.
When all the filtered nodes that passed "predicateFuncs" don’t pass the extenders filter, the failedPredicateMap hasn’t the extenders information, should add it, I think. So when the length of the “filteredNodes.Items” is 0, we can know the integral information. (The length of the “filteredNodes.Items” is 0, may be because the extenders filter failed.)
Automatic merge from submit-queue
Use report-dir in test framework instead.
We already have `report-dir` option in framework test context.
The node e2e framework should use it as well.
/cc @ronnielai
Automatic merge from submit-queue
node_e2e: Use upstream CoreOS image directly
.. and update it to the latest alpha
This will make updating the CoreOS image in the future much simpler since it won't involve project-copying, manual-baking, or so on.
cc @pwittrock @vishh @bboreham @yifan-gu
Automatic merge from submit-queue
E2E & Node E2E: Move configmap, docker_containers, downward_api, expansion and secrets test into common directory.
This is the 3rd part of #29494.
For #29081.
Based on #29092, #29806.
The first commit is squash of all dependent commits. Please only review the second commit.
The second PR added 17 lines.
@vishh @timstclair
Automatic merge from submit-queue
For e2e_node tests tell etcd to listen on ports 2379 and 4001
This is the default for etcd2, but etcd3 only listens on 2379.
Specifying the ports keeps things consistent no matter which version the user has installed.
Fixes#29117
Automatic merge from submit-queue
Bug fix: Use p.Name instead of pod.Name
For example, if you used `pod.GenerateName`, `pod.Name` might be the empty
string while `p.Name` contains the actual name of your pod. Thus passing
`pod.Name` can result in a `resource name may not be empty` error.
For example, if you used pod.GenerateName, pod.Name might be the empty
string while p.Name contains the actual name of your pod. Thus passing
pod.Name can result in a `resource name may not be empty` error.
The test tries to launch a pod with an anti-affinity annotation, waits 10
seconds and then checks that it is still pending.
But the anti-affinity annotation does not forbid to launch that pod on just
another node that does not have the zone label at all.
This commit changes this behavior by labeling two nodes with the zone label
and then forcing the pod to be launched on one of those two nodes.
Automatic merge from submit-queue
Node E2E: Move the node name initialization to first function of SynchronizedBeforeEach
Currently, we start e2e services in the first function of `SynchronizedBeforeEach` to make sure that we only start them once even we are running test in parallel test nodes.
However, e2e services require `NodeName`, but we initialize `NodeName` in the second function.
This PR moved the initialization logic into the first function, and shared the node name with all test nodes via the `SharedContext`.
Automatic merge from submit-queue
Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node' built for Linux only
This PR adds `+build linux' to density_test.go, resource_usage.go and resource_collector.go to last PR #29764.
#29764 fails build because it depends on cgroup which can not be built for os other than Linux.
Automatic merge from submit-queue
E2E & NodeE2E: Move host_path, downwardapi_volume and empty_dir into common directory.
This is the second part of #29494.
For #29081.
Based on #29092, #29806.
The first commit is squash of all dependent commits. Please only review the second commit.
The second PR is only 20 lines of change.
@vishh @timstclair
Automatic merge from submit-queue
Node E2E: Change the node e2e junit file name to junit_{image-name}{test-node-number}.xml
Fixes https://github.com/kubernetes/kubernetes/issues/30103.
Reuse the `report-prefix` in e2e test framework. Now the junit file will be like: `junit_{image-name}{test-node-number}.xml`.
Mark P2 to fix the test result.
/cc @rmmh
Automatic merge from submit-queue
federation: Adding secret API
Adding secret API to federation-apiserver and updating the federation client to include secrets
Automatic merge from submit-queue
Added test to density that will run maximum capacity pods on nodes
Added a test to the Density Suite that will load the kubelets with their maximum capacity number of pods
Automatic merge from submit-queue
Install go-bindata in cross-build image
Another follow-up to #25584.
We need `go-bindata` to create `test/e2e/generated`, and downloading it with `go get` at build time is painful for a variety of reasons. We can just include it in the cross-build image and not worry about it, especially as it updates very infrequently.
This fixes `hack/update-generated-protobuf.sh` as well.
cc @jayunit100 @soltysh
Automatic merge from submit-queue
pv e2e refactor and pre-bind test
refactored persistentvolume e2e so that multiple It() tests can be run. Added one test case for pre-binding, but the overall structure of the test should allow additional test cases to be more easily added.
Automatic merge from submit-queue
Resolve docker-daemon cgroup issue for both systemd and non-systemd node for node e2e tests
Fixed https://github.com/kubernetes/kubernetes/issues/29827
cc/ @coufon this should unblock your pr: #29764
I validated both containervm image and coreos image, and works as expected.
This is also required for adding gci image to node e2e test infrastructure.
Automatic merge from submit-queue
Fix deployment e2e test: waitDeploymentStatus should error when entering an invalid state
Follow up #28162
1. We should check that max unavailable and max surge aren't violated at all times in e2e tests (didn't check this in deployment scaled rollout yet, but we should wait for it to become valid and then continue do the check until it finishes)
2. Fix some minor bugs in e2e tests
@kubernetes/deployment
Automatic merge from submit-queue
Remove myself from test ownership.
These are almost certainly not correct, but probably more likely owners than myself.
@rmmh @dchen1107 @timstclair @erictune @mtaufen @caesarxuchao @fgrzadkowski @krousey @lavalamp
Automatic merge from submit-queue
Fix 29992
Fix#29992.
I copied RC test code to the wrong place to the RS test in #29798. I took a look at the failure reports, they were all failed on the RS test, so #29798 itself is correct.
Marked as P2 since it fixes a test flake that will block everyone.
Automatic merge from submit-queue
Revert "Revert "Drop support for --gce-service-account, require activated creds""
Reverts kubernetes/kubernetes#29242
Automatic merge from submit-queue
Limit number of pods spawned in SchedulerPredicates validates resourc…
Fixes https://github.com/kubernetes/kubernetes/issues/29190,
With this patch test should spawn at most 10 pods on the smallest node.
Automatic merge from submit-queue
integration test: Modify PVs/PVCs during binding.
Previous volume binder code was not able to cope with PVs or PVCs getting modified during the binding process. Current one should be resilient to these changes, so let's test it.
It makes the test approximately twice as long as before, from ~2 seconds to ~4-5.
@kubernetes/sig-storage
Marking as 1.3 target, however it does not really matter here, it's just a test.
Add min size of pod and max number of pods for SchedulerPredicates validate resouce limits test
Fix typo in patch for SchedulerPredicates validate resouce limits test
Moving max number of pods and min pod cpu request to constants
Automatic merge from submit-queue
Add density (batch pods creation latency and resource) and resource performance tests to `test-e2e-node'
This PR contains two new tests (migrate from e2e test):
1. Density test: verify startup latency and resource usage when create a batch of pod with throughput control. Throughput control is done by sleep for an interval between firing concurrently create pod operations.
It tests both batch creation and sequential (back-to-back) creation and report the throughputs.
2. Verify resource usage of steady state kubelet.
The test creates a new resource controller for `test-node-e2e' (resource_controller.go) which monitors resource through a standalone Cadvisor pod (port 8090) with 1s housekeeping interval.
Automatic merge from submit-queue
[Garbage Collector] add e2e tests again
#27151 is reverted because gke didn't start correctly after it's merged (https://github.com/kubernetes/kubernetes/pull/27151#issuecomment-233030686).
The possible problem is the `unbound variable`, which is fixed in the second commit of this PR. However, I cannot verify if the PR will fail the gke suite since I don't have the environment to run that suite.
@wojtek-t @lavalamp
Automatic merge from submit-queue
Update test-owners with new tests, add catch-all assignment to test-infra team.
We will triage any additional failures, since they're more likely to be infra related. If they're not, they can always be reassigned (and the owners list can be updated!)
/cc @kubernetes/test-infra-maintainers
Automatic merge from submit-queue
Node E2E: Add serial jenkins job.
This PR added a jenkins job for serial test. It will run all serial test one by one.
This will be useful for https://github.com/kubernetes/kubernetes/pull/29809.
@coufon @yujuhong @dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Add support to quota pvc storage requests
Adds support to quota cumulative `PersistentVolumeClaim` storage requests in a namespace.
Per our chat today @markturansky @abhgupta - this is not done (lacks unit testing), but is functional.
This lets quota enforcement for `PersistentVolumeClaim` to occur at creation time. Supporting bind time enforcement would require substantial more work. It's possible this is sufficient for many, so I am opening it up for feedback.
In the future, I suspect we may want to treat local disk in a special manner, but that would have to be a different resource altogether (i.e. `requests.disk`) or something.
Example quota:
```
apiVersion: v1
kind: ResourceQuota
metadata:
name: quota
spec:
hard:
persistentvolumeclaims: "10"
requests.storage: "40Gi"
```
/cc @kubernetes/rh-cluster-infra @deads2k
Automatic merge from submit-queue
E2E & Node E2E: Add exec util in framework
For #29081.
Based on #29092 and #29494.
For first commit is a squashed commit of all old commits.
**The last 2 commits are new.**
This PR added exec util in framework, and moved `privileged.go` and `kubelet_etc_hosts` into `common` directory.
@vishh @timstclair
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Node E2E: Make node e2e parallel
For https://github.com/kubernetes/kubernetes/issues/29081.
Fix https://github.com/kubernetes/kubernetes/issues/26215.
Based on https://github.com/kubernetes/kubernetes/pull/28807, https://github.com/kubernetes/kubernetes/pull/29020, will rebase after they are merged.
**Only the last commit is new.**
We are going to move more tests into the node e2e test. However, currently node e2e test only run sequentially, the test duration will increase quickly when we add more test.
This PR makes the node e2e test run in parallel so as to shorten test duration, so that we can add more test to improve the test coverage.
* If you run the test locally with `make test-e2e-node`, it will use `-p` ginkgo flag, which uses `(cores-1)` parallel test nodes by default.
* If you run the test remotely or in the Jenkin, the parallelism will be controlled by the environment variable `PARALLELISM`. The default value is `8`, which is reasonable for our test node (n1-standard-1).
Before this PR, it took **833.592s** to run all test on my desktop.
With this PR, it only takes **234.058s** to run.
The pull request node e2e run with this PR takes **232.327s**.
The pull request node e2e run for other PRs takes **673.810s**.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Adding GCI to node e2e.
Depends on https://github.com/kubernetes/kubernetes/pull/29486
Adding the dev release as of now since stable and beta run docker v1.9.1
which is incompatible with kubelet.
Automatic merge from submit-queue
Fix 29451
Fix#29451. I've also checked other tests in that file to make sure they don't have similar problems.
The issue is P0 and will block the submit queue, so I marked this PR as P0.
bindata and yaml, Gobindata automation
bindata utils for generating, go generate
match server version
gitignore for dirty, ca, rbase, KUBE_ROOT, buildfix
(rebased jul-25,29)
Automatic merge from submit-queue
Add API for StorageClasses
This is the API objects only required for dynamic provisioning picked apart from the controller logic.
Entire feature is here: https://github.com/kubernetes/kubernetes/pull/29006
Automatic merge from submit-queue
Remove redundant pod deletion in scheduler predicates tests and fix taints-tolerations e2e
~~In scheduler predicates test, some tests won't clean pods they created when exit with failure, which may lead to pod leak. This PR is to fix it.~~
Remove redundant pod deletion in scheduler predicates tests, since framework.AfterEach() already did the cleanup work after every test.
Also fix the test "validates that taints-tolerations is respected if not matching", refer to the change on taint-toleration test in #29003, and https://github.com/kubernetes/kubernetes/pull/24134#discussion_r63794924.
Automatic merge from submit-queue
make the resource prefix in etcd configurable for cohabitation
This looks big, its not as bad as it seems.
When you have different resources cohabiting, the resource name used for the etcd directory needs to be configurable. HPA in two different groups worked fine before. Now we're looking at something like RC<->RS. They normally store into two different etcd directories. This code allows them to be configured to store into the same location.
To maintain consistency across all resources, I allowed the `StorageFactory` to indicate which `ResourcePrefix` should be used inside `RESTOptions` which already contains storage information.
@lavalamp affects cohabitation.
@smarterclayton @mfojtik prereq for our rc<->rs and d<->dc story.
Automatic merge from submit-queue
Fix mount collision timeout issue
Short- or medium-term workaround for #29555. The root issue being fixed here is that the recent attach/detach work in the kubelet uses a unique volume name as a key that tracks the work that has to be done for each volume in a pod to attach/mount/umount/detach. However, the non-attachable volume plugins do not report unique names for themselves, which causes collisions when a single secret or configmap is mounted multiple times in a pod.
This is still a WIP -- I need to add a couple E2E tests that ensure that tests break in the future if there is a regression -- but posting for early review.
cc @kubernetes/sig-storage
Ultimately, I would like to refine this a bit further. A couple things I would like to change:
1. `GetUniqueVolumeName` should be a property ONLY of attachable volumes
2. I would like to see the kubelet apparatus for attach/mount/umount/detach handle non-attachable volumes specifically to avoid things like the `WaitForControllerAttach` call that has to be done for those volume types now
Automatic merge from submit-queue
Fix killing child sudo process in e2e_node tests
Fixes#29211.
The context is we are trying to kill a process started as `sudo kube-apiserver`, but `sudo` ignores signals from the same process group. Applying `Setpgid` means the `sudo kill` process won't be in the same process group, so will not fall foul of this nifty feature.
I also took the liberty of removing some code setting `Pdeathsig` because it claims to be doing something in the same area, but actually it doesn't do that at all. The setting is applied to the forked process, i.e. `sudo`, and it means the `sudo` will get killed if we (`e2e_node.test`) die. This (a) isn't what the comment says and (b) doesn't help because sending SIGKILL to the sudo process leaves sudo's child alive.
I didn't use the "hack for linux-only" approach because I think `Setpgid` is available on all platforms that `e2e_node` builds on.
This is the default for etcd2, but etcd3 only listens on 2379.
Specifying the ports keeps things consistent no matter which
version the user has installed.
Automatic merge from submit-queue
Faster test
<!--
Checklist for submitting a Pull Request
Please remove this comment block before submitting.
1. Please read our [contributor guidelines](https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md).
2. See our [developer guide](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md).
3. If you want this PR to automatically close an issue when it is merged,
add `fixes #<issue number>` or `fixes #<issue number>, fixes #<issue number>`
to close multiple issues (see: https://github.com/blog/1506-closing-issues-via-pull-requests).
4. Follow the instructions for [labeling and writing a release note for this PR](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes) in the block below.
-->
In attempting to troubleshoot flakes with this test case I actually wanted to understand how it worked.
There's some poor comments that need work.
I added some additional output which may or may not help in debugging the flakes.
I doubt this fixes the flake.
My major concern is the 'refactor' I did of the test case to batch up runs by sub-test-case. As it stood there was a 200ms pause between each sub, so they should not have interfered with each other. Now they are just started as fast as possible, but only 20 run at a time before moving on to the next 20. I am not sure if I am violating the ethos of the original test case.
Runs on my computer are down from 2m40s -> 40s.
Getting rid of the arbitrary client limiting brings it down to ~12 seconds. 11 to fetch the image and <1 to actually run the tests against the proxies. I can add a zero to the number of loops if you want to hit it harder. It would result in 10x as much text output though.
[]()
Automatic merge from submit-queue
Add support for kubectl create quota command
Follow-up of https://github.com/kubernetes/kubernetes/pull/19625
```
Create a resourcequota with the specified name, hard limits and optional scopes
Usage:
kubectl create quota NAME [--hard=key1=value1,key2=value2] [--scopes=Scope1,Scope2] [--dry-run=bool] [flags]
Aliases:
quota, q
Examples:
// Create a new resourcequota named my-quota
$ kubectl create quota my-quota --hard=cpu=1,memory=1G,pods=2,services=3,replicationcontrollers=2,resourcequotas=1,secrets=5,persistentvolumeclaims=10
// Create a new resourcequota named best-effort
$ kubectl create quota best-effort --hard=pods=100 --scopes=BestEffort
```
Automatic merge from submit-queue
Rework pod waiting mechanism in e2e tests to accept pod and watch based
This PR re-applies #28212 which was reverted in #29223. The only difference is that the initial PR contained also `PodStartTimeout` shortening (see [here](4b0c0bd924)) which might caused the problems. Let's give it a 2nd try. I've tested all the flakes and they were passing on my machine.
@smarterclayton @apelisse ptal
- what the test is doing
- how the test is set up
- subsections of the test setup
additional output
- print time spent getting ready to run proxy attempts
- number of test cases
- multiple attempts of each test case
- how many total proxying attempts will be made
- fast path output now has numerical identity of attempt like error output
- error output has time taken and http status like fast path output
batching runs
- run groups of test cases vs starting all 34*20=680 proxy attempts at
the same time.
- don't wait between starting proxy attempts anymore.
proxy e2e changes
- disable the client side rate limiter
- use `By` construct of ginkgo for inline `STEP` logging
- move the waitGroup add outside of the loop
Automatic merge from submit-queue
Syncing imaging pulling backoff logic
- Syncing the backoff logic in the parallel image puller and the sequential image puller to prepare for merging the two pullers into one.
- Moving image error definitions under kubelet/images
Automatic merge from submit-queue
Change SETUP_NODE to True for node e2e docker validation test.
The continuous node e2e docker validation test is failing because:
```
W0722 00:48:52.163940 1265 image_list.go:85] Could not pre-pull image gcr.io/google_containers/netexec:1.4 exit status 1 output: Cannot connect to the Docker daemon. Is the docker daemon running on this host?
```
This is because jenkins is not added to docker user group.
For other images tested in node e2e, jenkins is added to docker user group when the images are initially created https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/environment/setup_host.sh#L102.
However, in node e2e docker validation test, we are using GCI image which doesn't do that.
So we should use the `SETUP_NODE` option to add user to docker group before test running b6c87904f6/test/e2e_node/e2e_remote.go (L150-L159).
This is only one line change, could you help me review the PR? @wonderfly
Thanks a lot! :)
Automatic merge from submit-queue
test/e2e: plug time.Ticker resource leak.
This commit ensures that `logPodStartupStatus` does not leak
running `time.Ticker` instances. Upon termination of the consuming
routine, we stop the ticker.
Automatic merge from submit-queue
use regular client instead of kubectl in scheduler predicate tests when checking/setting/cleanning taints/labels
The existing implementation in scheduler predicate tests uses kubectl to check/set/clean taints/labels on node, which makes the test very related to kubectl.
This PR is to use regular client instead.
Automatic merge from submit-queue
Revert "Drop support for --gce-service-account, require activated creds"
Reverts kubernetes/kubernetes#28802
This appears to break the soak tests with "invalid grant" errors -- see the recent batch of errors in #27920.
Automatic merge from submit-queue
Allow for overriding throughput in load test
We seem to be already supporting higher throughput that what the default is.
I'm going to increase the throughput in our tests:
- speed up scalability tests
- ensure that what I'm seeing locally is really the repeatable case
This PR is a short preparation for those experiments.
[Ideally, I would like to have kubemark-500 to be finishing within 30 minutes. And I think this should be doable pretty soon.]
@gmarek
Automatic merge from submit-queue
Change some node e2e test to use the prepull image framework.
Fix https://github.com/kubernetes/kubernetes/issues/28868.
Node e2e test framework pre-pulls all images in [image_list.go](bc2f223f5a/test/e2e_node/image_list.go)
All node e2e test should use image from the "image_list". If a test needs new image, we should update the image_list to include the new image.
/cc @kubernetes/sig-node to notice people to use `image_list` when adding test. :)
Automatic merge from submit-queue
add tokenreviews endpoint to implement webhook
Wires up an API resource under `apis/authentication.k8s.io/v1beta1` to expose the webhook token authentication API as an API resource. This allows one API server to use another for authentication and uses existing policy engines for the "authoritative" API server to controller access to the endpoint.
@cjcullen you wrote the initial type
Automatic merge from submit-queue
Start namespace controller in node e2e
Fix https://github.com/kubernetes/kubernetes/issues/28320.
Based on https://github.com/kubernetes/kubernetes/pull/28807, only the last 2 commits are new.
Before this PR, there was no namespace controller running in node e2e test infrastructure. We can not enable the [`delete-namespace`](f2ddd60eb9/test/e2e/framework/test_context.go (L109)) flag in the test framework.
So after the test running, there will be running pod left on the test node. This seems to be acceptable in our test infrastructure because we create an new instance each time.
However, in 1.4 we may want to provide part of the test as node conformance test to the user, they definitely don't want the test to leave tons of pods on their node after test running.
Currently, there is no easy way to only start namespace controller in kube-controller-manager (confirmed with @mikedanese), so in this PR I started a "uncontainerized" one in the test infrastructure.
This PR:
* Started the namespace controller in the node e2e test infrastructure and enable the automatic namespace deletion.
* Change the privileged test to use framework (@yujuhong), so that all node e2e tests are using the framework and test pods will be cleaned up by namespace controller.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Switched watches in tests require ResourceVersion to be passed
For testing the Watches are not sufficient in that it might miss the event of transitioning a Pod from one state to another which might happen before we start Watching events. To remedy this, I'm proposing to switch to Gets to always read the actual state of a Pod.
@smarterclayton this fixes https://github.com/openshift/origin/issues/9192 and hopefully all `gave up waiting for pod...` flakes
[]()
Automatic merge from submit-queue
Don't repeat the program name in healthCheckCommand.String()
The name is in both `Path` and `Args[0]`, so start printing args at 1.
Also refactor to avoid an extra space character in the output.
I pondered whether `healthCheckCommand.String()` should check if the slice is empty, to avoid a panic, but it didn't check for `Cmd==nil` before.
Fixes#29107
Automatic merge from submit-queue
Change the docker validation node e2e test to use gci-canary-test
This PR changed the continuous docker validation node e2e test to use the image config file introduced in https://github.com/kubernetes/kubernetes/pull/28708. @euank
This PR also changed the gci image family from `gci-preview-test` to `gci-canary-test`. @wonderfly
Automatic merge from submit-queue
Return (bool, error) in Authorizer.Authorize()
Before this change, Authorize() method was just returning an error, regardless of whether the user is unauthorized or whether there is some other unrelated error. Returning boolean with information about user authorization and error (which should be unrelated to the authorization) separately will make it easier to debug.
Fixes#27974
Automatic merge from submit-queue
Node E2E: Make it possible to share test between e2e and node e2e
This PR is part of the plan to improve node e2e test coverage.
* Now to improve test coverage, we have to copy test from e2e to node e2e.
* When adding a new test, we have to decide its destiny at the very beginning - whether it is a node e2e or e2e.
This PR makes it possible to share test between e2e and node e2e.
By leveraging the mechanism of ginkgo, as long as we can import the test package in the test suite, the corresponding `Describe` will be run to initialize the global variable `_`, and the test will be inserted into the test suite. (See https://github.com/onsi/composition-ginkgo-example)
In the future, we just need to use the framework to write the test, and put the test into `test/e2e/node`, then it will be automatically shared by the 2 test suites.
This PR:
1) Refactored the framework to make it automatically differentiate e2e and node e2e (Mainly refactored the `PodClient` and the apiserver client initialization).
2) Created a new directory `test/e2e/node` and make it shared by e2e and node e2e.
3) Moved `container_probe.go` into `test/e2e/node` to verify the change.
@kubernetes/sig-node
[]()
Automatic merge from submit-queue
[flake fix] Wait for the podInformer to observe the pod
Fix#29065
The problem is that the rc manager hasn't observed pod1, so it creates another pod and scales down, pod1 might get deleted. To fix it, wait for the podInformer to observe the pod before running the rc manager.
Marked as P0 as it's fixing a P0 flake.
Automatic merge from submit-queue
Drop support for --gce-service-account, require activated creds
Now that `gcloud auth activate-service-account` is in remove support in the test framework for default service accounts -- testing GCE/GKE now requires prior gcloud activation.
This commit ensures that `logPodStartupStatus` does not leak
running `time.Ticker` instances. Upon termination of the consuming
routine, we stop the ticker.
Before this change, Authorize() method was just returning an error,
regardless of whether the user is unauthorized or whether there
is some other unrelated error. Returning boolean with information
about user authorization and error (which should be unrelated to
the authorization) separately will make it easier to debug.
Fixes#27974
Automatic merge from submit-queue
Fix verify results in MaxPods
As we already have "unschedulable" PodCondition we can stop relying on Events, which should make the tests more reliable.
cc @davidopp
Automatic merge from submit-queue
authorize based on user.Info
Update the `authorization.Attributes` to use the `user.Info` instead of discrete getters for each piece.
@kubernetes/sig-auth
Automatic merge from submit-queue
Fix a bug in mirror pod node e2e test.
Fixed a bug in test/e2e_node/mirror_pod_test.go. The function 'checkMirrorPodDisappear' returns nil even when the pod does not disappear. It should return a non-nil error.
@Random-Liu
Automatic merge from submit-queue
[GarbageCollector] Let the RC manager set/remove ControllerRef
What's done:
* RC manager sets Controller Ref when creating new pods
* RC manager sets Controller Ref when adopting pods with matching labels but having no controller
* RC manager clears Controller Ref when pod labels change
* RC manager clears pods' Controller Ref when rc's selector changes
* RC manager stops adoption/creating/deleting pods when rc's DeletionTimestamp is set
* RC manager bumps up ObservedGeneration: The [original code](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/replication/replication_controller_utils.go#L36) will do this.
* Integration tests:
* verifies that changing RC's selector or Pod's Labels triggers adoption/abandoning
* e2e tests (separated to #27151):
* verifies GC deletes the pods created by RC if DeleteOptions.OrphanDependents=false, and orphans the pods if DeleteOptions.OrphanDependents=true.
TODO:
- [x] we need to be able to select Pods that have a specific ControllerRef. Then each time we sync the RC, we will iterate through all the Pods that has a controllerRef pointing the RC, event if the labels of the Pod doesn't match the selector of RC anymore. This will prevent a Pod from stuck with a stale controllerRef, which could be caused by the race between abandoner (the goroutine that removes controllerRef) and worker the goroutine that add controllerRef to pods).
- [ ] use controllerRef instead of calling `getPodController`. This might be carried out by the control-plane team.
- [ ] according to the controllerRef proposal (#25256): "For debugging purposes we want to add an adoptionTime annotation prefixed with kubernetes.io/ which will keep the time of last controller ownership transfer." This might be carried out by the control-plane team.
cc @lavalamp @gmarek
Automatic merge from submit-queue
[garbage collector] add e2e test
This PR also includes some changes to plumb controller-manager's `--enable_garbage_collector` from the environment variable.
The e2e test will not be run by the core suite because it's marked `[Feature:GarbageCollector]`.
The corresponding jenkins job configuration PR is https://github.com/kubernetes/test-infra/pull/132.
Automatic merge from submit-queue
Support terminal resizing for exec/attach/run
```release-note
Add support for terminal resizing for exec, attach, and run. Note that for Docker, exec sessions
inherit the environment from the primary process, so if the container was created with tty=false,
that means the exec session's TERM variable will default to "dumb". Users can override this by
setting TERM=xterm (or whatever is appropriate) to get the correct "smart" terminal behavior.
```
Fixes#13585
Add support for terminal resizing for exec, attach, and run. Note that for Docker, exec sessions
inherit the environment from the primary process, so if the container was created with tty=false,
that means the exec session's TERM variable will default to "dumb". Users can override this by
setting TERM=xterm (or whatever is appropriate) to get the correct "smart" terminal behavior.
This allows us to start building real dependencies into Makefile.
Leave old hack/* scripts in place but advise to use 'make'. There are a few
rules that call things like 'go run' or 'build/*' that I left as-is for now.
Automatic merge from submit-queue
node_e2e: configure gce images via config file
This file provides the abiliy to specify image project on a per-image
basis and is more extensible for future changes.
For backwards compatibility and local development convenience, the
existing flags are kept and should work.
The eventual goal is to be able to source some images, such as the CoreOS one (and possibly containervm one) from their upstream projects and do all new configuration changes via a cloud-init key added to the image config.
This PR is a first step there. A following PR will add a config key of `cloud-init` or `user-data` and migrate the CoreOS e2e to use that.
This motivation is driven by the fact that currently the changes needed for the CoreOS image can all be done quickly in cloud-init and this will make it much easier to update the image and ensure that changes are applied consistently.
/cc @timstclair @vishh @yifan-gu @pwittrock
Automatic merge from submit-queue
Node E2E: Prep for continuous Docker validation node e2e test
Based on https://github.com/kubernetes/kubernetes/pull/28516, for https://github.com/kubernetes/kubernetes/issues/25215.
https://github.com/kubernetes/kubernetes/pull/26813 added support to run e2e test on gci preview image and newest docker version.
This PR added the same support to node e2e test.
The main dependencies of node e2e test are `docker`, `kubelet`, `etcd` and `apiserver`.
Currently, node e2e test builds `kubelet` and `apiserver` locally, and copies them into `/tmp` directory in VM instance. GCI also has built-in `docker`. So the only dependency missing is `etcd`.
This PR injected a simple cloud-init script when creating instance to install `etcd` during node startup.
@andyzheng0831 for the cloud init script.
@wonderfly for the gci instance setup.
@pwittrock for the node e2e test change.
/cc @dchen1107
[]()
Automatic merge from submit-queue
Deprecate the term "Ubernetes"
Deprecate the term "Ubernetes" in favor of "Cluster Federation" and "Multi-AZ Clusters"
Automatic merge from submit-queue
Fix path for examples - storage/volume directories changed
Added /volume and /storage in a couple of spots.
Fixes#27978
Automatic merge from submit-queue
Return server's representation of pod from framework pod creation functions
Since PodInterface.Create returns the server's representation of the pod, which may differ from the api.Pod object passed to Create, we do the same from the framework's pod creation functions. This is useful if e.g. you create pods using Pod.GenerateName rather than Pod.Name, and you still want to refer to pods by name later on (e.g. for deletion).
cc @timstclair
This file provides the abiliy to specify image project on a per-image
basis and is more extensible for future changes.
For backwards compatibility and local development convenience, the
existing flags are kept and should work.
Previous volume binder code was not able to cope with PVs or PVCs getting
modified during the binding process. Current one should be resilient to
these changes, so let's test it.
It makes the test approximately twice as long as before, from ~2 seconds to
~4-5.
Since PodInterface.Create returns the server's representation of the
pod, which may differ from the api.Pod object passed to Create, we do
the same from the framework's pod creation functions. This is useful if
e.g. you create pods using Pod.GenerateName rather than
Pod.Name, and you still want to refer to pods by name later on
(e.g. for deletion).
Automatic merge from submit-queue
Update coreos node e2e image to a version that uses cgroupfs
Temporary fix for #28192. This PR updates coreos node e2e image to a version that uses cgroupfs.
cc @vishh @yifan-gu
Search and replace for references to moved examples
Reverted find and replace paths on auto gen docs
Reverting changes to changelog
Fix bugs in test-cmd.sh
Fixed path in examples README
ran update-all successfully
Updated verify-flags exceptions to include renamed files
Automatic merge from submit-queue
Node E2E: Disable kubenet for local node e2e test.
After https://github.com/kubernetes/kubernetes/pull/28196, we must manually setup cni and nsenter in local node to run `make test_e2e_node`, which may not be necessary for local development.
I've tried to move cni downloading logic into `BeforeSuite`, however it is still hard to figure out who should install nsenter, manually installed by every developer? in the `setup_host.sh` script? in `BeforeSuite`?
This PR:
* Added a flag to disable kubenet and disabled kubenet in local test.
* Cleaned up the CNI installation logic a bit.
/cc @yujuhong @freehan
[]()
Automatic merge from submit-queue
E2E: Add UpdatePod function in e2e framework and change the test to use it.
Fix https://github.com/kubernetes/kubernetes/issues/28096.
Some e2e tests need to update pod, but the pod update is a bit complex because of potential conflict. #28096 happened just because the test only called pod `Update` once.
This PR move the update pod logic into a util function `UpdatePod` in e2e framework, and change the tests to use it.
Mark P2 because the original issue is P0, but in fact happens not quite frequently. :)
[]()
the test to use it.
Automatic merge from submit-queue
Add test/test_owners.csv, for automatic assignment of test failures.
This file will be read by the munger -- see kubernetes/contrib#1264
This also includes a simple script to do minor automatic updates to the CSV.
I'd like to get `update_owners.py` into a more usable state -- right now the CSV is based directly on the Google Sheets data. It has 9 outdated tests and is missing 80 new tests.
I can randomly assign new tests to people on kubernetes-maintainers, but are there any caveats to how the assignment should work? Should they be load balanced? Should some people in the group not receive issues? Etc.
Automatic merge from submit-queue
Fix node e2e issues on selinux enabled systems
It fixes following 3 node e2es:
```
[Fail] [k8s.io] Container Runtime Conformance Test container runtime conformance blackbox test when starting a container that exits [It] it should run with the expected status [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/runtime_conformance_test.go:114
[Fail] [k8s.io] Kubelet metrics api when querying /stats/summary [It] it should report resource usage through the stats api
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:158
```
```
[Fail] [k8s.io] Container Runtime Conformance Test container runtime conformance blackbox test when starting a container that exits [It] should report termination message if TerminationMessagePath is set [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/runtime_conformance_test.go:150
```
@kubernetes/rh-cluster-infra
Automatic merge from submit-queue
e2e: increase timeout when waiting for deployment pods to be deleted
Use the same timeout as the one used for waiting for the deployment
reaper to complete.
Takes a stab at https://github.com/kubernetes/kubernetes/issues/28067
@kubernetes/deployment PTAL
Automatic merge from submit-queue
Reorganize volume controllers and manager
* Move both PV and attach/detach volume controllers to `controllers/volume` (closes#26222)
* Rename `kubelet/volume` to `kubelet/volumemanager`
* Add/update OWNER files
Automatic merge from submit-queue
Add MinReadySeconds to rolling updater
Add MinReadySeconds support to RollingUpdater that allows to specify the number of seconds to wait on top of the pod is "ready" because its readiness probe passed.
Automatic merge from submit-queue
Fix node confomance test
Fixes https://github.com/kubernetes/kubernetes/issues/28255, https://github.com/kubernetes/kubernetes/issues/28250, https://github.com/kubernetes/kubernetes/issues/28341.
The main reason of the flake is that in the failed test expects the `PodPhase` to keep `Pending`. It did `Eventually` check and `Consistently` check for 5 seconds. However, the default `PodPhase` is `Pending`, when the check passes, the `PodStatus` could still be in default state.
After that, the test expects the container status to be `Waiting`, which may not be the case, because the default `ContainerStatuses` is empty, and the pod could still be in the default state.
This PR changes the test to ensure `ContainerStatuses` first and then check the `PodPhase` after that.
Mark P1 because the test fails relatively frequently and does block some PRs.
@pwittrock
/cc @liangchenye @ncdc
[]()
Automatic merge from submit-queue
Use slices of items to clean up after tests
Fixes#27582.
We used to maintain a pointer variable for each process to kill after the
tests finish. @lavalamp suggested using a slice instead, which is a much
cleaner solution. This implements @lavalamp's suggestion and also extends
the idea to tracking directories that need to be removed after the tests finish.
This also means that we should no longer check for nil `killCmd`s inside
`func (k *killCmd) Kill() error {...}` (see #27582 and #27589). If a nil
`killCmd` makes it in there, something is bad elsewhere and we want to see
the nil pointer exception immediately.
Mentioning @timstclair and @euank wrt the original issue/PR.
Automatic merge from submit-queue
Federated Services e2e: Simplify logic and logging around verificatio…
Simplify logic and logging around verification of underlying services.
Fixes#28269.
Without this PR, service verification in 4 of our e2e tests sometimes fails.
[Fail] [k8s.io] Kubelet metrics api when querying /stats/summary [It] it should report resource usage through the stats api
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:158
[Fail] [k8s.io] Container Runtime Conformance Test container runtime conformance blackbox test when starting a container that exits [It] should report termination message if TerminationMessagePath is set [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/runtime_conformance_test.go:150
[Fail] [k8s.io] Container Runtime Conformance Test container runtime conformance blackbox test when starting a container that exits [It] it should run with the expected status [Conformance]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/runtime_conformance_test.go:114
Fixes#27582
We used to maintain a pointer variable for each process to kill after the
tests finish. @lavalamp suggested using a slice instead, which is a much
cleaner solution. This implements @lavalamp's suggestion and also extends
the idea to tracking directories that need to be removed after the tests finish.
This also means that we should no longer check for nil `killCmd`s inside
`func (k *killCmd) Kill() error {...}` (see #27582 and #27589). If a nil
`killCmd` makes it in there, something is bad elsewhere and we want to see
the nil pointer exception immediately.
Automatic merge from submit-queue
Remove duplicated nginx image. Use nginx-slim instead
This PR removes the image `gcr.io/google_containers/nginx:1.7.9` and uses `gcr.io/google_containers/nginx-slim:0.7`.
Besides removing the duplication `1.7.9` is 16 months old.
Automatic merge from submit-queue
Fix federation e2e tests by correctly managing cluster clients
1. The main fix: Correct overall BeforeEach() to create a new set of cluster clients, rather than just append to the set created by all previous tests. This was screwing up a lot of stuff in difficult to diagnose ways.
2. Add lots of debug logging.
3. Be better about cleaning up after each test.
```
SUCCESS! -- 6 Passed | 0 Failed :-)
```
cc @nikhiljindal @madhusudancs @mfanjie @colhom FYI
Automatic merge from submit-queue
Add two pd tests with default grace period
Add two tests in pd.go. They are same as the flaky test, but the pod deletion has default grace period
Automatic merge from submit-queue
Refactored, expanded and fixed federated-services e2e tests.
1. Moved BeforeEach() and AfterEach() to an inner scope, to prevent clashes with Framework's BeforeEach() and AfterEach(). Morte to come on this, as it's a major bug in our use of Ginkgo, and affects many other tests.
2. Keep track of which clusters we have created namespaces in, so that we don't try to delete namespaces out of clusters that we didn't create them in (e.g. the primary cluster, where the framework already creates and deleted the required namespace).
3. Separate tests for federated service creation and verification that underlying services are created correctly.
4. For DNS resolution tests, create backend pods (and delete on cleanup) where required).
5. For non-local DNS resolution, delete a backend pod in one cluster to test, and in the remainder of clusters on cleanup.
6. Lots of refactoring to make code re-usable across multiple test.
7. Lots of debugging/fixing to make sure that everything that the testscreate are cleaned up properly afterwards, and don't clash with the cleanups done by the e2e Framework.
Automatic merge from submit-queue
TLS bootstrap API group (alpha)
This PR only covers the new types and related client/storage code- the vast majority of the line count is codegen. The implementation differs slightly from the current proposal document based on discussions in design thread (#20439). The controller logic and kubelet support mentioned in the proposal are forthcoming in separate requests.
I submit that #18762 ("Creating a new API group is really hard") is, if anything, understating it. I've tried to structure the commits to illustrate the process.
@mikedanese @erictune @smarterclayton @deads2k
```release-note-experimental
An alpha implementation of the the TLS bootstrap API described in docs/proposals/kubelet-tls-bootstrap.md.
```
[]()
Automatic merge from submit-queue
Add EndpointReconcilerConfig to master Config
Add EndpointReconcilerConfig to master Config to allow downstream integrators to customize the reconciler and reconciliation interval when starting a customized master
@kubernetes/sig-api-machinery @deads2k @smarterclayton @liggitt @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
Skip multi-zone e2e tests unless provider is GCE, GKE or AWS
No need to fail the tests. If label is not present then it means that node is not in any zone.
Related issue: #27372
Automatic merge from submit-queue
Convert service account token controller to use a work queue
Converts the service account token controller to use a work queue. This allows parallelization of token generation (useful when there are several simultaneous namespaces or service accounts being created). It also lets us requeue failures to be retried sooned than the next sync period (which can be very long).
Fixes an issue seen when a namespace is created with secrets quotaed, and the token controller tries to create a token secret prior to the quota status having been initialized. In that case, the secret is rejected at admission, and the token controller wasn't retrying until the resync period.
Automatic merge from submit-queue
Mark "RW PD, remove it, then schedule" test flaky
Mark test as flaky while it is being investigated. Tracked by https://github.com/kubernetes/kubernetes/issues/27691
Assigning to @jlowdermilk since he's on call
Add EndpointReconcilerConfig to master Config to allow downstream integrators to customize the reconciler
and reconciliation interval when starting a customized master.
Automatic merge from submit-queue
e2e: Allow skipping tests for specific runtimes, skip a few tests under rkt
The main benefit of this is that it gives a developer more useful output (more signal to noise) for things that are known broken on that runtime.
cc @kubernetes/rktnetes-maintainers , @ixdy
I'll run this PR through our jenkins and make sure things look happy and compare to the e2e results for this PR.
Automatic merge from submit-queue
[Refactor] QOS to have QOS Class type for QoS classes
This PR adds a QOSClass type and initializes QOSclass constants for the three QoS classes.
It would be good to use this in all future QOS related features.
This would be good to have for the (Pod level cgroups isolation proposal)[https://github.com/kubernetes/kubernetes/pull/26751] that i am working on aswell.
@vishh PTAL
Signed-off-by: Buddha Prakash <buddhap@google.com>
Automatic merge from submit-queue
e2e.framework.util.StartPods: panic if the number or replicas is zero
The number of pods to start must be non-zero.
Otherwise the function waits for pods forever if ``waitForRunning`` is true.
It the number of replicas is zero, panic so the mistake is heard all over the e2e realm.
Update all callers of StartPods to test for non-zero number of replicas.
Automatic merge from submit-queue
Set grace period to 0 when deleting namespaces after the test.
Otherwise, we try to run the next test and the pods are still there.
Automatic merge from submit-queue
Proportionally scale paused and rolling deployments
Enable paused and rolling deployments to be proportionally scaled.
Also have cleanup policy work for paused deployments.
Fixes#20853Fixes#20966Fixes#20754
@bgrant0607 @janetkuo @ironcladlou @nikhiljindal
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/20273)
<!-- Reviewable:end -->
Automatic merge from submit-queue
e2e: Delete old code
These tests were added commented out over a year ago. Now they don't compile. The port forward test has a whole file devoted to replacing it (`e2e/portforward.go`) and while the exec test doesn't have a perfect replacement, it has several tests that cover for it (exec over a websocket, an e2e_node test, all the kubectl execs). If we want that test, it would be better to write it fresh anyways.
cc @ncdc
Automatic merge from submit-queue
Use gcloud for default node pool and api for other in cluster autoscaler e2e test
cc: @piosz @jszczepkowski @fgrzadkowski
Currently there is a problem with gcloud when non-default pool is used for cluster update. So we temporarily switch to the old ca-enable method for non-default pools until it is fixed.
Automatic merge from submit-queue
A few changes to federated-service e2e test.
Most of the changes that get the test to pass have been made already or
elsewhere. Here we restructure a bit fixing a nesting problem, extend the
timeouts, and start creating distinct backend pods that I'll delete in the
non-local test (coming shortly).
Also some extra debugging info in the DNS code. I made some upstream
changes to skydns in https://github.com/skynetservices/skydns/pull/283
For #27739
Includes a commit from @madhusudancs that I will remove once his merges.
Automatic merge from submit-queue
e2e_node: lower the log verbosity level
The current level is so high that the logs are almost unreadable.
This fixes#27593
Most of the changes that get the test to pass have been made already or
elsewhere. Here we restructure a bit fixing a nesting problem, extend
the timeouts, and start creating distinct backend pods that I'll delete
in the non-local test (coming shortly).
Also some extra debugging info in the DNS code. I made some upstream
changes to skydns in https://github.com/skynetservices/skydns/pull/283
Automatic merge from submit-queue
Fixes a node e2e test error
Fixes following node e2e test error:
[k8s.io] Kubelet metrics api when querying /stats/summary [It] it should report resource usage through the stats api
And the logs show following error:
```
Jun 21 15:57:13 localhost journal: tee: /test-empty-dir-mnt: Is a directory
```
And the test fails with:
```
------------------------------
• Failure [310.665 seconds]
[k8s.io] Kubelet
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e/framework/framework.go:685
metrics api
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:161
when querying /stats/summary
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:160
it should report resource usage through the stats api [It]
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:159
Timed out after 300.000s.
Expected
<*errors.errorString | 0xc82026b6f0>: {
s: "expected \"volume used\" to not be zero",
}
to be nil
/root/upstream-code/gocode/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:158
------------------------------
```
@kubernetes/rh-cluster-infra
Automatic merge from submit-queue
increase addon check interval
Do static pods have a crash loop back off? If so, this test would be much faster if we restarted the kubelet to clear that.
Fixes#26770
Automatic merge from submit-queue
Add integration test for binding PVs using label selectors
Adds an integration test for persistent volume claim 'MatchExpressions' label selector.
Automatic merge from submit-queue
Fix 7 broken example e2e tests
Fixes#27325, Fixes#27727
7 broken example e2e tests:
- [x] Spark
* `namespace` is specified in example yaml files which conflict with e2e test namespaces, fixed by removing the namespace in yaml (the yaml files of [spark example](https://github.com/kubernetes/kubernetes/tree/master/examples/spark) doesn't need the namespace specified since it's specified in its context) -- cc @k82 who added namespace to Spark example in #23807
* wait for pods to exist before determining if it's running
- [x] Hazelcast
* wait for pods to exist before determining if it's running
- [x] Redis
* image `kubernetes/redis:v2` is not found, changed to `kubernetes/redis:v1` instead
* wait for pods to exist before determining if it's running
- [x] Celery-RabbitMQ
* remove 1 redundant call to `forEachPod`
* wait for pods to exist before determining if it's running
- [x] Cassandra
* fix `kubectl exec` on incorrect pod name
* fix getting endpoint ip addresses before creating pods
* wait for pods to exist before determining if it's running
- [x] Storm
* wait for pods to exist before determining if it's running
- [x] RethinkDB
* wait for pods to exist before determining if it's running
[]()
[k8s.io] Kubelet metrics api when querying /stats/summary [It] it should report resource usage through the stats api
And the logs show following error:
Jun 21 15:57:13 localhost journal: tee: /test-empty-dir-mnt: Is a directory
Automatic merge from submit-queue
Reapply ScheduledJob tests (2ab885a53a)
Re-applied the ScheduledJob tests (#25737) which were reverted due to an integration test error in #27184.
The problem was in `TestBatchGroupBackwardCompatibility` which is testing backwards compatibility for storing jobs (`extensions/v1beta1` vs `batch/v1`), which is not needed for `batch/v2alpha1`. I've added a skip to aforementioned test for that group. See `test/integration/master_test.go` for the actual fix.
@caesarxuchao @mikedanese ptal
@piosz @jszczepkowski @erictune fyi
[]()
Automatic merge from submit-queue
GCE provider: Limit Filter calls to regexps rather than insane blobs
Filters can't exceed 4k, and GET requests against the GCE API are also limited, so these break down in different ways at different cluster counts. Fix it by introducing an advisory `node-instance-prefix` configuration in the GCE provider that can hint the `EnsureLoadBalancer`/`UpdateLoadBalancer code` (and the firewall creation/update code). If it's not there, or wrong (a hostname that's registered violates it), just ignore it and grab the whole project.
Fixes#27731
[]()
Filters can't exceed 4k, and GET requests against the GCE API are also
limited, so these break down in different ways at different cluster
counts. Fix it by introducing an advisory node-instance-prefix
configuration in the GCE provider that can hint the
EnsureLoadBalancer/UpdateLoadBalancer code (and the firewall
creation/update code). If it's not there, or wrong (a hostname that's
registered violates it), just ignore it and grab the whole project.
Automatic merge from submit-queue
Migrate most of remaining tests from cmd/integration to test/integration to use framework
Ref #25940
Built on top of https://github.com/kubernetes/kubernetes/pull/27182 - only the last commit is unique
Automatic merge from submit-queue
Add possibility to run integration tests in parallel
- add env. variable with etcd URL to intergration tests
- update documentation with example how to use it to find flakes
Automatic merge from submit-queue
Add integration test for binding PVs using label selectors
Adds an integration test for persistent volume claim label selector.
Many integration tests delete all keys in etcd as part of their cleanup.
To run these tests in parallel we must run several etcd daemons, each on
different port and pass etcd url to the test suite.
Automatic merge from submit-queue
Node E2E: add termination message test
Based on #23658.
This PR:
1) Cleans up the `ConformanceContainer` a bit
2) Add termination message test
This test proves #23639, without #23658, the test could not pass.
@liangchenye @kubernetes/sig-node
Automatic merge from submit-queue
add unit and integration tests for rbac authorizer
This PR adds lots of tests for the RBAC authorizer.
The plan over the next couple days is to add a lot more test cases.
Updates #23396
cc @erictune
Automatic merge from submit-queue
WaitForRunningReady also waits for PodsSuccess
Ref. #27095 - fixes the test, doesn't fix the problem.
cc @yujuhong @fejta
Automatic merge from submit-queue
Add integration test for provisioning/deleting many PVs.
The test is configurable by KUBE_INTEGRATION_PV_OBJECTS for load tests, 100 objects are created by default.
@kubernetes/sig-storage
Automatic merge from submit-queue
Filter seccomp profile path from malicious .. and /
Without this patch with `localhost/<some-releative-path>` as seccomp profile one can load any file on the host, e.g. `localhost/../../../../dev/mem` which is not healthy for the kubelet.
/cc @jfrazelle
Unit tests depend on https://github.com/kubernetes/kubernetes/pull/26710.
Automatic merge from submit-queue
Revert revert of downward api node defaults
Reverts the revert of https://github.com/kubernetes/kubernetes/pull/27439Fixes#27062
@dchen1107 - who at Google can help debug why this caused issues with GKE infrastructure but not GCE merge queue?
/cc @wojtek-t @piosz @fgrzadkowski @eparis @pmorie
Automatic merge from submit-queue
Cleanups following #27587
- Add back the negative assertions, but mark them [Slow].
- Use the current DNS TTL of 180 sec as our timeout for all DNS tests.
- Assorted cleanups and refactoring.
Automatic merge from submit-queue
Extend ingress e2e
Splits the test into a cross platform conformance list, and platform specific bits that exercise features through annotations. Also exercises the features in https://github.com/kubernetes/contrib/pull/1133. Assigning to Girish, simply because I assigned the other pr to Minhan.
- Dropped the regex test and just test for nslookup exiting 0.
- Moved more setup into BeforeEach and used nested Context for non-local
case.
- Poll inside the container using a bash loop.
- Aim for less console noise unless something goes wrong.
- Commented out the tests trying to verify that a DNS name is absent.
Automatic merge from submit-queue
in each pd test, create and delete the pod for every iteration to give new pod name for exec
fix#26141
based on chat with @ncdc
The following is a snapshot of the log. Each iteration now has a new Pod name
```text
[It] should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow] [Flaky]
/srv/dev/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/pd.go:277
STEP: creating PD1
Jun 10 15:55:45.878: INFO: Successfully created a new PD: "rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c".
STEP: creating PD2
Jun 10 15:55:49.794: INFO: Successfully created a new PD: "rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c".
Jun 10 15:55:49.794: INFO: PD Read/Writer Iteration #0
STEP: submitting host0Pod to kubernetes
W0610 15:55:49.860308 17282 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c: need to check if this is versioned correctly.
STEP: writing a file in the container
Jun 10 15:56:09.792: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '988876932586416926' > '/testpd1/tracker0''
Jun 10 15:56:12.003: INFO: Wrote value: "988876932586416926" to PD1 ("rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
Jun 10 15:56:12.003: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '8414937992264649637' > '/testpd2/tracker0''
Jun 10 15:56:13.170: INFO: Wrote value: "8414937992264649637" to PD2 ("rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
Jun 10 15:56:13.170: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:14.325: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:14.325: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-cd68f34b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:15.590: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: deleting host0Pod
Jun 10 15:56:15.841: INFO: PD Read/Writer Iteration #1
STEP: submitting host0Pod to kubernetes
W0610 15:56:15.905485 17282 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
Jun 10 15:56:16.832: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:18.132: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:18.132: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:19.354: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: writing a file in the container
Jun 10 15:56:19.354: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '7639503234625274799' > '/testpd1/tracker1''
Jun 10 15:56:20.526: INFO: Wrote value: "7639503234625274799" to PD1 ("rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
Jun 10 15:56:20.526: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '7400445987108171911' > '/testpd2/tracker1''
Jun 10 15:56:21.694: INFO: Wrote value: "7400445987108171911" to PD2 ("rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
Jun 10 15:56:21.694: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:22.904: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:22.905: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:24.080: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: reading a file in the container
Jun 10 15:56:24.081: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker1'
Jun 10 15:56:25.290: INFO: Read file "/testpd1/tracker1" with content: 7639503234625274799
STEP: reading a file in the container
Jun 10 15:56:25.290: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-dcef71e1-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker1'
Jun 10 15:56:26.491: INFO: Read file "/testpd2/tracker1" with content: 7400445987108171911
STEP: deleting host0Pod
Jun 10 15:56:26.756: INFO: PD Read/Writer Iteration #2
STEP: submitting host0Pod to kubernetes
W0610 15:56:26.821828 17282 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
Jun 10 15:56:27.898: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker1'
Jun 10 15:56:29.096: INFO: Read file "/testpd1/tracker1" with content: 7639503234625274799
STEP: reading a file in the container
Jun 10 15:56:29.096: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker1'
Jun 10 15:56:30.325: INFO: Read file "/testpd2/tracker1" with content: 7400445987108171911
STEP: reading a file in the container
Jun 10 15:56:30.325: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:31.528: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:31.529: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:32.972: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: writing a file in the container
Jun 10 15:56:32.972: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1846555975530999997' > '/testpd1/tracker2''
Jun 10 15:56:34.157: INFO: Wrote value: "1846555975530999997" to PD1 ("rootfs-e2e-c8b82df9-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
Jun 10 15:56:34.157: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '2775947264799611726' > '/testpd2/tracker2''
Jun 10 15:56:35.661: INFO: Wrote value: "2775947264799611726" to PD2 ("rootfs-e2e-cb135362-2f23-11e6-a5a0-b8ca3a62792c") from pod "pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
Jun 10 15:56:35.662: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
Jun 10 15:56:36.868: INFO: Read file "/testpd1/tracker0" with content: 988876932586416926
STEP: reading a file in the container
Jun 10 15:56:36.868: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
Jun 10 15:56:38.062: INFO: Read file "/testpd2/tracker0" with content: 8414937992264649637
STEP: reading a file in the container
Jun 10 15:56:38.062: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker1'
Jun 10 15:56:39.221: INFO: Read file "/testpd1/tracker1" with content: 7639503234625274799
STEP: reading a file in the container
Jun 10 15:56:39.221: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker1'
Jun 10 15:56:40.397: INFO: Read file "/testpd2/tracker1" with content: 7400445987108171911
STEP: reading a file in the container
Jun 10 15:56:40.397: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker2'
Jun 10 15:56:41.584: INFO: Read file "/testpd1/tracker2" with content: 1846555975530999997
STEP: reading a file in the container
Jun 10 15:56:41.585: INFO: Running '/srv/dev/kubernetes/_output/local/bin/linux/amd64/kubectl exec --namespace=e2e-tests-pod-disks-2tvm2 pd-test-e370dd2b-2f23-11e6-a5a0-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker2'
Jun 10 15:56:42.800: INFO: Read file "/testpd2/tracker2" with content: 2775947264799611726
STEP: deleting host0Pod
```
@saad-ali
Automatic merge from submit-queue
Dumping logs of federation pods (federation-apiserver, federation-controller-manager) on e2e test failure
Ref https://github.com/kubernetes/kubernetes/issues/26762
This should help with debugging failures.
Right now there is no way to access those logs.
@kubernetes/sig-cluster-federation @colhom
Automatic merge from submit-queue
Call NewFramework constructor instead of hand creating framework.
https://github.com/kubernetes/kubernetes/issues/27486, probably because we defined a new clientConfigGetter for node e2es and this test was hand creating the framework.
Automatic merge from submit-queue
Kubelet Volume Attach/Detach/Mount/Unmount Redesign
This PR redesigns the Volume Attach/Detach/Mount/Unmount in Kubelet as proposed in https://github.com/kubernetes/kubernetes/issues/21931
```release-note
A new volume manager was introduced in kubelet that synchronizes volume mount/unmount (and attach/detach, if attach/detach controller is not enabled).
This eliminates the race conditions between the pod creation loop and the orphaned volumes loops. It also removes the unmount/detach from the `syncPod()` path so volume clean up never blocks the `syncPod` loop.
```
Automatic merge from submit-queue
federation: choosing a default federation name in test instead of failing
The tests are failing right now:
http://kubekins.dls.corp.google.com/job/kubernetes-e2e-gce-federation/
```
[k8s.io] Service [Feature:Federation] should be able to discover a non-local federated service
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/federated-service.go:130 Jun 14 12:40:35.091: FEDERATION_NAME environment variable must be set
[k8s.io] Service [Feature:Federation] should be able to discover a federated service
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/federated-service.go:130 Jun 14 12:40:40.802: FEDERATION_NAME environment variable must be set
```
This is to fix them.
cc @kubernetes/sig-cluster-federation @mml
This commit adds a new volume manager in kubelet that synchronizes
volume mount/unmount (and attach/detach, if attach/detach controller
is not enabled).
This eliminates the race conditions between the pod creation loop
and the orphaned volumes loops. It also removes the unmount/detach
from the `syncPod()` path so volume clean up never blocks the
`syncPod` loop.
Automatic merge from submit-queue
Make timeout for starting system pods configurable
Context: in 2000-node clusters (if only one node is big enough to fit heapster, which is our testing configuration), heapster won't be scheduled until that node has route. However, creating routes is pretty expensive and currently can take even 2 hours.
@zmerlynn @gmarek
Automatic merge from submit-queue
Add image pulling node e2e
Fixes#27007.
Based on #27309, will rebase after #27309 gets merged.
This PR added all tests mentioned in #27007:
* Pull an image from invalid registry;
* Pull an invalid image from gcr;
* Pull an image from gcr;
* Pull an image from docker hub;
* Pull an image needs auth with/without secrets.
For the imagePullSecrets test, I created a new gcloud project "authenticated-image-pulling", and the service account in the code only has "Storage Object Viewer" permission.
/cc @pwittrock @vishh
[]()
Automatic merge from submit-queue
Add description to created node images
Make it a little easier to see who to contact about important node e2e images.
The number of pods to start must be non-zero.
Otherwise the function waits for pods forever if waitForRunning is true.
It the number of replicas is zero, panic so the mistake is heard all over the e2e realm.
Update all callers of StartPods to test for non-zero number of replicas.
Automatic merge from submit-queue
Updating federation up scripts to work in non e2e setup
Ref: https://github.com/kubernetes/kubernetes.github.io/pull/656
Updating the federation up scripts so that they work as per steps in https://github.com/kubernetes/kubernetes.github.io/pull/656.
Changes are:
* Updating the default namespace to be "federation" instead of "federation-e2e"
* Updated the kubeconfig context to be named "federation-cluster" instead of "federated-context"
* Fixing federation-up so that FEDERATION_IMAGE_TAG is set even when federation-up is run without running `e2e.go --up`. e2e-up.sh sets it here: 6a388d4a0d/hack/e2e-internal/e2e-up.sh (L44).
* Adding a "missingkey=zero" option to template parser. Without this, the parser adds `"<no value>"` at the place of an env var that is not set. With this change, it instead replaces it with the corresponding zero value (for ex "" for strings). This is required for the FEDERATION_DNS_PROVIDER_CONFIG env var.
cc @kubernetes/sig-cluster-federation @colhom @mml
Automatic merge from submit-queue
Implement first set of federated service e2e tests.
These tests are untested and there is no guarantee that they work. The ongoing auth problems is blocking these e2es from being tested and upon @quinton-hoole's request I am submitting them now.
Only the last commit here needs review.
Depends on #26953
cc @nikhiljindal @colhom @mfanjie @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
Fix node e2e coreos kubelet cgroup detection
Fixes#26979#26431
The root issue, as best I can tell, is that cgroup detection does not work when the kubelet is started under an ssh session and the systemd `*Accounting` variables are set. I added additional logging and noted some differences in the cgroup slice names between those cadvisor returns and the kubelet detects for itself.
This difference does not occur if the kubelet is properly running under a unit. That environment is also a more common and sane environment.
See also discussion in #26903
cc @derekwaynecarr @vishh @pwittrock
Note that these tests are untested and there is no guarantee that they work.
The ongoing auth problems is blocking these e2es from being tested and upon
@quinton-hoole's request I am submitting them now.
Automatic merge from submit-queue
Add pending pod check in cluster autoscaler e2e tests
The tests should wait until all pods are running before declaring a success and resizing the mig.
cc: @fgrzadkowski @piosz @jszczepkowski
Automatic merge from submit-queue
volume integration: wait for PVs before creating PVCs
The test should wait until all volumes are processed by volume controller (i.e. in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity 50000000000 but got pvc-2 capacity 52000000000".
Fixes#27179 (together with #26894)
Automatic merge from submit-queue
Fix integration pv flakes
There are two fixes in this PR:
- run tests in separarate functions and use objects with different names, otherwise events from the beginning of the function are caught later when we watch for events of a different PV/PVC
- don't set PV.Spec.ClaimRef.UID of pre-bound PVs. PVs with UID set are considered as bound and they are deleted/recycled when appropriate PVC does not exists yet.
Fixes#26730 and probably also ~~#26894~~ #26256
The test should wait until all volumes are processed by volume controller (i.e.
in the controller cache) before creating a PVC.
Without that, the "best" matching PV could not be in the cache and controller
might bind the PVC to suboptiomal one.
This fixes integration test flake "Bind mismatch! Expected pvc-2 capacity
50000000000 but got pvc-2 capacity 52000000000".
Automatic merge from submit-queue
e2e: actually check error when we fail to GET the scheduled pod
and GET the pod before we try to gracefully delete it.
and add a debug log.
ref #26224
Automatic merge from submit-queue
Considering all nodes for the scheduler cache to allow lookups
Fixes the actual issue that led me to create https://github.com/kubernetes/kubernetes/issues/22554
Currently the nodes in the cache provided to the predicates excludes the unschedulable nodes using field level filtering for the watch results. This results in the above issue as the `ServiceAffinity` predicate uses the cached node list to look up the node metadata for a peer pod (another pod belonging to the same service). Since this peer pod could be currently hosted on a node that is currently unschedulable, the lookup could potentially fail, resulting in the pod failing to be scheduled.
As part of the fix, we are now including all nodes in the watch results and excluding the unschedulable nodes using `NodeCondition`
@derekwaynecarr PTAL
Automatic merge from submit-queue
Port the downward api test to the node e2e suite
Also extend the framework to allow a custom client config loading function, so
that the node e2e suite can reuse the same framework across tests.
This fixes#26609
/cc @timstclair @pwittrock
Automatic merge from submit-queue
Add e2e test for node problem detector
Based on https://github.com/kubernetes/node-problem-detector/pull/12.
This PR added e2e test for node problem detector. It starts a node problem detector with test configuration, and test its functionality.
Question:
* Should this be a node e2e test or e2e test?
* Should we mark this as Serial? The test should not affect other tests, but it will add a `TestConditon` in node conditions and remove in cleanup.
@dchen1107
/cc @kubernetes/sig-node
[]()
Automatic merge from submit-queue
Listing pods only once when getting pods for RS in deployment
Fixes#26834
1. Avoid ranging over RSes and then `List` pods of each RS. Instead, `List` pods of the deployment once, and then filter pods of each RS.
2. Avoid using clientset to `List` pods in deployment controller. Use podStore instead. (TODO in some functions because the unit tests don't have podStore.)
@kubernetes/deployment
[]()
Automatic merge from submit-queue
support for mounting local-ssds on GCI
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
@vulpecula @roberthbailey @andyzheng0831 @kubernetes/goog-image
This reverts commit 2494c77972.
The previous commit, which launches the kubelet under `systemd-run`,
fixes an error in detecting the kubelet's cgroup stats.
Fixes#26979
Automatic merge from submit-queue
Fix GKE upgrade e2e util.
containers command group at HEAD no longer accepts --zone. Flag
has to be specified after subcommand group. Fix#27011
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
Automatic merge from submit-queue
Enable WatchCache in test/integration/ tests
We already run cmd/integration/ with watch cache on. We should also run tests in test/integration/ with watch cache on.
@wojtek-t @lavalamp
Automatic merge from submit-queue
Add a custom main instead of the standard test main, to reduce stack …
Adds a custom test main handler (see: `TestMain` in https://golang.org/pkg/testing/ for details)
Partial fix for https://github.com/kubernetes/kubernetes/issues/25965
This does the standard timeout, but strips non-kubernetes stacks out of the stack trace (e.g. it filters things like:
```
goroutine 466 [IO wait, 7 minutes]:
net.runtime_pollWait(0x7fd74c4672c0, 0x72, 0xc821614000)
/usr/local/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8215c21b0, 0x72, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8215c21b0, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).Read(0xc8215c2150, 0xc821614000, 0x1000, 0x1000, 0x0, 0x7fd74c491050, 0xc820014058)
/usr/local/go/src/net/fd_unix.go:250 +0x23a
net.(*conn).Read(0xc820a5a090, 0xc821614000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:172 +0xe4
net/http.noteEOFReader.Read(0x7fd74c465258, 0xc820a5a090, 0xc8215f0068, 0xc821614000, 0x1000, 0x1000, 0x405773, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:1687 +0x67
net/http.(*noteEOFReader).Read(0xc8215ae1a0, 0xc821614000, 0x1000, 0x1000, 0xc82159ad1d, 0x0, 0x0)
<autogenerated>:284 +0xd0
bufio.(*Reader).fill(0xc8202a2b40)
/usr/local/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Peek(0xc8202a2b40, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:132 +0xcc
net/http.(*persistConn).readLoop(0xc8215f0000)
/usr/local/go/src/net/http/transport.go:1073 +0x177
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:857 +0x10a6
```
We may want to get even more aggressive in the future.
@kubernetes/sig-testing
Automatic merge from submit-queue
Move quota usage testing for loadbalancers into unit tests
Fixes https://github.com/kubernetes/kubernetes/issues/26319
* moved testing for node port and load balancer usage in quota to unit tests
* remove node port and node port -> loadbalancer service testing out of e2e
* covered already in replenishment_controller_test scenario
Given the time it takes to even allocate a load balancer, it seems better to test that outside of this test case to avoid unnecessary flakes.
/cc @bprashanth
Automatic merge from submit-queue
Flake 26210: decouple explicit access from port 80
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Reduce huge amount of logs in large cluster tests
When running tests in 2000-node clusters, I got more than 100.000 lines like this:
```
Jun 8 01:03:11.850: INFO: Condition NetworkUnavailable of node gke-gke-large-cluster-default-pool-1-03ee5a12-knrw is true instead of false. Reason: NoRouteCreated, message: Node created w ithout a route
```
that doesn't give much value.
This is PR is reducing the number of logs.
Also includes other improvements:
- Makefile rule to run tests against remote instance using existing host or image
- Makefile will reuse an instance created from an image if it was not torn down
- Runner starts gce instances in parallel with building source
- Runner uses instance ip instead of hostname so that it doesn't need to resolve
- Runner supports cleaning up files and processes on an instance without stopping / deleting it
- Runner runs tests using `ginkgo` binary to support running tests in parallel
Now that GCE routes take an extremely long time to come up and there's
a variance in "Ready" and "Schedulable", start cherry-picking tests
where we really want to have all nodes routable/schedulable for
testing. Adding logging. This will increase test times on large
clusters but should have 0 impact on normal testing.
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Mark runtime conformance tests as flaky and not run them in jenkins CI.
For #26809
@pwittrock As discussed offline, marking runtime tests as flaky for now. I'm not sure if those tests are required. Testing docker in every Kubernetes PR is un-necessary.
These tests can be run periodically in a separate CI. AFAIK, these tests don't seem to exercise any kube features.
When we create a PV, we should created it withoud Spec.ClaimRef.UID.
In rare cases, when 'PV added' event with UID is processed before 'PVC
added' (created by for loop few lines above), the controller does not know
a PVC with this UID and considers the PV as released. Reclaim policy is
then executed and the PV is deleted and it's never bound.
With UID="", the controller waits for the PVC to get created and binds
it.
Different tests should use different objects and watchers - I noticed
sometimes an event from old tests leaked into subsequent test in the
same function.
And add some logs.
Automatic merge from submit-queue
Update Node e2e Core OS image to run systemd with CPU & Memory accounting enabled by default
cc @derekwaynecarr
For #26289
Automatic merge from submit-queue
volume controller: add configurable integration test to stress the binder
The test tries to bind configured nr. of PVs to the same nr. of PVCs. '100' is used by default, which should take ~1-3 seconds (depends on log level). Periodic sync is needed in rare cases, which may add another 10 seconds. - cache from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance. Set KUBE_INTEGRATION_PERSISTENTVOLUME_* env. variables as described in persistent_volume_test.go and run the tests:
```
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PERSISTENTVOLUME_SYNC_PERIOD=10s KUBE_INTEGRATION_PERSISTENTVOLUME_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
```
Log level '2' is useful to get timestamps of various events like 'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs: claims are bound'.
Automatic merge from submit-queue
kubelet e2e: enforce that image prepulling must finish before the test
The image prepulling pod calls docker directly to pull images. If the pod
hasn't finished before running the resource usage tracking test, there'd be a
cpu spike in docker. We'd rather wait and fail if this is the case, before
running the test.
The image prepulling pod calls docker directly to pull images. If the pod
hasn't finished before running the resource usage tracking test, there'd be a
cpu spike in docker. We'd rather wait and fail if this is the case, before
running the test.
The test tries to bind configured nr. of PVs to the same nr. of PVCs.
'100' is used by default, which should take ~1-3 seconds (depends on log level).
Periodic sync is needed in rare cases, which may add another 10 seconds. - cache
from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance.
Set KUBE_INTEGRATION_PV_* env. variables as described in
persistent_volume_test.go and run the tests:
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PV_SYNC_PERIOD=10s KUBE_INTEGRATION_PV_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
Log level '2' is useful to get timestamps of various events like
'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs:
claims are bound'.
Automatic merge from submit-queue
Stabilize persistent volume integration tests
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Fixes#26499 (at least I hope so...)
Automatic merge from submit-queue
Don't allow deps with no discernible license
This updates the few deps we had with no LICENSE file to current versions that do have that file. It also disallows new deps without obvious licenses.
Automatic merge from submit-queue
Enable node e2e accounting on systemd
Updated the e2e setup.sh script to enable cpu and memory accounting.
Related to https://github.com/kubernetes/kubernetes/issues/26198
/cc @pwittrock
Automatic merge from submit-queue
Disable PodAffinity SchedulerPredicates test
This feature is disabled, so it's not surprising that tests don't work.
cc @davidopp @kevin-wangzefeng
@david-mcmahon - this disables the second test that causes failures in SchedulerPredicates suite. When this and #26695 are merged it should be passing in serial.
Automatic merge from submit-queue
Revert revert of adding resource constraints for master components in density tests
The problem was the time when resource constraints were generated. It turns out that the provider is not set there. This version should work.
cc @roberthbailey @alex-mohr
Automatic merge from submit-queue
Add direct serializer
Fix#25589. Implemented a direct codec that doesn't do conversion, but sets the group, version and kind before serialization as Clayton suggested [here](https://github.com/kubernetes/kubernetes/issues/25589#issuecomment-219168009).
First commit is cherry-picked from #24826.
@kubernetes/sig-api-machinery
Automatic merge from submit-queue
kubelet e2e: bumping cpu limit
The previous limit was too aggressive and caused kubernetes-e2e-gce-serial build 1404 to fail.
Automatic merge from submit-queue
Change Kubelet test to run also in large clusters
This should fix the recent failure of kubelet test in 2000-node cluster.
@zmerlynn - FYI
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Automatic merge from submit-queue
SSH e2e: Limit to 100 nodes, limit combinatorics
[]()This limits the "for all hosts" to 100 nodes, and also limits the
combinatorial section so that we only do the other SSH command variant
testing on the first host rather than *all* of the hosts. I also
killed one of the variants because it didn't seem to be testing much
important.
Fixes#26600
This limits the "for all hosts" to 100 nodes, and also limits the
combinatorial section so that we only do the other SSH command variant
testing on the first host rather than *all* of the hosts. I also
killed one of the variants because it didn't seem to be testing much
important.
Fixes#26600
Automatic merge from submit-queue
kubelet e2e: set cpu/memory limits for docker 1.11
Docker 1.11 consumes more memory. Bump the limit to fix the tests. Also add
new limits for the 100-pod resource usage tracking test.
This fixes#26495
Automatic merge from submit-queue
Fix some gce-only tests to run on gke as well
Enable "Services should work after restarting apiserver [Disruptive]" and DaemonRestart tests, except the 2 that require master ssh access.
Move restart/upgrade related test helpers into their own file in framework package.
- Exit non-0 if infrastructure failures happen
- Exit 0 if no infrastructure failures happen regardless of test results
(Jenkins will use junit.xml to determine test results)
Automatic merge from submit-queue
Use ubuntu-slim to reduce size of the iperf:e2e image
from
```
gcr.io/google_containers/iperf e2e 8b3cc7064090 5 weeks ago 737.9 MB
```
to
```
gcr.io/google_containers/iperf e2e 204325491636 33 seconds ago 61.09 MB
```
related to https://github.com/kubernetes/kubernetes/pull/25784#issuecomment-221706886
ping @bprashanth
Automatic merge from submit-queue
Support per-test-environment ginkgo flags for node e2e tests to facilitate skipping miss behaving tests in PR builder
We had an issue today where some node e2e tests were timing out in the pr builder. We want to be able to skip tests in the pr builder and leave them running in the CI if this happens again.
[]()
Automatic merge from submit-queue
Make Privileged pods node e2e use the framework
Made the test more readable along the way with more logs. This should help us triage failures/flakes in the future.
#24577
Automatic merge from submit-queue
Use pause image depending on the server's platform when testing
Removed all pause image constant strings, now the pause image is chosen by arch. Part of the effort of making e2e arch-agnostic.
The pause image name and version is also now only in two places, and it's documented to bump both
Also removed "amd64" constants in the code. Such constants should be replaced by `runtime.GOARCH` or by looking up the server platform
Fixes: #22876 and #15140
Makes it easier for: #25730
Related: #17981
This is for `v1.3`
@ixdy @thockin @vishh @kubernetes/sig-testing @andyzheng0831 @pensu
Automatic merge from submit-queue
Attach Detach Controller Business Logic
This PR adds the meat of the attach/detach controller proposed in #20262.
The PR splits the in-memory cache into a desired and actual state of the world.
Automatic merge from submit-queue
Flake 21484: retrieve pod log during e2e error
Print the pod log when an error occurs in
> Proxy version 1 should proxy through a service and a pod [Conformance]
e2e test. This will help to understand flake https://github.com/kubernetes/kubernetes/issues/21484 better.
Many tests expect all kube-system pods to be running and ready. The newly
added image prepull add-on pod can in the "succeeded" state. This commit fixes
the tests to allow kube-system pods to be succeeded.
Automatic merge from submit-queue
Downward API implementation for resources limits and requests
This is an implementation of Downward API for resources limits and requests, and it works with environment variables and volume plugin.
This is based on proposal https://github.com/kubernetes/kubernetes/pull/24051. This implementation follows API with magic keys approach as discussed in the proposal.
@kubernetes/rh-cluster-infra
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24179)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Prepull images in e2e
Quick and dirty image puller because the SQ stalled multiple times just *today* on image pull flake (https://github.com/kubernetes/kubernetes/issues/25277).
@kubernetes/sig-node @kubernetes/sig-testing wdyt?
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
- Add junit test reported
- Write etcd.log, kubelet.log and kube-apiserver.log to files instead of stdout
- Scp artifacts to the jenkins WORKSPACE
Fixes#25966
Automatic merge from submit-queue
in e2e test, when kubectl exec fails to find the container to run a command, it should retry
fix#26076
Without retrying upon "container not found" error, `Pod Disks` test failed on the following error:
```console
[k8s.io] Pod Disks
should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:271
[BeforeEach] [k8s.io] Pod Disks
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:108
STEP: Creating a kubernetes client
May 23 19:18:02.254: INFO: >>> TestContext.KubeConfig: /root/.kube/config
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [k8s.io] Pod Disks
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:69
[It] should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:271
STEP: creating PD1
May 23 19:18:06.678: INFO: Successfully created a new PD: "rootfs-e2e-11dd5f5b-211b-11e6-a3ff-b8ca3a62792c".
STEP: creating PD2
May 23 19:18:11.216: INFO: Successfully created a new PD: "rootfs-e2e-141f062d-211b-11e6-a3ff-b8ca3a62792c".
May 23 19:18:11.216: INFO: PD Read/Writer Iteration #0
STEP: submitting host0Pod to kubernetes
W0523 19:18:11.279910 4984 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c: need to check if this is versioned correctly.
STEP: writing a file in the container
May 23 19:18:39.088: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1394466581702052925' > '/testpd1/tracker0''
May 23 19:18:40.250: INFO: Wrote value: "1394466581702052925" to PD1 ("rootfs-e2e-11dd5f5b-211b-11e6-a3ff-b8ca3a62792c") from pod "pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
May 23 19:18:40.251: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1740704063962701662' > '/testpd2/tracker0''
May 23 19:18:41.433: INFO: Wrote value: "1740704063962701662" to PD2 ("rootfs-e2e-141f062d-211b-11e6-a3ff-b8ca3a62792c") from pod "pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
May 23 19:18:41.433: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
May 23 19:18:42.585: INFO: Read file "/testpd1/tracker0" with content: 1394466581702052925
STEP: reading a file in the container
May 23 19:18:42.585: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
May 23 19:18:43.779: INFO: Read file "/testpd2/tracker0" with content: 1740704063962701662
STEP: deleting host0Pod
May 23 19:18:44.048: INFO: PD Read/Writer Iteration #1
STEP: submitting host0Pod to kubernetes
W0523 19:18:44.132475 4984 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
May 23 19:18:45.186: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
May 23 19:18:46.290: INFO: error running kubectl exec to read file: exit status 1
stdout=
stderr=error: error executing remote command: error executing command in container: container not found ("mycontainer")
)
May 23 19:18:46.290: INFO: Error reading file: exit status 1
May 23 19:18:46.290: INFO: Unexpected error occurred: exit status 1
```
Now I've run this fix on e2e pd test 5 times and no longer see any failure
Automatic merge from submit-queue
Don't dump everything in kubemarks
Don't dump all events etc. in kubemark failures, those are useless anyway with that amount of data.
Automatic merge from submit-queue
Fix panic in auth test failure
I got a spurious failure in the webhook integration test, but couldn't see the error returned because a panic was hit that assumed a body was always returned with the response
Automatic merge from submit-queue
E2e tests for GKE cluster with local SSD.
The test cover node pool with local SSD creation and scheduling a pod that writes and reads from it. Pod access local disk via hostPath.
```release-note
E2e tests for GKE cluster with local SSD.
-OR-
```
Automatic merge from submit-queue
kubelet/cadvisor: Refactor cadvisor disk stat/usage interfaces.
basically
1) cadvisor struct will know what runtime the kubelet is, passed in via additional argument to New()
2) rename cadvisor wrapper function to DockerImagesFsInfo() to ImagesFsInfo() and have linux implementation choose a label based on the runtime inside the cadvisor struct
2a) mock/fake/unsupported modified to take the same additional argument in New()
3) kubelet's wrapper for the cadvisor wrapper is renamed in parallel
4) make all tests use new interface
Automatic merge from submit-queue
Cache Webhook Authentication responses
Add a simple LRU cache w/ 2 minute TTL to the webhook authenticator.
Kubectl is a little spammy, w/ >= 4 API requests per command. This also prevents a single unauthenticated user from being able to DOS the remote authenticator.
Automatic merge from submit-queue
Extend secrets volumes with path control
As per [1] this PR extends secrets mapped into volume with:
* key-to-path mapping the same way as is for configmap. E.g.
```
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "mypod",
"namespace": "default"
},
"spec": {
"containers": [{
"name": "mypod",
"image": "redis",
"volumeMounts": [{
"name": "foo",
"mountPath": "/etc/foo",
"readOnly": true
}]
}],
"volumes": [{
"name": "foo",
"secret": {
"secretName": "mysecret",
"items": [{
"key": "username",
"path": "my-username"
}]
}
}]
}
}
```
Here the ``spec.volumes[0].secret.items`` added changing original target ``/etc/foo/username`` to ``/etc/foo/my-username``.
* secondly, refactoring ``pkg/volumes/secrets/secrets.go`` volume plugin to use ``AtomicWritter`` to project a secret into file.
[1] https://github.com/kubernetes/kubernetes/blob/master/docs/design/configmap.md#changes-to-secret
Automatic merge from submit-queue
Check status of framework.CheckPodsRunningReady
Check status of framework.CheckPodsRunningReady and fail test if it's false, instead of silently
ignoring the failure.
This doesn't fix whatever is causing the pod not to start in #17523 but it does fail the test as soon as it detects the pod didn't start, instead of allowing the testing to proceed.
cc @kubernetes/sig-testing @spxtr @ixdy @kubernetes/rh-cluster-infra
I removed the netexec and goproxy pods from the proxy exec test. Instead
it now runs kubectl locally and the proxy is running in-process. Since
Go won't proxy for localhost requests, this test cannot pass if the API
server is local. However it was already disabled for local clusters.
Automatic merge from submit-queue
SchedulerPredicates e2e test: be more verbose about requested resource
When ``validates resource limits of pods that are allowed to run [Conformance]`` test is run, logs could give more information about requested resource and say it is for cpu and in mili units.
cpu is stored in m units here:
```
nodeToCapacityMap[node.Name] = capacity.MilliValue()
```
Automatic merge from submit-queue
Add a timeout to the node e2e Ginkgo test runner
Also add a few debugging statements to indicate progress.
Should help prevent #25639, since we'll timeout tests before Jenkins times out the build.
Automatic merge from submit-queue
gcr.io/google_containers/mounttest: use Stat instead of Lstat
The current ``mt.go`` implementation use ``os.Lstat`` instead of ``os.Stat`` which does not read symlinks. Since implementation of ``AtomicWriter`` (which relies on existence of symlinks), the updated implementation of secret volume using the ``AtomicWriter`` can not be tested for secret file permission. Replacing ``Lstat`` with ``Stat`` allows to read symlinks and return permissions of target file. The change affects ``--file_perm`` and ``--file_mode`` options only.
``mounttest`` image is currently used by:
##### downwardapi_volume.go
- e2e: Downward API volume
- version: 0.6
- args: --file_content, --break_on_expected_content, --retry_time, --file_content_in_loop
##### empty_dir.go
- e2e: EmptyDir volumes
- version: 0.5
- args: --file_perm, --file_perm, ...
##### host_path.go
- e2e: hostPath
- version: 0.6
- args: --file_mode, ...
##### configmap.go
- e2e: ConfigMap
- version: 0.6
- args: --file_content, --break_on_expected_content, --retry_time, --file_content_in_loop
##### service_accounts.go
- e2e: ServiceAccounts
- version: 0.2
- args: --file_content
Some of the e2e tests use at least one of the affected options. Locally, I have updated all version of mounttest images to 0.7. All e2e tests pass with the new image.
- create 100 PV, ranging from 0 to 99GB; create 1 PVC to claim 50GB. Verify only one PV is bound and rest are pending
- create 2 PVs with different access modes (RWM, RWO), 1 PVC to claim RWM PV. Verify RWM is bound and RWO is not bound.
Signed-off-by: Huamin Chen <hchen@redhat.com>
Automatic merge from submit-queue
Refactor persistent volume controller
Here is complete persistent controller as designed in https://github.com/pmorie/pv-haxxz/blob/master/controller.go
It's feature complete and compatible with current binder/recycler/provisioner. No new features, it *should* be much more stable and predictable.
Testing
--
The unit test framework is quite complicated, still it was necessary to reach reasonable coverage (78% in `persistentvolume_controller.go`). The untested part are error cases, which are quite hard to test in reasonable way - sure, I can inject a VersionConflictError on any object update and check the error bubbles up to appropriate places, but the real test would be to run `syncClaim`/`syncVolume` again and check it recovers appropriately from the error in the next periodic sync. That's the hard part.
Organization
---
The PR starts with `rm -rf kubernetes/pkg/controller/persistentvolume`. I find it easier to read when I see only the new controller without old pieces scattered around.
[`types.go` from the old controller is reused to speed up matching a bit, the code looks solid and has 95% unit test coverage].
I tried to split the PR into smaller patches, let me know what you think.
~~TODO~~
--
* ~~Missing: provisioning, recycling~~.
* ~~Fix integration tests~~
* ~~Fix e2e tests~~
@kubernetes/sig-storage
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24331)
<!-- Reviewable:end -->
Fixes#15632
The key to path mapping allows pod to specify different name (thus location) of each secret.
At the same time refactor the volume plugin to use AtomicWritter to project secrets to files in a volume.
Update e2e Secrets test, the secret file permission has changed from 0444 to 0644
Remove TestPluginIdempotent as the AtomicWritter is responsible for secret creation
Automatic merge from submit-queue
Add init containers to pods
This implements #1589 as per proposal #23666
Incorporates feedback on #1589, creates parallel structure for InitContainers and Containers, adds validation for InitContainers that requires name uniqueness, and comments on a number of implications of init containers.
This is a complete alpha implementation.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/23567)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Add pod status/ready/restartCount conformance test
add more test cases to cover containers which will be terminated/running/failed/pending.
Signed-off-by: liang chenye <liangchenye@huawei.com>
Automatic merge from submit-queue
The remaining API changes for PodDisruptionBudget.
It's mostly the boilerplate required for the registry, some extra codegen, and a few tests.
Will squash once we're sure it's good.
Automatic merge from submit-queue
[e2e] kubectl stdin
Problem: Currently kubectl heavily relies on files which have to be (for lack of a better word :):):)) "written" to the file system. This hinders adoption of something like gobindata, by forcing an intermediary generated-assets directory type thing.
Solution: Lets migrate `kubectl.go` testing over to using standard input streams.
cc @kubernetes/sig-testing @timothysc
Automatic merge from submit-queue
Add pod condition PodScheduled to detect situation when scheduler tried to schedule a Pod, but failed
Set `PodSchedule` condition to `ConditionFalse` in `scheduleOne()` if scheduling failed and to `ConditionTrue` in `/bind` subresource.
Ref #24404
@mml (as it seems to be related to "why pending" effort)
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24459)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Webhook Token Authenticator
Add a webhook token authenticator plugin to allow a remote service to make authentication decisions.
Automatic merge from submit-queue
Move internal types of hpa from pkg/apis/extensions to pkg/apis/autoscaling
ref #21577
@lavalamp could you please review or delegate to someone from CSI team?
@janetkuo could you please take a look into the kubelet changes?
cc @fgrzadkowski @jszczepkowski @mwielgus @kubernetes/autoscaling