Automatic merge from submit-queue (batch tested with PRs 39001, 39104, 35978, 39361, 39273)
refactored admission to avoid internal client references
Refactored admission to avoid internal client references. This required switching to plugin initializers for them. And that required some rewiring of the plugin initializers.
Technically I can decouple from the other two commits, but I'm optimistic that those will go through easy. This is slightly move invasive, but I'd like to shoot for pre-christmas to avoid new admission plugins coming through and breaking bits.
@sttts @derekwaynecarr
Automatic merge from submit-queue (batch tested with PRs 39092, 39126, 37380, 37093, 39237)
Endpoints with TolerateUnready annotation, should list Pods in state terminating
**What this PR does / why we need it**:
We are using preStop lifecycle hooks to gracefully remove a node from a cluster. This hook is potentially long running and after the preStop hook is fired, the DNS resolution of the soon to be stopped Pod is failing, which causes a failure there.
**Special notes for your reviewer**:
Would be great to backport that to 1.4, 1.3
**Release note**:
```release-note
Endpoints, that tolerate unready Pods, are now listing Pods in state Terminating as well
```
@bprashanth
Automatic merge from submit-queue (batch tested with PRs 39150, 38615)
Add work queues to PV controller
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead, where we
can enqueue volumes/claims as we want.
Automatic merge from submit-queue
Add Persistent Volume E2E in the context of a disrupted kubelet
This PR adds a test suite for persistent volumes affected by a disrupted kubelet. Two cases are presented:
1. A volume mounted via PVC remains accessible after a kubelet restart.
2. When a pod is deleted while the kubelet is down, the mounted volume is unmounted successfully.
PV controller should not use Controller.Requeue, as as it is not available in
shared informers. We need to implement our own work queues instead where we
can enqueue volumes/claims as we want.
Automatic merge from submit-queue
Remove all MAINTAINER statements in the codebase as they are deprecated
**What this PR does / why we need it**:
ref: https://github.com/docker/docker/pull/25466
**Release note**:
```release-note
Remove all MAINTAINER statements in Dockerfiles in the codebase as they are deprecated by docker
```
@ixdy @thockin (who else should be notified?)
Automatic merge from submit-queue
Remove system:anonymous check from kubectl test
This verbiage doesn't appear when the cluster is `AlwaysAllow` (and just makes the check more brittle).
Follow-on to #39263, this is the last (consistent) failure on [kops-aws](https://k8s-testgrid.appspot.com/google-aws#kops-aws&sort-by-failures=)
Automatic merge from submit-queue
Avoid unnecessary memory allocations
Low-hanging fruits in saving memory allocations. During our 5000-node kubemark runs I've see this:
ControllerManager:
- 40.17% k8s.io/kubernetes/pkg/util/system.IsMasterNode
- 19.04% k8s.io/kubernetes/pkg/controller.(*PodControllerRefManager).Classify
Scheduler:
- 42.74% k8s.io/kubernetes/plugin/pkg/scheduler/algrorithm/predicates.(*MaxPDVolumeCountChecker).filterVolumes
This PR is eliminating all of those.
Automatic merge from submit-queue
CreateNodeSelectorPods should respect parameter
Fix (1): `CreateNodeSelectorPods` should respect parameter `id`.
The existing e2e does not break because it happened use "node-selector" as id, which is the same as the hard coded value.
Fix (2): The current `CreateNodeSelectorPods` does not use `nodeSelector` parameter, it hard coded a label instead.
The reason current e2e does not influenced because we happened use the same label: https://github.com/kubernetes/kubernetes/blob/master/test/e2e/cluster_size_autoscaling.go#L177
Found these bugs during testing #36238
Automatic merge from submit-queue
Begin paths for internationalization in kubectl
This is just the first step, purposely simple so we can get the interface correct.
@kubernetes/sig-cli @deads2k
Automatic merge from submit-queue
Support loading UTF16 files if a byte-order-mark is present
Add support in kubectl for loading UTF16 encoded files if they have a correct BOM (Byte-Order-Mark https://en.wikipedia.org/wiki/Byte_order_mark) at the beginning
of the file. Falls back on UTF8 encoding, if no understandable BOM is present.
Fixes part of https://github.com/kubernetes/kubernetes/issues/39007
@fabianofranz @deads2k @kubernetes/sig-cli-misc
Automatic merge from submit-queue (batch tested with PRs 39059, 39175, 35676, 38655)
ReplicaSet has onwer ref of the Deployment that created it
**What this PR does / why we need it**:
This enabled garbage collection for ReplicaSets and ensures they are owned by their respective Deployment objects.
fixes https://github.com/kubernetes/kubernetes/issues/33845
This is an initial PR to get feedback. Will update this quickly with unit tests if this seems like in the right direction
Automatic merge from submit-queue
In-cluster configs must take flag overrides into account
**What this PR does / why we need it**: Some flags must override in-cluster configs if provided to `kubectl` inside a cluster.
**Which issue this PR fixes**: Fixes https://github.com/kubernetes/kubernetes/issues/38834
**Release note**:
```release-note
Fixed a bug where the --server, --token, and --certificate-authority flags were not overriding the related in-cluster configs when provided in a `kubectl` call inside a cluster.
```
Automatic merge from submit-queue
remove unneeded authenticator dependencies from genericapiserver
Refactors the authenticator options to remove unneeded dependencies.
@sttts
Automatic merge from submit-queue (batch tested with PRs 39146, 39094)
cleanup last e2e authorization failures
Builds on https://github.com/kubernetes/kubernetes/pull/39080. This adds rbac role bindings during e2e tests for test that use SA permissions to loopback to the API server.
Assigned to me until its ready.
Automatic merge from submit-queue
Node E2E: Set user with `--ssh-user` flag when running remote node e2e.
This PR unblocks https://github.com/kubernetes/test-infra/issues/1348.
In our test environment, we must login test instance as user `jenkins` because of the service account. Node e2e is always using the default user on the host, which works fine till now, because it is always run as `jenkins` in our test environment.
However, now we moved the test runner into a docker container, inside the container user is `root` by default, which will cause error:
```
Permission denied (publickey)
```
This PR added a flag `--ssh-user` to explicitly specify the user used to ssh into test instance. The dockerized test runner can set user to `jenkins` with this flag.
@krzyzacy @ixdy
Automatic merge from submit-queue
register batch/jobs to federation-apiserver
register batch/jobs api objects to federation-apiserver
**Release note**:
```release-note
Federation: Add `batch/jobs` API objects to federation-apiserver
```
@quinton-hoole @nikhiljindal @deepak-vij
#34261
Automatic merge from submit-queue
Added 'hollow'-node-problem-detector to hollow-nodes in kubemark
Added node-problem-detector container in kubemark hollow-nodes, which takes in a 'hollow' (having an empty list of rules and conditions) kernel monitor config.
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 36751, 38968)
Convert * users/groups to system:authenticated group in ABAC
Part of enabling anonymous auth by default in 1.6 means protecting earlier policies that did not intend to grant access to anonymous users.
This modifies ABAC policies that match `user` or `group` `*` to only match authenticated users.
Docs PR to update examples to use `system:authenticated` or `system:unauthenticated` groups explicitly: https://github.com/kubernetes/kubernetes.github.io/pull/1992
```release-note
ABAC policies using "user":"*" or "group":"*" to match all users or groups will only match authenticated requests. To match unauthenticated requests, ABAC policies must explicitly specify "group":"system:unauthenticated"
```
Automatic merge from submit-queue
Moved kubemark master from Debian to GCI
This PR fixes issue #37484
Kubemark master now runs on GCI instead of Debian, taking it one step closer to a real cluster master.
Primary changes:
1. changing master VM image/OS in kubemark's config-default.sh to debian
2. moving kubelet to systemd from supervisord
3. changing directory for cert/key/csv files from /srv/kubernetes to /etc/srv/kubernetes
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue
Add test to detach a pd whose node was deleted
**What this PR does / why we need it**:
A test for the following issue :
If a node with a GCE PD attached is deleted (before the volume is detached), subsequent attempts by the attach/detach controller to detach it should not fail.
**Bonus** :Added additional code to ensure that the pd can still be attached to a different node.
Edit : Removed it as it was making the test much slower.
https://github.com/kubernetes/kubernetes/issues/29358
Automatic merge from submit-queue (batch tested with PRs 38426, 38917, 38891, 38935)
Support different image during GCE node upgrade
**What this PR does / why we need it**: It lets GCE upgrade tests upgrade to a GCI node image.
**Which issue this PR fixes**: fixes#37855
Automatic merge from submit-queue (batch tested with PRs 38942, 38958)
Added MULTIZONE flag to e2e remove master script.
Added MULTIZONE flag to e2e remove master script. The script is used by HA tests which set-up multizone cluster.
Automatic merge from submit-queue (batch tested with PRs 34353, 33837, 38878)
Add e2e test for configmap volume
There are two patches:
- refactor e2e volume tests to allow multiple volumes mounted into single pod
- add a test for ConfigMap volume mounted twice to test #28502
Automatic merge from submit-queue (batch tested with PRs 34353, 33837, 38878)
Gce persistentvolume testing
Add E2E PersistentVolume test for a GCE environment. Tests that deleting a PV or PVC before the referencing pod does not fail on unmount and detach during pod deletion.
cc @jeffvance
Automatic merge from submit-queue (batch tested with PRs 37468, 36546, 38713, 38902, 38614)
Remove extensions/v1beta1 Job
Fixes https://github.com/kubernetes/kubernetes/issues/32763. This endpoint was deprecated in 1.5 and was planned to be removed in 1.6.
**Release note**:
```release-note
Remove extensions/v1beta1 Jobs resource, and job/v1beta1 generator.
```
Automatic merge from submit-queue (batch tested with PRs 37468, 36546, 38713, 38902, 38614)
Adds e2e firewall tests for LoadBalancer service, ingress, and e2e cluster
Fixes#25488 and fixes#31827.
This PR adds e2e firewall test for LoadBalancer type service, ingress and e2e cluster.
Test details for LoadBalancer type service as below:
- Verifies corresponding firewall rule has correct `sourceRanges`, `ports and protocols` and `target tags`.
- Verifies requests can reach all expected instances.
- Verifies requests can not reach instances that are not included.
Overview of the test procedure:
- Creates a LoadBalancer type service.
- Validates the corresponding firewall rule.
- Creates netexec pods as service backends.
- Sends requests from outside of the cluster and examine hitting all instances in range.
- Removes tags from one of the instances in order to get it out of firewall rule's range.
- Sends requests from outside of the cluster and examine not hitting this instance.
- Recovers tags for this instances and verifies its traffic is back.
@bprashanth @bowei @thockin
For LoadBalancer type service:
- Verifies corresponding firewall rule has correct sourceRanges, ports
& protocols, target tags.
- Verifies requests can reach all expected instances.
- Verifies requests can not reach instances that are not included.
For Ingress resrouce:
- Verifies the ingress firewall rule has correct sourceRanges, target
tags and tcp ports.
For general e2e cluster:
- Verifies all required firewall rules has correct sourceRange, ports
& protocols, source tags and target tags.
- Verifies well know ports on master and nodes are not
exposed externally
Automatic merge from submit-queue
Don't check nodeport for nginx ingress
Services behind a standard nginx ingress don't need nodeport, so don't check that.
Extracted delete operations into functions
wait on pv/pvc bind
removed redundant verification, minor refactors
GCEPD: fixed typo
name verifyDiskAttached to verifyGCEDiskAttached
fix empty log msg
Updated test owners
removed unnecessary api calls
Check for apierr IsNotFound for pod,pv,pvc but ignore result
Disable dynamic provisioning in test PVCs
gofmt'd
Automatic merge from submit-queue
Fix Recreate for Deployments and stop using events in e2e tests
Fixes https://github.com/kubernetes/kubernetes/issues/36453 by removing events from the deployment tests. The test about events during a Rolling deployment is redundant so I just removed it (we already have another test specifically for Rolling deployments).
Closes https://github.com/kubernetes/kubernetes/issues/32567 (preferred to use pod LISTs instead of a new status API field for replica sets that would add many more writes to replica sets).
@kubernetes/deployment
Automatic merge from submit-queue (batch tested with PRs 38830, 38750)
[Federation] Stop cleaning federation namespace in e2e tests
when --clean-start=true flag is provided to e2e tests it would cleanup all the leftover namespaces except `default` and `kube-system` and because of this when we run e2e tests in federation soak test job, the federation control plane is destroyed before it runs the tests and all tests start to fail.
So adding federation-system to the list of namespace to be left intact and also changed the default federation namespace name from `federation` to `federation-system` to be consistent with the newer method of deploying federation using kubefed.
@madhusudancs @nikhiljindal
Automatic merge from submit-queue (batch tested with PRs 38830, 38750)
Remove the ReadyReplica version guard
**What this PR does / why we need it**: Removes outlived version guards.
**Which issue this PR fixes**: fixes#37310
Automatic merge from submit-queue
Node Conformance Test: Fix report prefix for node conformance test.
The node conformance CI is running now.
The only problem is that junit files overwrite each other because of the lack of junit prefix. http://gcsweb.k8s.io/gcs/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-conformance/42/artifacts/
This PR fixes this. I've verified in my environment, it works well.
@timstclair
Automatic merge from submit-queue (batch tested with PRs 38788, 38821, 38829)
Node Conformance Test: Fix node conformance test.
The test suite could build on my desktop. However it is failing on jenkins.
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-conformance/1
It turns out that `docker save $IMAGE -o $FILE` only works for docker 1.12. (My desktop is 1.12) For older version docker, we should use `docker save -o $FILE $IMAGE instead`. (Jenkins is using 1.9.1)
@timstclair Could you help me review this short PR? :)
Automatic merge from submit-queue (batch tested with PRs 38154, 38502)
Rename "release_1_5" clientset to just "clientset"
We used to keep multiple releases in the main repo. Now that [client-go](https://github.com/kubernetes/client-go) does the versioning, there is no need to keep releases in the main repo. This PR renames the "release_1_5" clientset to just "clientset", clientset development will be done in this directory.
@kubernetes/sig-api-machinery @deads2k
```release-note
The main repository does not keep multiple releases of clientsets anymore. Please find previous releases at https://github.com/kubernetes/client-go
```
Automatic merge from submit-queue
make spellcheck for test/*
**What this PR does / why we need it**: Increase code readability
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: only a slight effort
**Release note**:
```release-note
```
Automatic merge from submit-queue
genericapiserver: unify swagger and openapi in config
- make swagger config customizable
- remove superfluous `Config.Enable*` flags for OpenAPI and Swagger.
This is necessary for downstream projects to tweak the swagger spec.
Automatic merge from submit-queue
Node Conformance: Node Conformance CI
For https://github.com/kubernetes/kubernetes/issues/37252.
The first 2 commits of this PR are from #38150 and #38152. Please review those 2 PRs first, they are both minor cleanup.
This PR:
* Add `TestSuite` interface in `test/e2e_node/remote` to separate test suite logic (packaging, deploy, run test) from VM lifecycle management logic, so that different test suites can share the same VM lifecycle management logic.
* Different test suites such as node e2e, node conformance, node soaking, cri validation etc. should implement different `TestSuite`.
* `test/e2e_node/runner/remote` will initialize and run different test suite based on the subcommand.
* Add `run-kubelet-mode` which only starts and monitors kubelet, similar with `run-services-mode`. The reason we need this:
* Unlike node e2e, node conformance test doesn't start kubelet inside the test suite (in fact, in the future node e2e shouldn't do that either), it assumes kubelet is already running before the test.
* In fact, node e2e should use similar node bootstrap script like cluster e2e, and the bootstrap script should initialize the node with all necessary node software including kubelet. However, it's not the case now.
* The easiest way for now is to reuse the kubelet start logic in the test suite. So in this PR, we added `run-kubelet-mode`, and use the test binary as a kubelet launcher to start kubelet before running the test.
* Implement node e2e `TestSuite`.
* Implement node conformance `TestSuite`. Use `docker save` and `docker load` to create and deploy conformance docker image; Start kubelet by running test binary in `run-kubelet-mode`; Run conformance test with `docker run`.
This PR will make it easy to implement continuous integration node soaking test and cri validation test (https://github.com/kubernetes/kubernetes/pull/35266).
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Inode Eviction Test is Flaky
This Pull Request:
Marks the InodeEviciton test as flaky
Increases the timeout for disk pressure because coreos has nearly 2 million inodes.
Decreases the status polling interval so we can see eviction ordering better.
@Random-Liu
Automatic merge from submit-queue
Add a package for handling version numbers (including non-"Semantic" versions)
As noted in #32401, we are using Semantic Version-parsing libraries to parse version numbers that aren't necessarily "Semantic". Although, contrary to what I'd said there, it turns out that this wasn't actually currently a problem for the iptables code, because the regexp used to extract the version number out of the "iptables --version" output only pulled out three components, so given "iptables v1.4.19.1", it would have extracted just "1.4.19". Still, it could be a problem if they later release "1.5" rather than "1.5.0", or if we eventually need to _compare_ against a 4-digit version number.
Also, as noted in #23854, we were also using two different semver libraries in different parts of the code (plus a wrapper around one of them in pkg/version).
This PR adds pkg/util/version, with code to parse and compare both semver and non-semver version strings, and then updates kubernetes to use it everywhere (including getting rid of a bunch of code duplication in kubelet by making utilversion.Version implement the kubecontainer.Version interface directly).
Ironically, this does not actually allow us to get rid of either of the vendored semver libraries, because we still have other dependencies that depend on each of them. (cadvisor uses blang/semver and etcd uses coreos/go-semver)
fixes#32401, #23854
Automatic merge from submit-queue
Add an option to run Job in Density/Load config
cc @timothysc @jeremyeder
@erictune @soltysh - I run this test and it seems to me that Job has noticeably worse performance than Deployment. I'll create an issue for this, but this PR is for easy repro.
Automatic merge from submit-queue (batch tested with PRs 38609, 38227)
On kubemark master, kubelet now runs as a supervisord process and all master components as pods
This PR fixes issue #37485
On kubemark, previously we had a custom setup that runs master components as top-level processes under supervisord, which is lighter than running kubelet and docker.
This PR makes kubelet run as a process under supervisord and the master components (apiserver, controller-manager, scheduler, etcd) as pods, making testing on kubemark mimic real clusters better.
Also, start-kubemark-master.sh now closely resembles cluster/gce/gci/configure-helper.sh, allowing easy integration in future.
cc @kubernetes/sig-scalability @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 38419, 38457, 38607)
Node E2E: Update CVM version to e2e-node-containervm-v20161208-image.
I built the new node e2e image from e2e-node-containervm-v20161208-image.
@timstclair
/cc @kubernetes/sig-node
Automatic merge from submit-queue
test: cleanup test logs for deployments
@mfojtik @janetkuo this will help with the deployment logs (should make them a bit cleaner) ptal
Automatic merge from submit-queue (batch tested with PRs 34002, 38535, 37330, 38522, 38423)
Node E2E: `make test-e2e-node` runs the same test with pr builder by default.
This PR makes `make test-e2e-node` run non-serial, non-flaky, non-slow test by default.
This will make it easier to use.
/cc @timstclair
Automatic merge from submit-queue (batch tested with PRs 37860, 38429, 38451, 36050, 38463)
[Part 2] Adding s390x cross-compilation support for gcr.io images in this repo
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**: This PR enables s390x support to kube-dns , pause, addon-manager, etcd, hyperkube, kube-discovery etc. This PR also includes the changes due to which it can be cross compiled on x86 host architecture.
**Which issue this PR fixes#34328
**Special notes for your reviewer**: In existing file "build-tools/build-image/cross/Dockerfile" the repository mentioned for installing cross build tool chains for supporting architecture does not have a tool chain for s390x hence in my PR I am changing the repository so that it will be cross compiled for s390x.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```
Allows cross compilation of Kubernetes on x86 host for s390x also enables s390x support to kube-dns , pause, addon-manager, etcd, hyperkube, kube-discovery etc
```
Automatic merge from submit-queue (batch tested with PRs 38354, 38371)
Add GetOptions parameter to Get() calls in client library
Ref #37473
This PR is super mechanical - the non trivial commits are:
- Update client generator
- Register GetOptions in batch/v2alpha1 group
Automatic merge from submit-queue (batch tested with PRs 38278, 37770)
Refactor REST storage to use generic defaults
This removes the repetition in the REST storage builders by moving the logic to `restoptions.ApplyOptions`. `registry.StorageWithCacher`/`generic.StorageDecorator` no longer assume that they can build the `keyFunc` for arbitrary objects. `restoptions.ApplyOptions` uses the `registry.Store`'s `KeyFunc` for its call to `generic.StorageDecorator`.
```release-note
Cluster federation servers have changed the location in etcd where federated services are stored, so existing federated services must be deleted and recreated. Before upgrading, export all federated services from the federation server and delete the services. After upgrading the cluster, recreate the federated services from the exported data.
```
Automatic merge from submit-queue (batch tested with PRs 36419, 38330, 37718, 38244, 38375)
Guarantees drop packets commands succeed in reboot test
Fixes the main case in #33405 and #36230.
Previous attempted fix in #38057.
During the reboot test, the iptables command that was supposed to take the node offline failed to exec.
Turned out the xtables lock was holding by other processes led to this failure. Logs as below:
```
I1202 20:00:29.686] Dec 2 20:00:29.685: INFO: ssh jenkins@146.148.111.167:22: stdout:
"+ sleep 10
+ sudo iptables -I INPUT 1 -s 127.0.0.1 -j ACCEPT
Another app is currently holding the xtables lock. Perhaps you want to use the -w option?"
I1202 20:00:29.686] Dec 2 20:00:29.685: INFO: ssh jenkins@146.148.111.167:22: stderr: ""
I1202 20:00:29.686] Dec 2 20:00:29.685: INFO: ssh jenkins@146.148.111.167:22: exit code: 0
```
This reboot test won't pass if any one of these iptables commands fails. This PR put "reboot" commands into while loops to guarantee it retries until succeed.
`sudo iptables -t filter -nL` is removed since it is clear now that the `FILTER` rules won't be clobbered.
(Tests passed on local cluster.)
@bprashanth
Automatic merge from submit-queue (batch tested with PRs 36419, 38330, 37718, 38244, 38375)
adjusted timeouts for inode eviction and garbage collection tests
Inode eviction tests appear to run slower on coreos than the other operating systems I tested on.
I adjusted the timeout for the test from 10 to 30 minutes to compensate.
Garbage collection tests also flake occasionally due to timeouts.
I adjusted the timeout for runtime commands from 2 to 3 minutes, and removed an unused constant.
cc: @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 36071, 32752, 37998, 38350, 38401)
Pass addressable values to DeepCopy
Extracted from https://github.com/kubernetes/kubernetes/pull/35728
These are the places we are currently calling DeepCopy incorrectly, and we need to fix, even if we don't pick up the changes to DeepCopy in #35728:
* creating a new cloner means we have no generated functions registered
* passing non-addressable values doesn't pick up generated deep copy functions, and forces us into reflective mode
Automatic merge from submit-queue (batch tested with PRs 36071, 32752, 37998, 38350, 38401)
Add test for concurrent evictions requests
This is a followup PR after #37668.
Add a test case to make sure concurrent eviction requests can be handled.
@davidopp @lavalamp
Automatic merge from submit-queue (batch tested with PRs 35939, 38381, 37825, 38306, 38110)
Moved start-kubemark-master.sh from test/kubemark/ to test/kubemark/r…
Automatic merge from submit-queue
Fix useless uuid in container log path node e2e
@timstclair pointed out there're nits in original PR, ref: https://github.com/kubernetes/kubernetes/pull/34877
So this patch:
1. removed useless uuid
2. change all those strings to const
Thanks. 🐱
Automatic merge from submit-queue (batch tested with PRs 36626, 37294, 37463, 37943, 36541)
Use tmpfs for gluster-server container volumes.
Gluster server needs a filesystem that supports extended attributes for its data. Some distros (Debian, Ubuntu) ship Docker that runs containers on aufs, which does not support extended attributes and therefore Gluster server fails there.
We can use tmpfs for Gluster server data volumes. ~~This expects that host's /tmp is tmpfs, which is true for Debian, Ubuntu, RHEL, CentOS and Fedora.~~
I reworked it to mount tmpfs inside the container, the server pod is privileged.
And *after* this PR is merged and new Gluster server container image is pushed we need to bump the image version in https://github.com/kubernetes/kubernetes/blob/master/test/e2e/volumes.go#L407
Edit: I also fixed Gluster server Dockerfile, it was not working at all.
Automatic merge from submit-queue (batch tested with PRs 37092, 37850)
Turns on dns horizontal scaling tests for GKE
Seems like the dns-autoscaler is already enabled in [this recent gke build](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gke/769/).
Turning on the corresponding e2e tests to increase test coverage.
Probably better to wait for this fix#37261 to go in first.
@bowei @bprashanth
cc @maisem @roberthbailey
Automatic merge from submit-queue (batch tested with PRs 37325, 38313, 38141, 38321, 38333)
Decrease expected lower bound for misc CPU
Fixes https://github.com/kubernetes/kubernetes/issues/34990
We started enforcing expectations on the `misc` system container in https://github.com/kubernetes/kubernetes/pull/37856, but the CPU usage tends to be lower than the `kueblet` & `runtime` containers (to be expected). For simplicity, I lowered the lower bound for all system containers.
Automatic merge from submit-queue (batch tested with PRs 37325, 38313, 38141, 38321, 38333)
Cleanup firewalls, add nginx ingress to presubmit
Make the firewall cleanup code follow the same pattern as the other cleanup functions, and add the nginx ingress e2e to presubmit.
Planning to watch the test for a bit, and if it works alright, I'll add the other Ingress e2e to post-submit merge blocker.
Automatic merge from submit-queue (batch tested with PRs 37325, 38313, 38141, 38321, 38333)
Fix running e2e with 'Completed' kube-system pods
As of now, e2e runner keeps waiting for pods in `kube-system` namespace to be "Running and Ready" if there are any pods in `Completed` state in that namespace.
This for example happens after following [Kubernetes Hosted Installation](http://docs.projectcalico.org/v2.0/getting-started/kubernetes/installation/#kubernetes-hosted-installation) instructions for Calico, making it impossible to run conformance tests against the cluster. It's also to possible to reproduce the problem like that:
```
$ cat testjob.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: tst
namespace: kube-system
spec:
template:
metadata:
name: tst
spec:
containers:
- name: tst
image: busybox
command: ["echo", "test"]
restartPolicy: Never
$ kubectl create -f testjob.yaml
$ go run hack/e2e.go -v --test --test_args='--ginkgo.focus=existing\s+RC'
```
Automatic merge from submit-queue
Delete regional static-ip instead of global for type=lb
Global vs region is the difference between
```
$ gcloud compute addresses delete foo --global
$ gcloud compute addresses delete foo --region us-central1
```
Type=LoadBalancer users the second type and were were doing the first.
Also adds some logging.
Automatic merge from submit-queue
Fix scheduler_perf test so that QPS is non-zero even if there is a scheduling "cold start"
@gmarek ... is something like this more realistic as an expectation ? That is, wait till scheduling starts, then wait 1 second before any polling attempts.... i.e.
- gaurantee uniform sleep before measure, by doing it first rather then last
- print the initial status so that its easier to debug in case of issues
#36532
Automatic merge from submit-queue (batch tested with PRs 38294, 37009, 36778, 38130, 37835)
fix permissions when using fsGroup
Currently, when an fsGroup is specified, the permissions of the defaultMode are not respected and all files created by the atomic writer have mode 777. This is because in `SetVolumeOwnership()` the `filepath.Walk` includes the symlinks created by the atomic writer. The symlinks have mode 777 when read from `info.Mode()`. However, when the are chmod'ed later, the chmod applies to the file the symlink points to, not the symlink itself, resulting in the wrong mode for the underlying file.
This PR skips chmod/chown for symlinks in the walk since those operations are carried out on the underlying file which will be included elsewhere in the walk.
xref https://bugzilla.redhat.com/show_bug.cgi?id=1384458
@derekwaynecarr @pmorie
Automatic merge from submit-queue (batch tested with PRs 38173, 38151, 38197, 38221)
test: wait for ready replica set before adopting
Reworked version of https://github.com/kubernetes/kubernetes/pull/36439 which was reverted in https://github.com/kubernetes/kubernetes/pull/38049. This PR doesn't use any of the new status API added in replica sets so it should cause no trouble with upgrade tests.
@kubernetes/deployment @smarterclayton
Automatic merge from submit-queue (batch tested with PRs 37032, 38119, 38186, 38200, 38139)
New ns param for NewClusterVerification
**What this PR does / why we need it**: Allows the test to specify alternate namespaces to when waiting for pods to be in a specific state.
**Which issue this PR fixes**: fixes#38138
**Special notes for your reviewer**: Minor fix
**Release note**: None
Automatic merge from submit-queue (batch tested with PRs 37032, 38119, 38186, 38200, 38139)
Detect long-running requests from parsed request info
Follow up to https://github.com/kubernetes/kubernetes/pull/36064
Uses parsed request info to more tightly match verbs and subresources
Removes regex-based long-running request path matching (which is easily fooled)
```release-note
The --long-running-request-regexp flag to kube-apiserver is deprecated and will be removed in a future release. Long-running requests are now detected based on specific verbs (watch, proxy) or subresources (proxy, portforward, log, exec, attach).
```
Automatic merge from submit-queue
Add integration tests for desire state of world populator
Add integration tests for desire state of world populator
This adds tests for code introduced here :
https://github.com/kubernetes/kubernetes/issues/26994
Via integration test we can now verify that if pod delete
event is somehow missed by AttachDetach controller - it still
get cleaned up by Desired State of World populator.
Automatic merge from submit-queue (batch tested with PRs 38194, 37594, 38123, 37831, 37084)
remove unnecessary fields from genericapiserver config
Cleans up some unnecessary fields in the genericapiserver config.
Automatic merge from submit-queue
Skip not registered nodes in labeling in CA e2e tests
This PR fixes problems with querying for not yet registered nodes. The underlying problem is related to the way the test is written. So we apply labels to the existing nodes, create pods that require N+1 nodes with the labels and expect a new node to be added. But the new node is created without the labels. As soon as the node is spotted it is labeled. But sometimes it is too late. CA notices that the new node doesn't solve the problem and ask for another, hoping that this time it will get the node with the labels. The node is added by MIG but it takes a minute or more for the node to start and register in kubernetes. At this moment the labeling is started. The list of nodes to be labeled is taken from MIG. The extra node is there. But it is not in kubernetes yet. So 404 error is returned on labeling attempt and test fails.
This PR filters the list of nodes to be labeled and applies the labels only on the fully registered nodes.
Fixes 404 in #33754
cc: @jszczepkowski @piosz @fgrzadkowski
Automatic merge from submit-queue (batch tested with PRs 36990, 37494, 38152, 37561, 38136)
api federation types
First commit adds types that can back the kubernetes-discovery server with an `kubectl` compatible way of adding federated servers. Second commit is just generated code.
After we have types, I'd like to start splitting `kubernetes-discovery` into a "legacy" mode which will support what we have today and a "normal" mode which will provide an API federation server like this: https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/federated-api-servers.md that includes both discovery and proxy in a single server. Something like this: https://github.com/openshift/kube-aggregator .
@kubernetes/sig-api-machinery @nikhiljindal
Automatic merge from submit-queue (batch tested with PRs 36990, 37494, 38152, 37561, 38136)
Node E2E: Move ssh related functions into ssh.go.
This PR moves all ssh related functions and variables into a separate file `ssh.go`.
This is a minor cleanup preparing for my test framework refactoring work. Will send out the refactor PR later.
/cc @kubernetes/sig-node
This adds tests for code introduced here :
https://github.com/kubernetes/kubernetes/issues/26994
Via integration test we can now verify that if pod delete
event is somehow missed by AttachDetach controller - it still
get cleaned up by Desired State of World populator.
Automatic merge from submit-queue (batch tested with PRs 37870, 36643, 37664, 37545)
Add option to disable federation ingress controller
**What this PR does / why we need it**:
Added an option to enable/disable federation ingress controller as currently federated ingresses doesn't work in environments other than GCE/GKE. Also ignore reconcile config maps if no federated ingresses exist.
**Which issue this PR fixes**
fixes#33943
@quinton-hoole
**Release note**:
```release-note
Add `--controllers` flag to federation controller manager for enable/disable federation ingress controller
```
Automatic merge from submit-queue
[Federation] Separate the cleanup phases of service and service shards so that service shards can be cleaned up even after the service is deleted elsewhere.
Fixes Federated Service e2e test.
This separation is necessary because "Federated Service DNS should be
able to discover a federated service" e2e test recently added a case
where it deletes the service from federation but not the shards from
the underlying clusters.
Because of the way cleanup was implemented in the AfterEach block
currently, we did not cleanup any of the underlying shards. However,
separating the two phases of the cleanup needs this separation.
cc @kubernetes/sig-cluster-federation @nikhiljindal
Automatic merge from submit-queue (batch tested with PRs 38149, 38156, 38150)
Node E2E: Remove setup-node option
This PR removes `setup-node` option, because:
* It is misleading. `setup-node` doesn't really setup the node, test framework will only put current user into docker user group when it is specified.
* It is not necessary anymore. Because we always run node e2e test as root now, we don't need to do this anymore.
This is a minor cleanup preparing for my test framework refactoring work. Will send out the refactor PR later.
/cc @kubernetes/sig-node
Automatic merge from submit-queue (batch tested with PRs 38149, 38156, 38150)
Remove girishkalele from most places
@matchstick you might need to help here. I am doing this because the bot is trying to create an issue assigned to @girishkalele but it cannot be created as he is not a member of the org any longer.
Automatic merge from submit-queue (batch tested with PRs 37328, 38102, 37261, 31321, 38146)
Fixes flake: wait for dns pods terminating after test completed
From #37194. Based on #36600. Please only look at the second commit.
As mentioned in [comment](https://github.com/kubernetes/kubernetes/issues/37194#issuecomment-262007174), "DNS horizontal autoscaling" test does not wait for the additional pods to be terminated and this may lead to the failure of later tests.
This fix adds a wait loop at the end of the serial test to ensure the cluster recovers to the original state. In the non-serial test it does not wait for the additional pods terminating because it will not affect other tests, given they are able to be run simultaneously. Plus wait for pods terminating will take certain amount of time.
Note this only fixes certain case of #37194. I noticed there are other failures irrelevant to dns autoscaler. LIke [this one](https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-serial/34/).
@bprashanth @Random-Liu
Automatic merge from submit-queue (batch tested with PRs 37328, 38102, 37261, 31321, 38146)
Make thirdparty codec able to decode DeleteOptions
Fix#37278.
Without this PR, the gvk sent to the delegated codec will be the thirdparty one, which is not recognized by the delegated codec (usually api.Codecs).
Automatic merge from submit-queue (batch tested with PRs 38076, 38137, 36882, 37634, 37558)
Make logging for gcl e2e test more verbose
To help debug https://github.com/kubernetes/kubernetes/issues/37241
CC @piosz
Automatic merge from submit-queue (batch tested with PRs 38111, 38121)
remove rbac super user
Cleaning up cruft and duplicated capabilities as we transition from RBAC alpha to beta. In 1.5, we added a secured loopback connection based on the `system:masters` group name. `system:masters` have full power in the API, so the RBAC super user is superfluous.
The flag will stay in place so that the process can still launch, but it will be disconnected.
@kubernetes/sig-auth
Automatic merge from submit-queue (batch tested with PRs 36352, 36538, 37976, 36374)
test: update deployment helper to return better error messages
@kubernetes/deployment the problem with https://github.com/kubernetes/kubernetes/issues/36270 is that the selector key is never added in the deployment but this change would make it clearer.
This separation is necessary because "Federated Service DNS should be
able to discover a federated service" e2e test recently added a case
where it deletes the service from federation but not the shards from
the underlying clusters.
Because of the way cleanup was implemented in the AfterEach block
currently, we did not cleanup any of the underlying shards. However,
separating the two phases of the cleanup needs this separation.
Automatic merge from submit-queue (batch tested with PRs 38049, 37823, 38000, 36646)
Revert "test: update rollover test to wait for available rs before adopting"
This reverts commit 5b7bf78f3f from pr #36439 which appears to have mostly broken the gci-gke test.
Automatic merge from submit-queue (batch tested with PRs 37094, 37663, 37442, 37808, 37826)
Moved gobindata, refactored ReadOrDie refs
**What this PR does / why we need it**: Having gobindata inside of test/e2e/framework prevents external projects from importing the framework. Moving it out and managing refs fixes this problem.
**Which issue this PR fixes**: fixes#37007
Automatic merge from submit-queue (batch tested with PRs 37997, 37939, 37990, 36700, 37258)
Add cluster-level AppArmor E2E test
My goal is to reuse this test for an automated cluster upgrade test.
Automatic merge from submit-queue
Kubeadm unit tests for kubeadm/app/master package
Added unit tests for the kubeadm/app/master package testing functionality of tokens.go, kubeconfig.go, manifests.go, pki.go, discovery.go, addons.go, and apiclient.go.
This PR is part of the ongoing effort to add tests (#35025)
/cc @pires @jbeda
Automatic merge from submit-queue
test: update rollover test to wait for available rs before adopting
Scenario that happened in https://github.com/kubernetes/kubernetes/issues/35355#issuecomment-257808460
-- Replica set that is about to be adopted has 2 out of 4 ready replicas
-- Deployment is created with 4 replicas, adopts pre-existing replica set, creates a new one, and starts rolling replicas over to the new replica set.
```
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled down replica set test-rollover-controller to 3
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set test-rollover-deployment-2505289747 to 1
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment-2505289747: {replicaset-controller } SuccessfulCreate: Created pod: test-rollover-deployment-2505289747-iuiei
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:04 -0700 PDT - event for test-rollover-deployment-2505289747-iuiei: {default-scheduler } Scheduled: Successfully assigned test-rollover-deployment-2505289747-iuiei to gke-jenkins-e2e-default-pool-33c0400e-6q5m
Nov 2 01:38:17.088: INFO: At 2016-11-02 01:38:05 -0700 PDT - event for test-rollover-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set test-rollover-deployment-2505289747 to 2
```
At this point there is no minimum availability for the Deployment (maxUnavailable is 1 meaning desired minimum available is 3 but we only have 2), and the new replica set uses a non-existent image. New replica set is scaled up to 1 (maxSurge is 1), then old replica set is scaled down by one, because cleanupUnhealthyReplicas observes that it has 2 unhealthy replicas - it can only scale down one though because the [maximum replicas it can cleanup is one](d87dfa2723/pkg/controller/deployment/rolling.go (L125)) (4+1-3-1). New replica set is scaled to 2. Available replicas are still 2 (third replica from the old replica set has yet to come up).
-- Deployment is rolled over with a new update. Test reaches for the WaitForDeploymentStatus check but there are only 2 availableReplicas (maxUnavailable is still violated).
This change makes the test wait for a healthy replica set before proceeding thus it should never hit the scenario described above.
@kubernetes/deployment
- Remaining spaghetti untangled
- Missed bazel update and a few hardcoded refs
- New instance of framework.ReadOrDie reference removed post rebase
- Resolve new clientset rebase
- Fixed e2e/generated BUILD dep
- A space
- Missed gobindata ref in golang.sh
Automatic merge from submit-queue
[etcd] Reduce the etcd surface area in the integration test to minimize deps
This is a code refactor for isolation of client usage.
Automatic merge from submit-queue
Build vendored copy of go-bindata and use that in go generate step
**What this PR does / why we need it**: as the title says, uses the vendored version of `go-bindata` rather than expecting developers to `go get` it (when building outside docker).
**Which issue this PR fixes**: fixes#34067, partially addresses #36655
**Special notes for your reviewer**: we still call `go generate` far too many times:
```console
~/.../src/k8s.io/kubernetes $ which go-bindata
~/.../src/k8s.io/kubernetes $ make
+++ [1116 17:35:28] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:29] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:30] Building go targets for linux/amd64:
cmd/libs/go2idl/deepcopy-gen
+++ [1116 17:35:35] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:35] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:36] Building go targets for linux/amd64:
cmd/libs/go2idl/defaulter-gen
+++ [1116 17:35:41] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:41] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:42] Building go targets for linux/amd64:
cmd/libs/go2idl/conversion-gen
+++ [1116 17:35:47] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:47] Generating bindata:
test/e2e/framework/gobindata_util.go
+++ [1116 17:35:48] Building go targets for linux/amd64:
cmd/libs/go2idl/openapi-gen
+++ [1116 17:35:56] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [1116 17:35:56] Generating bindata:
test/e2e/framework/gobindata_util.go
```
Fixing that is a separate effort, though.
cc @sebgoa @ZhangBanger
Automatic merge from submit-queue
Kubeadm unit tests pkg node
Added unit tests for the kubeadm/app/node package testing functionality of bootstrap.go, csr.go, and discovery.go.
This PR is part of the ongoing effort to add tests (#35025)
/cc @pires @jbeda
Automatic merge from submit-queue
Cleanup old cloud resources after 48 hours
With this pr the ingress e2e purges old leaked resources (>48h), so even if tests fail due to leaks, the entire queue won't close till someone bumps up quota through a manual request.
Automatic merge from submit-queue
Adds termination hook in reboot test for debugging
From #33405 and #36230.
Logs the SSH command issued for dropping inbound / outbound traffic to file and dump it out when test ends.
The first `sudo iptables -t filter -nL` is called to confirm the rules are injected. The second `sudo iptables -t filter -nL` is to check whether the rules get clobbered. Adds `date` in between to check time frame.
@bprashanth @freehan
Automatic merge from submit-queue
Add the system verification test to the kubeadm preflight checks
And refactor the system verification test to accept to write to a specific writer in order to customize the output
This PR is targeting v1.5, PTAL
cc @Random-Liu @dchen1107 @kubernetes/sig-cluster-lifecycle
Automatic merge from submit-queue
Update Stateful Set example files for 1.5
1. Remove initialized annotation from statefulset examples
2. Update storage class annotation to beta in statefulset examples
3. Remove alpha limitation on PetSet in cassandra example
cc @erictune @foxish @kow3ns @enisoc @chrislovecnm @kubernetes/sig-apps
```release-note
NONE
```
Automatic merge from submit-queue
move parts of the mega generic run struct out
This splits the main `ServerRunOptions` into composeable pieces that are bindable separately and adds easy paths for composing servers to run delegating authentication and authorization.
@sttts @ncdc alright, I think this is as far as I need to go to make the composing servers reasonable to write. I'll try leaving it here
Automatic merge from submit-queue
Fix package aliases to follow golang convention
Some package aliases are not not align with golang convention https://blog.golang.org/package-names. This PR fixes them. Also adds a verify script and presubmit checks.
Fixes#35070.
cc/ @timstclair @Random-Liu
Automatic merge from submit-queue
Revision handling in federated deployment controller
Deployment controller in regular kubernetes automatically adds an annotation in deployment. This causes a bit of confusion in controller and tests. This PR skips revision annotation in checks. In the next K8S release we will need to have better support for deployment revisions.
Helps with #36588
cc: @nikhiljindal @madhusudancs
Automatic merge from submit-queue
Stop deleting underlying services when federation service is deleted
Fixes https://github.com/kubernetes/kubernetes/issues/36799
Fixing federation service controller to not delete services from underlying clusters when federated service is deleted.
None of the federation controller should do this unless explicitly asked by the user using DeleteOptions. This is the only federation controller that does that.
cc @kubernetes/sig-cluster-federation @madhusudancs
```release-note
federation service controller: stop deleting services from underlying clusters when federated service is deleted.
```
Automatic merge from submit-queue
Skip rather than fail networking tests on single node
**What this PR does / why we need it**:
Needed for the general e2e tidying we need to do for flakey slow tests, imo pre 1.5, see #31402 and so on.
**Which issue this PR fixes** *
Dont fail multinode tests if on a single node cluster, skip instead.
Automatic merge from submit-queue
Fix nil pointer dereference in test framework
Checking the `result.Code` prior to `err` in the if statement causes a panic if result is `nil`. It turns out the formatting of the error is already in `IssueSSHCommandWithResult`, so removing redundant code is enough to fix the issue. Logging the SSH result was also redundant, so I removed that as well.
Automatic merge from submit-queue
Node Conformance Test: Final cleanup for node conformance test.
This PR fits node conformance test with recent change.
* Remove `--manifest-path` because the test will get kubelet configuration through `/configz` now. https://github.com/kubernetes/kubernetes/pull/36919
* Add `$TEST_ARGS` so that we can override arguments inside the container.
* Fix a bug in garbage_collector_test.go which will cause the framework tries to connect docker no matter running the test or not. @dashpole
* Add `${REGISTRY}/node-test:${VERSION}` for convenience.
* Bump up the image version to `0.2`. (the one released with v1.4 is `v0.1`)
I've run the test both with `run_test.sh` script and directly `docker run`. Both of them passed.
After this gets merged, I'll build and push the new test image.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Per-container inode accounting test
Test spins up two pods: one pod uses all inodes, other pod acts normally. Test ensures correct pressure is encountered, inode-hog-pod is evicted, and the pod acting normally is not evicted. Test ensures conditions return to normal after the test.
Automatic merge from submit-queue
Fixes dns autoscaling test flakes
Fixes #36457 and fixes#36569.
#36457 is flake due to the 10 minutes timeout for scaling down cluster. Changes to use `scaleDownTimeout` from [test/e2e/cluster_size_autoscaling.go](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/cluster_size_autoscaling.go), which is 15 minutes.
The failure in #36569 is because we get the schedulable nodes number at the beginning of the test and assume it will not change unless we manually change the cluster size. But below logs indicate there may be nodes become ready after the test has begun.
```
[BeforeEach] [k8s.io] DNS horizontal autoscaling
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/dns_autoscaling.go:71
Nov 10 00:36:26.951: INFO: Condition Ready of node jenkins-e2e-minion-group-x6w1 is false instead of true. Reason: KubeletNotReady, message: Kubenet does not have netConfig. This is most likely due to lack of PodCIDR
STEP: Replace the dns autoscaling parameters with testing parameters
Nov 10 00:36:26.961: INFO: DNS autoscaling ConfigMap updated.
STEP: Wait for kube-dns scaled to expected number
Nov 10 00:36:26.961: INFO: Waiting up to 5m0s for kube-dns reach 8 replicas
...
Expected error:
<*errors.errorString | 0xc420b17ef0>: {
s: "err waiting for DNS replicas to satisfy 8, got 9: timed out waiting for the condition",
}
err waiting for DNS replicas to satisfy 8, got 9: timed out waiting for the condition
not to have occurred
```
This fix puts the logic of counting schedulable nodes into the polling loop. By doing so, the test will have the correct expected replicas count even if schedulable nodes change in between.
@bowei @bprashanth
---
Updates: all `ExpectNoError(err)` are changed to `Expect(err).NotTo(HaveOccurred())`
Automatic merge from submit-queue
federation service e2e: Creating configmap for kube-dns
Ref #37105 and #37143.
Creating kube-dns config map to pass federations flag to kube-dns.
This is required since we moved to the new add on manager. With the old add on manager, we were using kube-dns rc that included the federations flag.
cc @kubernetes/sig-cluster-federation @madhusudancs @bowei @MrHohn
Verified that the tests pass with this change.
Checking the result.Code prior to err in the if statement causes a panic
if result is nil. It turns out the formatting of the error is already in
IssueSSHCommandWithResult, so removing redundant code is enough to fix
the issue. Logging the SSH result was also redundant, so I removed that
as well.
Automatic merge from submit-queue
Modify GCI mounter to enable NFSv3
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.
After this change, need to make push the image and update the sha number in Changelog.
Automatic merge from submit-queue
Guard the ready replica checking by server version
I fixed replica readiness checking for 1.4->1.5 upgrades by using a field that only exists in versions >=1.4.0 in #36924
This fixed a lot of issues in 1.4->1.5 upgrade testing, but did not fix 1.3->1.5 upgrade tests. I've disabled replica checking for 1.3 masters as the old logic was broken anyway.
This will not affect the 1.3 CI tests. Just 1.3 -> {1.4, 1.5} upgrade tests.
https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubernetes-e2e-gke-container_vm-1.3-container_vm-1.5-upgrade-cluster-new/330?log
is an example of this breakage. This is the tell-tale logs:
```console
Nov 22 09:40:50.469: INFO: 11 / 11 pods in namespace 'kube-system' are running and ready (506 seconds elapsed)
Nov 22 09:40:50.469: INFO: expected 5 pod replicas in namespace 'kube-system', 0 are Running and Ready.
Nov 22 09:40:50.469: INFO: POD NODE PHASE GRACE CONDITIONS
```
Automatic merge from submit-queue
Use netexec container in http lifecycle hook test.
Fixes https://github.com/kubernetes/kubernetes/issues/33636.
The original test is using `"echo -e \"HTTP/1.1 200 OK\n\" | nc -l -p 1234` as a simple http server.
However, it seems that this is not very reliable, which may response before golang thinks it should.
So we get the error:
```
I1106 06:14:13.325397 2096 logs.go:41] Unsolicited response received on idle HTTP channel starting with "HTTP/1.1 200 OK\n\n"; err=<nil>
```
This PR changes the test to use the `netexec` container which is a simple http server written by golang and used in many of our networking e2e test. It should be more reliable.
Mark 1.5 since this is fixing a 1.5 release blocking issue. Mark P0 to match the original issue.
@dchen1107
Automatic merge from submit-queue
Node E2E: Update ubuntu image to e2e-node-ubuntu-trusty-docker10-v2-image.
@sjenning Hopefully this will unblock https://github.com/kubernetes/kubernetes/pull/32577.
I built a new ubuntu trusty image with docker 10 - `e2e-node-ubuntu-trusty-docker10-v2-image`. The kernel version of the new ubuntu image is `4.4.0-47-generic`.
@dchen1107
/cc @kubernetes/sig-node
Mark v1.5 because this unblocks a v1.5 feature https://github.com/kubernetes/kubernetes/pull/32577.
This disables ready replica checking for 1.3 masters, but only from 1.4
or 1.5 clients. The old logic was broken anyway due to overlapping
labels with replica sets.
Gluster server needs a filesystem that supports extended attributes for its
data. Some distros (Debian, Ubuntu) ship Docker that runs containers on aufs,
which does not support extended attributes and therefore Gluster server fails
there.
We can use tmpfs for Gluster server data volumes instead of a directory inside
the container.
Automatic merge from submit-queue
Fixed e2e tests for HA master.
Set of fixes that allows HA master e2e tests to pass for removal/addition master replicas.
The summary of changes:
- fixed host name in etcd certs,
- added cluster validation after kube-down,
- fixed the number of master replicas in cluster validation,
- made MULTIZONE=true required for HA master deployments, ensured we correctly handle MULTIZONE=true when user wants to create HA master but not kubelets in multiple zones,
- extended verification of master replicas in HA master e2e tests.
Automatic merge from submit-queue
Test flake: ignore dig failure; wait until timeout
The intent of this test was to continue trying until the timeout has
been reached. The extra assertion makes the test fail early when it
should not.
https://github.com/kubernetes/kubernetes/issues/37144
fix 37114
Automatic merge from submit-queue
Update the timeout in CLOSE_WAIT e2e test
I see some test flakes due to the timeout being too strict,
updating to a larger value.
Adding tail -n 1, it looks like there may be leftover state for other
runs. We really only care about one of the CLOSE_WAIT entries.
Automatic merge from submit-queue
Test case for scalling down before scale up finished
Verifies that current pet (one which was created last by scale up action)
will enter running before scale down will take place
related: #30082
In order to make NFSv3 work, mounter needs to start rpcbind daemon. This
change modify mounter's Dockerfile and mounter script to start the
rpcbind daemon if it is not running on the host.
After this change, need to make push the image and update the sha number in Changelog.
I see some test flakes due to the timeout being too strict,
updating to a larger value.
Adding tail -n 1, it looks like there may be leftover state for other
runs. We really only care about one of the CLOSE_WAIT entries.
Automatic merge from submit-queue
Validate pet set scale up/down order
This change implements test cases to validate that:
- scale up will be done in order
- scale down in reverse order
- if any of the pets will be unhealthy, scale up/down will halt
related: https://github.com/kubernetes/kubernetes/issues/30082
Automatic merge from submit-queue
Filter out non-RestartAlways mirror pod in restart test.
Fixes#37202.
> A quick fix is to filter out non-RestartAlways pods. Because either RestartNever and RestartOnFailure pods could succeed, and we can not deal with terminated mirror pods very well now.
@yujuhong @gmarek
/cc @kubernetes/sig-node
Automatic merge from submit-queue
make groupVersionResource listing dynamic for namespace controller
@derekwaynecarr @kubernetes/sig-api-machinery
```release-note
Third party resources are now deleted when a namespace is deleted.
```
Fixes https://github.com/kubernetes/kubernetes/issues/32306
Verifies that current pet (one which was created last by scale up action)
will enter running before scale down will take place
Change-Id: Ib47644c22b3b097bc466f68c5cac46c78dd44552
This change implements test cases to validate that:
scale up will be done in order
scale down in reverse order
if any of the pets will be unhealthy, scale up/down will halt
Change-Id: I4487a7c9823d94a2995bbb06accdfd8f3001354c
Automatic merge from submit-queue
Retry job update after failure to prevent modification conflict
This fixes#34585 flake.
@janetkuo || @kubernetes/sig-apps ptal
I've been getting too many emails recently wrt to that issue, so I wanted to "clean" my inbox a bit 😉
Automatic merge from submit-queue
Fix kubectl Stratigic Merge Patch compatibility
As @smarterclayton pointed out in [comment1](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290820) and [comment2](https://github.com/kubernetes/kubernetes/pull/35647#pullrequestreview-8290847) in PR #35647,
we cannot assume the API servers publish version and they shares the same version.
This PR removes all the calls of GetServerSupportedSMPatchVersion().
Change the behavior of `apply` and `edit` to:
Retrying with the old patch version, if the new version fails.
Default other usage of SMPatch to the new version, since they don't update list of primitives.
fixes#36916
cc: @pwittrock @smarterclayton
Automatic merge from submit-queue
Eviction Thresholds Update
Sets the defaults for the eviction-hard threshold for GCE based on what we were using during testing: "memory.available<250Mi,nodefs.available<10%,nodefs.inodesFree<5%".
Sets flags for e2e tests to use eviction-minimum-reclaim: "nodefs.available<5%,nodefs.inodesFree<5%"
this fixes#32537
Automatic merge from submit-queue
Add limited config-map support to kube-dns
This is an integration bugfix for https://github.com/kubernetes/kubernetes/issues/36194
```release-note
kube-dns
Added --config-map and --config-map-namespace command line options.
If --config-map is set, kube-dns will load dynamic configuration from the config map
referenced by --config-map-namespace, --config-map. The config-map supports
the following properties: "federations".
--federations flag is now deprecated. Prefer to set federations via the config-map.
Federations can be configured by settings the "federations" field to the value currently
set in the command line.
Example:
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-dns
namespace: kube-system
data:
federations: abc=def
```
- Adds command line flags --config-map, --config-map-ns.
- Fixes 36194 (https://github.com/kubernetes/kubernetes/issues/36194)
- Update kube-dns yamls
- Update bazel (hack/update-bazel.sh)
- Update known command line flags
- Temporarily reference new kube-dns image (this will be fixed with
a separate commit when the DNS image is created)
Automatic merge from submit-queue
Add log rotation to kubemark
We were running out of disk in our large cluster tests, as we didn't have log rotation enabled.
cc @saad-ali
Automatic merge from submit-queue
remove TPR registration, ease validation requirements
Fixes https://github.com/kubernetes/kubernetes/issues/36007 .
This removes the special casing for TPRs inside of the `UnstructuredObject`, which should allow CRUD against skewed kube api server levels.
@kubernetes/kubectl @kubernetes/sig-cli
@janetkuo
Automatic merge from submit-queue
fix leaking memory backed volumes of terminated pods
Currently, we allow volumes to remain mounted on the node, even though the pod is terminated. This creates a vector for a malicious user to exhaust memory on the node by creating memory backed volumes containing large files.
This PR removes memory backed volumes (emptyDir w/ medium Memory, secrets, configmaps) of terminated pods from the node.
@saad-ali @derekwaynecarr
Automatic merge from submit-queue
Add e2e test for statefulset updates
Verify that one can (manually) update statefulset template
cc @erictune @foxish @kow3ns @kubernetes/sig-apps
Automatic merge from submit-queue
Wait for all Nodes to be schedulable before running e2e tests
This should fix the problem we're seeing when running tests on large clusters.
cc @dchen1107
Automatic merge from submit-queue
Delete taint annotation when removing last taint
It messes with debugging of tests failures.
cc @davidopp @kevin-wangzefeng
Automatic merge from submit-queue
Add more test cases to k8s e2e upgrade tests
**Special notes for your reviewer**:
Added guestbook, secrets, daemonsets, configmaps, jobs to e2e upgrade tests according to the discussions in #35078
Still need to run these test cases in real setup, raised a PR here for initial comments
@quinton-hoole
Automatic merge from submit-queue
Add clarity/retries to proxy url test
Improve one segment of the kube-proxy networking test by:
1. Retrying for 30s
2. Bucketing into 2 failure modes
3. Adding some clarity by describing the exec pod on failure
Althought 1 shouldn't be necessary, I don't think we lose anything if the kube-proxy convenience endpoint doesn't respond immediately, and if it fails for 30s straight it is indicative of something that requires attention probably within 1.5.
Fixes https://github.com/kubernetes/kubernetes/issues/32436
Automatic merge from submit-queue
Enable NFSv4 and GlusterFS tests on cluster e2e tests
Enable NFSv4 and GlusterFS tests on cluster e2e tests for GCI images
only.
Automatic merge from submit-queue
Node E2E: Avoid printing test result twice.
This is a problem since long time ago.
`RunSshCommand` includes the command output to the error. If the command running the test fails, the test output will also be included in the error. [The runner prints both the test output and the error](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/runner/remote/run_remote.go#L270), which leads the test result to be printed twice. (See the [test result](https://storage.googleapis.com/kubernetes-jenkins/logs/kubelet-gce-e2e-ci/10968/build-log.txt) on node tmp-node-e2e-af900a4d-e2e-node-ubuntu-trusty-docker9-v1-image)
This PR changes `RunSshCommand` not to put command output into the error, and leave the caller to decide how to deal with command output when the command fails.
Automatic merge from submit-queue
Add e2e test for CockroachDB statefulset
Refactor the code of statefulset e2e test for clustered applications, and add a test for CockroachDB.
The yaml file is copied from examples/cockroachdb/
cc @erictune @foxish @kow3ns @kubernetes/sig-apps
Automatic merge from submit-queue
e2e pod cleanup test: restrict pods to be assigned to nodes observed …
The test checks the individual kubelet /runningPods endpoint based on the
initial list of nodes it observes. It is important that all pods are
scheduled only onto those nodes. Apply node labels to ensure no stray pods on
other nodes.
This fixes#35197
kubectl-in-pod.json has a hardcoded path for
kubectl in its hostPath ('/usr/bin/kubectl').
When the test actually runs on gke, kubectl
is available at '/workspace/kubernetes/platforms/linux/amd64/kubectl'
and NOT at '/usr/bin/kubectl'. So we need to
fix the hostPath for the test to work properly.
Fixes#36586
Automatic merge from submit-queue
Garbage collection tests the MaxPerPodContainers and MaxContainers constraints
This is the first version of this test. It tests that containers are garbage collected according to the default configuration.
Automatic merge from submit-queue
Cleanup pod in MatchContainerOutput
MatchContainerOutput always creates a pod and does not cleanup. We need
to fix this to be better at re-trying the scenarios.
When there is an error say in the first attempt of ExpectNoErrorWithRetries
(for example in "Pods should contain environment variables for services" test)
the retries logic calls MatchContainerOutput another time and the
podClient.create fails correctly since the pod was not cleaned up the
first time MatchContainerOutput was called.
Fixes#35089
Automatic merge from submit-queue
Change dnsutils image to use alpine
This reduces the size of the dnsutils image and should reduce the # of failed e2e test runs due to image pull timeout.
MatchContainerOutput always creates a pod and does not cleanup. We need
to fix this to be better at re-trying the scenarios.
When there is an error say in the first attempt of ExpectNoErrorWithRetries
(for example in "Pods should contain environment variables for services" test)
the retries logic calls MatchContainerOutput another time and the
podClient.create fails correctly since the pod was not cleaned up the
first time MatchContainerOutput was called.
Fixes#35089
Automatic merge from submit-queue
[kubelet] rename --cgroups-per-qos to --experimental-cgroups-per-qos
This reflects the true nature of "cgroups per qos" feature.
```release-note
* Rename `--cgroups-per-qos` to `--experimental-cgroups-per-qos` in Kubelet
```
Automatic merge from submit-queue
tests: update memory resource limits
```release-note
NONE
```
On ubuntu, the `RestartNever` test OOMs its cgroup limit fairly
frequently.
This bumps the number up to something suitably large since the test
isn't testing anything related to this anyways.
Fixes#36159
Fix based on https://github.com/kubernetes/kubernetes/issues/36159#issuecomment-258992255
cc @yujuhong @saad-ali
The test checks the individual kubelet /runningPods endpoint based on the
initial list of nodes it observes. It is important that all pods are
scheduled only onto those nodes. Apply node labels to ensure no stray pods on
other nodes.
On ubuntu, the `RestartNever` test OOMs its cgroup limit fairly
frequently.
This bumps the number up to something suitably large since the test
isn't testing anything related to this anyways.
Fixes#36159
Automatic merge from submit-queue
Use generous limits in the resource usage tracking tests
These tests are mainly used to catch resource leaks in the soak cluster. Using
higher limits to reduce noise.
This should fix#32942 and #32214.
Automatic merge from submit-queue
Bump GCI version to gci-dev-56-8977-0-0
@vishh @saad
``` release-note
Updating GCI base image to gci-dev-56-8977-0-0. Changelog as follows:
* runc: Eliminate redundant parsing of mountinfo
* Updated kubernetes to v1.4.5
```
Automatic merge from submit-queue
Add back e2e tests for disruption budget
The tests were temporarily removed due to problems with api versions in client-go. The test is almost exactly the same as it used to be before api version change.
cc: @caesarxuchao @davidopp
Automatic merge from submit-queue
Restore event messages for replica sets in the deployment controller
Needed to unblock release upgrade tests (see https://github.com/kubernetes/kubernetes/issues/36453)
@kubernetes/deployment ptal
Automatic merge from submit-queue
Node Conformance & E2E: Get node name from node object.
This PR changes the node e2e test framework to get node name from apiserver instead of test flags.
When a user tried out the node conformance test, he found that node conformance test will not work properly if kubelet is started with `hostname-override`.
The reason is that node conformance test is using [the default node name - `os.Hostname`](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/e2e_node_suite_test.go#L124), which may be different from `hostname-override`. This will cause test pods not scheduled, and eventually test timeout.
We can expose a flag from node conformance test, and let user set node name themselves if they are using `hostname-override` on kubelet. However, let the framework automatically detect it from apiserver is more user friendly.
/cc @kubernetes/sig-node
This PR 1) only changes node e2e test framework; 2) fixes a problem in node conformance test which is a 1.5 feature. @saad-ali Can we have this in 1.5?
Automatic merge from submit-queue
Add e2e node test for log path
fixes#34661
A node e2e test to check if container logs files are properly created with right content.
Since the log files under `/var/log/containers` are actually symbolic of docker containers log files, we can not use a pod to mount them in and do check (symbolic doesn't supported by docker volume).
cc @Random-Liu
Automatic merge from submit-queue
Migrates addons from RCs to Deployments
Fixes#33698.
Below addons are being migrated:
- kube-dns
- GLBC default backend
- Dashboard UI
- Kibana
For the new deployments, the version suffixes are removed from their names. Version related labels are also removed because they are confusing and not needed any more with regard to how Deployment and the new Addon Manager works.
The `replica` field in `kube-dns` Deployment manifest is removed for the incoming DNS horizontal autoscaling feature #33239.
The `replica` field in `Dashboard` Deployment manifest is also removed because the rescheduler e2e test is manually scaling it.
Some resource limit related fields in `heapster-controller.yaml` are removed, as they will be set up by the `addon resizer` containers. Detailed reasons in #34513.
Three e2e tests are modified:
- `rescheduler.go`: Changed to resize Dashboard UI Deployment instead of ReplicationController.
- `addon_update.go`: Some namespace related changes in order to make it compatible with the new Addon Manager.
- `dns_autoscaling.go`: Changed to examine kube-dns Deployment instead of ReplicationController.
Both of above two tests passed on my own cluster. The upgrade process --- from old Addons with RCs to new Addons with Deployments --- was also tested and worked as expected.
The last commit upgrades Addon Manager to v6.0. It is still a work in process and currently waiting for #35220 to be finished. (The Addon Manager image in used comes from a non-official registry but it mostly works except some corner cases.)
@piosz @gmarek could you please review the heapster part and the rescheduler test?
@mikedanese @thockin
cc @kubernetes/sig-cluster-lifecycle
---
Notes:
- Kube-dns manifest still uses *-rc.yaml for the new Deployment. The stale file names are preserved here for receiving faster review. May send out PR to re-organize kube-dns's file names after this.
- Heapster Deployment's name remains in the old fashion(with `-v1.2.0` suffix) for avoiding describe this upgrade transition explicitly. In this way we don't need to attach fake apply labels to the old Deployments.
Automatic merge from submit-queue
Adding cascading deletion support to more federation controllers
Ref #33612
Adding cascading deletion support for federated daemonsets and ingress.
The code is same as that for namespaces. Just ensuring that DeletionHelper functions are called at right places in these controllers.
e2e tests coming up in another PR.
cc @kubernetes/sig-cluster-federation @caesarxuchao @madhusudancs @mwielgus
```release-note
federation: Adding support for DeleteOptions.OrphanDependents for federated daemonsets and ingresses. Setting it to false while deleting a federated daemonset or ingress also deletes the corresponding resource from all registered clusters.
```
Automatic merge from submit-queue
Enable NFS and GlusterFS tests in both node and cluster e2e tests
This PR is to enable NFS and GlusterFS tests on both node and cluster
e2e tests.
It also change the code to use ExecCommandInPod instead of kubectl since
node does not have kubectl available
This PR is to enable NFS and GlusterFS tests on both node and cluster
e2e tests for gci and containervm distro.
It also change the code to use ExecCommandInPod instead of kubectl since
node does not have kubectl available
Automatic merge from submit-queue
Add Windows support to kube-proxy
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This is the first stab at supporting kube-proxy (userspace mode) on Windows
**Which issue this PR fixes** :
fixes#30278
**Special notes for your reviewer**:
The MVP uses `netsh portproxy` to redirect traffic from `ServiceIP:ServicePort` to a `LocalIP:LocalPort`.
For the next version we are expecting to have guidance from Microsoft Container Networking team.
**Limitations**:
Current implementation does not support DNS queries over UDP as `netsh portproxy` currently only supports TCP. We are working with Microsoft to remediate this.
cc: @brendandburns @dcbw
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Automatic merge from submit-queue
Extend test timeout for LB creation in large clusters
This will most probably be necessary to test 3000-node clusters.
Automatic merge from submit-queue
Support persistent volume usage for kubernetes running on Photon Controller platform
**What this PR does / why we need it:**
Enable the persistent volume usage for kubernetes running on Photon platform.
Photon Controller: https://vmware.github.io/photon-controller/
_Only the first commit include the real code change.
The following commits are for third-party vendor dependency and auto-generated code/docs updating._
Two components are added:
pkg/cloudprovider/providers/photon: support Photon Controller as cloud provider
pkg/volume/photon_pd: support Photon persistent disk as volume source for persistent volume
Usage introduction:
a. Photon Controller is supported as cloud provider.
When choosing to use photon controller as a cloud provider, "--cloud-provider=photon --cloud-config=[path_to_config_file]" is required for kubelet/kube-controller-manager/kube-apiserver. The config file of Photon Controller should follow the following usage:
```
[Global]
target = http://[photon_controller_endpoint_IP]
ignoreCertificate = true
tenant = [tenant_name]
project = [project_name]
overrideIP = true
```
b. Photon persistent disk is supported as volume source/persistent volume source.
yaml usage:
```
volumes:
- name: photon-storage-1
photonPersistentDisk:
pdID: "643ed4e2-3fcc-482b-96d0-12ff6cab2a69"
```
pdID is the persistent disk ID from Photon Controller.
c. Enable Photon Controller as volume provisioner.
yaml usage:
```
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: gold_sc
provisioner: kubernetes.io/photon-pd
parameters:
flavor: persistent-disk-gold
```
The flavor "persistent-disk-gold" needs to be created by Photon platform admin before hand.
Automatic merge from submit-queue
add e2e test for kubectl in a Pod
Add a e2e test to make sure kubectl can talk to the api server when it is mounted in a pod.
Fixes: #33138
Automatic merge from submit-queue
Make GCI nodes mount non tmpfs, ext* & bind mounts using an external mounter
This PR downloads the stage1 & gci-mounter ACIs as part of cluster bring up instead of downloading them dynamically from gcr.io, which was the cause for #36206.
I have also optimized the containerized mounter to pre-load the mounter image once to avoid fetch latency while using it.
Original PR which got reverted: https://github.com/kubernetes/kubernetes/pull/35821
```release-note
GCI nodes use an external mounter script to mount NFS & GlusterFS storage volumes
```
@mtaufen Node e2e is not re-enabled in this PR.
cc @jingxu97
Automatic merge from submit-queue
Node Conformance Test: Containerize the node e2e test
For #30122, #30174.
Based on #32427, #32454.
**Please only review the last 3 commits.**
This PR packages the node e2e test into a docker image:
- 1st commit: Add `NodeConformance` flag in the node e2e framework to avoid starting kubelet and collecting system logs. We do this because:
- There are all kinds of ways to manage kubelet and system logs, for different situation we need to mount different things into the container, run different commands. It is hard and unnecessary to handle the complexity inside the test suite.
- 2nd commit: Remove all `sudo` in the test container. We do this because:
- In most container, there is no `sudo` command, and there is no need to use `sudo` inside the container.
- It introduces some complexity to use `sudo` inside the test. (https://github.com/kubernetes/kubernetes/issues/29211, https://github.com/kubernetes/kubernetes/issues/26748) In fact we just need to run the test suite with `sudo`.
- 3rd commit: Package the test into a docker container with corresponding `Makefile` and `Dockerfile`. We also added a `run_test.sh` script to start kubelet and run the test container. The script is only for demonstration purpose and we'll also use the script in our node e2e framework. In the future, we should update the script to start kubelet in production way (maybe with `systemd` or `supervisord`).
@dchen1107 @vishh
/cc @kubernetes/sig-node @kubernetes/sig-testing
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
Release alpha version node test container gcr.io/google_containers/node-test-ARCH:0.1 for users to verify their node setup.
```
Automatic merge from submit-queue
Adding cadcading deletion support for federated secrets
Ref https://github.com/kubernetes/kubernetes/issues/33612
Adding cascading deletion support for federated secrets.
The code is same as that for namespaces. Just ensuring that DeletionHelper functions are called at right places in secret_controller.
Also added e2e tests.
cc @kubernetes/sig-cluster-federation @caesarxuchao
```release-note
federation: Adding support for DeleteOptions.OrphanDependents for federated secrets. Setting it to false while deleting a federated secret also deletes the corresponding secrets from all registered clusters.
```
Automatic merge from submit-queue
Rename experimental-runtime-integration-type to experimental-cri
Also rename the field in the component config to `EnableCRI`
The e2e tests cover cases like cluster size changed, parameters
changed, ConfigMap got deleted, autoscaler pod got deleted, etc.
They are separated into a fast part(could be run parallelly) and
a slow part(put in [serial]). The fast part of the e2e tests cost
around 50 seconds to run.
Automatic merge from submit-queue
Rename ScheduledJobs to CronJobs
I went with @smarterclayton idea of registering named types in schema. This way we can support both the new (CronJobs) and old (ScheduledJobs) resource name. Fixes#32150.
fyi @erictune @caesarxuchao @janetkuo
Not ready yet, but getting close there...
**Release note**:
```release-note
Rename ScheduledJobs to CronJobs.
```
This allows us to interrupt/kill the executed command if it exceeds the
timeout (not implemented by this commit).
Set timeout in Exec probes. HTTPGet and TCPSocket probes respect the
timeout, while Exec probes used to ignore it.
Add e2e test for exec probe with timeout. However, the test is skipped
while the default exec handler doesn't support timeouts.
Automatic merge from submit-queue
Adding more e2e tests for federated namespace cascading deletion and fixing bugs
Ref https://github.com/kubernetes/kubernetes/issues/33612
Adding more e2e tests for testing cascading deletion of federated namespace.
New tests are now verifying that cascading deletion happen when DeletionOptions.OrphanDependents=false and it does not happen when DeleteOptions.OrphanDependents=true.
Also updated deletion helper to always add OrphanFinalizer. generic registry will remove it if DeleteOptions.OrphanDependents=false. Also updated namespace registry to do the same.
We need to add the orphan finalizer to keep the orphan by default behavior. We assume that its dependents are going to be orphaned and hence add that finalizer. If user does not want the orphan behavior, he can do so using DeleteOptions and then the registry will remove that finalizer.
cc @kubernetes/sig-cluster-federation @caesarxuchao @derekwaynecarr
Automatic merge from submit-queue
Node Conformance Test: Add system verification
For #30122 and #29081.
This PR introduces system verification test in node e2e and conformance test. It will run before the real test. Once the system verification fails, the test will just fail. The output of the system verification is like this:
```
I0909 23:33:20.622122 2717 validators.go:45] Validating os...
OS: Linux
I0909 23:33:20.623274 2717 validators.go:45] Validating kernel...
I0909 23:33:20.624037 2717 kernel_validator.go:79] Validating kernel version
KERNEL_VERSION: 3.16.0-4-amd64
I0909 23:33:20.624146 2717 kernel_validator.go:93] Validating kernel config
CONFIG_NAMESPACES: enabled
CONFIG_NET_NS: enabled
CONFIG_PID_NS: enabled
CONFIG_IPC_NS: enabled
CONFIG_UTS_NS: enabled
CONFIG_CGROUPS: enabled
CONFIG_CGROUP_CPUACCT: enabled
CONFIG_CGROUP_DEVICE: enabled
CONFIG_CGROUP_FREEZER: enabled
CONFIG_CGROUP_SCHED: enabled
CONFIG_CPUSETS: enabled
CONFIG_MEMCG: enabled
I0909 23:33:20.679328 2717 validators.go:45] Validating cgroups...
CGROUPS_CPU: enabled
CGROUPS_CPUACCT: enabled
CGROUPS_CPUSET: enabled
CGROUPS_DEVICES: enabled
CGROUPS_FREEZER: enabled
CGROUPS_MEMORY: enabled
I0909 23:33:20.679454 2717 validators.go:45] Validating docker...
DOCKER_GRAPH_DRIVER: aufs
```
It verifies the system following a predefined `SysSpec`:
``` go
// DefaultSysSpec is the default SysSpec.
var DefaultSysSpec = SysSpec{
OS: "Linux",
KernelVersion: []string{`3\.[1-9][0-9].*`, `4\..*`}, // Requires 3.10+ or 4+
// TODO(random-liu): Add more config
KernelConfig: KernelConfig{
Required: []string{
"NAMESPACES", "NET_NS", "PID_NS", "IPC_NS", "UTS_NS",
"CGROUPS", "CGROUP_CPUACCT", "CGROUP_DEVICE", "CGROUP_FREEZER",
"CGROUP_SCHED", "CPUSETS", "MEMCG",
},
Forbidden: []string{},
},
Cgroups: []string{"cpu", "cpuacct", "cpuset", "devices", "freezer", "memory"},
RuntimeSpec: RuntimeSpec{
DockerSpec: &DockerSpec{
Version: []string{`1\.(9|\d{2,})\..*`}, // Requires 1.9+
GraphDriver: []string{"aufs", "overlay", "devicemapper"},
},
},
}
```
Currently, it only supports:
- Kernel validation: version validation and kernel configuration validation
- Cgroup validation: validating whether required cgroups subsystems are enabled.
- Runtime Validation: currently, only validates docker graph driver.
The validating framework is ready. The specific validation items could be added over time.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
lister-gen updates
- Remove "zz_generated." prefix from generated lister file names
- Add support for expansion interfaces
- Switch to new generated JobLister
@deads2k @liggitt @sttts @mikedanese @caesarxuchao for the lister-gen changes
@soltysh @deads2k for the informer / job controller changes
Automatic merge from submit-queue
Add cmd support to gcp auth provider plugin
**What this PR does / why we need it**:
Adds ability for gcp auth provider plugin to get access token by shelling out to an external command. We need this because for GKE, kubectl should be using gcloud credentials. It currently uses google application default credentials, which causes confusion if user has configured both with different permissions (previously the two were almost always identical).
**Which issue this PR fixes**:
Addresses #35530 with gcp-only solution, as generic cmd plugin was deemed not useful for other providers.
**Special notes for your reviewer**:
Configuration options are to support whatever future command gcloud provides for printing access token of active user. Also works with existing command (`gcloud auth print-access-token`)
```release-note
```
Automatic merge from submit-queue
Remove GetRootContext method from VolumeHost interface
Remove the `GetRootContext` call from the `VolumeHost` interface, since Kubernetes no longer needs to know the SELinux context of the Kubelet directory.
Per #33951 and #35127.
Depends on #33663; only the last commit is relevant to this PR.
Automatic merge from submit-queue
Per Volume Inode Accounting
Collects volume inode stats using the same find command as cadvisor. The command is "find _path_ -xdev -printf '.' | wc -c". The output is passed to the summary api, and will be consumed by the eviction manager.
This cannot be merged yet, as it depends on changes adding the InodesUsed field to the summary api, and the eviction manager consuming this. Expect tests to fail until this happens.
DEPENDS ON #35137
Automatic merge from submit-queue
[AppArmor] Hold bad AppArmor pods in pending rather than rejecting
Fixes https://github.com/kubernetes/kubernetes/issues/32837
Overview of the fix:
If the Kubelet needs to reject a Pod for a reason that the control plane doesn't understand (e.g. which AppArmor profiles are installed on the node), then it might contiinuously try to run the pod on the same rejecting node. This change adds a concept of "soft rejection", in which the Pod is admitted, but not allowed to run (and therefore held in a pending state). This prevents the pod from being retried on other nodes, but also prevents the high churn. This is consistent with how other missing local resources (e.g. volumes) is handled.
A side effect of the change is that Pods which are not initially runnable will be retried. This is desired behavior since it avoids a race condition when a new node is brought up but the AppArmor profiles have not yet been loaded on it.
``` release-note
Pods with invalid AppArmor configurations will be held in a Pending state, rather than rejected (failed). Check the pod status message to find out why it is not running.
```
@kubernetes/sig-node @timothysc @rrati @davidopp
Automatic merge from submit-queue
Update how we detect overlapping deployments
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#24152
**Special notes for your reviewer**: cc @kubernetes/deployment
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
NONE
```
When looking for overlapping deployments, we should also find other deployments that select current deployment's pods,
not just the ones whose pods are selected by current deployment.
Automatic merge from submit-queue
New command: "kubeadm token generate"
As part of #33930, this PR adds a new top-level command to kubeadm to just generate a token for use with the init/join commands. Otherwise, users are left to either figure out how to generate a token on their own, or let `kubeadm init` generate a token, capture and parse the output, and then use that token for `kubeadm join`.
At this point, I was hoping for feedback on the CLI experience, and then I can add tests. I spoke with @mikedanese and he didn't like the original propose of `kubeadm util generate-token`, so here are the runners up:
```
$ kubeadm generate-token # <--- current implementation
$ kubeadm generate token # in case kubeadm might generate other things in the future?
$ kubeadm init --generate-token # possibly as a subcommand of an existing one
```
Currently, the output is simply the token on one line without any padding/formatting:
```
$ kubeadm generate-token
1087fd.722b60cdd39b1a5f
```
CC: @kubernetes/sig-cluster-lifecycle
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
New kubeadm command: generate-token
```
Automatic merge from submit-queue
Add --force to kubectl delete and explain force deletion
--force is required for --grace-period=0. --now is == --grace-period=1.
Improve command help to explain what graceful deletion is and warn about
force deletion.
Part of #34160 & #29033
```release-note
In order to bypass graceful deletion of pods (to immediately remove the pod from the API) the user must now provide the `--force` flag in addition to `--grace-period=0`. This prevents users from accidentally force deleting pods without being aware of the consequences of force deletion. Force deleting pods for resources like StatefulSets can result in multiple pods with the same name having running processes in the cluster, which may lead to data corruption or data inconsistency when using shared storage or common API endpoints.
```
Automatic merge from submit-queue
NPD: Add e2e test for NPD v0.2.
Node problem detector has been updated after v0.1, including:
1. Add lookback support. It will lookback for configured time to search for possible kernel panic before node reboot.
2. Get node name via downward api.
This PR updates the test to test the new NPD behavior.
@dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Made changes to DELETE API to let v1.DeleteOptions be passed in as a queryParameter
**Which issue this PR fixes** _(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)_: fixes#34856
```release-note
DELETE requests can now pass in their DeleteOptions as a query parameter or a body parameter, rather than just as a body parameter.
```
Automatic merge from submit-queue
Set the annotation only if the test requires it.
**What this PR does / why we need it**: Fixes StatefulSet flake
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/36107
**Special notes for your reviewer**: We shouldn't be setting the debug annotation in all our tests, only the ones that bring statefulset pods up one after another. In the absence of the annotation, we have the new default behavior governed by https://github.com/kubernetes/kubernetes/pull/35739
**Release note**:
```release-note
NONE
```
cc @kubernetes/sig-apps @bprashanth @calebamiles
Automatic merge from submit-queue
Temporarily disable GCI mounter in e2e node tests
This is just so we have an off-switch ready to go if we need it. Don't merge unless we need to disable this functionality in the e2e node tests.
Automatic merge from submit-queue
Controller changes for perma failed deployments
This PR adds support for reporting failed deployments based on a timeout
parameter defined in the spec. If there is no progress for the amount
of time defined as progressDeadlineSeconds then the deployment will be
marked as failed by a Progressing condition with a ProgressDeadlineExceeded
reason.
Follow-up to https://github.com/kubernetes/kubernetes/pull/19343
Docs at kubernetes/kubernetes.github.io#1337
Fixes https://github.com/kubernetes/kubernetes/issues/14519
@kubernetes/deployment @smarterclayton
Automatic merge from submit-queue
add secret type to RBD secrets in examples and e2e test
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This is a followup to recent changes in secret type matching
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
@kubernetes/sig-storage @liggitt
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
```
Automatic merge from submit-queue
Federated ConfigMap controller
Based on the secrets controller. E2e tests will come in the next PR.
**Release note**:
``` release-note
Federated ConfigMap controller. Supports all the API that regular ConfigMap has.
```
cc: @quinton-hoole @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
Remove statefulset e2e test setup for alpha
Depends on #35731, once statefulset is beta, it doesn't need special treatment for alpha version in e2e test
cc @erictune @foxish @kubernetes/sig-apps
Automatic merge from submit-queue
Fixed kibana test problem
After #36103 a problem was introduced. If you change basePath, dynamic bundle optimization is performed upon startup, which can take several minutes.
Following PR fixed this problem by waiting until optimization is completed.
@piosz
Edit: This problem has no known workarounds if basePath is dynamic (like in our case)
Reference: https://discuss.elastic.co/t/how-to-avoid-dynamic-bundle-optimization/37367
Automatic merge from submit-queue
Remove non-generic options from genericapiserver.Config
Remove non-generic options from genericapiserver.Config. Changes the discovery CIDR/IP information to an interface and then demotes several fields.
I haven't pulled from them genericapiserver.Options, but that's a future option we have. Segregation as as a followup at the very least.
Automatic merge from submit-queue
Enable NFS volume test
This PR fixes the dockerfile for NFS server image and enable NFSv4.
After using containeried mounts approach on GCI, this test should pass.
The functionality used to exist entirely in the NC which would
previously clean up pods and nodes together. Now, we simply
wait for the PodGC to see that the node is now deleted and clean up the
pods. This may take a while and hence we set a 1 minute timeout.
Automatic merge from submit-queue
Switch DisruptionBudget api from bool to int allowed disruptions [only v1beta1]
Continuation of #34546. Apparently it there is some bug that prevents us from having 2 different incompatibile version of API in integration tests. So in this PR v1alpha1 is removed until testing infrastructure is fixed.
Base PR comment:
Currently there is a single bool in disruption budget api that denotes whether 1 pod can be deleted or not. Every time a pod is deleted the apiserver filps the bool to false and the disruptionbudget controller sets it to true if more deletions are allowed. This works but it is far from optimal when the user wants to delete multiple pods (for example, by decreasing replicaset size from 10000 to 8000).
This PR adds a new api version v1beta1 and changes bool to int which contains a number of pods that can be deleted at once.
cc: @davidopp @mml @wojtek-t @fgrzadkowski @caesarxuchao
--v=2 is low noise (record changes), can be default
--v=3 will shows per request logging
Note: due to the code path with which we integrate with
skydns, we don't see non-PILLAR_DOMAIN requests, so these
will never be logged.
Automatic merge from submit-queue
[Federation] Add unit tests for `kubefed init`'s certificate generator.
Please review only the last commit here. This is based on PR #35594 which will be reviewed independently.
These are a subset of unit tests for code introduced in PR #35594
Design Doc: PR #34484
cc @kubernetes/sig-cluster-federation @quinton-hoole
Automatic merge from submit-queue
promote /healthz and /metrics to genericapiserver
Promotes `/healthz` to genericapiserver with methods to add healthz checks before running.
Promotes `/metrics` to genericapiserver gated by config flag.
@lavalamp adds the healthz checks linked to `postStartHooks` as promised.
Automatic merge from submit-queue
[Kubelet] Use the custom mounter script for Nfs and Glusterfs only
This patch reduces the scope for the containerized mounter to NFS and GlusterFS on GCE + GCI clusters
This patch also enabled the containerized mounter on GCI nodes
Shepherding multiple PRs through the submit queue is painful. Hence I combined them into this PR. Please review each commit individually.
cc @jingxu97 @saad-ali
https://github.com/kubernetes/kubernetes/pull/35652 has also been reverted as part of this PR
Automatic merge from submit-queue
pod and qos level cgroup support
```release-note
[Kubelet] Add alpha support for `--cgroups-per-qos` using the configured `--cgroup-driver`. Disabled by default.
```
Automatic merge from submit-queue
Move Statefulset (previously PetSet) to v1beta1
**What this PR does / why we need it**: #28718
**Which issue this PR fixes** _(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)_: fixes #
**Special notes for your reviewer**: depends on #35663 (PetSet rename)
cc @erictune @foxish @kubernetes/sig-apps
**Release note**:
``` release-note
v1beta1/StatefulSet replaces v1alpha1/PetSet.
```
When looking for overlapping deployments, we should also find other deployments that select current deployment's pods,
not just the ones whose pods are selected by current deployment.
--force is required for --grace-period=0. --now is == --grace-period=1.
Improve command help to explain what graceful deletion is and warn about
force deletion.
Automatic merge from submit-queue
Move .gitattributes annotation to the root, so GitHub will respect them.
This should fix the merge conflicts by letting GitHub use the simpler line-by-line algorithm for this file. Having .gitattributes in a sub-directory would work for local merging, but would show conflicts on the web UI.
Automatic merge from submit-queue
[Federation][join-01] Implement `kubefed join` command.
Supersedes PR #35155.
Please review only the last commit here. This is based on PR #35492 which will be reviewed independently.
I will add a release note separately for this entire feature, so please don't worry too much about the release note here in the PR.
Design Doc: PR #34484
cc @kubernetes/sig-cluster-federation @quinton-hoole @mwielgus
Automatic merge from submit-queue
Making the pod.alpha.kubernetes.io/initialized annotation optional in PetSet pods
**What this PR does / why we need it**: As of now, the absence of the annotation `pod.alpha.kubernetes.io/initialized` in PetSets causes the PetSet controller to effectively "pause". Being a debug hook, users expect that its absence has no effect on the working of a PetSet. This PR inverts the logic so that we let the PetSet controller operate as expected in the absence of the annotation.
Letting the annotation remain alpha seems ok. Renaming it to something more meaningful needs further discussion.
**Which issue this PR fixes** _(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)_: fixes https://github.com/kubernetes/kubernetes/issues/35498
**Special notes for your reviewer**:
**Release note**:
``` release-note
The annotation "pod.alpha.kubernetes.io/initialized" on StatefulSets (formerly PetSets) is now optional and only encouraged for debug use.
```
cc @erictune @smarterclayton @bprashanth @kubernetes/sig-apps
@kow3ns The examples will need to be cleaned up as well I think later on to remove them.
Automatic merge from submit-queue
Disable gci-mounter in cri node e2e tests
gci-mounter is still being validated and there are known issues. Do not enable it
for cri tests for now.
Automatic merge from submit-queue
Node controller to not force delete pods
Fixes https://github.com/kubernetes/kubernetes/issues/35145
- [x] e2e tests to test Petset, RC, Job.
- [x] Remove and cover other locations where we force-delete pods within the NodeController.
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
Node controller no longer force-deletes pods from the api-server.
* For StatefulSet (previously PetSet), this change means creation of replacement pods is blocked until old pods are definitely not running (indicated either by the kubelet returning from partitioned state, or deletion of the Node object, or deletion of the instance in the cloud provider, or force deletion of the pod from the api-server). This has the desirable outcome of "fencing" to prevent "split brain" scenarios.
* For all other existing controllers except StatefulSet , this has no effect on the ability of the controller to replace pods because the controllers do not reuse pod names (they use generate-name).
* User-written controllers that reuse names of pod objects should evaluate this change.
```
Automatic merge from submit-queue
Disruption e2e: wait for running pods in the table test, too
**What this PR does / why we need it**: Makes the `DisruptionController` tests less flaky.
**Which issue this PR fixes**: fixes#34032
Automatic merge from submit-queue
e2e: Fix GetReadySchedulableNodesOrDie for taints
**What this PR does / why we need it**:
This changes framework.GetReadySchedulableNodesOrDie and
framework.GetMasterAndWorkerNodesOrDie so that nodes that can't take a
generic fake pod due to a taint/toleration mismatch aren't returned.
This is a rehash of #35210, but pulls in the scheduler code.
**Which issue this PR fixes**: c.f. #35210
**Special notes for your reviewer**: I think it's gross that we keep having to manually compute this in e2es. Maybe we need a bug for that?
Automatic merge from submit-queue
Eviction manager evicts based on inode consumption
Fixes: #32526 Integrate Cadvisor per-container inode stats into the summary api. Make the eviction manager act based on inode consumption to evict pods using the most inodes.
This PR is pending on a cadvisor godeps update which will be included in PR #35136
This changes framework.GetReadySchedulableNodesOrDie and
framework.GetMasterAndWorkerNodesOrDie so that nodes that can't take a
generic fake pod due to a taint/toleration mismatch aren't returned.
This is a rehash of #35210, but pulls in the scheduler code.
Run:
etcd &
kube-apiserver --etcd-servers=... ...
UPDATE_NODE_APISERVER go test ./test/integration/master
-test.run=TestUpdateNodeObjects -test.v -tags integration
Simulates the core update loops from nodes to the API server, allowing
baseline profiling for steady state of large clusters. May require
tweaking the http.Transport used by the client to support >N idle
connections to the master.
Automatic merge from submit-queue
remove non-reuseable bits of MasterServer
Scrub `master.go` again. I think I'm pretty happy with this shape. I may promote `InstallAPIs` since we're likely to want it downstream.
Automatic merge from submit-queue
Allow apiserver to choose preferred kubelet address type
Follow up to #33718 to stay compatible with clusters using DNS names for master->node communications. Adds the `--kubelet-preferred-address-types` apiserver flag for clusters that prefer a different node address type.
```release-note
The apiserver can now select which type of kubelet-reported address to use for master->node communications, using the --kubelet-preferred-address-types flag.
```
Automatic merge from submit-queue
Misc master and federation cleanups
- misc small cleanups
- make ServerRunOption embeddings explicit in order to make the technical debt in our plumbing code visible.
Automatic merge from submit-queue
kubeadm: added unit test for app/preflight pkg
Added unit test for kubeadm/app/preflight package testing functionality of checks.go.
This PR is part of the ongoing effort to add tests (#35025)
/cc @pires @jbeda
Automatic merge from submit-queue
Verify and update client-go staging area for every PR
We need to keep the staging area up-to-date to prevent PRs from breaking client-go.
It's marked as "WIP" because we need to decide the [versioning strategy](https://github.com/kubernetes/client-go/issues/9) for client-go first. This PR contains breaking changes for client-go.
This is blocking #29934 and potentially #34441
cc @kubernetes/sig-api-machinery
Automatic merge from submit-queue
Let release_1_5 clientset include multiple versions of a group
Fix#35237
This PR make versioned clientset to include multiple versions of a group. Currently only `batch` has `v1` and `v2alpha1`. The clientset interface now looks like:
```go
BatchV2alpha1() v2alpha1batch.BatchV2alpha1Interface
BatchV1() v1batch.BatchV1Interface
// Deprecated: please explicitly pick a version if possible.
Batch() v1batch.BatchV1Interface
```
Commit "update client-gen to say internalversion rather than unversioned" fixes https://github.com/kubernetes/kubernetes/issues/24481.
cc @kubernetes/sig-api-machinery @soltysh @deads2k @nikhiljindal
```release-note
release_1_5 clientset supports multiple versions of a group.
```
Automatic merge from submit-queue
Make overlapping deployments deletable
@kubernetes/deployment ptal
Fixes https://github.com/kubernetes/kubernetes/issues/34466 by 1) not adding the overlapping annotation in the working deployment, 2) updates observedGeneration for overlapping deployments, and 3) updates the kubectl deployment reaper to do non-cascading deletion for deployments with the overlapping annotation.
Automatic merge from submit-queue
Convert - to _ for protobuf package names
Convert - to _ for protobuf package names to allow protobuf code generation
support for go packages that have - in their names.
@smarterclayton @deads2k @liggitt @sttts @lavalamp @nikhiljindal @kubernetes/sig-api-machinery
Automatic merge from submit-queue
convert SA controller to shared informers
convert the SA controller to shared informer + workqueue.
I think one of @derekwaynecarr @ncdc or @liggitt
Automatic merge from submit-queue
allow authentication through a front-proxy
This allows a front proxy to set a request header and have that be a valid `user.Info` in the authentication chain. To secure this power, a client certificate may be used to confirm the identity of the front proxy
@kubernetes/sig-auth fyi
@erictune per-request
@liggitt you wrote the openshift one, ptal.
Automatic merge from submit-queue
Simplify negotiation in server in preparation for multi version support
This is a pre-factor for #33900 to simplify runtime.NegotiatedSerializer, tighten up a few abstractions that may break when clients can request different client versions, and pave the way for better negotiation.
View this as pure simplification.
Automatic merge from submit-queue
Fixing e2e tests which rely on network disruptions
**What this PR does / why we need it**: It fixes e2e tests on network disruptions
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/27324, https://github.com/kubernetes/kubernetes/issues/35293https://github.com/kubernetes/kubernetes/pull/33655 changed the kubelet's `--api-server` flag to use the external IP address of the APIServer in GCE. Hence, the iptables rules were failing previously. We now return the external IP for `getMaster()`. This fix is required in order to write e2e tests which test behavior in the face of network disruptions.
/cc @bprashanth @dchen1107 @kubernetes/sig-apps
Automatic merge from submit-queue
Add a retry when reading a file content from a container
To avoid temporal failure in reading the file content, add a retry
process in function verifyPDContentsViaContainer
Automatic merge from submit-queue
[PHASE 1] Opaque integer resource accounting.
## [PHASE 1] Opaque integer resource accounting.
This change provides a simple way to advertise some amount of arbitrary countable resource for a node in a Kubernetes cluster. Users can consume these resources by including them in pod specs, and the scheduler takes them into account when placing pods on nodes. See the example at the bottom of the PR description for more info.
Summary of changes:
- Defines opaque integer resources as any resource with prefix `pod.alpha.kubernetes.io/opaque-int-resource-`.
- Prevent kubelet from overwriting capacity.
- Handle opaque resources in scheduler.
- Validate integer-ness of opaque int quantities in API server.
- Tests for above.
Feature issue: https://github.com/kubernetes/features/issues/76
Design: http://goo.gl/IoKYP1
Issues:
kubernetes/kubernetes#28312kubernetes/kubernetes#19082
Related:
kubernetes/kubernetes#19080
CC @davidopp @timothysc @balajismaniam
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
Added support for accounting opaque integer resources.
Allows cluster operators to advertise new node-level resources that would be
otherwise unknown to Kubernetes. Users can consume these resources in pod
specs just like CPU and memory. The scheduler takes care of the resource
accounting so that no more than the available amount is simultaneously
allocated to pods.
```
## Usage example
```sh
$ echo '[{"op": "add", "path": "pod.alpha.kubernetes.io~1opaque-int-resource-bananas", "value": "555"}]' | \
> http PATCH http://localhost:8080/api/v1/nodes/localhost.localdomain/status \
> Content-Type:application/json-patch+json
```
```http
HTTP/1.1 200 OK
Content-Type: application/json
Date: Thu, 11 Aug 2016 16:44:55 GMT
Transfer-Encoding: chunked
{
"apiVersion": "v1",
"kind": "Node",
"metadata": {
"annotations": {
"volumes.kubernetes.io/controller-managed-attach-detach": "true"
},
"creationTimestamp": "2016-07-12T04:07:43Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/hostname": "localhost.localdomain"
},
"name": "localhost.localdomain",
"resourceVersion": "12837",
"selfLink": "/api/v1/nodes/localhost.localdomain/status",
"uid": "2ee9ea1c-47e6-11e6-9fb4-525400659b2e"
},
"spec": {
"externalID": "localhost.localdomain"
},
"status": {
"addresses": [
{
"address": "10.0.2.15",
"type": "LegacyHostIP"
},
{
"address": "10.0.2.15",
"type": "InternalIP"
}
],
"allocatable": {
"alpha.kubernetes.io/nvidia-gpu": "0",
"cpu": "2",
"memory": "8175808Ki",
"pods": "110"
},
"capacity": {
"alpha.kubernetes.io/nvidia-gpu": "0",
"pod.alpha.kubernetes.io/opaque-int-resource-bananas": "555",
"cpu": "2",
"memory": "8175808Ki",
"pods": "110"
},
"conditions": [
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-07-12T04:07:43Z",
"message": "kubelet has sufficient disk space available",
"reason": "KubeletHasSufficientDisk",
"status": "False",
"type": "OutOfDisk"
},
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-07-12T04:07:43Z",
"message": "kubelet has sufficient memory available",
"reason": "KubeletHasSufficientMemory",
"status": "False",
"type": "MemoryPressure"
},
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-08-10T06:27:11Z",
"message": "kubelet is posting ready status",
"reason": "KubeletReady",
"status": "True",
"type": "Ready"
},
{
"lastHeartbeatTime": "2016-08-11T16:44:47Z",
"lastTransitionTime": "2016-08-10T06:27:01Z",
"message": "kubelet has no disk pressure",
"reason": "KubeletHasNoDiskPressure",
"status": "False",
"type": "DiskPressure"
}
],
"daemonEndpoints": {
"kubeletEndpoint": {
"Port": 10250
}
},
"images": [],
"nodeInfo": {
"architecture": "amd64",
"bootID": "1f7e95ca-a4c2-490e-8ca2-6621ae1eb5f0",
"containerRuntimeVersion": "docker://1.10.3",
"kernelVersion": "4.5.7-202.fc23.x86_64",
"kubeProxyVersion": "v1.3.0-alpha.4.4285+7e4b86c96110d3-dirty",
"kubeletVersion": "v1.3.0-alpha.4.4285+7e4b86c96110d3-dirty",
"machineID": "cac4063395254bc89d06af5d05322453",
"operatingSystem": "linux",
"osImage": "Fedora 23 (Cloud Edition)",
"systemUUID": "D6EE0782-5DEB-4465-B35D-E54190C5EE96"
}
}
}
```
After patching, the kubelet's next sync fills in allocatable:
```
$ kubectl get node localhost.localdomain -o json | jq .status.allocatable
```
```json
{
"alpha.kubernetes.io/nvidia-gpu": "0",
"pod.alpha.kubernetes.io/opaque-int-resource-bananas": "555",
"cpu": "2",
"memory": "8175808Ki",
"pods": "110"
}
```
Create two pods, one that needs a single banana and another that needs a truck load:
```
$ kubectl create -f chimp.yaml
$ kubectl create -f superchimp.yaml
```
Inspect the scheduler result and pod status:
```
$ kubectl describe pods chimp
Name: chimp
Namespace: default
Node: localhost.localdomain/10.0.2.15
Start Time: Thu, 11 Aug 2016 19:58:46 +0000
Labels: <none>
Status: Running
IP: 172.17.0.2
Controllers: <none>
Containers:
nginx:
Container ID: docker://46ff268f2f9217c59cc49f97cc4f0f085d5ac0e251f508cc08938601117c0cec
Image: nginx:1.10
Image ID: docker://sha256:82e97a2b0390a20107ab1310dea17f539ff6034438099384998fd91fc540b128
Port: 80/TCP
Limits:
cpu: 500m
memory: 64Mi
pod.alpha.kubernetes.io/opaque-int-resource-bananas: 3
Requests:
cpu: 250m
memory: 32Mi
pod.alpha.kubernetes.io/opaque-int-resource-bananas: 1
State: Running
Started: Thu, 11 Aug 2016 19:58:51 +0000
Ready: True
Restart Count: 0
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
No volumes.
QoS Class: Burstable
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
9m 9m 1 {default-scheduler } Normal Scheduled Successfully assigned chimp to localhost.localdomain
9m 9m 2 {kubelet localhost.localdomain} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
9m 9m 1 {kubelet localhost.localdomain} spec.containers{nginx} Normal Pulled Container image "nginx:1.10" already present on machine
9m 9m 1 {kubelet localhost.localdomain} spec.containers{nginx} Normal Created Created container with docker id 46ff268f2f92
9m 9m 1 {kubelet localhost.localdomain} spec.containers{nginx} Normal Started Started container with docker id 46ff268f2f92
```
```
$ kubectl describe pods superchimp
Name: superchimp
Namespace: default
Node: /
Labels: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
nginx:
Image: nginx:1.10
Port: 80/TCP
Requests:
cpu: 250m
memory: 32Mi
pod.alpha.kubernetes.io/opaque-int-resource-bananas: 10Ki
Volume Mounts: <none>
Environment Variables: <none>
Conditions:
Type Status
PodScheduled False
No volumes.
QoS Class: Burstable
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
3m 1s 15 {default-scheduler } Warning FailedScheduling pod (superchimp) failed to fit in any node
fit failure on node (localhost.localdomain): Insufficient pod.alpha.kubernetes.io/opaque-int-resource-bananas
```
- Prevents kubelet from overwriting capacity during sync.
- Handles opaque integer resources in the scheduler.
- Adds scheduler predicate tests for opaque resources.
- Validates opaque int resources:
- Ensures supplied opaque int quantities in node capacity,
node allocatable, pod request and pod limit are integers.
- Adds tests for new validation logic (node update and pod spec).
- Added e2e tests for opaque integer resources.
Alter how runtime.SerializeInfo is represented to simplify negotiation
and reduce the need to allocate during negotiation. Simplify the dynamic
client's logic around negotiating type. Add more tests for media type
handling where necessary.
Automatic merge from submit-queue
Update rkt version on GCI nodes to v1.18.0
v1.18.0 avoids outputting debug information by default which happens to
pollute events and kubelet logs.
Automatic merge from submit-queue
Add hack/verify-test-owners.sh to ensure tests always have owners.
This ensures that new tests or changed tests are assigned appropriate owners.
Automatic merge from submit-queue
Swap in tests ownership
To make the test ownership more closer to actual area of expertise I made the following swap. I included @mtaufen to close the cycle. Please wait with applying lgtm label for the second reviewer.
Automatic merge from submit-queue
remove the non-generated client
Removes the non-generated client from kube. The package has a few methods left, but nothing that needs updating when adding new groups.
@ingvagabund
Automatic merge from submit-queue
Fixed gcloud command in logs-generator makefile
I grepped through the code looking for `gcloud` and `push` commands and only found one Makefile missing the `--`. I added it.
fixes#33765🐛
Automatic merge from submit-queue
Set done to true & return error if RestartPolicy not Always in test framework
Found a small issue with https://github.com/kubernetes/kubernetes/pull/34632, it returns an error if the RestartPolicy is not Always, but the user will never see it because done isn't set to true & they will timeout instead.
@Random-Liu because you wrote that PR
Automatic merge from submit-queue
Adding cascading deletion support to federated namespaces
Ref https://github.com/kubernetes/kubernetes/issues/33612
With this change, whenever a federated namespace is deleted with `DeleteOptions.OrphanDependents = false`, then federation namespace controller first deletes the corresponding namespaces from all underlying clusters before deleting the federated namespace.
cc @kubernetes/sig-cluster-federation @caesarxuchao
```release-note
Adding support for DeleteOptions.OrphanDependents for federated namespaces. Setting it to false while deleting a federated namespace also deletes the corresponding namespace from all registered clusters.
```
Automatic merge from submit-queue
rename kubelet flag mounter-path to experimental-mounter-path
```release-note
* Kubelet flag '--mounter-path' renamed to '--experimental-mounter-path'
```
The feature the flag controls is an experimental feature and this renaming ensures that users do not depend on this feature just yet.
Automatic merge from submit-queue
Add multiple PV/PVC pair handling to persistent volume e2e test
Adds the framework for creating, validating, and deleting groups of PVs and PVCs.
Automatic merge from submit-queue
Speed up some networking tests in large clusters
Since we are getting towards testing larger and larger clusters (hopefully 5000-node ones soon-ish), I'm trying to limit the amount of super long tests to minimum.
This should significantly reduce amount of time used by those from test/e2e/networking.go.
@gmarek
Automatic merge from submit-queue
Remove versioned LabelSelectors
We have LabelSelectors defined in `unversioned`, `batch/v1`, `batch/v2alpha1`, and `extensions/v1beta1`. Their definitions are all the same. I kept the definition in `unversioned` and removed the others. It only makes sense to define a versioned LabelSelectors if the definition is different.
Automatic merge from submit-queue
Marked NodeOutOfDisk test with feature label
Marked NodeOutOfDisk test with feature label to temporarily remove it from flaky suite.
@madhusudancs @piosz
Automatic merge from submit-queue
always clean gce resources in service e2e
@bprashanth the previous PR was closed when I squashed my commits.
Here is the new change set, please help to review again.
1). only the following two It() create, I created a string array to persist the LB name so that they can be cleaned in AfterEach(), and the string array was reset after clean up.
```
"should be able to change the type and ports of a service [Slow]"
"should be able to create services of type LoadBalancer and externalTraffic=localOnly"
```
2). Directly call gce api to delete the resource and ignore any error returned.
Automatic merge from submit-queue
CRI: Add dockershim grpc server.
This PR adds a in-process grpc server for dockershim.
Flags change:
1. `container-runtime` will not be automatically set to remote when `container-runtime-endpoint` is set. @feiskyer
2. set kubelet flag `--experimental-runtime-integration-type=remote --container-runtime-endpoint=UNIX_SOCKET_FILE_PATH` to enable the in-process dockershim grpc server.
3. set node e2e test flag `--runtime-integration-type=remote -container-runtime-endpoint=UNIX_SOCKET_FILE_PATH` to run node e2e test against in-process dockershim grpc server.
I've run node e2e test against the remote cri integration, tests which don't rely on stream and log functions can pass.
This unblocks the following work:
1) CRI conformance test.
2) Performance comparison between in-process integration and in-process grpc integration.
@yujuhong @feiskyer
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Make hack/update_owners.py get list from local repo, add --check option.
This should become a verify step soon. The munger understands the * syntax already.
Automatic merge from submit-queue
test-images: server address is now configurable
This commit perform changes discussed in #32128
Should be merged before #35301
cc: @pskrzyns
Automatic merge from submit-queue
Add a retry when reading a file content from a container
To avoid temporal failure in reading the file content, add a retry
process in function verifyPDContentsViaContainer
Automatic merge from submit-queue
e2e node plumbing and bundling for GCI mounter
**Note:** The code in this PR only bundles the mounter and modifies `--mounter-path` if it can find `cluster/gce/gci/mounter` in the K8s source dir when building the test bundle.
This bundles the mounter script for GCI with the node e2e tests and allows the `--mounter-path` to be passed to the Kubelet via the node test framework. The node test runner will detect when we are running on a remote GCI node and add the appropriate `--mounter-path` to the `testArgs`.
It also includes a simple node test that mounts a tmpfs volume. This will exercise the Kubelet's mounter code path.
**ITEM OF NOTE:** To get the k8s root dir (in order to copy the mount script into the tarball), I changed `getK8sRootDir` -> `GetK8sRootDir` in `test/e2e_node/build/build.go`. Based on the comment above that function (and the fact that it was private to begin with), I'm not sure this is the best way to do things:
```
// TODO: Dedup / merge this with comparable utilities in e2e/util.go
```
On the other hand, the `e2e/util.go` file mentioned in that comment doesn't exist anymore. This should be resolved before this PR is merged.
Automatic merge from submit-queue
CRI: Add cri test on containervm.
As is discussed with @yujuhong, we need to validate cri on containervm.
@yujuhong @feiskyer
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Fixing a typo in federated secrets test
Realized that our secret e2e test was not running due to a typo in my last PR :)
cc @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
service e2e: remove TODO and subtle changes in logging
Removes the stale `TODO` for external source IP preservation as the e2e test of ESIPP was added.
Changes logging in create service functions: namespace/namespace -> namespace/serviceName.
@bprashanth
Automatic merge from submit-queue
Update elasticsearch and kibana usage
```release-note
Updated default Elasticsearch and Kibana used for elasticsearch logging destination to versions 2.4.1 and 4.6.1 respectively.
```
Updated controllers for elasticsearch and kibana to use newer versions of images. Fixed e2e test because of elasticsearch backward incompatible API changes.
Fixed out of sync elasticsearch controller for coreos.
@piosz
Automatic merge from submit-queue
Loadbalanced client src ip preservation enters beta
Sounds like we're going to try out the proposal (https://github.com/kubernetes/kubernetes/issues/30819#issuecomment-249877334) for annotations -> fields on just one feature in 1.5 (scheduler). Or do we want to just convert to fields right now?
Automatic merge from submit-queue
Verify petset status.replicas in e2e test
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**: follow up #33983. PetSet status.replicas bug is fixed, so adding tests for it (especially for the `should handle healthy pet restarts during scale` case)
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**: cc @erictune @foxish @kubernetes/sig-apps
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
```release-note
NONE
```
Automatic merge from submit-queue
Set non-always RestartPolicy for write-pod in pv e2e
Due to https://github.com/kubernetes/kubernetes/pull/34632 the RestartPolicy can't be Always (& it shouldn't be anyway)