Automatic merge from submit-queue
support for mounting local-ssds on GCI
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
@vulpecula @roberthbailey @andyzheng0831 @kubernetes/goog-image
This reverts commit 2494c77972.
The previous commit, which launches the kubelet under `systemd-run`,
fixes an error in detecting the kubelet's cgroup stats.
Fixes#26979
Automatic merge from submit-queue
Fix GKE upgrade e2e util.
containers command group at HEAD no longer accepts --zone. Flag
has to be specified after subcommand group. Fix#27011
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
Automatic merge from submit-queue
Enable WatchCache in test/integration/ tests
We already run cmd/integration/ with watch cache on. We should also run tests in test/integration/ with watch cache on.
@wojtek-t @lavalamp
Automatic merge from submit-queue
Add a custom main instead of the standard test main, to reduce stack …
Adds a custom test main handler (see: `TestMain` in https://golang.org/pkg/testing/ for details)
Partial fix for https://github.com/kubernetes/kubernetes/issues/25965
This does the standard timeout, but strips non-kubernetes stacks out of the stack trace (e.g. it filters things like:
```
goroutine 466 [IO wait, 7 minutes]:
net.runtime_pollWait(0x7fd74c4672c0, 0x72, 0xc821614000)
/usr/local/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8215c21b0, 0x72, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8215c21b0, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).Read(0xc8215c2150, 0xc821614000, 0x1000, 0x1000, 0x0, 0x7fd74c491050, 0xc820014058)
/usr/local/go/src/net/fd_unix.go:250 +0x23a
net.(*conn).Read(0xc820a5a090, 0xc821614000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:172 +0xe4
net/http.noteEOFReader.Read(0x7fd74c465258, 0xc820a5a090, 0xc8215f0068, 0xc821614000, 0x1000, 0x1000, 0x405773, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:1687 +0x67
net/http.(*noteEOFReader).Read(0xc8215ae1a0, 0xc821614000, 0x1000, 0x1000, 0xc82159ad1d, 0x0, 0x0)
<autogenerated>:284 +0xd0
bufio.(*Reader).fill(0xc8202a2b40)
/usr/local/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Peek(0xc8202a2b40, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:132 +0xcc
net/http.(*persistConn).readLoop(0xc8215f0000)
/usr/local/go/src/net/http/transport.go:1073 +0x177
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:857 +0x10a6
```
We may want to get even more aggressive in the future.
@kubernetes/sig-testing
Automatic merge from submit-queue
Move quota usage testing for loadbalancers into unit tests
Fixes https://github.com/kubernetes/kubernetes/issues/26319
* moved testing for node port and load balancer usage in quota to unit tests
* remove node port and node port -> loadbalancer service testing out of e2e
* covered already in replenishment_controller_test scenario
Given the time it takes to even allocate a load balancer, it seems better to test that outside of this test case to avoid unnecessary flakes.
/cc @bprashanth
Automatic merge from submit-queue
Flake 26210: decouple explicit access from port 80
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Reduce huge amount of logs in large cluster tests
When running tests in 2000-node clusters, I got more than 100.000 lines like this:
```
Jun 8 01:03:11.850: INFO: Condition NetworkUnavailable of node gke-gke-large-cluster-default-pool-1-03ee5a12-knrw is true instead of false. Reason: NoRouteCreated, message: Node created w ithout a route
```
that doesn't give much value.
This is PR is reducing the number of logs.
Also includes other improvements:
- Makefile rule to run tests against remote instance using existing host or image
- Makefile will reuse an instance created from an image if it was not torn down
- Runner starts gce instances in parallel with building source
- Runner uses instance ip instead of hostname so that it doesn't need to resolve
- Runner supports cleaning up files and processes on an instance without stopping / deleting it
- Runner runs tests using `ginkgo` binary to support running tests in parallel
Now that GCE routes take an extremely long time to come up and there's
a variance in "Ready" and "Schedulable", start cherry-picking tests
where we really want to have all nodes routable/schedulable for
testing. Adding logging. This will increase test times on large
clusters but should have 0 impact on normal testing.
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Mark runtime conformance tests as flaky and not run them in jenkins CI.
For #26809
@pwittrock As discussed offline, marking runtime tests as flaky for now. I'm not sure if those tests are required. Testing docker in every Kubernetes PR is un-necessary.
These tests can be run periodically in a separate CI. AFAIK, these tests don't seem to exercise any kube features.
When we create a PV, we should created it withoud Spec.ClaimRef.UID.
In rare cases, when 'PV added' event with UID is processed before 'PVC
added' (created by for loop few lines above), the controller does not know
a PVC with this UID and considers the PV as released. Reclaim policy is
then executed and the PV is deleted and it's never bound.
With UID="", the controller waits for the PVC to get created and binds
it.
Different tests should use different objects and watchers - I noticed
sometimes an event from old tests leaked into subsequent test in the
same function.
And add some logs.
Automatic merge from submit-queue
Update Node e2e Core OS image to run systemd with CPU & Memory accounting enabled by default
cc @derekwaynecarr
For #26289
Automatic merge from submit-queue
volume controller: add configurable integration test to stress the binder
The test tries to bind configured nr. of PVs to the same nr. of PVCs. '100' is used by default, which should take ~1-3 seconds (depends on log level). Periodic sync is needed in rare cases, which may add another 10 seconds. - cache from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance. Set KUBE_INTEGRATION_PERSISTENTVOLUME_* env. variables as described in persistent_volume_test.go and run the tests:
```
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PERSISTENTVOLUME_SYNC_PERIOD=10s KUBE_INTEGRATION_PERSISTENTVOLUME_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
```
Log level '2' is useful to get timestamps of various events like 'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs: claims are bound'.
Automatic merge from submit-queue
kubelet e2e: enforce that image prepulling must finish before the test
The image prepulling pod calls docker directly to pull images. If the pod
hasn't finished before running the resource usage tracking test, there'd be a
cpu spike in docker. We'd rather wait and fail if this is the case, before
running the test.
The image prepulling pod calls docker directly to pull images. If the pod
hasn't finished before running the resource usage tracking test, there'd be a
cpu spike in docker. We'd rather wait and fail if this is the case, before
running the test.
The test tries to bind configured nr. of PVs to the same nr. of PVCs.
'100' is used by default, which should take ~1-3 seconds (depends on log level).
Periodic sync is needed in rare cases, which may add another 10 seconds. - cache
from #25881 will help here and sync should not be needed at all.
The test is configurable and may be reused to measure binder performance.
Set KUBE_INTEGRATION_PV_* env. variables as described in
persistent_volume_test.go and run the tests:
# compile
$ cd test/integration
$ godep go test -tags 'integration no-docker' -c
# run the tests
$ KUBE_INTEGRATION_PV_SYNC_PERIOD=10s KUBE_INTEGRATION_PV_OBJECTS=1000 time ./integration.test -test.run TestPersistentVolumeMultiPVsPVCs -v 2
Log level '2' is useful to get timestamps of various events like
'TestPersistentVolumeMultiPVsPVCs: start' and 'TestPersistentVolumeMultiPVsPVCs:
claims are bound'.
Automatic merge from submit-queue
Stabilize persistent volume integration tests
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Fixes#26499 (at least I hope so...)
Automatic merge from submit-queue
Don't allow deps with no discernible license
This updates the few deps we had with no LICENSE file to current versions that do have that file. It also disallows new deps without obvious licenses.
Automatic merge from submit-queue
Enable node e2e accounting on systemd
Updated the e2e setup.sh script to enable cpu and memory accounting.
Related to https://github.com/kubernetes/kubernetes/issues/26198
/cc @pwittrock
Automatic merge from submit-queue
Disable PodAffinity SchedulerPredicates test
This feature is disabled, so it's not surprising that tests don't work.
cc @davidopp @kevin-wangzefeng
@david-mcmahon - this disables the second test that causes failures in SchedulerPredicates suite. When this and #26695 are merged it should be passing in serial.
Automatic merge from submit-queue
Revert revert of adding resource constraints for master components in density tests
The problem was the time when resource constraints were generated. It turns out that the provider is not set there. This version should work.
cc @roberthbailey @alex-mohr
Automatic merge from submit-queue
Add direct serializer
Fix#25589. Implemented a direct codec that doesn't do conversion, but sets the group, version and kind before serialization as Clayton suggested [here](https://github.com/kubernetes/kubernetes/issues/25589#issuecomment-219168009).
First commit is cherry-picked from #24826.
@kubernetes/sig-api-machinery
Automatic merge from submit-queue
kubelet e2e: bumping cpu limit
The previous limit was too aggressive and caused kubernetes-e2e-gce-serial build 1404 to fail.
Automatic merge from submit-queue
Change Kubelet test to run also in large clusters
This should fix the recent failure of kubelet test in 2000-node cluster.
@zmerlynn - FYI
- add more logs
- wait both for volume and claim to get bound
When binding volumes to claims the controller saves PV first and PVC right
after that. In theory, this saved PV could cause waitForPersistentVolumePhase
to finish and PVC could be checked in the test before the controller saves it.
So, wait for both PVC and PV to get bound and check the results only after
that. This is only a theory, there are no usable logs in integration tests.
Automatic merge from submit-queue
SSH e2e: Limit to 100 nodes, limit combinatorics
[]()This limits the "for all hosts" to 100 nodes, and also limits the
combinatorial section so that we only do the other SSH command variant
testing on the first host rather than *all* of the hosts. I also
killed one of the variants because it didn't seem to be testing much
important.
Fixes#26600
This limits the "for all hosts" to 100 nodes, and also limits the
combinatorial section so that we only do the other SSH command variant
testing on the first host rather than *all* of the hosts. I also
killed one of the variants because it didn't seem to be testing much
important.
Fixes#26600
Automatic merge from submit-queue
kubelet e2e: set cpu/memory limits for docker 1.11
Docker 1.11 consumes more memory. Bump the limit to fix the tests. Also add
new limits for the 100-pod resource usage tracking test.
This fixes#26495
Automatic merge from submit-queue
Fix some gce-only tests to run on gke as well
Enable "Services should work after restarting apiserver [Disruptive]" and DaemonRestart tests, except the 2 that require master ssh access.
Move restart/upgrade related test helpers into their own file in framework package.
- Exit non-0 if infrastructure failures happen
- Exit 0 if no infrastructure failures happen regardless of test results
(Jenkins will use junit.xml to determine test results)
Automatic merge from submit-queue
Use ubuntu-slim to reduce size of the iperf:e2e image
from
```
gcr.io/google_containers/iperf e2e 8b3cc7064090 5 weeks ago 737.9 MB
```
to
```
gcr.io/google_containers/iperf e2e 204325491636 33 seconds ago 61.09 MB
```
related to https://github.com/kubernetes/kubernetes/pull/25784#issuecomment-221706886
ping @bprashanth
Automatic merge from submit-queue
Support per-test-environment ginkgo flags for node e2e tests to facilitate skipping miss behaving tests in PR builder
We had an issue today where some node e2e tests were timing out in the pr builder. We want to be able to skip tests in the pr builder and leave them running in the CI if this happens again.
[]()
Automatic merge from submit-queue
Make Privileged pods node e2e use the framework
Made the test more readable along the way with more logs. This should help us triage failures/flakes in the future.
#24577
Automatic merge from submit-queue
Use pause image depending on the server's platform when testing
Removed all pause image constant strings, now the pause image is chosen by arch. Part of the effort of making e2e arch-agnostic.
The pause image name and version is also now only in two places, and it's documented to bump both
Also removed "amd64" constants in the code. Such constants should be replaced by `runtime.GOARCH` or by looking up the server platform
Fixes: #22876 and #15140
Makes it easier for: #25730
Related: #17981
This is for `v1.3`
@ixdy @thockin @vishh @kubernetes/sig-testing @andyzheng0831 @pensu
Automatic merge from submit-queue
Attach Detach Controller Business Logic
This PR adds the meat of the attach/detach controller proposed in #20262.
The PR splits the in-memory cache into a desired and actual state of the world.
Automatic merge from submit-queue
Flake 21484: retrieve pod log during e2e error
Print the pod log when an error occurs in
> Proxy version 1 should proxy through a service and a pod [Conformance]
e2e test. This will help to understand flake https://github.com/kubernetes/kubernetes/issues/21484 better.
Many tests expect all kube-system pods to be running and ready. The newly
added image prepull add-on pod can in the "succeeded" state. This commit fixes
the tests to allow kube-system pods to be succeeded.
Automatic merge from submit-queue
Downward API implementation for resources limits and requests
This is an implementation of Downward API for resources limits and requests, and it works with environment variables and volume plugin.
This is based on proposal https://github.com/kubernetes/kubernetes/pull/24051. This implementation follows API with magic keys approach as discussed in the proposal.
@kubernetes/rh-cluster-infra
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24179)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Prepull images in e2e
Quick and dirty image puller because the SQ stalled multiple times just *today* on image pull flake (https://github.com/kubernetes/kubernetes/issues/25277).
@kubernetes/sig-node @kubernetes/sig-testing wdyt?
Split controller cache into actual and desired state of world.
Controller will only operate on volumes scheduled to nodes that
have the "volumes.kubernetes.io/controller-managed-attach" annotation.
- Add junit test reported
- Write etcd.log, kubelet.log and kube-apiserver.log to files instead of stdout
- Scp artifacts to the jenkins WORKSPACE
Fixes#25966
Automatic merge from submit-queue
in e2e test, when kubectl exec fails to find the container to run a command, it should retry
fix#26076
Without retrying upon "container not found" error, `Pod Disks` test failed on the following error:
```console
[k8s.io] Pod Disks
should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:271
[BeforeEach] [k8s.io] Pod Disks
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:108
STEP: Creating a kubernetes client
May 23 19:18:02.254: INFO: >>> TestContext.KubeConfig: /root/.kube/config
STEP: Building a namespace api object
STEP: Waiting for a default service account to be provisioned in namespace
[BeforeEach] [k8s.io] Pod Disks
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:69
[It] should schedule a pod w/two RW PDs both mounted to one container, write to PD, verify contents, delete pod, recreate pod, verify contents, and repeat in rapid succession [Slow]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/pd.go:271
STEP: creating PD1
May 23 19:18:06.678: INFO: Successfully created a new PD: "rootfs-e2e-11dd5f5b-211b-11e6-a3ff-b8ca3a62792c".
STEP: creating PD2
May 23 19:18:11.216: INFO: Successfully created a new PD: "rootfs-e2e-141f062d-211b-11e6-a3ff-b8ca3a62792c".
May 23 19:18:11.216: INFO: PD Read/Writer Iteration #0
STEP: submitting host0Pod to kubernetes
W0523 19:18:11.279910 4984 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c: need to check if this is versioned correctly.
STEP: writing a file in the container
May 23 19:18:39.088: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1394466581702052925' > '/testpd1/tracker0''
May 23 19:18:40.250: INFO: Wrote value: "1394466581702052925" to PD1 ("rootfs-e2e-11dd5f5b-211b-11e6-a3ff-b8ca3a62792c") from pod "pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c" container "mycontainer"
STEP: writing a file in the container
May 23 19:18:40.251: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- /bin/sh -c echo '1740704063962701662' > '/testpd2/tracker0''
May 23 19:18:41.433: INFO: Wrote value: "1740704063962701662" to PD2 ("rootfs-e2e-141f062d-211b-11e6-a3ff-b8ca3a62792c") from pod "pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c" container "mycontainer"
STEP: reading a file in the container
May 23 19:18:41.433: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
May 23 19:18:42.585: INFO: Read file "/testpd1/tracker0" with content: 1394466581702052925
STEP: reading a file in the container
May 23 19:18:42.585: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd2/tracker0'
May 23 19:18:43.779: INFO: Read file "/testpd2/tracker0" with content: 1740704063962701662
STEP: deleting host0Pod
May 23 19:18:44.048: INFO: PD Read/Writer Iteration #1
STEP: submitting host0Pod to kubernetes
W0523 19:18:44.132475 4984 request.go:347] Field selector: v1 - pods - metadata.name - pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c: need to check if this is versioned correctly.
STEP: reading a file in the container
May 23 19:18:45.186: INFO: Running '/srv/dev/kubernetes/_output/dockerized/bin/linux/amd64/kubectl kubectl --server=https://130.211.199.187 --kubeconfig=/root/.kube/config exec --namespace=e2e-tests-pod-disks-3t3g8 pd-test-16d3653c-211b-11e6-a3ff-b8ca3a62792c -c=mycontainer -- cat /testpd1/tracker0'
May 23 19:18:46.290: INFO: error running kubectl exec to read file: exit status 1
stdout=
stderr=error: error executing remote command: error executing command in container: container not found ("mycontainer")
)
May 23 19:18:46.290: INFO: Error reading file: exit status 1
May 23 19:18:46.290: INFO: Unexpected error occurred: exit status 1
```
Now I've run this fix on e2e pd test 5 times and no longer see any failure
Automatic merge from submit-queue
Don't dump everything in kubemarks
Don't dump all events etc. in kubemark failures, those are useless anyway with that amount of data.
Automatic merge from submit-queue
Fix panic in auth test failure
I got a spurious failure in the webhook integration test, but couldn't see the error returned because a panic was hit that assumed a body was always returned with the response
Automatic merge from submit-queue
E2e tests for GKE cluster with local SSD.
The test cover node pool with local SSD creation and scheduling a pod that writes and reads from it. Pod access local disk via hostPath.
```release-note
E2e tests for GKE cluster with local SSD.
-OR-
```
Automatic merge from submit-queue
kubelet/cadvisor: Refactor cadvisor disk stat/usage interfaces.
basically
1) cadvisor struct will know what runtime the kubelet is, passed in via additional argument to New()
2) rename cadvisor wrapper function to DockerImagesFsInfo() to ImagesFsInfo() and have linux implementation choose a label based on the runtime inside the cadvisor struct
2a) mock/fake/unsupported modified to take the same additional argument in New()
3) kubelet's wrapper for the cadvisor wrapper is renamed in parallel
4) make all tests use new interface
Automatic merge from submit-queue
Cache Webhook Authentication responses
Add a simple LRU cache w/ 2 minute TTL to the webhook authenticator.
Kubectl is a little spammy, w/ >= 4 API requests per command. This also prevents a single unauthenticated user from being able to DOS the remote authenticator.
Automatic merge from submit-queue
Extend secrets volumes with path control
As per [1] this PR extends secrets mapped into volume with:
* key-to-path mapping the same way as is for configmap. E.g.
```
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "mypod",
"namespace": "default"
},
"spec": {
"containers": [{
"name": "mypod",
"image": "redis",
"volumeMounts": [{
"name": "foo",
"mountPath": "/etc/foo",
"readOnly": true
}]
}],
"volumes": [{
"name": "foo",
"secret": {
"secretName": "mysecret",
"items": [{
"key": "username",
"path": "my-username"
}]
}
}]
}
}
```
Here the ``spec.volumes[0].secret.items`` added changing original target ``/etc/foo/username`` to ``/etc/foo/my-username``.
* secondly, refactoring ``pkg/volumes/secrets/secrets.go`` volume plugin to use ``AtomicWritter`` to project a secret into file.
[1] https://github.com/kubernetes/kubernetes/blob/master/docs/design/configmap.md#changes-to-secret
Automatic merge from submit-queue
Check status of framework.CheckPodsRunningReady
Check status of framework.CheckPodsRunningReady and fail test if it's false, instead of silently
ignoring the failure.
This doesn't fix whatever is causing the pod not to start in #17523 but it does fail the test as soon as it detects the pod didn't start, instead of allowing the testing to proceed.
cc @kubernetes/sig-testing @spxtr @ixdy @kubernetes/rh-cluster-infra
I removed the netexec and goproxy pods from the proxy exec test. Instead
it now runs kubectl locally and the proxy is running in-process. Since
Go won't proxy for localhost requests, this test cannot pass if the API
server is local. However it was already disabled for local clusters.
Automatic merge from submit-queue
SchedulerPredicates e2e test: be more verbose about requested resource
When ``validates resource limits of pods that are allowed to run [Conformance]`` test is run, logs could give more information about requested resource and say it is for cpu and in mili units.
cpu is stored in m units here:
```
nodeToCapacityMap[node.Name] = capacity.MilliValue()
```
Automatic merge from submit-queue
Add a timeout to the node e2e Ginkgo test runner
Also add a few debugging statements to indicate progress.
Should help prevent #25639, since we'll timeout tests before Jenkins times out the build.
Automatic merge from submit-queue
gcr.io/google_containers/mounttest: use Stat instead of Lstat
The current ``mt.go`` implementation use ``os.Lstat`` instead of ``os.Stat`` which does not read symlinks. Since implementation of ``AtomicWriter`` (which relies on existence of symlinks), the updated implementation of secret volume using the ``AtomicWriter`` can not be tested for secret file permission. Replacing ``Lstat`` with ``Stat`` allows to read symlinks and return permissions of target file. The change affects ``--file_perm`` and ``--file_mode`` options only.
``mounttest`` image is currently used by:
##### downwardapi_volume.go
- e2e: Downward API volume
- version: 0.6
- args: --file_content, --break_on_expected_content, --retry_time, --file_content_in_loop
##### empty_dir.go
- e2e: EmptyDir volumes
- version: 0.5
- args: --file_perm, --file_perm, ...
##### host_path.go
- e2e: hostPath
- version: 0.6
- args: --file_mode, ...
##### configmap.go
- e2e: ConfigMap
- version: 0.6
- args: --file_content, --break_on_expected_content, --retry_time, --file_content_in_loop
##### service_accounts.go
- e2e: ServiceAccounts
- version: 0.2
- args: --file_content
Some of the e2e tests use at least one of the affected options. Locally, I have updated all version of mounttest images to 0.7. All e2e tests pass with the new image.
- create 100 PV, ranging from 0 to 99GB; create 1 PVC to claim 50GB. Verify only one PV is bound and rest are pending
- create 2 PVs with different access modes (RWM, RWO), 1 PVC to claim RWM PV. Verify RWM is bound and RWO is not bound.
Signed-off-by: Huamin Chen <hchen@redhat.com>
Automatic merge from submit-queue
Refactor persistent volume controller
Here is complete persistent controller as designed in https://github.com/pmorie/pv-haxxz/blob/master/controller.go
It's feature complete and compatible with current binder/recycler/provisioner. No new features, it *should* be much more stable and predictable.
Testing
--
The unit test framework is quite complicated, still it was necessary to reach reasonable coverage (78% in `persistentvolume_controller.go`). The untested part are error cases, which are quite hard to test in reasonable way - sure, I can inject a VersionConflictError on any object update and check the error bubbles up to appropriate places, but the real test would be to run `syncClaim`/`syncVolume` again and check it recovers appropriately from the error in the next periodic sync. That's the hard part.
Organization
---
The PR starts with `rm -rf kubernetes/pkg/controller/persistentvolume`. I find it easier to read when I see only the new controller without old pieces scattered around.
[`types.go` from the old controller is reused to speed up matching a bit, the code looks solid and has 95% unit test coverage].
I tried to split the PR into smaller patches, let me know what you think.
~~TODO~~
--
* ~~Missing: provisioning, recycling~~.
* ~~Fix integration tests~~
* ~~Fix e2e tests~~
@kubernetes/sig-storage
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24331)
<!-- Reviewable:end -->
Fixes#15632
The key to path mapping allows pod to specify different name (thus location) of each secret.
At the same time refactor the volume plugin to use AtomicWritter to project secrets to files in a volume.
Update e2e Secrets test, the secret file permission has changed from 0444 to 0644
Remove TestPluginIdempotent as the AtomicWritter is responsible for secret creation
Automatic merge from submit-queue
Add init containers to pods
This implements #1589 as per proposal #23666
Incorporates feedback on #1589, creates parallel structure for InitContainers and Containers, adds validation for InitContainers that requires name uniqueness, and comments on a number of implications of init containers.
This is a complete alpha implementation.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/23567)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Add pod status/ready/restartCount conformance test
add more test cases to cover containers which will be terminated/running/failed/pending.
Signed-off-by: liang chenye <liangchenye@huawei.com>
Automatic merge from submit-queue
The remaining API changes for PodDisruptionBudget.
It's mostly the boilerplate required for the registry, some extra codegen, and a few tests.
Will squash once we're sure it's good.
Automatic merge from submit-queue
[e2e] kubectl stdin
Problem: Currently kubectl heavily relies on files which have to be (for lack of a better word :):):)) "written" to the file system. This hinders adoption of something like gobindata, by forcing an intermediary generated-assets directory type thing.
Solution: Lets migrate `kubectl.go` testing over to using standard input streams.
cc @kubernetes/sig-testing @timothysc
Automatic merge from submit-queue
Add pod condition PodScheduled to detect situation when scheduler tried to schedule a Pod, but failed
Set `PodSchedule` condition to `ConditionFalse` in `scheduleOne()` if scheduling failed and to `ConditionTrue` in `/bind` subresource.
Ref #24404
@mml (as it seems to be related to "why pending" effort)
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24459)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Webhook Token Authenticator
Add a webhook token authenticator plugin to allow a remote service to make authentication decisions.
Automatic merge from submit-queue
Move internal types of hpa from pkg/apis/extensions to pkg/apis/autoscaling
ref #21577
@lavalamp could you please review or delegate to someone from CSI team?
@janetkuo could you please take a look into the kubelet changes?
cc @fgrzadkowski @jszczepkowski @mwielgus @kubernetes/autoscaling
Automatic merge from submit-queue
Upgrades tests use chaosmonkey package and ServiceTestJig
Introduce the `chaosmonkey` e2e package for doing disruptive testing (e.g. upgrade testing) more easily.
- [x] `chaosmonkey` package
- [x] migrate upgrade tests to `chaosmonkey` (using WIP `serviceJig`)
- [x] migrate upgrade tests to use `ServiceTestJig` and `chaosmonkey`
Deferred:
- [ ] make `ServiceTestJig` implement `chaosmonkey.Interface`
- [ ] migrate disruptive services tests to use `ServiceTestJig` and `chaosmonkey`
This provides the extensible framework for #15131. We should now easily be able to add tests (e.g. #6084, #23189).
This is a rewrite of #22446.
cc @mikedanese @quinton-hoole @roberthbailey
Assigning to @thockin, who wrote `ServiceTestJig`.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24014)
<!-- Reviewable:end -->
Automatic merge from submit-queue
Let the dynamic client take runtime.Object instead of v1.ListOptions
so that I can pass whatever version of ListOptions to the List/Watch/DeleteCollection methods.
cc @krousey
Automatic merge from submit-queue
Kubelet: Add docker operation timeout
For #23563.
Based on #24748, only the last 2 commits are new.
This PR:
1) Add timeout for all docker operations.
2) Add docker operation timeout metrics
3) Cleanup kubelet stats and add runtime operation error and timeout rate monitoring.
4) Monitor runtime operation error and timeout rate in kubelet perf.
@yujuhong
/cc @gmarek Because of the metrics change.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
enable resource name and service account cases for impersonation
Adds the resource name check since that attribute was added for authorization. Also adds a check against a separate resource for service accounts. Allowing impersonation of service accounts to use a different resource check places control of impersonation with the same users to have the power to get the SA tokens directly.
@kubernetes/kube-iam
@sgallagher FYI
Automatic merge from submit-queue
test/e2e/addon_update: Respect KUBE_SSH_USER
This change makes the e2e tests more consistent as the other ones all already respected this variable.
I didn't do the larger change of re-factoring it to use `framework.SSH` because getting the `scp` portion working there has some significant complexity. If I find time, I'd like to go back and do it since this test needs a little ❤️
cc @yifan-gu, @kubernetes/sig-testing
Automatic merge from submit-queue
Abstract node side functionality of attachable plugins
- Create PhysicalAttacher interface to abstract MountDevice and
WaitForAttach.
- Create PhysicalDetacher interface to abstract WaitForDetach and
UnmountDevice.
- Expand unit tests to check that Attach, Detach, WaitForAttach,
WaitForDetach, MountDevice, and UnmountDevice get call where
appropriet.
Physical{Attacher,Detacher} are working titles suggestions welcome. Some other thoughts:
- NodeSideAttacher or NodeAttacher.
- AttachWatcher
- Call this Attacher and call the Current Attacher CloudAttacher.
- DeviceMounter (although there are way too many things called Mounter right now :/)
This is to address: https://github.com/kubernetes/kubernetes/pull/21709#issuecomment-192035382
@saad-ali
Automatic merge from submit-queue
Support persisting config from kubecfg AuthProvider plugins
Plumbs through an interface to the plugin that can persist a `map[string]string` config for just that plugin. Also adds `config` to the AuthProvider serialization type, and `Login()` to the AuthProvider plugin interface.
Modified the gcp AuthProvider to cache short-term access tokens in the kubecfg file.
Builds on #23066
@bobbyrullo @deads2k @jlowdermilk @erictune
Automatic merge from submit-queue
kubectl describe: show multiple labels/annotations on multiple lines
Small UX improvement: when there is more than one label/annotation, it's more readable to see them on the different lines.
Before:
```console
$ kubectl describe svc
Name: s2i-test
Namespace: test2
Labels: app=s2i-test,foo=bar
...
```
After:
```console
$ kubectl describe svc
Name: s2i-test
Namespace: test2
Labels: app=s2i-test
foo=bar
...
```
This change affects output of the labels/annotations in many of the sub-commands of the `kubectl describe`.
PTAL @smarterclayton @kargakis
Automatic merge from submit-queue
kubectl: more sophisticated pod selection for logs and attach
Trying to get the logs or attach to an object other than a pod
will poll forever if that object has no replicas. This commit adds
a 20s timeout for polling.
@kubernetes/kubectl @deads2k @fabianofranz
Automatic merge from submit-queue
Reimplement 'pause' in C - smaller footprint all around
Statically links against musl. Size of amd64 binary is 3560 bytes.
I couldn't test the arm binary since I have no hardware to test it on, though I assume we want it to work on a raspberry pi.
This PR also adds the gcc5/musl cross compiling image used to build the binaries.
@thockin
Automatic merge from submit-queue
Add subPath to mount a child dir or file of a volumeMount
Allow users to specify a subPath in Container.volumeMounts so they can use a single volume for many mounts instead of creating many volumes. For instance, a user can now use a single PersistentVolume to store the Mysql database and the document root of an Apache server of a LAMP stack pod by mapping them to different subPaths in this single volume.
Also solves https://github.com/kubernetes/kubernetes/issues/20466.
Automatic merge from submit-queue
e2e: Enable persistent volume test
The test is already there and all packages should be already available on all test machines.
It tests:
- binding
- using bound claim in a pod
- recycling NFS volume
(we should see shortly if all nfs packages are really installed as Jenkins tests it...)
Automatic merge from submit-queue
Allow etcd to store protobuf objects
Split storage serialization from client negotiation, and allow API server to take flag controlling serialization.
TODO:
* [x] API server still doesn't start - range allocation object doesn't seem to round trip correctly to etcd
* [ ] Verify that third party resources are ignoring protobuf (add a test)
* [ ] Add integration tests that verify storage is correctly protobuf
* [ ] Add a global default for which storage format to prefer?
Automatic merge from submit-queue
e2e/framework/util.StartPods: don't wait for pods that are not created
When running ``[k8s.io] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run [Conformance]`` pods can be created in a way in which additional pods have to be create to fully saturate node's capacity CPU in a cluster. Additional pods are created by calling ``framework.StartPods``. The function creates pods with a given label and waits for them (if ``waitForRunning`` is ``true``). This is fine as long as the number of pods to created is non-zero. If there are zero pods to be created and ``waitForRunning`` is ``true``, the function waits forever as there is not going to be any pods with requested label. Thus, resulting in ``Error waiting for 0 pods to be running - probably a timeout``. Causing the e2e test to fail even if it should not.
Adding condition to return from the function immediately if there is not pod to create.
The codec factory should support two distinct interfaces - negotiating
for a serializer with a client, vs reading or writing data to a storage
form (etcd, disk, etc). Make the EncodeForVersion and DecodeToVersion
methods only take Encoder and Decoder, and slight refactoring elsewhere.
In the storage factory, use a content type to control what serializer to
pick, and use the universal deserializer. This ensures that storage can
read JSON (which might be from older objects) while only writing
protobuf. Add exceptions for those resources that may not be able to
write to protobuf (specifically third party resources, but potentially
others in the future).
Automatic merge from submit-queue
Petset controller
Took longer than I expected. Main parts of this pr are:
1. Identity generation based on petset spec (volumes are mapped per discussion in #18016)
2. Ensure that we create/delete pets in sequence
3. Ensuring that we create, wait for healthy, create; or delete, wait for terminationGrace, delete
4. Controller that watches apiserver and drives actual -> desired
PVCs are not deleted, yet.
Automatic merge from submit-queue
kubectl rolling-update support for same image
Fixes#23497.
Enables `kubectl rolling-update --image` to the same image, adding a `--image-pull-policy` flag to remove ambiguity. This allows rolling-update to behave as an "update and/or restart" (https://github.com/kubernetes/kubernetes/issues/23497#issuecomment-212349730), or as a forced update when the same tag can mean multiple versions (e.g. `:latest`). cc @janetkuo @nikhiljindal
- Expand Attacher/Detacher interfaces to break up work more
explicitly.
- Add arguments to all functions to avoid having implementers store
the data needed for operations.
- Expand unit tests to check that Attach, Detach, WaitForAttach,
WaitForDetach, MountDevice, and UnmountDevice get call where
appropriet.
Automatic merge from submit-queue
Use tagged redis image for kubectl test, move json test file out of deprecated examples
Closes#24642
Changes the redis image to use the :e2e tagged version on gcr.io.
Since the examples/ subdir is deprecated in favor of the new kubernetes/kubernetes.github.io I just copied this file to test-manifests/kubectl like some other files.
The test is already there and all packages should be already available on all
test machines (namely: mount.nfs).
It tests:
- binding
- using bound claim in a pod
- recycling NFS volume
Correct the display of the output when e.g. starting the kubelet fails in the node e2e tests. This
change makes it display a string instead of an array of decimal values.
Automatic merge from submit-queue
Framework support for node e2e.
This should let us port existing e2e tests to the node e2e suite, if the tests are node specific.
Automatic merge from submit-queue
Provide flags to use etcd3 backed storage
ref: #24405
What's in this PR?
- Add a new flag "storage-backend" to choose "etcd2" or "etcd3". By default (i.e. empty), it's "etcd2".
- Take out etcd config code into a standalone package and let it create etcd2 or etcd3 storage backend given user input.
Automatic merge from submit-queue
Promote Pod Hostname & Subdomain to fields (were annotations)
Deprecating the podHostName, subdomain and PodHostnames annotations and created corresponding new fields for them on PodSpec and Endpoints types.
Annotation doc: #22564
Annotation code: #20688
Automatic merge from submit-queue
Add support for running clusters on GCI
Google Container-VM Image (GCI) is the next revision of Container-VM. See documentation at https://cloud.google.com/compute/docs/containers/vm-image/. This change adds support for starting a Kubernetes cluster using GCI.
With this change, users can start a kubernetes cluster using the latest kubelet and kubectl release binary built in the GCI image by running:
$ KUBE_OS_DISTRIBUTION="gci" cluster/kube-up.sh
Or run a testing cluster on GCI by running:
$ KUBE_OS_DISTRIBUTION="gci" go run hack/e2e.go -v --up
The commands above will choose the latest GCI image by default.
Automatic merge from submit-queue
Quota ignores pod compute resources on updates
Scenario:
1. define a quota Q that tracks memory and cpu
2. create pod P that uses memory=100Mi, cpu=100m
3. update pod P to use memory=50Mi,cpu=10m
Expected Results:
Step 3 should fail with validation error.
Quota Q should not have changed.
Actual Results:
Step 3 fails validation, but quota Q is decremented to have memory usage down 50Mi and cpu usage down 40m. This is because the quota was getting updated even though the pod was going to fail validation.
Fix:
Quota should only support modifying pod compute resources when pods themselves support modifying their compute resources.
This also fixes https://github.com/kubernetes/kubernetes/issues/24352
/cc @smarterclayton - this is what we discussed.
fyi: @kubernetes/rh-cluster-infra
Automatic merge from submit-queue
add user.Info.GetExtra
I found myself wanting this field (or something like it), when trying to plumb the information about which scopes a particular token has.
Only the token authenticators have that information and I don't want tokens to leak past the authenticator. I thought about extending the `authenticator.Token` interface to include scopes (`[]string`), but that felt a little specific for what I wanted to do. I came up with this as an alternative.
It allows the token authenticator to fill in the information and authorizers already get handed the `user.Info`. It means that implementors can choose to tie the layers together if they wish, using whatever data they think is best.
@kubernetes/kube-iam
generic pod-per-node functionality for testing - 2 node test only
- update framework to decompose pod vs svc creation for composition.
- remove hard coded 2 and pointer to --scale
Automatic merge from submit-queue
Collect and expose runtime's image storage usage via Kubelet's /stats/summary endpoint
This information is useful to users since docker images are typically not stored on the root filesystem.
Kubelet will also consume this feature in the future to decide is evicting images will help with disk usage on the nodes.
cc @kubernetes/sig-node
Automatic merge from submit-queue
Move internal types of job from pkg/apis/extensions to pkg/apis/batch
This addressed the job part of #23216, this is still WIP. Will notify once finished. I'd like to have it in before starting working on ScheduledJob.
@lavalamp @erictune fyi