Many e2e tests expect two values are the same, such check is one of
common things in tests. Now gomega.Expect(foo).To(gomega.Equal(bar))
is doing that, this adds ExpectEqual() for replacing the above call
for readable code.
In addition, this replaces under apimachinery/generated_clientset.go
as a sample.
In a heavily contested AWS account (that was close to rate limits)
it took more than 2m for the load balancer to begin accepting
requests. This increases the timeout to 3m to give upgrade tests
more of a chance to pass in a contentious environment.
Because Linux images cannot run on Windows and vice-versa, separate
tests were added for both OSes, only separated by a [LinuxOnly] tag
in their names.
Based on the given --node-os-distro, we can select which image to
use when spawning the pod.
Quite a few images are only used a few times in a few tests. Thus,
the images are being centralized into the agnhost image, reducing
the number of images that have to be pulled and used.
This PR replaces the usage of the following images with agnhost:
- fakegitserver
- hostexec
- liveness
- logs-generator
- no-snat-test
- no-snat-test-proxy
- port-forward-tester
logs-generator used to be configured through env variables, but with
the image centralization, that changed to flags.
This commit updates the agnhost README to reflect that.
Refactors the functions used in agnhost into different modules,
based on their functionality, leaving only the main in the base
folder.
Future commits will add several functionalities to agnhost, so
this change will be necessary to keep it clean.
I looked at all runs over all jobs run against master or release-1.15,
ignored [Feature:.*] or [Serial] tests, and added [Slow] to any tests
whose 50th percentile duration was over 5 minutes
Misc comments:
- the apimachinery chunking test is the worst offender, at about 15min
- all test cases for all drivers that ran the [Testpattern:.*(xfs)] were
taking longer than 5 minutes, so I got lucky and this was an easy
call; not sure how to support some drivers taking too long for some
test patterns
If there are tags in the test name that describe qualities of the
test that make it ineligible for conformance, raise an error. This
is basically the "skip list" that heptio's e2e image used to use.
Thankfully all of our existing Conformance tests lack these tags. I
considered added [Slow] to the list, but let's save that for another
day.
For windows node, security context is disabled. This PR fixes a bug so
that fsGroup will not be applied to pods that run on windows node.
Change-Id: Id9870416d2ad8ef791b3b4896d6747a2adbada2f
TestWatchBasedManager was racing with the default namespace creation.
To fix that flake and to ensure integration tests using a shared etcd
don't accidentally overlap in the future, move the three main tests
using the default namespace to separate namespaces, and have
TestWatchBasedManager create that namespace before it runs.
Make StartTestServer wait for default namespace creation, which will
reduce other flakes until future changes completely remove use of default
namespace.
From a failed integration run:
watch_manager_test.go:66: namespaces "default" not found
watch_manager_test.go:66: namespaces "default" not found
watch_manager_test.go:66: namespaces "default" not found
Correctly ensure CRDs can be watched using protobuf when transformed to
PartialObjectMetadata. To do this we add a set of serializers allowed to
be used for "normal" requests (that return CRDs) while the serializers
supported by the infrastructure is broader and includes protobuf. During
negotatiation we check for transformation requests and protobuf is
excluded from non-transform requests.
As part of the change, correct an error message when the server returns
a 406 but the client doesn't accept the format to avoid confusing users
who set impossible Accept rules for CRDs (the dynamic client doesn't
support Protobuf, so if the server responds with a protobuf status the
message from the server is lost and the generic error was confusing).
Running hack/make-rules/test-e2e-node.sh or test/e2e_node/conformance/run_test.sh
with a password-less sudo user on a dev box right now requires to first
create a password for that user, and then type it every time one wants
to run these tests. This patch is fixing this by not asking for sudo
credentials if it seems the user can run any command without a
password.
Signed-off-by: Jean Rouge <rougej+github@gmail.com>
We can use framework.ExpectError() for checking the expected error
happens. However Expect(err).To(HaveOccurred()) can be used instead
and that makes the e2e test code unreadable.
This adds the check to use framework.ExpectError() for readable code.
update bazel build
fix get plugin config method
initialize only needed plugins
fix unit test
fix import duplicate package
update bazel
add docstrings
add weight field to plugin
add plugin to v1alpha1
add plugins at appropriate extension points
remove todo statement
fix import package file path
set plugin json schema
add plugin unit test to option
initial plugin in test integration
initialize only needed plugins
update bazel
rename func
change plugins needed logic
remove v1 alias
change the comment
fix alias shorter
remove blank line
change docstrings
fix map bool to struct
add some docstrings
add unreserve plugin
fix docstrings
move variable inside the for loop
make if else statement cleaner
remove plugin config from reserve plugin unit test
add plugin config and reduce unnecessary options for unit test
update bazel
fix race condition
fix permit plugin integration
change plugins to be pointer
change weight to int32
fix package alias
initial queue sort plugin
rename unreserve plugin
redesign plugin struct
update docstrings
check queue sort plugin amount
fix error message
fix condition
change plugin struct
add disabled plugin for unit test
fix docstrings
handle nil plugin set
conflict.
Adding unit test verify that deleteValidation is retried.
adding e2e test verifying the webhook can intercept configmap and custom
resource deletion, and the existing object is sent via the
admissionreview.OldObject.
update the admission integration test to verify that the existing object
is passed to the deletion admission webhook as oldObject, in case of an
immediate deletion and in case of an update-on-delete.
The dnsutils and jessie-dnsutils images are installing dnsmasq,
which is required for a few tests checking custom DNS servers and
configurations.
dnsmasq is a Linux specific binary. In order for the tests to also
pass on Windows, this commit adds CoreDNS to the images, so a later
commit will update the tests to use CoreDNS instead of dnsmasq.
This fixes golint failures of the following files:
- test/e2e/framework/networking_utils.go
- test/e2e/framework/service_util.go
- test/e2e/framework/util.go
All golint failures in test/e2e/framework are fixed at this commit.
Remove 'test/e2e/framework' from 'hack/.golint_failures'
The e2e test framework has ExpectNoError() for readable test code.
This replaces Expect(err).NotTo(HaveOccurred()) with it for e2e/lifecycle/bootstrap.
Now that internal types are equivalent, allow the apiserver to serve
metav1 and metav1beta1 depending on the client. Test that in the
apiserver integration test and ensure we get the appropriate responses.
Register the metav1 type in the appropriate external locations.
This is a find/replace within my editor. I made the import
networkingv1beta1 so that it will be easier to replace for
the future v1 migration.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
Move approvers who haven't reviewed any PRs touching test/ in
over six months to emeritus_approvers (and remove from reviewers
as well to avoid assigning to inactive people)
One previously undocumented expectation is that
GetDynamicProvisionStorageClass can be called more than once per test
and then each time returns a new, unique storage class. The in-memory
implementation in driveroperations.go:GetStorageClass ensured that,
but loading from a .yaml file didn't. This caused the multivolume tests
to fail when applied to an already installed GCE driver with the
-storage.testdriver parameter.
This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages
There is a lot of gomega.Expect(err).To(gomega.HaveOccurred()) callers
which expect an error happens in e2e tests.
However these test code seems confusing because the code readers
need to take care of To() or NotTo() on each test scenario.
This adds ExpectError() for more readable test code.
In addition, this applies ExpectError() to e2e provisioning.go as a
sample.
* fix duplicated imports of api/core/v1
* fix duplicated imports of client-go/kubernetes
* fix duplicated imports of rest code
* change import name to more reasonable
The framework/ssh.go code was heavily used throughout the framework
and could be useful elsewhere but reusing those methods requires
importing all of the framework.
Extracting these methods to their own package for reuse.
Only a few methods had to be copied into this package from the
rest of the framework to avoid an import cycle.
This reverts commit c50e7fd301 because
it included API changes that shouldn't have been in that PR and
fixing the storage class conflict inside the framework is probably the
wrong place.
**What type of PR is this?**
/kind cleanup
**What this PR does / why we need it**:
Staging the GCE Cloud Provider as part of KEP [20190125-removing-in-tree-providers](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cloud-provider/20190125-removing-in-tree-providers.md). Staging repo setup here https://github.com/kubernetes/legacy-cloud-providers
Moves the GCE cloud provider implementation to staging.
This is in preparation for moving the cloud provider code out of tree entirely.
However we need it in staging while the code needs to be consumed both in/out of tree.
**Which issue(s) this PR fixes**:
Fixes #
**Special notes for your reviewer**:
**Does this PR introduce a user-facing change?**:
```
NONE
```
Updated import dependency tracking.
Factored in the cleanup from #77412
Minor fix to go.mod.
This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.
This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.
This is the continuation of the refactoring of framework/deployment_utils.go
into framework/deployment.
Signed-off-by: Jorge Alarcon Ochoa <alarcj137@gmail.com>
Basically conformance test checks the target k8s cluster works all
features which are specified in each test and that should not depend
on any condition.
This adds checking that conformance test should not call any Skip
methods. And it detects the existing conformance test
"creating/deleting custom resource definition objects works"
calls framework.SkipUnlessServerVersionGTE(). So this removes the
Skip also.
This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.
The container might start before all the networking plumbing has
been sucessfully completed, causing the Kubernetes reachability
check to fail.
This commit adds a few retries to the connectivity check.
Try to finish what commit 4c8a65ac01 started; that is, do not assume
cluster.local is a constant base domain, when it is configurable.
This makes DNS e2e tests pass with --dns-domain, which was only being honored
for some tests, not all
Signed-off-by: Tobias Wolf <towolf@gmail.com>
This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.
Also, This change makes zone to work per datacenter and cleans up dummy vms.
There can be multiple datastores found for a given name. The datastore name is
unique only within a datacenter. So this commit returns a list of datastores
for a given datastore name in FindDatastoreByName() method. The calles are
responsible to handle or find the right datastore to use among those returned.
This fixes golint failures of the following file:
test/e2e/framework/providers/gce/recreate_node.go
And also, replaces functions using gomega with framework functions.
There was a specific error flow that was commented as only applying
to GKE. This was never tested specifically for GKE (only commented
as such) but that seems to be out of date and can be removed. If
the SAR endpoint does not exist it should be considered an error.
This test creates a pod with custom DNS configurations and expects
them to be set in the containers.
The test pod is using the agnhost image, which can output the container's
DNS configurations that the test checks.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
The change to registrytest was found by liggitt to mitigate a NPE error.
This is necessary since ingress is a cohabitating resource that is not
stored in the default version for the networking resource.
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
- fix shell script issues
- `bx` is deprecated; rename to `ibmcloud`
- remove unnecessay variable replacement in hollow-node_template.yaml
- add replacement logic for HOLLOW_KUBELET_TEST_ARGS and HOLLOW_PROXY_TEST_ARGS
- don't hardcode KUBEMARK_IMAGE_REGISTRY to brandondr96
- make cluster number and spec configurable
- make number and spec of workers configurable
- separate NUM_NODES and KUBEMARK_NUM_NODES
In the DeleteVolumeFinalizer feature in external-snapshotter,
the external-snapshotter needs to update the PVC object to
add a Finalizer if a snapshot is being created from the PVC
and delete the Finalizer after the snapshot is created.
For that reason, we need to add "update" rbac rule for the
PVC object in external-snapshot e2e test manifest file.
DeleteVolumeFinalizer PR is here. It couldn't pass e2e test
until we fix the rbac rule in e2e.
https://github.com/kubernetes-csi/external-snapshotter/pull/47
Dockerhub does not support slashes in the image names, so when the tests are
configured to use a dockerhub registry instead of the current
gcr.io/kubernetes-e2e-test-images registry, the tests using the mentioned image
will fail, as the image cannot exist and cannot be pulled.
Previous IPv6 regex was too loose, this patchs adds a better and
more strict regex for IPv6 addresses and makes the IPv4 and IPv6
regex availables as constants inside the framework pkg
I was running e2e VPA tests and some were failing because the tests
assume thata they're running on:
- zonal single-zone cluster, or
- regional multi-zone cluster.
And I was running them on regional, single-zoned cluster.
Change-Id: I110a70d2249f981b60cf76d1ad674ccfcedd8fb3
This is a part of a series for fixing golint failures for util.go.
- fixes golint failures from line 2354 to line 3685 at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
Other tests that check for default storageclass also
check for cloudprovider such as gce, aws and openstack
and hence are already skipped in bare metal environments.
But this particular test keeps failing because no such check exists.
This is a part of a series for fixing golint failures for util.go.
- fixes golint failures from line 3686 to end of file at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This is equivalent to the "internal" firewall rule that is created for
the regular masters.
The main reason for doing it is to allow prometheus scraping metrics
from various kubemark master components, e.g. kubelet.
Ref. https://github.com/kubernetes/perf-tests/issues/503
- moves these helper functions into e2e/framework/auth
- removes logging from helper functions
- in some cases explicitly returns errors that were implicitly
ignored/logged. In the situations where they should be ignored,
we explicitly check that the condition is met before ignoring it.
- fixes references of these methods to use the right package and
return values
This is a part of a series for fixing golint failures for util.go.
- fixes golint failures from line 1395 to line 2353 at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This changes following files because of change function name
in above file.
- test/e2e/apps/rc.go
- test/e2e/apps/replica_set.go
This is a part of a series for fixing golint failures for util.go.
- fixes `should not use dot imports` about `ginkgo` and `gomega`
- fixes golint failures from top of file to line 1394 at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This changes following files because of change function name
in above file.
- test/e2e/e2e.go
- test/e2e/network/network_tiers.go
- test/e2e/network/service.go
Remove usage of the aggregated clientset in the e2e testing framework
itself. We have one test that consumes the clientset in the suite
and it's in test/e2e/apimachinery/aggregator.go, which was recently
promoted to conformance in 8101b86.
This test now obtains a local copy of the aggregated clientset.
The suite still has to compile the internal client in.
One possible solution here is to move this test in a separate suite,
yet it's unclear of how to tackle the problem now that the test
has to run as part of the conformance suite.
It turns out to be a frequent bug that is revealed when nodes don't have
external IP addresses. In the test we assume that in such case there
won't be any addresses of type 'NodeExternalIp', which is invalid. In
such case there will be an address of type 'NodeExternalIP', but with
the empty 'Address' field.
Ref. https://github.com/kubernetes/kubernetes/issues/76374
Adds the following subcommands to the agnhost binary:
- dns-server-list: outputs the host's DNS server list with which it was
configured with. This can be found in /etc/resolv.conf on Linux and
through some powershell commands on Windows.
- etc-hosts: outputs the host's hosts file. This can be found in /etc/hosts
for Linux, and C:/Windows/System32/drivers/etc/hosts for Windows.
- pause: pauses the binary's execution. This can be used to keep the Pod in
a Running state for various reasons, including executing additional agnhost
commands.
Refactors bits of the code to avoid duplication.
Adds a README for the agnhost image.
Improve readability in e2e network service tests by using differently
named methods for a test with a transition and without a transition.
This replaces a boolean argument, which didn't give any indication w.r.t
its purpose (unless one read the method).
The containers are mounted the /tmp folder as a HostPath volume
and they are supposed to create a new file in it.
The /tmp folder has 777 file permissions, so there shouldn't be any
problems creating a file, even if the container is unprivileged.
iSCSI target (=the server) is implemented in Linux kernel. The "iSCSI
server" pod is not a real server, it just configures the kernel on the
host. In order to run iSCSI tests in parallel, we need to be able to
run multiple such pods on a single node, serving different LUNs to
different tests.
The "server pod" must run with HostNetwork=true to achieve that.
Each pod then creates its own IQN with namespace name, so it can't
collide with other server pods running in another namespaces on the same
node.
some resource out of scope. For example, try get namespace inside defaut namespace.
It would be reject by api-server but `kubectl auth can-i get namespace --namespace=default`
would give a `yes`. After this patch, a warning info would be given.
For more detail, please refer issue #75950
For windows, the command such as "mount" and "grep" do not work for
windows node, this PR is fix the test issue by removing those commands
and change it windows ones if the node OS is windows.
Change-Id: I2428128ee407b611067b8e7c000dfff539d17309
This fixes golint failures of the following file:
- test/e2e/framework/volume_util.go
This changes following file because of change function name
in above file.
- test/e2e/storage/testsuites/volumes.go
The test package imports cmd/kubeadm, which is far from ideal.
There are a couple of reasons for the import:
1) Marshaling of Ingress from api/extensions/v1beta1.
To fix that include a local function in e2e/manifest/manifest.go
that does that same as the kubeadm MarshalToYaml.
2) Using PKI helper function in apimachinery and auth tests.
To fix that include a new file under test/utils/pki_helpers.go
that only contains the required helpers instead of including the whole
kubeadm pkiutil package.
There is another related problem:
e2e_node/e2e_node_suite_test.go includes:
k8s.io/kubernetes/cmd/kubeadm/app/util/system
But this has to be done in a follow up.
This fixes golint failures of framework/metrics_util.go.
Cleanup:
- SaturationTime was only used in test/e2e/scalability/density.go.
So this moves it into the e2e test.
- interestingClusterAutoscalerMetrics was not used in filterMetrics()
so this removes the related code.
This fixes golint failures of the following file:
- test/e2e/framework/statefulset_utils.go
This changes following file because of change function name
in above file.
- test/e2e/apps/statefulset.go
The new image is meant to be used for testing purposes, whenever there
are significant differences between Linux and Windows in the way
something is obtained or tested. For example, the DNS suffix list can
be found in ``/etc/resolv.conf`` on Linux, but on Windows, such file
does not exist, and one way to obtain the mentioned list would be
through some powershell commands.
The image contains an extendable CLI as the entrypoint, the tests
only having to add the necessary arguments. For the previous example,
passing the ``dns-suffix`` argument will print out the comma separated
DNS suffix list, on both Linux and Windows.
The image name means that it should behave the same way on any host,
no matter the host OS.
The container status is not constant, and can change over time in the
following order:
- Running: When kubelet reports the Pod as running. This state is missable if
the container finishes its command faster than kubelet getting to report this
state.
- Terminated: After the Container finished its command, it will enter the Terminated
state, in which will remain for a short period of time, before kubelet will try
to restart it.
- Waiting: When kubelet has to wait for the backoff period to expire before actually
restarting the container.
Treating and handling each of these states when calculating the backoff period between
container restarts will make the tests more reliable.
This fixes golint failures on the following files:
- test/e2e/framework/deployment_util.go
- test/e2e/framework/exec_util.go
Cleanup:
- ScaleDeployment() was not used at all, so let's remove it.
- ExecCommandInPod() and ExecCommandInPodWithFullOutput() were called
in the framework only, so let's make them local.
In addition, this replaces the combination of GetCPUSummary() and
FormatCPUSummary() with LogCPUSummary() in e2e/node/kubelet_perf.go
because that is completely same.
This fixes golint failures on the following files:
- test/e2e/framework/authorizer_util.go
- test/e2e/framework/cleanup.go
- test/e2e/framework/create.go
This fixes golint failures under test/e2e/framework/providers/gce/.
Cleanup:
* FirewallTimeoutDefault is not used at all, so remove it.
* FirewallTestTcpTimeout, FirewallTestHttpPort and FirewallTestUdpPort
are used at test/e2e/network/firewall.go only. So move them.
At the moment, Windows cannot mount individual files into Containers, which means
that the Kubelet-managed hosts file cannot be mounted into the Container, causing
the "should provide DNS for pods for Hostname and Subdomain" test to fail.
The mentioned test has /etc/hosts file entry checks. This commits separates the
DNS check and the /etc/hosts checks into two tests.
The Python code used in the example_cluster_dns test is not compatible with Python3.
Keeping in mind that Python2 will no longer be supported from 2020 onwards, it is a good idea to address this issue.
The GlusterDynamicProvisioner test will not work on GKE because master
node is in a different project and cannot talk to the pod running on
node which is used for gluster provisioner. So add the code to skip the
test on GKE
[stackdriver addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[fluentd-gcp addon] Bump fluentd-gcp-scaler to v0.5.1 to pick up security fixes.
[fluentd-gcp addon] Bump event-exporter to v0.2.4 to pick up security fixes.
[fluentd-gcp addon] Bump prometheus-to-sd to v0.5.0 to pick up security fixes.
[metatada-proxy addon] Bump prometheus-to-sd v0.5.0 to pick up security fixes.
Because the code was moved, golint is now active. Because users of the
code must adapt to the new location of the code, it makes sense to
also change the API at the same time to address the style comments
from golint ("struct field ApiGroup should be APIGroup", same for
ApiExtensionClient).
E2E Test "Secret should fail to create secret in volume
due to empty secret key" tries to create a secret
with empty key and check whether it fails or not.
But the secret creation in this test fails with
double error due to invalid secret name. This will
make this test to pass even if the fuctionality
which needs to be tested is broken.
This commit fix the secret name error(name should not
have capital letters) so that the secret creation fails
only due to desired reason.
Signed-off-by: kanwar saad bin liaqat <kanwar.sbl@gmail.com>
There are two reason why this is useful:
1. less code to vendor into external users of the framework
The following dependencies become obsolete due to this change (from `dep`):
(8/23) Removed unused project github.com/grpc-ecosystem/go-grpc-prometheus
(9/23) Removed unused project github.com/coreos/etcd
(10/23) Removed unused project github.com/globalsign/mgo
(11/23) Removed unused project github.com/go-openapi/strfmt
(12/23) Removed unused project github.com/asaskevich/govalidator
(13/23) Removed unused project github.com/mitchellh/mapstructure
(14/23) Removed unused project github.com/NYTimes/gziphandler
(15/23) Removed unused project gopkg.in/natefinch/lumberjack.v2
(16/23) Removed unused project github.com/go-openapi/errors
(17/23) Removed unused project github.com/go-openapi/analysis
(18/23) Removed unused project github.com/go-openapi/runtime
(19/23) Removed unused project sigs.k8s.io/structured-merge-diff
(20/23) Removed unused project github.com/go-openapi/validate
(21/23) Removed unused project github.com/coreos/go-systemd
(22/23) Removed unused project github.com/go-openapi/loads
(23/23) Removed unused project github.com/munnerz/goautoneg
2. works around https://github.com/kubernetes/kubernetes/issues/75338
which currently breaks vendoring
Some recent changes to crd_util.go must now be pulling in the broken
k8s.io/apiextensions-apiserver packages, because it was still working
in revision 2e90d92db9 (as demonstrated by
586ae281ac).
We try using an atomic with a CAS, as a potential workaround for
issue #74890.
Kudos to @neolit123 for the investigation & idea.
This is a speculative workaround - we are really seeing if this is
better; if not we should revert.
If it is better we should file a bug against go 1.12, and then revert!
Issue #74890
We REJECT every other case. Close this FIXME.
To get this to work in all cases, we have to process service in
filter.INPUT, since LB IPS might be manged as local addresses.
This is a prefactoring for followup changes that need to use very
similar but subtly different test. Now it is more generic, though it
pushes a little logic up the stack. That makes sense to me.
Need to add bracket in the tag for sig-windows. Also fix an issue: for
current testing structure, it first init driver and then set up the
framework. So when initialize the driver, it does not know what OS is
and we can not set up the capabilities correctly. Instead we have to add
all the capabilities and supported fs types including both linux and
windows. Later in the code, we will check the Node OS and decide how to
run the test.
The test [sig-node] PreStop should call prestop when killing a pod
[Conformance] use the nettest image for testing, but it turns out
that this image is configured to listen in the address 0.0.0.0.
Removing the address from the function http.ListenAndServe makes
it start listening on both IPv4 and IPv6 addresses.
Reference: https://github.com/kubernetes/kubernetes/issues/70248
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Current e2e tests for the Container Lifecycle Hooks weren't
using brackets for the IPv6 URL addresses per RFC2732, thus those
tests were failing.
This patches add brackets to the target URL if it's an IPv6 address.
Reference: https://github.com/kubernetes/kubernetes/issues/70248
The test [k8s.io] Probing container [It] should not be restarted with a
/healthz http liveness probe [NodeConformance] [Conformance]
fails because it's using a nginx image that's spawns a server that's
only listening on IPv4 by default.
Switching to an image like TestWebserver that's listening in IPv4 and IPv6 by default
allows the test to run on IPv4 and IPv6 environments.
Reference: https://github.com/kubernetes/kubernetes/issues/70248
Current regex used in the Downward e2e API tests is matching only
IPv4 addresses, consequently those tests fails with IPv6 clusters.
This patch modifies the regex to match ipv4 and ipv6 addresses.
Ref: https://github.com/kubernetes/kubernetes/issues/70248
And add a corresponding flag in kubectl (for apply), even though the
value is defaulted in kubectl with "kubectl".
The flag is required for Apply patch-type, and optional for other PATCH,
CREATE and UPDATE (in which case we fallback on the user-agent).
Clean up the code paths that lead to objects being transformed and output with negotiation.
Remove some duplicate code that was not consistent. Now, watch will respond correctly to
Table and PartialObjectMetadata requests. Add unit and integration tests.
When transforming responses to Tables, only the first watch event for a given type will
include the columns. Columns will not change unless the watch is restarted.
Add a volume attachment printer and tighten up table validation error cases.
Disable protobuf from table conversion because Tables don't have protobuf because they
use `interface{}`
Windows containers do not include a route to the GCE metadata server by
default. This check is causing the "DNS should provide DNS for the
cluster" test to fail for clusters with Windows nodes
(https://testgrid.k8s.io/sig-windows#gce-windows-master&width=20).
Tested that this works by running "DNS should provide DNS for the
cluster" against an e2e cluster with Windows nodes brought up on GCE.
Current IPv6 e2e test for external connectivity is using a
domain address (google.com) as target.
However, the same IPv4 test uses the well known Google DNS address
8.8.8.8.
We should be coherent in the testing, this patch changes the target to use
the Google IPv6 DNS address 2001:4860:4860::8888.
It looks like node does become unschedulable for the pod
but condition does not get added to the pod in time.
Also ginkgo could retry the test and hence it helps to use
unique node label for scheduling.
When using an already installed driver, the snapshot name is the
original driver name. Renaming was incorrectly copied from the in-tree
CSI hostpath driver.
It is useful to apply the storage testsuite also to "external" (=
out-of-tree) storage drivers. One way of doing that is setting up a
custom E2E test suite, but that's still quite a bit of work.
An easier alternative is to parameterize the Kubernetes e2e.test
binary at runtime so that it instantiates the testsuite for one or
more drivers. Some parameters have to be provided before starting the
test because they define configuration and capabilities of the driver
and its storage backend that cannot be discovered at runtime. This is
done by populating the DriverDefinition with the content of the file
that the new -storage.testdriver parameters points to.
The universal .yaml and .json decoder from Kubernetes is used. It's
flexible, but has some downsides:
- currently ignores unknown fields (see https://github.com/kubernetes/kubernetes/pull/71589)
- poor error messages when fields have the wrong type
Storage drivers have to be installed in the test cluster before
starting e2e.test. Only tests involving dynamically provisioned
volumes are currently supported.
This is the 2nd PR to move CSINodeInfo/CSIDriver APIs to
v1beta1 core storage APIs. It includes controller side changes.
It depends on the PR with API changes:
https://github.com/kubernetes/kubernetes/pull/73883
The network connection might not yet be established by the time the
container starts, causing the dig command to fail.
Retrying the dig command will solve this issue. This approach is similar
to the other DNS Conformance tests.
It has been suggested to replace the "e2eteam/busybox:1.29" image
used in the test "should be able to pull image from docker hub [NodeConformance]"
with a nanoserver image manifest list.
Adds a TODO for it.
Kubelet might miss reporting the new Running state when restarting
a pod after its backoff period expired, and thus, the pod will
continue to remain in CrashLoopBackOff state, causing the
"should cap back-off at MaxContainerBackOff" and
"should have their auto-restart back-off timer reset on image update"
tests to fail, since they're waiting the Pods to enter a Running state.
Waiting for the next Terminated state instead of the next Running state
is more reliable.
Note that this adds 5 seconds to the restart delay due to the fact that
the Container runs for 5 seconds (it's command is "sleep 5"), but it is
within the test's expectations.
This PR is the first step to transition CSINodeInfo and CSIDriver
CRD's to in-tree APIs. It adds them to the existing API group
“storage.k8s.io” as core storage APIs.
When running ginkgo directly against the source code of the test suite
instead of using some pre-compiled e2e.test binary, ginkgo no longer
recognized that it runs a Ginkgo testsuite, which broke "-focus" and
"-p".
By re-inserting the magic strings that ginkgo looks for into a
comment, we can restore the desired behavior without affecting the
code.
Fixes: #74827
Adds the test "should be able to pull from private registry with secret [NodeConformance]"
which will pull the image "gcr.io/authenticated-image-pulling/windows-nanoserver:v1".
The mentioned image is a manifest list, and it works for both
Windows Server 1803 and Windows Server 2019. The manifest list
will have to be amended when a new Windows Server is released.
Adds the test "should be able to pull image from gcr.io [NodeConformance]",
which will pull the the image "gcr.io/kubernetes-e2e-test-images/windows-nanoserver:v1".
The mentioned image is a manifest list, and it works for both
Windows Server 1803 and Windows Server 2019. The manifest list
will have to be amended when a new Windows Server is released.
The command passed to the Windows Container has been changed to
"ping -t localhost", which will keep the container in the Running state,
which is required and checked by the test.
The image ``quay.io/coreos/etcd:v3.3.10`` does not have Windows
support and Windows Containers cannot be spawned using it.
Makes the etcd image's registry configurable, so the tests can be
configured to use a registry which has Windows support.
* merge pkg/api/v1/node with pkg/util/node
* update test case for utilnode
* remove package pkg/api/v1/node
* move isNodeReady to internal func
* Split GetNodeCondition into e2e and controller pkg
* fix import errors
It is unrealistic to expect a cascading delete to immediately take
effect. Somehow this test got away with it for a while, but we
have finally reached a point where apiserver performance has changed
just enough to expsoe this flaky expectation.
Use Nginx as the DaemonSet image instead of the ServeHostname image.
This was changed because the ServeHostname has a sleep after terminating
which makes it incompatible with the DaemonSet Rolling Upgrade e2e test.
In addition, make the DaemonSet Rolling Upgrade e2e test timeout a
function of the number of nodes that make up the cluster. This is
required because the more nodes there are, the longer the time it will
take to complete a rolling upgrade.
Signed-off-by: Alexander Brand <alexbrand09@gmail.com>
* changes audit e2e event version scheme; adds internal audit to common audit scheme; removes unneeded comments
* add more detail to audit missing events in e2e/integration tests
* adds version priority to audit scheme; updates comment
Since the commit f3d79e152e
openstack provider has been denied on e2e test runner.
However there are storage e2e tests which are related to
openstack. So this adds the registration of openstack
provider for e2e test.
This package contains public/private key utilities copied directly from
client-go/util/cert. All imports were updated.
Future PRs will actually refactor the libraries.
Updates #71004
When running e2e conformance tests against a public https protected
APIserver the websocket tests would fail because it fell back to using
`ws://` instead of `wss://` for the websocket connection.
This happened because the code detect if HTTPS is used only looks for
HTTPS related configuration in the kubeconfig, like a custom CA or
certificates.
The fix is to always use HTTPS when the apiserver URL has the scheme `https://`.
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
e2e test "[It] should provision storage with different parameters"
depends on cloud providers as gce/gke, aws, openstack, vsphere and
azure. If the other cloud providers like local, the test is skipped.
However getRandomClusterZone() was called before the above cloud
provider check, and if the zone label is not found the test was failed.
This makes the test call getRandomClusterZone() only if necessary
to avoid such unnecessary failures.
Fixes: #73771
Moved all flag code from `staging/src/k8s.io/apiserver/pkg/util/[flag|globalflag]` to `component-base/cli/[flag|globalflag]` except for the term function because of unwanted dependencies.
There is a risk that the init function does not reset one of the local
variables that was set by a previous test. To avoid this, all
variables set by init are now in a struct which gets cleaned
completely first.
The recommended approach for not running unsuitable tests is to skip
them at runtime with an explanation. Filtering out unsuitable test
patters and thus not even defining unsuitable tests was done earlier
because it was faster than skipping tests at runtime.
But now these tests can be skipped efficiently, so this special case
can be removed.
CreateDriver (now called SetupTest) is a potentially expensive
operation, depending on the driver. Creating and tearing down a
framework instance also takes time (measured at 6 seconds on a fast
machine) and produces quite a bit of log output.
Both can be avoided for tests that skip based on static
information (like for instance the current OS, vendor, driver and test
pattern) by making the test suite responsible for creating framework
and driver.
The lifecycle of the TestConfig instance was confusing because it was
stored inside the DriverInfo, a struct which conceptually is static,
while the TestConfig is dynamic. It is cleaner to separate the two,
even if that means that an additional pointer must be passed into some
functions. Now CreateDriver is responsible for initializing the
PerTestConfig that is to be used by the test.
To make this approach simpler to implement (= less functions which
need the pointer) and the tests easier to read, the entire setup and
test definition is now contained in a single function. This is how it
is normally done in Ginkgo. This is easier to read because one can see
at a glance where variables are set, instead of having to trace values
though two additional structs (TestResource and TestInput).
Because we are changing the API already, also other changes are made:
- some function prototypes get simplified
- the naming of functions is changed to match their purpose
(tests aren't executed by the test suite, they only get defined
for later execution)
- unused methods get removed (TestSuite.skipUnsupportedTest is redundant)
This increases type safety and makes the code easier to read because
it becomes obvious that the "test resource" passed to some functions
must be the result of a previous CreateVolume.
This makes it possible to remove:
- functions that never did anything (the DeleteVolume methods in
drivers that never create a volume)
- type casts (in the DeleteVolume implementation)
- the unused DeleteVolume parameters
- the stand-alone DeleteVolume function (which would be just a non-nil
check)
GetPersistentVolumeSource and GetVolumeSource could also become
methods on more specific interfaces - they don't actually use anything
from TestDriver instance which provides them.
The main motivation however is to reduce the number of methods which
might need an explicit test config parameter.
Whether the read test after writing was done on the same node was
random for drivers that weren't locked onto a single node. Now it is
deterministic: it always happens on the same node.
The case with reading on another node is covered separately for test
configurations that support it (not locked onto a single node, more
than one node in the test cluster).
As before, the TestConfig.ClientNodeSelector is ignored by the
provisioning testsuite.
The test "should write entries to /etc/hosts" should have the [LinuxOnly] tag as
it cannot pass on Windows; individual files cannot be mounted in Windows Containers.
This test was missed in the original PR (https://github.com/kubernetes/kubernetes/pull/73204)
TestDynamicProvisioning had multiple ways of choosing additional
checks:
- the PvCheck callback
- the builtin write/read check controlled by a boolean
- the snapshot testing
Complicating matters further, that builtin write/read test had been
more customizable with new fields `NodeSelector` and
`ExpectUnschedulable` which were only set by one particular test (see
https://github.com/kubernetes/kubernetes/pull/70941).
That is confusing and will only get more confusing when adding more
checks in the future. Therefore the write/read check is now a separate
function that must be enabled explicitly by tests that want to run it.
The snapshot checking is also defined only for the snapshot test.
The test that expects unschedulable pods now also checks for that
particular situation itself. Instead of testing it with two pods (the
behavior from the write/read check) that both fail to start, only a
single unschedulable pod is created.
Because node name, node selector and the `ExpectUnschedulable` were
only used for checking, it is possible to simplify `StorageClassTest`
by removing all of these fields.
Expect(err).NotTo(HaveOccurred()) is an anti-pattern in Ginkgo testing
because a test failure doesn't explain what failed (see
https://github.com/kubernetes/kubernetes/issues/34059). We avoid it
now by making the check function itself responsible for checking
errors and including more information in those checks.
When the provisioning test gets stuck, the log fills up with messages
about waiting for a certain pod to run. Now the pod names are
pvc-[volume-tester|snapshot]-[writer|reader] plus the random
number appended by Kubernetes. This makes it easier to see where the
test is stuck.
There is no need to check for empty strings, we can also directly
initialize structs with the value. The end result is the same when the
value is empty (empty string in the struct).
This addresses the two remaining change requests from
https://github.com/kubernetes/kubernetes/pull/69036:
- replace "csi-hostpath-v0" name check with capability
check (cleaner that way)
- add feature tag to "should create snapshot with defaults" because
that is an alpha feature
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Even if snapshots are supported by the driver interface, the driver or
suite might still want to skip a particular test, so those checks
still need to be executed.
Adds the test "should be able to pull image from docker hub [WindowsOnly]",
which will pull a Windows busybox image from dockerhub. Since it is busybox,
the same command will also work for this image.
The busybox image is currently used in other E2E tests, so the image should
already be prepulled on the nodes. Additionally, the image has a manifest list
for Windows Server 1803 and Windows Server 2019, and future versions will be
added to it.
The previous version of CudaVectorAdd test image can still be used
in our testing. A later change will extend the existing gpu e2e tests
to run pods with two containers. One with CudaVectorAdd version1 and
the other with CudaVectorAdd version2 so that we can test both
Cuda versions.
At the moment, Windows cannot mount individual files into Containers, which means
that the Kubelet-managed hosts file cannot be mounted into the Container, causing
the "should provide DNS for the cluster" test to fail.
This test separates the hosts entries checks from the mentioned test to a new test.
This change modifies kubectl run e2e test to remove a race condition
where kubectl run might complete before we can scrape the logs off via
attach.
Tested via running the e2e tests a few times on gce via kubetest.
The feature is gated behind a newly introduced 'dump-systemd-journal' flag.
We want to dump the full systemd journal in our scalability performance tests.
Some of the tests cannot pass using Windows nodes due to various reasons:
- seLinuxOptions are not supported on Windows.
- Running as an UID / GID is not supported on Windows.
- file permissions work differently on Windows, and they cannot be set in
the same manner as on Linux.
- individual files cannot be mounted in Windows Containers.
- Cannot create container using Linux image (e.g.: alpine) on Windows.
Because of this, it has been decided to use the "[LinuxOnly]" tag for the
tests which cannot run on Windows because of the mentioned reasons. This way,
when running tests using Windows nodes, those tests can simply be skipped by
adding the "[LinuxOnly]" tag to the ginkgo.skip argument.
Not accepting --provider= (i.e. setting an empty provider name) broke
some test jobs. As suggested in
https://github.com/kubernetes/kubernetes/pull/73402#issuecomment-459368230,
now --provider= and not passing --provider at all both trigger a
message and then continue as if --provider=skeleton had been used.
is either decided by the schema's version priority, or by the per
resource override.
This fixes a bug where the "batch" group is encoded in v1beta1, which
was hidden when --storage-versions is a valid flag.
The empty string was the default and then triggered a special
warning. There's no good reason for that behavior, so now the special
handling for "unset provider" is gone and "skeleton" is the non-empty
default for the value.
This finishes the work started for 1.13: instead of merely warning
about an unknown value given to --profile, the test/e2e/e2e.test
binary will now print an error and refuse to run.
Fixes: #70200
This allows for overriding defaults (e.g. max requests inflight) in
prow job configs by using KUBEMARK_APISERVER_TEST_ARGS. These are
passed to the start-kubemark-master.sh script as APISERVER_TEST_ARGS.
Some tests use .yaml files to deploy pods, which have hardcoded
images. Those images cannot be used for Windows containers.
The image names can be injected by the tests themselves, based on
the configured registries.
By default, cmd is used to execute commands in Windows containers, but it
has some issues executing the command (no output):
echo 'hostName' | nc -w 1 -u <targetIP> <targetPort>
But it has no issues running (output is correct):
echo hostName | nc -w 1 -u <targetIP> <targetPort>
This causes the test "should function for node-pod communication: udp" to
fail when using Windows Containers.
Until we port the remaining e2e tests to use exec host pods instead
of SSH, it should be possible to use bastion host when
clusters don't expose public IPs for nodes. Testers can use the
KUBE_SSH_BASTION environment variable set to a `host:port` to
specify the host that the tests will use as a jump host.
If KUBE_SSH_BASTION is specified, the test will first connect to
the bastion using SSH and the provider private key, then establish
a tunnel from the bastion to the target node by its IP. If the
node has an external IP, the external IP must be routable from the
bastion.
Support KUBE_SSH_KEY_PATH as a single environment variable that if
specified ignores the provider specific settinsg and loads the
SSH private key from the provided path. Makes it much easier to
specify a consistent signer across providers.
e2e-node tests may use custom system specs for validating nodes to
conform the specs. The functionality is switched on when the tests
are run with this command:
make SYSTEM_SPEC_NAME=gke test-e2e-node
Currently the command fails with the error:
F1228 16:12:41.568836 34514 e2e_node_suite_test.go:106] Failed to load system spec: open /home/rojkov/go/src/k8s.io/kubernetes/k8s.io/kubernetes/cmd/kubeadm/app/util/system/specs/gke.yaml: no such file or directory
Move the spec file under `test/e2e_node/system/specs` and introduce a single
public constant referring the file to use instead of multiple private constants.
When we are running apiserver related code, we do not currently capture
the logs from `httplog.NewLogged` and `trace.LogIfLong` since the
default log verbosity is not set. So just make sure we have a minimum
verbosity set in these circumstance.
Change-Id: I64a30029778615e679b244ddba801833218d1573
Remove SLOW tag and update description for KUBEDESCRIBE(Probing container) and SIGDESCRIBE(EmptyDir Wrapper Volume)
Remove slow references for tests that execute below 5 minutes
Some tests which utilized SSH to run commands on nodes would
first look for external IPs but fall back to internal IPs
since those could be reachable by the testing program.
This change adds that same fallback logic to another method
used to find the appropriate SSH address for each node.
Fixes#68747
PR #70862 made each driver responsible for resetting its config, but
as it turned out, one place was missed in that PR: the in-tree gcepd
sets a node selector. Not resetting that caused other tests to fail
randomly depending on test execution order.
Now the test suite resets the config by taking a copy after setting up
the driver and restoring that copy before each test.
Long term the intention is to separate the entire test config from the
static driver info (https://github.com/kubernetes/kubernetes/issues/72288),
but for now resetting the config is the fastest way to fix the test flake.
Fixes: #72378
Exposing framework.VolumeTestConfig as part of the testsuite package
API was confusing because it was unclear which of the values in it
really have an effect. How it was set also was a bit awkward: a test
driver had a copy that had to be overwritten at test runtime and then
might have been updated and/or overwritten again by the driver.
Now testsuites has its own test config structure. It contains the
values that might have to be set dynamically at runtime. Instead of
overwriting a copy of that struct inside the test driver, the test
driver takes some common defaults (specifically, the framework pointer
and the prefix) when it gets initialized and then manages its own
copy. For example, the hostpath driver has to lock the pods to a
single node.
framework.VolumeTestConfig is still used internally and test drivers
can decide to run tests with a fully populated instance if needed (for
example, after setting up an NFS server).
This makes it possible to use the testsuites package out-of-tree
without pulling in unnecessary dependencies and code (in
test/e2e/storage/vsphere) that defines tests that are not wanted in a
custom test suite.
Different drivers support different volume sizes. Some have certain
minimum sizes, some maximum sizes. Instead of hard-coding some kind of
default into the testsuites, now each driver that supports dynamic
provisioning has to provide the size.
The setup of the V0 hostpath driver was done with copy-and-paste and
then changing just the driver name and the manifests. The same can be
achieved by making the base struct a bit more configurable, which
simplifies future changes (less code).
Renaming the provisioner container was unnecessary and was reverted to
make it possible to use the same patch configuration.
While at it, also fix the InitHostV0PathCSIDriver typo.
Commit 503f654d7a, "Update CSI tests to
point to 1.0.0 external bits", changed to images published on gcr.io,
perhaps because the images on quay.io weren't ready yet. We now have
up-to-date images on quay.io and should be using those.
The filename can overlap when multiple resources have the same name (but
obviously are of a different type). Include the name of the type in the
file name to prevent the overlap.
dnsmasq 2.79 introduced a change to respond to all norecurse queries with ServFail.
This is to prevent cache snooping where an adversary can figure out if a particular hostname has been looked up or not.
These tests do not need the norecurse flag, hence removing it.
Tests might want to use there own options for specifying a file path,
while still using the abstract file access API. For example,
framework.CreateFromManifests might be used to create a mixture of
files under the repo root and from elsewhere. To support this,
absolute paths can now be given to the testfiles package and they will
be read directly.
While debugging issues I found myself having to change the constant in the code
for a cluster > 20 nodes, and then on a very small cluster I found myself passing
0 to avoid the mostly useless output (it's useful in specific scenarios but generates
a *lot* of output that doesn't help debugging the rest of the time).
The driver dynamically provisions its volumes in /tmp. To preserve the data
across driver restarts, the directory must be mapped to more persistent
place, /tmp on the host seems to be the safest choice.
Some mounttest related tests are checking the file permissions set on the
container files, but the default file permissions on Windows is 775 instead of
644, causing some tests to fail.
Keep in mind that file permissions work differently on Windows, and setting file
permissions via Kubernetes is not currently supported on Windows.
This change renames the '--experimental-encryption-provider-config'
flag to '--encryption-provider-config'. The old flag is accepted but
generates a warning.
In 1.14, we will drop support for '--experimental-encryption-provider-config'
entirely.
Co-authored-by: Stanislav Laznicka <slaznick@redhat.com>
Attempting to retrieve logs for a container that hasn't started yet
may have been the reason for the "the server does not allow this
method on the requested resource" error that showed up on the GCE CI
test cluster (issue #70888).
If we wait with retrieving logs until the pod is running or has
terminated, then we might be able to avoid that error. It's the right
thing to do either way and not complicated to implement.
A test name should not be the subset of another, because then it is
impossible to focus on it.
In this case, -ginkgo.focus=should.provision.storage ran both "should
provision storage" and "should provision storage with mount options"
without the ability to select just the former.
There were providers which would:
- allow overrides of the file base name but not the path
- allow oeverrides of the file path
- not allow any overrides at all
This change makes it such that each of the supported providers
can override the SSH key location using an env var. The env
var itself may vary based on the provider though.
If given an absolute path to the key, it is used. If given
a relative path it will be made relative to ~/.ssh
Fixes: #68747
Ginkgo expects the caller to pass the appropriate skip, which we do
not do. Slightly improve call site messages by eliding the util.go
stack frame. Also drop the timestamp from skip messages since skip
is almost always called to check for global conditions, not temporal
ones.
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
This previously caused a panic when moving lastKnownGood between two
non-nil values, because we were comparing the interface wrapper instead
of comparing the NodeConfigSources. The case of moving from one non-nil
lastKnownGood config to another doesn't appear to be tested by the e2e
node tests. I added a unit test and an e2e node test to help catch bugs
with this case in the future.
When node lease feature is enabled, kubelet reports node status to api server
only if there is some change or it didn't report over last report interval.
The existence of /proc/net/nf_conntrack depends on the Linux kernel
config CONFIG_NF_CONNTRACK_PROCFS, and Ubuntu 16.04's one is disabled.
So the e2e test "should set TCP CLOSE_WAIT timeout" fails on that.
This adds the existence check of /proc/net/nf_conntrack for skipping
the test if not existing.
In addition, this makes IssueSSHCommandWithResult return Stderr in
the error message if err is nil to check "No such file or directory"
as nonexistence of /proc/net/nf_conntrack.
This patch introduces glusterfsPersistentVolumeSource addition
to glusterfsVolumeSource. All fields remains same as glusterfsVolumeSource
with an addition of a new field
called `EndpointsNamespace` to define namespace of endpoint in the
spec.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Creating secrets is useful for CSI drivers like ceph-csi which have to
be configured via secrets.
While at it, the UniqueName method gets replaced with
MetaNamespaceKeyFunc which does the same thing (at least as long as
non-namespaced items don't have a redundant namespace set) and the
factory types aren't exported anymore (not necessary).
This way we can be sure that the kubelet can't communicate with the
master, even if falls-back to the internal/external IP (which seems to
be the case with DNS)
Issue #56787
As discussing on #68905
some tests of test/e2e/common/host_path.go are covered with
test/e2e/storage/testsuites/subpath.go
So we don't need to keep them in test/e2e/common/host_path.go
anymore for the maintenance.
The striped cache used by the token cache is slightly more sophisticated
however the simple cache provides about the same exact behavior. I used
the striped cache rather than the simple cache because:
* It has been used without issue as the primary token cache.
* It preforms better under load.
* It is already exposed in the public API of the token cache package.
The command executed in ValidateController function uses the
image name of the running container. This is a problem in multiarch
images, since the image name is the name of the image specific to
the architecture, but the image passed as parameter is the multiarch
one (as the test are architecture agnostic), making the test to fail.
This patch fixes it by making the logic use a command that get the
multiarch name given in the container spec.
Signed-off-by: Miguel Herranz <miguel@midokura.com>
This change updates the etcd storage path test to exercise custom
resource storage by creating custom resource definitions before
running the test.
Duplicated custom resource definition test logic was consolidated.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Put into OWNERS files what has effectively been happening
for the past few months, we review and shepherd PR's
through other SIG reviewers before bringing to sig-arch
for a final /approve.
The goal is to eventually grow approvers when it's been
demonstrated that we have sufficient context.
Tests for scaling down based on external metric are flaky. I think this
is because they:
- Start with 2 replicas,
- Export metric value == 1/2 target,
- Expect scale down to 1.
Since the expected recommendation is exactly 1 it might flake (and with
scale down stabilization any recommendations higher than 1 will
persist).
Change expected value of the metric so recommended size will be lower
than 1. This should make those tests less flaky.
The detailed dumps of original and patched item content was useful
while developing the feature, but is less relevant now and too
verbose. It might be relevant again, so it's left in the code as
comments.
What gets logged now is just a single-line "creating" resp. "deleting"
message with the type of the item and its unique name.
This also enhances up some other aspects of the original logging:
- the namespace is included for item types that are namespaced
- the "deleting" message no longer gets replicated in each factory
method
Fixes: #70448
- Scale down based on custom metric was flaking. Increase target value
of the metric.
- Scale down based on CPU was flaking during stabilization. Increase
tolerance of stabilization (caused by resource consumer using more CPU
than requested).
Debugging the CSI driver tests depends a lot on the output of the CSI
sidecar containers and the CSI driver, but that information was not
captured automatically and thus unavailable after a test run. This is
particularly bad when running in a remote CI system, but also manually
watching the cluster during a test was cumbersome.
Now pod events and log messages get copied to the test's output at the
time that they happen (when running without report directory) or get
written to individual log files (when running with report directory in
the CI).
Ensuring that CSI drivers get deployed for testing exactly as intended
was problematic because the original .yaml files had to be converted
into code. e2e/manifest helped a bit, but not enough:
- could not load all entities
- didn't handle loading .yaml files with multiple entities
- actually creating and deleting entities still had to be done in tests
The new framework utility code handles all of that, including the
tricky cleanup operation that tests got wrong (AfterEach does not get
called after test failures!).
In addition, it is ensuring that each test gets its own instance of the
entities.
The PSP role binding for hostpath is now necessary because we switch
from creating a pod directly to creation via the StatefulSet
controller, which runs with less privileges.
Without this, the hostpath test runs into these errors in the
kubernetes-e2e-gce job:
Oct 19 16:30:09.225: INFO: At 2018-10-19 16:25:07 +0000 UTC - event for csi-hostpath-attacher: {statefulset-controller } FailedCreate: create Pod csi-hostpath-attacher-0 in StatefulSet csi-hostpath-attacher failed error: pods "csi-hostpath-attacher-0" is forbidden: unable to validate against any pod security policy: []
Oct 19 16:30:09.225: INFO: At 2018-10-19 16:25:07 +0000 UTC - event for csi-hostpath-provisioner: {statefulset-controller } FailedCreate: create Pod csi-hostpath-provisioner-0 in StatefulSet csi-hostpath-provisioner failed error: pods "csi-hostpath-provisioner-0" is forbidden: unable to validate against any pod security policy: []
Oct 19 16:30:09.225: INFO: At 2018-10-19 16:25:07 +0000 UTC - event for csi-hostpathplugin: {daemonset-controller } FailedCreate: Error creating: pods "csi-hostpathplugin-" is forbidden: unable to validate against any pod security policy: []
The extra role binding is silently ignored on clusters which don't
have this particular role.
When we get an unsupported provider message, it often isn't clear what
method actually failed - add more information to the error message.
Issue #70280
Setting command line arguments via env variables that are not needed
by the binaries is just unnecessarily complex. The driver renaming
code in the E2E manifest PR would have to be made more complex to deal
with such a deployment. It is easier for that code and humans who look
at the .yaml file to remove the indirection.
I increased request from 500mCPU to 1CPU in
https://github.com/kubernetes/kubernetes/pull/70125 but this interacts
poorly with other e2e tests (higher requests mean pods fil to schedule).
So revert that change.
Force ELB to ensureHealthCheck when target pool exists.
Add e2e test to ensure that HC interval will be reconciled when
kube-controller-manager restarts.
Health checks with bigger thresholds and larger intervals will not be reconciled.
Add unittest for ILB and ELB to ensure HC reconciles and is configurable.
With CRI-O we've been hitting a lot of flakes with the following test:
[sig-apps] CronJob should remove from active list jobs that have been deleted
The events shown in the test failures in both kube and openshift were the following:
STEP: Found 13 events.
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:02 +0000 UTC - event for forbid: {cronjob-controller } SuccessfulCreate: Created job forbid-1540412040
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:02 +0000 UTC - event for forbid-1540412040: {job-controller } SuccessfulCreate: Created pod: forbid-1540412040-z7n7t
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:02 +0000 UTC - event for forbid-1540412040-z7n7t: {default-scheduler } Scheduled: Successfully assigned e2e-tests-cronjob-rjr2m/forbid-1540412040-z7n7t to 127.0.0.1
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:03 +0000 UTC - event for forbid-1540412040-z7n7t: {kubelet 127.0.0.1} Pulled: Container image "docker.io/library/busybox:1.29" already present on machine
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:03 +0000 UTC - event for forbid-1540412040-z7n7t: {kubelet 127.0.0.1} Created: Created container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:03 +0000 UTC - event for forbid-1540412040-z7n7t: {kubelet 127.0.0.1} Started: Started container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:12 +0000 UTC - event for forbid: {cronjob-controller } MissingJob: Active job went missing: forbid-1540412040
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:02 +0000 UTC - event for forbid: {cronjob-controller } SuccessfulCreate: Created job forbid-1540412100
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:02 +0000 UTC - event for forbid-1540412100: {job-controller } SuccessfulCreate: Created pod: forbid-1540412100-rq89l
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:02 +0000 UTC - event for forbid-1540412100-rq89l: {default-scheduler } Scheduled: Successfully assigned e2e-tests-cronjob-rjr2m/forbid-1540412100-rq89l to 127.0.0.1
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:06 +0000 UTC - event for forbid-1540412100-rq89l: {kubelet 127.0.0.1} Started: Started container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:06 +0000 UTC - event for forbid-1540412100-rq89l: {kubelet 127.0.0.1} Created: Created container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:06 +0000 UTC - event for forbid-1540412100-rq89l: {kubelet 127.0.0.1} Pulled: Container image "docker.io/library/busybox:1.29" already present on machine
The code in test is racy because the Forbid policy can still let the controller to create
a new pod for the cronjob. CRI-O is fast at re-creating the pod and by the time
the test code reaches the check, it fails. The events are as follow:
[It] should remove from active list jobs that have been deleted
/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/apps/cronjob.go:192
STEP: Creating a ForbidConcurrent cronjob
STEP: Ensuring a job is scheduled
STEP: Ensuring exactly one is scheduled
STEP: Deleting the job
STEP: deleting Job.batch forbid-1540412040 in namespace e2e-tests-cronjob-rjr2m, will wait for the garbage collector to delete the pods
Oct 24 20:14:02.533: INFO: Deleting Job.batch forbid-1540412040 took: 2.699182ms
Oct 24 20:14:02.634: INFO: Terminating Job.batch forbid-1540412040 pods took: 100.223228ms
STEP: Ensuring job was deleted
STEP: Ensuring there are no active jobs in the cronjob
[AfterEach] [sig-apps] CronJob
/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:148
It looks clear that by the time we're ensuring that there are no more active jobs, there
could be _already_ a new job spinned, making the test flakes.
This PR fixes all the above by making sure that the _deleted_ job is not in the Active
list anymore, besides other pod already running but with different UUID which is
going to be fine anyway for the purpose of the test.
Signed-off-by: Antonio Murdaca <runcom@linux.com>
The E2E refactoring tightened the sanity checking of the --provider
parameter such that it only allowed known providers. That seemed to
make sense because it catches typos, but it turned out that various
callers depended on the "accept arbitrary provider value" behavior,
therefore it gets restored.
Provisioning test for retain policy requires each driver's backend volume
deletion logic. Without it, volume leakage happens. Move this test back
to volume_provisioning.go and test it only for gce, until general
backend volume deletion code for each driver becomes available.
Fixes: #70191
Resource consumer might use slightly more CPU than requested. That
resulted in HPA sometimes increasing size of deployments during e2e
tests. Deflake tests by:
- Scaling up CPU requests in those tests. Resource consumer might go a fixed
number of milli CPU seconds above target. Having higher requests makes
the test less sensitive.
- On scale down consume CPU in the middle between what would generate
recommendation of expexted size and 1 pod fewer (instead of righ on
edge beween expected and expected +1).
Some variables were int32 but always cast to int before use. Make them
int.
The service account authenticator isn't the only authenticator that
should respect API audience. The authentication config structure should
reflect that.
If dig fails to reach the DNS, it will output errors to stdout, which
can cause the test -n to falsely pass:
ubuntu@ubuntu:~$ dig +notcp +noall +answer +search kubernetes.default A
;; connection timed out; no servers could be reached
command terminated with exit code 9
ubuntu@ubuntu:~$ test -n "$(dig +notcp +noall +answer +search kubernetes.default A)" && echo OK
OK
This patch solves this issue by making sure the dig command actually succeeds
before checking its output.
The test "should run with the expected status" passes with and without
the set SELinuxOptions, but removing it will ensure that the test will
be able to run and pass on Windows nodes as well.
This change updates the ETCD storage test so that its data is
exported. Thus it can be used by other tests. The dry run test was
updated to consume this data instead of having a duplicate copy.
The code to start a master that can be used for "one of every
resource" style tests was also factored out. It is reused in the
dry run test as well.
This prevents these tests from drifting in the future and reduces
the long term maintenance burden.
Signed-off-by: Monis Khan <mkhan@redhat.com>
In some storage tests, kubelet is stopped first and the test check node
NotReady state. However, if it fails to have this state, kubelet could
not be restarted because the defer function is placed after the stop
kubelet command. This PR fixes this issue.
Make CreatePrivilegedPSPBinding reentrant so tests using it (e.g. DNS) can be
executed more than once against a cluster. Without this change, such tests will
fail because the PSP already exists, short circuiting test setup.
Not all users of the E2E framework want to run cloud-provider specific
tests. By splitting out the code it becomes possible to decide in
a E2E test suite which providers are supported.
This is achieved in two ways:
- the framework calls certain functions through a provider
interface instead of calling specific cloud provider functions
directly
- tests that are cloud-provider specific directly import the
new provider packages
The ingress test utilities are only needed by a few tests. Splitting
them out into a separate package makes the framework simpler for test
suites not using those tests.
Fixes: #66649
When facing an issue which is due to lack of PV/PVC Protection
finalizer on the e2e tests, the error message is just like
Expected
<bool>: false
to be true
Actually we cannot understand what happened during the tests.
This adds more debugging info on the tests.
Some commands used in tests are Linux specific and do not exist
or do not behave the same on Windows nodes. This can cause those
tests to fail on Windows nodes.
Replaces the mentioned commands with ones that behave the same on
both Linux and Windows.
The test "Should be able to deny attaching pod" can randomly fail
because it never waits for the pod to enter a "Running" state. Because
of this, the "kubectl attach" command can potentially fail, if the pod is
not in the correct state.
Additionally, if the "kubectl attach" actually manages to attach, then the
test will hang. The command should be executed with a timeout.
After the call to ioutil.TempDir, the directory has already been
created, and MkdirAll therefore can't do anything. The mode argument
in particular is misleading.
Tests settings should be defined in the test source code itself
because conceptually the framework is a separate entity that not all
test authors can modify.
For the sake of backwards compatibility the name of the command line
flags are not changed.
Tests settings should be defined in the test source code itself
because conceptually the framework is a separate entity that not all
test authors can modify.
Using the new framework/config code also has several advantages:
- defaults can be set with less code
- no confusion around what's a duration
- the options can also be set via command line flags
While at it, a minor bug gets fixed:
- readConfig() returns only defaults when called while
registering Ginkgo tests because Viperize() gets called later,
so the scale in the logging soak test couldn't really be configured;
now the value is read when the test runs and thus can be changed
The options get moved into the "instrumentation.logging"
resp. "instrumentation.monitoring" group to make it more obvious where
they are used. This is a breaking change, but that was already
necessary to improve the duration setting from plain integer to a
proper time duration.
Tests shouldn't have to use the central context for their settings,
because conceptually tests and framework get developed independently.
This does not yet use the new framework/config utility code because
that code still needs to be reviewed.
Besides moving the flags, they also get renamed from the top-level
"--csiImage{Version|Registry}" to
"--storage.csi.image.{version|registry}". These flags were introduced
fairly recently and shouldn't be in use much, so now is a good time to
introduce a hierarchical naming for storage flags, in particular
because more flags will be added soon.
Storing settings in the framework's TestContext is not something that
out-of-tree test authors can do because for them the framework is a
read-only upstream component. Conceptually the same is true for
in-tree tests, so the recommended approach is to define configuration
settings in the code that uses them.
How to do that is a bit uncertain. Viper has several
drawbacks (maintenance status uncertain, cannot list supported
options, cannot validate the configuration file). How to handle
configuration files is currently getting discussed for kubeadm, with
similar concerns about
Viper (https://github.com/kubernetes/kubeadm/issues/1040).
Instead of making a choice now for E2E, the recommendation is that
test authors continue to define command line flags as before, except
that they should do it in their own code and with better flag names.
But the ability to read options also from a file is useful, so
several enhancements get added:
- all settings defined via flags can also be read from a
configuration file, without extra work for test authors
- framework/config makes it possible to populate a struct directly
and define flags with a single function call
- a path and file suffix can be given to --viper-config (as in
"--viper-config /tmp/e2e.json") instead of expecting the file in
the current directory; as before, just plain "--viper-config e2e"
still works
- if "--viper-config" is set, the file must exist; otherwise the
"e2e" config is optional (as before)
- errors from Viper are no longer silently ignored, so syntax errors
are detected early
- Viper support is optional: test suite authors who don't want
it are not forced to use it by the e2e/framework
Individual implementations are not yet being moved.
Fixed all dependencies which call the interface.
Fixed golint exceptions to reflect the move.
Added project info as per @dims and
https://github.com/kubernetes/kubernetes-template-project.
Added dims to the security contacts.
Fixed minor issues.
Added missing template files.
Copied ControllerClientBuilder interface to cp.
This allows us to break the only dependency on K8s/K8s.
Added TODO to ControllerClientBuilder.
Fixed GoDeps.
Factored in feedback from JustinSB.
Some e2e tests are skipped by depending on Linux distribution of
master and node, and the options can be one of "debian", "ubuntu",
"gci" or "custom". This updates the help message of the options.
The words "error" and "fail" are magic in test output, and are
highlighted in the build logs. Change some test strings to avoid
hitting the highlighting in normal operation.
The new test/e2e/framework/testfiles package makes it possible to
write tests that do not depend on a specific way of providing
additional test files at runtime. Such tests and the framework are
then more easily reused in other test suites.
In the test/e2e suite file access is enabled based on the existing
"repo-root" command line parameter and the built-in bindata. Tests
using the new API will first check for files under "repo-root" and
then fall back to the builtin data. This way, users of a test binary
can modify those files without having to rebuild the binary.
"repo-root" is still needed because at least some tests check for
additional files (secret.yaml, via ingress_utils.go) that are not part
of the upstream source code and thus may or may not be built into a
test binary.
Tests using bindata or repo-root directly get modified to use the new
API, or removed when they are obsolete: test/e2e/examples.go depended
on files that were removed in
https://github.com/kubernetes/kubernetes/pull/61246 and thus can no
longer be run in Kubernetes. Moving the tests to kubernetes/examples
is tracked in https://github.com/kubernetes/examples/issues/214.
The file removal did not break the automated E2E testing probably
because the tests are under the Feature:Example tag and thus not
enabled during normal CI runs.
Removing also the obsolete tests makes it simpler to rework the
"repo-root" setting because less code uses it.
Related-to: #66649 and #23987
New command is now `kubectl diff` rather than `kubectl alpha diff` since
it's moving out of alpha soon, and will be using dry-run apply to
produce the diff rather than the custom merge logic.
This change makes the schema structs consistently use non-pointer
receivers. This makes it easier to call these methods since these
structs are used as values instead of pointers.
Signed-off-by: Monis Khan <mkhan@redhat.com>
Bootstrap initializes the necessary vSphere objects before the tests are
run. A call to Bootstrap was missing in persistent_volumes-vsphere.go's
BeforeEach. This results in Panic while running e2e tests for 'vsphere'
provider with a stack trace like this:
/usr/local/go/src/runtime/panic.go:502 +0x229
github.com/docker/kube-e2e-image/vendor/k8s.io/kubernetes/test/e2e/storage/vsphere.glob..func1.1()
/go/src/github.com/docker/kube-e2e-image/vendor/k8s.io/kubernetes/test/e2e/storage/vsphere/persistent_volumes-vsphere.go:77
+0xa21
github.com/docker/kube-e2e-image/vendor/github.com/onsi/ginkgo/internal/leafnodes.(*runner).runSync(0xc4217c9b60,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
/go/src/github.com/docker/kube-e2e-image/e2e_test.go:88 +0x2c8
testing.tRunner(0xc4206e01e0, 0x4212900)
/usr/local/go/src/testing/testing.go:777 +0xd0
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:824 +0x2e0
This change fixes the Panic by calling Bootstrap.
Testing:
After this change, tests with FOCUS set to "PersistentVolumes:vsphere"
dont Panic. They pass as expected.
Signed-off-by: Anusha Ragunathan <anusha.ragunathan@docker.com>
Normally the pod would get created via a DaemonSet controller, but
during testing it is easier to create it directly. We just need to
ignore errors (like 'No API token found for service account
"csi-service-account"') and retry for a while. If the error persists,
the error will still abort and report it eventually.
This problem also occurs elsewhere, so an utility function in the
framework for it seems justified.
Fixes: #68776
This fixes the Redis StatefulSet e2e test by adding the missing `publishNotReadyAddresses: true` field, which was accidentially left out in #63742.
Without this fix, the redis e2e test will fail because the pod is unable
to lookup the service:
```
2018/09/14 13:57:10 lookup redis on 172.31.9.247:53: no such host
```
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
The current interface is kind of clunky and not super easy to use, since
you have to specify parameters to specify which versions to diff. Also
the default isn't the most useful setting.
Change the interface by removing all the parameters and force only one
useful use-case, that is: diffing what's currently live against
what would be live if applied.
The EtcdMain function in integration tests is designed to launch an etcd
instance only when running in a bazel test. For non-bazel, we rely on
hack/make-rules/test-integration.sh to bring up the etcd instance.
This patch fixes the following in EtcdMain:
1. If etcd is not found in ${RUNFILES_DIR} then look in ${PATH}.
2. Try to connect to the etcd started by `make test-integraion`; if it
is up, then don't start etcd.
3. Gracefully shut down etcd after tests.
4. Get a port from the OS instead of deriving it from argv[0].
5. Don't use sync.Once.
The benefit of this change is that integration tests work with `go test`
as well as `make test-integration` without users needing to do anything
special. That makes it much easier to pass go testing flags to tests and
integrate with IDEs.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Added unschedulable and network-unavailable toleration.
Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com>
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #61312
fixes: https://github.com/kubernetes/kubernetes/issues/67606
**Release note**:
```release-note
If `TaintNodesByCondition` is enabled, add `node.kubernetes.io/unschedulable` and
`node.kubernetes.io/network-unavailable` automatically to DaemonSet pods.
```
Automatic merge from submit-queue (batch tested with PRs 65250, 68241). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Initial node performance testing framework.
This PR adds a framework for node performance testing.
Partially fixes: https://github.com/kubernetes/kubernetes/issues/65249.
Use the following command to run this test:
```sh
make test-e2e-node FOCUS="Node Performance Testing" SKIP="" PARALLELISM=1
```
It has been tested in the following environment:
- n1-standard-16
- Ubuntu 16.04
- docker 17.03.2
Note to reviewers:
This PR won't pass node e2e since the docker images in https://github.com/kubernetes/kubernetes/pull/65251 are required for this to function. The node e2e will fail when trying to pull the required images for testing.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
CSI Node info registration in kubelet
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#67683
**Special notes for your reviewer**:
Feature issue: https://github.com/kubernetes/features/issues/557
Design doc: https://github.com/kubernetes/community/pull/2034
Missing pieces:
* CSI client retry and exponential backoff logic.
* CSINodeInfo object validation
* e2e test with all the CSI machinery.
An RBAC rule is also added to support external-provisioner topology updates.
**Release note**:
```release-note
Registers volume topology information reported by a node-level Container Storage Interface (CSI) driver. This enables Kubernetes support of CSI topology mechanisms.
```
Automatic merge from submit-queue (batch tested with PRs 68341, 68385). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fixed error message reporting for Gluster driver tests
Quick fix for error message for Gluster driver tests. This doesn't solve the problem but will make it easier to pinpoint the issue if it flakes again on CI.
Related to: #68373
/sig storage
/kind bug
/assign @msau42
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 67950, 68195). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Remove e2e-image-puller
**What this PR does / why we need it**:
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63355
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Update default etcd server to 3.2.24 for kubernetes 1.12
**What this PR does / why we need it**:
Update default etcd server to 3.2.24 for kubernetes 1.12
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
xref #68147
**Special notes for your reviewer**:
NONE
**Release note**:
```
Update default etcd server to 3.2.24 for kubernetes 1.12
```
/assign @wojtek-t @jpbetz @dims
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews @gyuho
What this PR does / why we need it:
Simple code and typo fixed in nfs tests. The tests in nfs are useful as an example of how to configure a NFS server and this typo was hurting code comprehension.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
none
Special notes for your reviewer:
none
Release note:
none
Automatic merge from submit-queue (batch tested with PRs 68087, 68256, 64621, 68299, 68296). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Bump addon-manager to v8.7
**What this PR does / why we need it**:
Major changes:
- Support extra `--prune-whitelist` resources in kube-addon-manager.
- Update kubectl to v1.10.7.
Basically picking up https://github.com/kubernetes/kubernetes/pull/67743.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @Random-Liu @mikedanese
**Release note**:
```release-note
Bump addon-manager to v8.7
- Support extra `--prune-whitelist` resources in kube-addon-manager.
- Update kubectl to v1.10.7.
```
This changes the custom metrics client logic over to support multiple versions
of the custom metrics API by checking discovery to find the appropriate versions.
Fixes#68011
Co-authored-by: Solly Ross <sross@redhat.com>
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fix gce localssd pv tests
**What this PR does / why we need it**:
When running local PV tests against GCE local SSD, it directly uses the disk so doesn't need to create a tmp dir like the other test formats. Fsgroup tests do not create test-file so don't error on cleanup if the file doesn't exist.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#68308
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 68171, 67945, 68233). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Move the CloudControllerManagerConfiguration to an API group in `cmd/`
**What this PR does / why we need it**:
This PR is the last piece of https://github.com/kubernetes/kubernetes/issues/67233.
It moves the `CloudControllerManagerConfiguration` to its own `cloudcontrollermanager.config.k8s.io` config API group, but unlike the other components this API group is "private" (only available in `k8s.io/kubernetes`, which limits consumer base), as it's located entirely in `cmd/` vs a staging repo.
This decision was made for now as we're not sure what the story for the ccm loading ComponentConfig files is, and probably a "real" file-loading ccm will never exist in core, only helper libraries. Eventually the ccm will only be a library in any case, and implementors will/can use the base types the ccm library API group provides. It's probably good to note that there is no practical implication of this change as the ccm **cannot** read ComponentConfig files. Hencec the code move isn't user-facing.
With this change, we're able to remove `pkg/apis/componentconfig`, as this was the last consumer. That is hence done in this PR as well (so the move is easily visible in git, vs first one "big add" then a "big remove"). The only piece of code that was used was the flag helper structs, so I moved them to `pkg/util/flag` that I think makes sense for now.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref: kubernetes/community#2354
**Special notes for your reviewer**:
This PR builds on top of (first two commits, marked as `Co-authored by: @stewart-yu`) https://github.com/kubernetes/kubernetes/pull/67689
**Release note**:
```release-note
NONE
```
/assign @liggitt @sttts @thockin @stewart-yu
Automatic merge from submit-queue (batch tested with PRs 68171, 67945, 68233). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
[e2e] verifying LimitRange update is effective before creating new pod
**What this PR does / why we need it**:
Refer to the flaky test mentioned in #68170, LimitRange updating should be verified before creating new pod.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#68170
**Special notes for your reviewer**:
/cc bsalamat k82cn
/sig scheduling
**Release note**:
```release-note
[e2e] verifying LimitRange update is effective before creating new pod
```
At e2e test for "apply set/view last-applied",
failure message is `Missing "replicas": 2 in kubectl
view-last-applied`, in spite of `replicas` key is contained.
This changes `Missing` to `Presenting`.
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
CSI: skip attach for non-attachable drivers
**What this PR does / why we need it**:
This is implementation of https://github.com/kubernetes/community/pull/2523. CSI volumes that don't need attach/detach now don't need external attacher running.
WIP:
* contains #67803 to get CSIDriver API. Ignore the first commit.
* ~~missing e2e test~~
/sig storage
cc: @saad-ali @vladimirvivien @verult @msau42 @gnufied @davidz627
**Release note**:
```release-note
CSI volume plugin does not need external attacher for non-attachable CSI volumes.
```
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Replace git volume with configmap in emptydir wrapper conflict test
**What this PR does / why we need it**: GitRepoVolumeSource is deprecated, use a ConfigMap instead. (This test is part of the conformance suite, so it would be good to allow downstreams to disable/not support gitRepo)
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 67691, 68147). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Bump versions of components with latest security patches.
**What this PR does / why we need it**:
Upgrade versions of monitoring components used on GCP, to include latest security patches.
**Release note**:
```release-note
[fluentd-gcp-scaler addon] Bump fluentd-gcp-scaler to 0.4 to pick up security fixes.
[prometheus-to-sd addon] Bump prometheus-to-sd to 0.3.1 to pick up security fixes, bug fixes and new features.
[event-exporter addon] Bump event-exporter to 0.2.3 to pick up security fixes.
```
Automatic merge from submit-queue (batch tested with PRs 67709, 67556). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fix volume scheduling issue with pod affinity and anti-affinity
**What this PR does / why we need it**:
The previous design of the volume scheduler had volume assume + bind done before pod assume + bind. This causes issues when trying to evaluate future pods with pod affinity/anti-affinity because the pod has not been assumed while the volumes have been decided.
This PR changes the design so that volume and pod are assumed first, followed by volume and pod binding. Volume binding waits (asynchronously) for the operations to complete or error. This eliminates the subsequent passes through the scheduler to wait for volume binding to complete (although pod events or resyncs may still cause the pod to run through scheduling while binding is still in progress). This design also aligns better with the scheduler framework design, so will make it easier to migrate in the future.
Many changes had to be made in the volume scheduler to handle this new design, mostly around:
* How we cache pending binding operations. Now, any delayed binding PVC that is not fully bound must have a cached binding operation. This also means bind API updates may be repeated.
* Waiting for the bind operation to fully complete, and detecting failure conditions to abort the bind and retry scheduling.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#65131
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue where pod scheduling may fail when using local PVs and pod affinity and anti-affinity without the default StatefulSet OrderedReady pod management policy
```
Automatic merge from submit-queue (batch tested with PRs 66840, 68159). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
TTL for cleaning up Jobs after they finish
**What this PR does / why we need it**: https://github.com/kubernetes/features/issues/592
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64470
For https://github.com/kubernetes/features/issues/592
**Special notes for your reviewer**: @kubernetes/sig-apps-pr-reviews
**Release note**:
```release-note
Add a TTL machenism to clean up Jobs after they finish.
```
Automatic merge from submit-queue (batch tested with PRs 67736, 68123, 68138). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Update external provisioner test to use latest nfs-provisioner
**What this PR does / why we need it**: latest nfs-provisioner will work with cri-containerd, so let's update it
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**: I want to move this test to use nfs-client-provisioner soon anyway since a lot of our e2e tests already use a containerized nfs server and it would be good to be consistent. So this can be treated as something of a stopgap but it would be nice to have ASAP to unblock https://github.com/kubernetes-incubator/external-storage/issues/432#issuecomment-417511065
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 67736, 68123, 68138). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Port security context NodeConformance e2e_node tests to e2e
**What this PR does / why we need it**:
Port all [NodeConformance] SecurityContext e2e_node tests to e2e/common.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#67032
**Special notes for your reviewer**:
- This PR is a continuing effort to close#67032.
- Removed ContainerRuntime constraint [as discussed](https://github.com/kubernetes/kubernetes/pull/67032#discussion_r214201870).
- Porting all [NodeConformance] tests to e2e/common which do not have node dependencies.
- Does it make sense to port [privileged test](https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/security_context_test.go#L558) to e2e/common and remove [NodeFeature:HostAccess] label from test name?
**Release note**:
```release-note
NONE
```
/area conformance
@kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 63011, 68089, 67944, 68132). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Support both directory and block device for local volume plugin FileSystem VolumeMode
Support both directory and block device for local volume plugin FileSystem VolumeMode
xref: [local storage dynamic provisioning design #1914](https://github.com/kubernetes/community/pull/1914)
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Support both directory and block device for local volume plugin FileSystem VolumeMode
```
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Add --server-dry-run flag to `kubectl apply`
- Adds the flag
- changes the helper so that we can pass options for patch,
- Adds a test to make sure it doesn't change the object
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
Add new `--server-dry-run` flag to `kubectl apply` so that the request will be sent to the server with the dry-run flag (alpha), which means that changes won't be persisted.
```
Automatic merge from submit-queue (batch tested with PRs 67864, 68158). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Update echoserver version used to 2.2
Change-Id: Ic1dcb2c64ac682ca601ab2589fd6af70d4e09620
**What this PR does / why we need it**:
In https://github.com/kubernetes/kubernetes/pull/67578 we updated the image. Let's please switch to the new image
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Taint node in paralle.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#67823
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 63437, 68081). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Enable ImageLocalityPriority by default with integration tests
**What this PR does / why we need it**:
This PR is a follow-up to [#63842](https://github.com/kubernetes/kubernetes/issues/63842). It moves the ImageLocalityPriority function to default priority functions of the default algorithm provider and adds integration tests for the updated scheduling policy.
- Compared to [#64662](https://github.com/kubernetes/kubernetes/pull/64662), this PR does note provide e2e test due to concerns about a large image may add too much overhead to the testing infrastructure and pipeline. We should add e2e tests in the future with the use of large enough image(s) in following PRs.
- Compared to [#64662](https://github.com/kubernetes/kubernetes/pull/64662), this PR simplifies the code changes and keeps code changes under test/integration/scheduler/.
- The PR contains a bug fix for [#65745](https://github.com/kubernetes/kubernetes/pull/65745) - caught by the integration test - where the image states are not properly cloned to the scheduler's cachedNodeInfoMap. We might split this fix into a separate PR.
The integration test covers what follows: a pod requiring a large image (~= 3GB) is submitted to the cluster and there is a single node in the cluster has the same large image; the pod should get scheduled to that node. We might also consider whether more scenarios are desired.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
Kindly ping @resouer and @bsalamat
**Release note**:
```release-note
None
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
cloud-ctrl-mgr: enable secure port 10258
This PR enables authn+authz (delegated to the kube-apiserver) and the secure port 10258 for the cloud-controller-manager. In addition, the insecure port is disabled.
This is the counterpart PR to https://github.com/kubernetes/kubernetes/pull/64149.
Moreover, it adds integration test coverage for the `--port` and `--secure-port` flags, plus the testserver infrastructure to tests flags in general inside integration tests.
```release-note
Enable secure serving on port 10258 to cloud-controller-manager (configurable via `--secure-port`). Delegated authentication and authorization have to be configured like for aggregated API servers.
```
Automatic merge from submit-queue (batch tested with PRs 67578, 68154, 68162, 65545). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fixes#67561 Multiple same headers got wrong result on gcr.io/google-containers/echoserver:1.10
**What this PR does / why we need it**:
Fix a bug of echoserver
**Which issue(s) this PR fixes**:
Fixes#67561
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 67571, 67284, 66835, 68096, 68152). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
apiserver returns continue together with the 410 error
Implements https://github.com/kubernetes/kubernetes/issues/66981#issuecomment-410845134.
Closes#66981.
/sig api-machinery
/assign @lavalamp @liggitt @smarterclayton
```release-note
Upon receiving a LIST request with expired continue token, the apiserver now returns a continue token together with the 410 "the from parameter is too old " error. If the client does not care about getting a list from a consistent snapshot, the client can use this token to continue listing from the next key, but the returned chunk will be from the latest snapshot.
```
Automatic merge from submit-queue (batch tested with PRs 65251, 67255, 67224, 67297, 68105). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Sync peer-finder code from contrib repo
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/contrib/issues/2643
**Special notes for your reviewer**:
This is just an code sync up PR from https://github.com/kubernetes/contrib/pull/2644
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
CSI Cluster Registry and Node Info CRDs
**What this PR does / why we need it**:
Introduces the new `CSIDriver` and `CSINodeInfo` API Object as proposed in https://github.com/kubernetes/community/pull/2514 and https://github.com/kubernetes/community/pull/2034
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/features/issues/594
**Special notes for your reviewer**:
Per the discussion in https://groups.google.com/d/msg/kubernetes-sig-storage-wg-csi/x5CchIP9qiI/D_TyOrn2CwAJ the API is being added to the staging directory of the `kubernetes/kubernetes` repo because the consumers will be attach/detach controller and possibly kubelet, but it will be installed as a CRD (because we want to move in the direction where the API server is Kubernetes agnostic, and all Kubernetes specific types are installed).
**Release note**:
```release-note
Introduce CSI Cluster Registration mechanism to ease CSI plugin discovery and allow CSI drivers to customize Kubernetes' interaction with them.
```
CC @jsafrane
Automatic merge from submit-queue (batch tested with PRs 68051, 68130, 67211, 68065, 68117). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Update `kubectl get` sorter to deal with server-side printing
**Release note**:
```release-note
NONE
```
### Why?
Currently, we default to non-server-side printing when sorting items in `kubectl get`. This means that instead of taking advantage of having the server tell `kubectl` how to display information, `kubectl` falls back to using hardcoded resource types to figure out how to print its output. This does not really work with resources that `kubectl` does not know about, and it goes against our goal of snipping any dependencies that `kubectl` has on the core repo.
This patch adds a sorter capable of dealing with Table objects sent by the server when using "server-side printing".
A few things left to take care of:
- ~~[ ] When printing `all` resources, this implementation does not handle sorting every single Table object, but rather _only_ the rows in each object. As a result, output will contain sorted resources of the same _kind_, but the overall list of mixed resources will _not_ itself be sorted. Example:~~
```bash
$ kubectl get all --sort-by .metadata.name
NAME READY STATUS RESTARTS AGE
# pods here will be sorted:
pod/bar 0/2 Pending 0 31m
pod/foo 1/1 Running 0 37m
NAME DESIRED CURRENT READY AGE
# replication controllers here will be sorted as well:
replicationcontroller/baz 1 1 1 37m
replicationcontroller/buz 1 1 1 37m
# ... but the overall mixed list of rc's and pods will not be sorted
```
This occurs because each Table object received from the server contains all rows for that resource _kind_. We would need a way to build an ambiguous Table object containing all rows for all objects regardless of their type to have a fully sorted mixed-object output.
- [ ] handle sorting by column-names, rather than _only_ with jsonpaths (Tracked in https://github.com/kubernetes/kubernetes/issues/68027)
cc @soltysh @kubernetes/sig-cli-maintainers @seans3 @mengqiy
Automatic merge from submit-queue (batch tested with PRs 68051, 68130, 67211, 68065, 68117). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Wait for Scheduler cache empty.
Signed-off-by: Klaus Ma <klaus1982.cn@gmail.com>
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#68126
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 67756, 64149, 68076, 68131, 68120). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
kube-ctrl-mgr: enable secure port 10257
This PR enables authn+authz (delegated to the kube-apiserver) and the secure port 10257 for the kube-controller-manager. In addition, the insecure port is disabled.
Moreover, it adds integration test coverage for the `--port` and `--secure-port` flags, plus the testserver infrastructure to tests flags in general inside integration tests.
```release-note
Enable secure serving on port 10257 to kube-controller-manager (configurable via `--secure-port`). Delegated authentication and authorization have to be configured like for aggregated API servers.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fix hostpath subpath reconstruction tests are failing
**What this PR does / why we need it**:
Fix hostpath subpath reconstruction tests are failing
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes: #68093
**Special notes for your reviewer**:
/sig storage
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fix multizone gce pd subpath test
**What this PR does / why we need it**:
The format pod for readonly tests also needs to fill in the NodeSelector for inline gce pd volumes.
Also rename "gce" driver to "gcepd"
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#68085
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 67368, 59930, 68074). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Fix subpath tests not to fail in namespace deletion
**What this PR does / why we need it**:
This PR fixes below subpath test not to fail in namespace deletion
- subPath should support restarting containers using directory as subpath
- subPath should support restarting containers using file as subpath
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes: #68073
**Special notes for your reviewer**:
/sig storage
/sig testing
**Release note**:
```release-note
NONE
```
This is the old behaviour and we did not intent to change it due to enabled authn/z in general.
As the kube-apiserver this sets the "system:unsecured" user info.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Move kubelet internal ComponentConfig types to `pkg/kubelet/apis/config`
**What this PR does / why we need it**:
This PR is split out from the main PR of https://github.com/kubernetes/kubernetes/pull/67263, in order to make merging each scoped piece of the puzzle easier and smoother.
This PR simply moves the `k8s.io/kubernetes/pkg/apis/kubeletconfig` as-is to `k8s.io/kubernetes/pkg/apis/config` as agreed in the KEP.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref: kubernetes/community#2354
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
@kubernetes/sig-node-pr-reviews
/assign @mtaufen @thockin @liggitt
Automatic merge from submit-queue (batch tested with PRs 67764, 68034, 67836). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Remove feature tag from dynamic provisioning topology tests
**What this PR does / why we need it**:
Now that the feature has been moved to beta, remove feature tag to let it run in the standard CI suite.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Add etcd DB size monitoring in density test
/cc @wojtek-t
fyi - @jpbetz @gyuho @kubernetes/sig-scalability-misc
```release-note
NONE
```
This commit fixes below subpath test not to fail in namespace deletion
- subPath should support restarting containers using directory as subpath
- subPath should support restarting containers using file as subpath
Fixes: #68073