ingress: use new serviceBackend split
ingress: remove all v1beta1 restrictions on creation
This change removes creation and update restrictions enforced by
k8s 1.18 for not allowing resource backends.
Paths are no longer
required to be valid regex and a PathType is now user-specified
and no longer defaulted.
Also remove all TODOs in staging/net/v1 types
Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
Lowering the amount of cpu allocated to this workload will set the
resources allocated to be similar to the other npb and tf workload in
this tests.
This will also allow to run all three workloads in a n1-standard-12 gcp
instance - which has 16 cpus and 60 GB.
Signed-off-by: alejandrox1 <alarcj137@gmail.com>
For the beta server-side dry-run feature, `kubectl apply` provided the
`--server-dry-run` flag.
As of 1.18, this flag was deprecated and marked to be removed after 1
release.
- Remove the ServerDryRun field and delegate it entirely to the resource.Helper
- Use resource.Helper for deletions (as in `kubectl apply --force`)
instead of using the pruner's method that uses a dynamic client
- Reduce the resource.Helpers and times we check for server-side dry-run
in apply
the test "executing a command with run and attach without stdin"
is inherently flaky, there are several discussion but seems that
it requires changing the way the kubectl run and attach works.
The test fails if we are not able to attach before the container prints
"stdin closed", but hasn't exited yet.
Because the race seems difficult to solve, we can wait 5 seconds
before printing to give time to kubectl to attach to the container.
- use in-cache Pod instead of real-time Pod (by calling API server) to mark it as unschedulable
in internal schedulingQ
- remove the backoff logic as now we don't call API server
- the whole logic is changed to a synchronous call
We have little coverage around node addition and removal. Since distinct event handlers interact, it is important to cover this in integration tests.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
ensure that when a pod servicing UDP traffic is deleted the conntrack entries
are cleaned up and another backend can pick up the traffic with minimal
interruption
When using NodePort services and long running connections that on pod deletion
stale conntrack entries can halt the flow of traffic. Add a test case to check
that conntrack entries are cleaned up.
In case two or more controllers share the informers created through InitTestScheduler,
it's not safe to start the informers until all controllers set their informer
indexers. Otherwise, some controller might fail to register their indexers
in time. Thus, it's responsibility of each consumer to make sure all informers
are started after all controllers had time to get initiliazed.
ginkgo has a weird bug that - AfterEach does not get called when
testsuite exits with certain kind of interrupt (Ctrl-C for example).
More info - https://github.com/onsi/ginkgo/issues/222
We workaround this issue in Kubernetes by adding a special hook into
AfterSuite call, but AfterSuite can not be used to peforms certain
kind of cleanup because it can race with AfterEach hook and
framework.AfterEach hook will set framework.ClientSet to nil.
This presents a problem in cleaning up CSI driver and testpods. This
PR removes cleanup of driver manifest via CleanupAction because that
is not safe and racy (such as f.ClientSet may disappear!) and makes
AfterSuite hooks run in a ordered fashion
We should read and verify the data before actually closing the
connection to avoid connection based-races within the test.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
Removes the fatal error from getIP and moves it to
retry loop so that application will not immediately
crash on failure.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
This updates the error messages when registering a
node to be more explicit about what error occurred
and how long it will wait to retry.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Currently the guestbook application will fail if unable
to resolve TCP address on first attempt. If pod networking
is not setup when the application starts then it will be
unable to resolve, leading to frequent failures. This moves
the address resolution into the retry block so it will try
again if unsuccessful on first attempt.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
The current /exit method is not sufficient to test graceful shutdown
behaviors within Kube that allow services to remain available during
rolling restarts. Add support for `wait=DURATION` and
`timeout=DURATION` to the exit handler and wire that to the Go http
server's graceful termination.
With these methods netexec can be used in a pod to simulate graceful
shutdown by adding a preStop handler that hits the exit endpoint with
a timeout and wait period.
kubelet sometimes calls NodeStageVolume an NodePublishVolume too
often, which breaks this test and leads to flakiness. The test isn't
about that, so we can relax the checking and it still covers what it
was meant to cover.
collectPodsAndNetworkPolicies() is called to collect diagnostics
after a failure. Previously, if it encountered a failure in getting
the logs it would call Failf(), discarding the rest of the diagnostics
immediately.
Following changes in #87730, Kubelet is directly hcsshim to gather stats.
However, unlike `docker stats` API that was used before, hcsshim does not
keep information about exited containers.
When the Kubelet lists containers (`docker_container.go:ListContainers()`),
it sets `All: true`, retrieving non-running containers.
When docker stats is called with such container id, it'll return a valid JSON
with all values set to 0. The non-running containers are filtered later on in the process.
When the hcsshim is called with such container id, it'll return an error, effectively
stopping the stats retrieval for all containers.
"Volumes GlusterFS should be mountable" is a bit flaky in a downstream CI.
This PR make "should be mountable" test on par with the other GlusterFS
tests (in_tree.go: DeleteVolume())
commit 43c56eb403 introduced a change
where CPUAccounting, CPUAccounting and TasksAccounting are enabled for
the systemd service.
It causes a regression on RHEL 7.8 where systemd-run doesn't allow to
set TasksAccounting.
Since Delegate= already enables all the controllers, it is superfluous
to specify them.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
In caf0d1d61874a2c8687b7deb773eca30ddaee5b6 we documented a policy to
ensure that conformance tests should not rely in existence or use of
kubelet apis directly. So based on that we should drop conformance for
the two tests here that use the "/logs" endpoint directly.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The test "should not change the subpath mount on a container restart if the environment variable changes"
creates a pod with the liveness probe: cat /volume_mount/test.log. The test then
deletes that file, which causes the probe to fail and the container to be restarted.
After which it recreates the file by exec-ing into the pod, but there is a chance
that the container was not created yet, or it did not start yet.
This commit adds a few retries to the exec command.
this is mainly to ensure integration tests (which all end in _test)
are properly bossed around for their imports
I had to adjust some of the _test files to adhere to existing
reverse_rules specified elsewhere
specifically:
- cmd/kubeadm/.import-restrictions
- we don't need to explicitly allow k8s.io repos (external or published)
- rm pkg/controller/.import-restrictions
- pkg/client/unversioned was removed in 59042
- pkg/kubectl/.import-restrictions
- pkg/printers is no longer used
- pkg/api was masking all of the pkg/apis prefixes
- rm staging/src/k8s.io/code-generator/cmd/lister-gen/.import-restrictions
- noop / empty file
- test/e2e/framework/.import-restrictions
- we don't need to explicitly allow k8s.io repos (external or published)
yaml has comments, so we can explain why we have certain rules or
certain prefixes
for those files that weren't already commented yaml, I converted them to
yaml and took a best guess at comments based on the PRs that introduced
or updated them
When a test pattern or storage class uses late binding, the cleanup
code didn't know about the PV that may have been created for the PVC
since setting it up and thus then also didn't wait for PV deletion.
This is problematic for test isolation because the next test was
allowed to be started before fully cleaning up. Worse, it the driver
gets removed after the test, the volume might never get deleted.
'docker pull' is a time consuming operation. It makes sense to check
if image exists locally before pulling it from a registry.
Checked if image exists by running 'docker inspect'. Only pull if
image doesn't exist.
The service allocator is used to allocate ip addresses for the
Service IP allocator and NodePorts for the Service NodePort
allocator. It uses a bitmap backed by etcd to store the allocation
and tries to allocate the resources directly from the local memory
instead from etcd, that can cause issues in environment with
high concurrency.
It may happen, in deployments with multiple apiservers, that the
resource allocation information is out of sync, this is more
sensible with NodePorts, per example:
1. apiserver A create a service with NodePort X
2. apiserver B deletes the service
3. apiserver A creates the service again
If the allocation data of apiserver A wasn't refreshed with the
deletion of apiserver B, apiserver A fails the allocation because
the data is out of sync. The Repair loops solve the problem later,
but there are some use cases that require to improve the concurrency
in the allocation logic.
We can try to not do the Allocation and Release operations locally,
and try instead to check if the local data is up to date with etcd,
and operate over the most recent version of the data.
- add ./hack/tools/go.mod, this makes ./hack/tools a distinct module
- hack/tools/tools.go undescore imports bazel related tools, over time we
can add others.
- hack/*.sh scripts will cd to hack/tools and go install tools from there
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
During a parallel test run these tests were observed to cause a
preemption of another test:
```
Apr 19 16:59:06.749: INFO: At 2020-04-19 16:58:52 +0000 UTC - event for pod-init-b6fbd440-dbc2-454a-b31a-ce44266298d1: {default-scheduler } Scheduled: Successfully assigned e2e-init-container-7691/pod-init-b6fbd440-dbc2-454a-b31a-ce44266298d1 to ip-10-0-148-234.us-west-2.compute.internal
Apr 19 16:59:06.750: INFO: At 2020-04-19 16:58:54 +0000 UTC - event for pod-init-b6fbd440-dbc2-454a-b31a-ce44266298d1: {default-scheduler } Preempted: Preempted by e2e-resourcequota-priorityclass-8850/testpod-pclass9 on node ip-10-0-148-234.us-west-2.compute.internal
```
These tests have no need to actually land on a node to validate resource quota and so we can set an impossible scheduling condition. Hopefully we don't have tests that too broadly check impossible scheduling conditions.
The admission cache may take longer to see both ingress classes
than it takes to create the ingress. We must loop until we see
the appropriate error, cleaning up after ourselves as we go.
Don't set a connection deadline for reading, because the read operation will
fail if no data is reaceived after the deadline, and will not keep the
connection in the close_wait status.
Copy csi-hostpath driver manifests from
kubernetes-csi/csi-driver-host-path. It bumps version of all images to the
release shipped along Kubernetes 1.18.
As seen in one case (https://github.com/intel/pmem-csi/issues/587), a
pod can reach the "not running" state although its ephemeral volumes
are still being torn down by kubelet and the CSI driver. What happens
then is that the test returns too early and even deleting the
namespace and thus the pod succeeds before the NodeVolumeUnpublish
really finishes.
To avoid this, StopPod now waits for the pod to really disappear.
The agnhost image used for testing has a `netexec` path which supports
two new flags, `--tls-cert-file` and `--tls-private-key-file`. If the
former is provided, the HTTP server will be upgraded to HTTPS, using the
certificate (and private key) provided.
By default, there are keys already mounted into the container at
`/localhost.crt` and `/localhost.key`, which contain PEM-encoded TLS
certs with IP SANs for `127.0.0.1` and `[::1]`.
This adds 2 new tests covering EndpointSlices, including new coverage of
the self referential Endpoints and EndpointSlices that need to be
created by the API Server and the lifecycle of EndpointSlices from
creation to deletion. This also removes the [feature] indicator from the
name to ensure that this test will run more often now that it is enabled
by default.
Adds reviewers to the OWNERS files in the kubernetes/test/images folder.
The reviewers are added automatically, based on their contributions on
an image (>= 20% code churn).
Note that the code churn is taken into account for authors, and not committers.
Adds OWNERS files for: cuda-vector-add, nonewprivs, pets, redis, volume.
Adds reviewers to the OWNERS files in the kubernetes/test/images folder.
The reviewers are added automatically, based on their contributions on
an image (>= 20% code churn).
Note that the code churn is taken into account for authors, and not committers.
Adds ONWERS files for: apparmor-loader, echoserver, jessie-dnsutils, metadata-concealment,
sample-apiserver.
The build times are a bit high for the image builder (~50 minutes), and it will a bit more
when Windows support will be added to the other test images. This commit changes the
machineType to N1_HIGHCPU_8.
Reenables Windows test image building. Added DOCKER_CERT_BASE_PATH (default value: $HOME),
which will contain the path where the certificates needed for Remote Docker Connection can
be found.
If a REMOTE_DOCKER_URL was not set for a particular OS version, exclude that image from the
manifest list. This fixes an issue where, if REMOTE_DOCKER_URL was not set for Windows Server 1909,
the Windows were completely excluded from the manifest list, including for Windows Server 1809
and 1903 which could have been built and pushed.
Sets "test-webserver" as the default CMD for kitten and nautilus. Since they are now based on
agnhost, they should be set to run test-webserver to maintain previous behaviour.
Bumps the agnhost version to 2.13, as 2.12 has already been promoted. 2.13 will contain
Windows support.
Adds Windows support for the kitten and nautilus images, so they can promoted together
with agnhost (they were not previously promoted).
Adds OWNERS files to: agnhost, busybox, kitten, nautilus.
The timeout for the two loops inside the test itself are now bounded
by an upper limit for the duration of the entire test instead of
having their own, rather arbitrary timeouts.
The functionality included in the e2e/manifests is useful for writing
e2e tests and will be a good addition to the test framework as a
sub-package.
Signed-off-by: alejandrox1 <alarcj137@gmail.com>
Before https://github.com/kubernetes/kubernetes/pull/83084, `kubectl
apply --prune` can prune resources in all namespaces specified in
config files. After that PR got merged, only a single namespace is
considered for pruning. It is OK if namespace is explicitly specified
by --namespace option, but what the PR does is use the default
namespace (or from kubeconfig) if not overridden by command line flag.
That breaks the existing usage of `kubectl apply --prune` without
--namespace option. If --namespace is not used, there is no error,
and no one notices this issue unless they actually check that pruning
happens. This issue also prevents resources in multiple namespaces in
config file from being pruned.
kubectl 1.16 does not have this bug. Let's see the difference between
kubectl 1.16 and kubectl 1.17. Suppose the following config file:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: null
name: foo
namespace: a
labels:
pl: foo
data:
foo: bar
---
apiVersion: v1
kind: ConfigMap
metadata:
creationTimestamp: null
name: bar
namespace: a
labels:
pl: foo
data:
foo: bar
```
Apply it with `kubectl apply -f file`. Then comment out ConfigMap foo
in this file. kubectl 1.16 prunes ConfigMap foo with the following
command:
$ kubectl-1.16 apply -f file -l pl=foo --prune
configmap/bar configured
configmap/foo pruned
But kubectl 1.17 does not prune ConfigMap foo with the same command:
$ kubectl-1.17 apply -f file -l pl=foo --prune
configmap/bar configured
With this patch, kubectl once again can prune the resource as before.
/cluster/kubeadm.sh is used to find the kubeadm binary.
This file is legacy and is removed.
Remove /test/cmd/kubeadm.sh. This file contains a function that is used
to build kubeadm and invoke "make test". Move the function contents
to hack/make-rules/test-cmd.cmd.
Stop sourcing /test/cmd/kubeadm.sh in /test/cmd/legacy-script.sh.
Also remove the --kubeadm-path invocation as this can be handled
with an env. variable directly.