The existing termination period of 600 seconds for pods on the
e2e test causes that those pods are kept running after the
test has finished. 100 seconds is a good compromise to avoid
leaving pods lingering and more than enought for the test to finish.
Change-Id: I993162a77125345df1829044dc2514e03b13a407
add a new functionality to the agnhost image to run a server that
closes the connections received by sending a RST.
If a TCP servers closes the connection before all the socket is read,
it sends a RST. This implementations just reads only one byte from the
connection and closes it after that, that means that in order for this
to work the client has to send at least 2 bytes of data.
Using a simple curl is enough to trigger a RST:
curl http://127.0.0.1:8080
curl: (56) Recv failure: Connection reset by peer
Change-Id: I238fba0f790f2c92b37c732f51910a8b125f65db
image_list.go is one of the files included in the non-test variant Go build list, but its getSampleDevicePluginPod function references readDaemonSetV1OrDie function defined in device_plugin_test.go which is included in the test variant Go build list only. (The file name is *_test.go).
As a result, "go build" fails with the undefined reference error.
In practice, that may not be an issue since k8s project contributors aren't meant to run go build on this package. However, tools that depend on go build to operate - e.g., gopls or govulncheck ./... - will report this as an error.
Fix this error and make test/e2e package pass go build by moving this file to also test-only source code.
Sometimes, the kube-proxy endpoints take longer than 10 seconds to
be ready, so the initial connection check fails (via `gomega.Eventually`).
This patch uses a separate longer timeout for `gomega.Eventually`.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
* Fix flaky HPA e2e tests by not failing on context cancelled
Consume requests are sent during test execution in a loop in a separate goroutine. Once the test completes, it is expected that a consumption request may be pending. Cancelling the request during cleanup should not cause test failures.
Tests started being flaky since #112923 introduced passing test context that gets cancelled during cleanup.
* Use PollUntilContextTimeout and restructure error ignoring logic
Additional test cases added:
Keeps device plugin assignments across pod and kubelet restarts (no device plugin re-registration)
Keeps device plugin assignments after the device plugin has re-registered (no kubelet or pod restart)
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Add a test suit to simulate node reboot (achieved by removing pods
using CRI API before kubelet is restarted).
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This test captures that scenario where after kubelet restart,
application pod comes up and the device plugin pod hasn't re-registered
itself, the pod fails with admission error. It is worth noting that
once the device plugin pod has registered itself, another
application pod requesting devices ends up running
successfully.
For the test case where kubelet is restarted and device plugin
has re-registered without involving pod restart, since the
pod after kubelet restart ends up with admission error,
we cannot be certain the device that the second pod (pod2) would
get. As long as, it gets a device we consider the test to pass.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Breakdown of the steps implemented as part of this e2e test is as follows:
1. Create a file `registration` at path `/var/lib/kubelet/device-plugins/sample/`
2. Create sample device plugin with an environment variable with
`REGISTER_CONTROL_FILE=/var/lib/kubelet/device-plugins/sample/registration` that
waits for a client to delete the control file.
3. Trigger plugin registeration by deleting the abovementioned directory.
4. Create a test pod requesting devices exposed by the device plugin.
5. Stop kubelet.
6. Remove pods using CRI to ensure new pods are created after kubelet restart.
7. Restart kubelet.
8. Wait for the sample device plugin pod to be running. In this case,
the registration is not triggered.
9. Ensure that resource capacity/allocatable exported by the device plugin is zero.
10. The test pod should fail with `UnexpectedAdmissionError`
11. Delete the test pod.
12. Delete the sample device plugin pod.
13. Remove `/var/lib/kubelet/device-plugins/sample/` and its content, the directory
created to control registration
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>