Previously the code used to delete pods serially.
In this patch we factor out code to do that in parallel,
using goroutines.
This shaves some time in the e2e tm test run with no intended
changes in behaviour.
Signed-off-by: Francesco Romani <fromani@redhat.com>
The image "e2eteam/powershell-helper:6.2.7-linux-cache" is a Linux image. Because we're running "docker buildx build --platform windows/amd64", docker buildx will consider it as a Windows image unless we explicitly specify otherwise. If the image's platform is not correctly identified, we can run into problems when trying to build the image.
We are already doing something similar with the windows-servercore-cache image.
These tests were not performing evictions because:
- Test pods were not on the test node (fixed with spec.nodeName)
- The test pods are unmanaged and require the --force flag to be evicted
(added the flag)
- The test node does not run pods, which prevents pod deletion from
finalizing (worked around with --skip-wait-for-delete-timeout)
We can cache the powershell-helper image's results into a scratch Linux image using
docker buildx. This will allow us to spend less time pulling the data we need from the
powershell-helper image when we need it.
Additionally, docker buildx might have some issues with cross-registry images, so this
will allow us to circumvent it.
While adding annotations to the namespace, using the Update API may result in
conflicts as "the object has been modified; please apply your changes to the
latest version and try again". Use Patch API to avoid this.
Signed-off-by: Chaitanya Bandi <kbandi@cs.stonybrook.edu>
While labeling the namespace using the Update API may result in conflicts as
"the object has been modified; please apply your changes to the latest version
and try again". Use Patch API to avoid this.
Signed-off-by: Chaitanya Bandi <kbandi@cs.stonybrook.edu>
The Topology Manager e2e tests wants to run on real multi-NUMA system
and want to consume real devices supported by device plugins; SRIOV
devices happen to be the most commonly available of such devices.
The tests need to wait for resource availability before to actually
run the tests, or they will fail with a false negative, also relatively
hard to debug.
An optimization was added in commit 56106439cf to minimize the restarts,
speed up the execution and make a nasty, yet not fully understood, flake
with SRIOV device plugin much less likely.
Unfortunately the pod-scope tests were mistakenly left over.
This Patch fixes that.
CI lanes did NOT fail (and will not fail) because the CI machines aren't
multi NUMA nor expose SRIOV devices, so the relevant portion of the test
will just skip, avoiding the issue.
However, this resurfaces when running the testsuite on bare metal; this
is how we noticed.
Signed-off-by: Francesco Romani <fromani@redhat.com>