- Test was failing due to using `sleep infinity` inside the busybox
container which was going into a crash loop. `sleep infinity` isn't
supported by the sleep version in busybox, so replace it with a `while
true; sleep loop`.
- Replace usage of dbus message emitting from gdbus to dbus-send. The
test was failing on ubuntu which doesn't have gdbus installed.
dbus-send is installed on COS and Ubuntu, so use it instead.
- Replace check of pod phase with the test util function `PodRunningReady`
which checks both phase as well as pod ready condition.
- Add some more verbose logging to ease future debugging.
The server service monitors the kubelet service and restart it
once the service is down, to avoid kubelet double restarting
we will stop the kubelet service and wait until the kubelet will be
restarted and the node will be ready.
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
With the memory manager static policy:
- start multiple guaranteed pods and verify that pods succeeded to start
- start workload pod on each NUMA node to load the memory and start the
pod that requested more memory than each NUMA node have, the pod should fail
to start with the admission error, because no single NUMA node has enough
memory to start the pod and also each NUMA node already used for single
NUMA node allocation
The test requires at least two NUMA nodes
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Provides basic tests e2e to verify that pod succeeds
to start with MemoryManager enabled.
Verifies both MemoryManager policies and when the node has
multiple NUMA nodes it will verify the memory pinning.
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Under the CPU manager and topology manager e2e tests possible the situation
when one of steps under the test will fail and it will not clean the CPU manager
state file. Move the deletion of the state file to `AfterEach` to guarantee that
the state file will be always removed from the node.
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
- fix the issue when the test runs on the node with the single CPU
- fix the issue when the CPU topology has only one core per socket, it can
be easily reproduced by configuring VM with multi NUMA, but when each socket
has only one core
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Changes default cluster DNS domain to empty string to align with the
default kubelet configuration value.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Removes comment from daemons function that previously indicated that a
check was being run to make sure docker daemon was running.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Relaxes matching of pod_memory_working_set_bytes metric so that we won't
error due to presence of other pods.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
The e2e topology manager want to test the resource alignment using
devices, and the easiest devices to use are the SRIOV devices at this
moment.
The resource alignment test cases are run for each supported policies,
in a loop.
The tests manage the SRIOV device plugin; up until now, the plugin
was set up and tore down at each loop.
There is no real need for that. Each loop must reconfigure (thus
restart) the kubelet, but the device plugin can set up and tore down
just once for all the policies, thus once.
The kubelet can reconnect just fine to a running device plugin.
This way, we greatly reduce the interactions and the complexity of the
test environment, making it easier to understand and more robust, and
we trim down some minutes from execution time.
However, this patch also hides (not solves) a test flake we observed
on some environment. The issue is hardly reproduceable and not well
understood, but seems caused by doing the sriov dp setup/teardown
in each policy testing loop.
Investigation so far suggests that the kubelet sometimes have a stale
state after the sriovdp teardown/setup cycle, leading to flakes and
false negatives.
We tried to address this in https://github.com/kubernetes/kubernetes/pull/95611
with no conclusive results yet.
This patch was posted because overall we believe this patch gains
exceeds the drawbacks (hiding the aforementioned flake) and
because understanding the potential interaction issues between the
sriovdp and the kubelet deserve a separate test.
Signed-off-by: Francesco Romani <fromani@redhat.com>