On the multi NUMA node environment, kernel splits hugepages allocated under
/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages file equally between NUMA nodes.
That makes it harder to predict where several pods will start because the number
of hugepages on each NUMA node will depend on the amount of NUMA nodes under the environment.
The memory manager test will allocate hugepages on the specific NUMA node to make
the test more predictable on the multi NUMA nodes environment.
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Let's wait for the local node (aka the kubelet)
to be ready before to query podresources again,
to avoid false negatives.
Co-authored-by: Artyom Lukianov <alukiano@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
DKC is being removed and we don't want it to continue flaking the rest
of our tests. Lets disable them when dkc is disabled rather than hard
failing. This fits more in line with our other E2Es, and reduces the
maintenance load in test-infra.
we need to make sure the system state is completely cleaned up
again, to avoid to mess up with the shared node state, before
we transition from one test to another.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Since commit 42dd01aa3f the cpuRequest is in millicores, hence
we need to properly check translating to exclusive cpus
when verifying the resource allocation.
Signed-off-by: Francesco Romani <fromani@redhat.com>
the intent is to make the code more readable, no intended
changes in behaviour. Now it should be a bit more explicit
why the code is checking some values.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: wangyysde <net_use@bzhy.com>
Generation swagger.json.
Use v2 path for hpa_cpu_field.
run update-codegen.sh
Signed-off-by: wangyysde <net_use@bzhy.com>
Some of the networking tests are flaking, and logging the command stdout and stderr
might show us some additional information about the the underlying issue when it
occurs.
Some tests have a short timeout for starting the pods (1 minute), but if
those tests happen to be the first ones to run, and the images have to be
pulled, then the test could timeout, especially with larger images. This
commit will allow us to prepull commonly used E2E test images, so this issue
can be avoided.
The logic to detect stale endpoints was not assuming the endpoint
readiness.
We can have stale entries on UDP services for 2 reasons:
- an endpoint was receiving traffic and is removed or replaced
- a service was receiving traffic but not forwarding it, and starts
to forward it.
Add an e2e test to cover the regression