Upon reconsidering as to the purpose of the test i.e to test the lock
contention flags (--lock-file-contention and --lock-file), it makes
sense that we test only the actual functionality which is the kubelet
should stop once there is a lock contention.
In no way it is the responsiblity of the kubelet to restart, which would
be the responsiblity of a higher system such as systemd.
Hence the removal of the check for releasing the lock and checking for
whether the kubelet is healthy again or not seem out of scope from
kubelet's responsiblities.
Signed-off-by: Imran Pochi <imran@kinvolk.io>
The remote runtime implementation now supports the `verbose` fields,
which are required for consumers like cri-tools to enable multi CRI
version support.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
A cpu/topology manager e2e test wants to require one exclusive CPU
and a share of CPU time; let's round up the allocatable CPU requirements
(from 1 to 2) to reduce the chances of false negatives.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Even though CI machines _usually_ have at least two cpus,
let's rather not assume this holds true, and let's actually
check the allocatable CPUs, skipping even the simplest
tests if the assumption is broken, to avoid false negatives.
Signed-off-by: Francesco Romani <fromani@redhat.com>
The existing cpu/topology manager tests correctly check for the
node resources and skip if the detected resources are not enough
to run the tests, to avoid false negatives.
Unfortunately they do the check against the node capacity, while
the correct approach is to check the allocatable resources.
The existing check is correct only on a narrow set of cases;
otherwise can still lead to false negatives.
This PR fixes that.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Make sure to log out the cpu capacity and allocatable for
the node running the tests, to make the troubleshooting
of test failures easier.
Signed-off-by: Francesco Romani <fromani@redhat.com>
With the removal of the kubelet AppArmor profile validation in
https://github.com/kubernetes/kubernetes/pull/97966 we passed the
responsibility of the desired behavior to the container runtime.
Therefore we have to change the e2e test which silently broke after the
PR merge.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
This reverts commit 9d09c9d246
This E2E test was reverted becuase the test was failing continously.
More on the issue here #104307
This commit re-reverts and brings back the LockContention test, with
the addition of [Serial] tag to the test.
Revert to previous behavior in 1.21/1.20 of setting pod phase to failed
during graceful node shutdown.
Setting pods to failed phase will ensure that external controllers that
manage pods like deployments will create new pods to replace those that
are shutdown. Many customers have taken a dependency on this behavior
and it was breaking change in 1.22, so this change reverts back to the
previous behavior.
Signed-off-by: David Porter <david@porter.me>