The node-kubelet-flaky e2e job that runs the the
`Node Performance Testing [Serial] [Slow] [Flaky]` e2e tests have been
flaking because of inconsistencies on the cpu manager checkpoint file.
This seems to be caused because the checkpoint file is deleted (which is
what needs to happen in order to change the CPU manager policy which is
used for these e2e tests) right after the e2e tests asserts that a pod
does not exist anymore.
However, after a pod is deleted, the CPU manager may still be cleaning
up the resources used by the pod which may result in the checkpoint file
being created.
Whenever this happened, the kubelet would panic if we then try to
subsequently change the CPU manager policy to "static" from "none" or
vice versa (this is done 4 times in these tests).
Signed-off-by: alejandrox1 <alarcj137@gmail.com>
Most of these could have been refactored automatically but it wouldn't
have been uglier. The unsophisticated tooling left lots of unnecessary
struct -> pointer -> struct transitions.
Remove the "OrDie" from the name (since it doesn't "or die") and add
an extra check that there is at least 1 node available, since many
callers already did that themselves, and many others should have.