Commit Graph

41 Commits

Author SHA1 Message Date
Francesco Romani
00b41334bf e2e: node: podresources: fix restart wait
Fix the waiting logic in the e2e test loop to wait
for resources to be reported again instead of making logic on the
timestamp. The idea is that waiting for resource availability
is the canonical way clients should observe the desired state,
and it should also be more robust than comparing timestamps,
especially on CI environments.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 14:04:55 +01:00
Francesco Romani
92e00203e0 e2e: node: unify sample device plugin utilities
Start to consolidate the sample device plugin utility
and constants in a central place, because we need
to use it in different e2e tests.

Having a central dependency is better than a maze of
entangled e2e tests depending on each other helpers.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 14:04:55 +01:00
Francesco Romani
871201ba64 e2e: node: remove kubevirt device plugin
The podresources e2e tests want to exercise the case on which a device
plugin doesn't report topology affinity. The only known device plugin
which had this requirement and didn't depend on specialized hardware
was the kubevirt device plugin, which was however deprecated after
we started using it.

So the e2e tests are now broken, and in any case they can't depend on
unmaintained and liable to be obsolete code.

To unblock the state and preserve some e2e signal, we switch to the
sample device plugin, which is a stub implementation and which is
managed in-tree, so we can maintain it and ensure it fits the e2e test
usecase.

This is however a regression, because e2e tests should try their hardest
to use real devices and avoid any mocking or faking.
The upside is that using a OS-neutral device plugin for the tests enables
us to run on all the supported platform (windows!) so this could allow
us to transition these tests to conformance.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 14:04:22 +01:00
Francesco Romani
aa1a0385e2 e2e: node: podresources: internal cleanup
rename getPodResources for clarity. Allow to return error
(and not use ginkgo expectations), so it can actually be used
as intended inside `Eventually` blocks without blow up at the
first failure.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 13:49:13 +01:00
Ian K. Coolidge
cbb985a310 cpuset: Delete 'builder' methods
All usage of builder pattern is convertible to cpuset.New()
with the same or fewer lines of code.

Migrate Builder.Add to a private method of CPUSet, with a comment
that it is only intended for internal use to preserve immutable
propoerty of the exported interface.

This also removes 'require' library dependency, which avoids
non-standard library usage.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
f3829c4be3 cpuset: Rename 'NewCPUSet' to 'New' 2023-01-06 23:32:51 +00:00
Ian K. Coolidge
a0c989b99a cpuset: Remove *Int64 methods
These are rarely used and can be accommodated with a trivial helper.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
67a057d4f2 cpuset: Remove 'MustParse' method
Removes exit/fatal from cpuset library.

Usage in podresources test was not necessary.

Library reference in cpu_manager_test was moved to a local function, and
converted to use e2e test framework error catching.
2023-01-06 23:32:51 +00:00
Patrick Ohly
2f6c4f5eab e2e: use Ginkgo context
All code must use the context from Ginkgo when doing API calls or polling for a
change, otherwise the code would not return immediately when the test gets
aborted.
2022-12-16 20:14:04 +01:00
Patrick Ohly
d4729008ef e2e: simplify test cleanup
ginkgo.DeferCleanup has multiple advantages:
- The cleanup operation can get registered if and only if needed.
- No need to return a cleanup function that the caller must invoke.
- Automatically determines whether a context is needed, which will
  simplify the introduction of context parameters.
- Ginkgo's timeline shows when it executes the cleanup operation.
2022-12-13 08:09:01 +01:00
Patrick Ohly
df5d84ae81 e2e: accept context from Ginkgo
Every ginkgo callback should return immediately when a timeout occurs or the
test run manually gets aborted with CTRL-C. To do that, they must take a ctx
parameter and pass it through to all code which might block.

This is a first automated step towards that: the additional parameter got added
with

    sed -i 's/\(framework.ConformanceIt\|ginkgo.It\)\(.*\)func() {$/\1\2func(ctx context.Context) {/' \
        $(git grep -l -e framework.ConformanceIt -e ginkgo.It )
    $GOPATH/bin/goimports -w $(git status | grep modified: | sed -e 's/.* //')

log_test.go was left unchanged.
2022-12-10 19:50:18 +01:00
Kubernetes Prow Robot
45636684a4
Merge pull request #112897 from fromanirh/podresources-metrics-e2e-tests
register podresources metrics
2022-10-19 13:57:18 -07:00
Francesco Romani
3c60c1a10c node: e2e: add podresources metrics tests
add tests to ensure the podresources metrics are exposed,
and basic sanity tests for their values.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-10-06 15:14:56 +02:00
Patrick Ohly
dfdf88d4fa e2e: adapt to moved code
This is the result of automatically editing source files like this:

    go install golang.org/x/tools/cmd/goimports@latest
    find ./test/e2e* -name "*.go" | xargs env PATH=$GOPATH/bin:$PATH ./e2e-framework-sed.sh

with e2e-framework-sed.sh containing this:

sed -i \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.ExecCommandInContainer(/e2epod.ExecCommandInContainer(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.ExecCommandInContainerWithFullOutput(/e2epod.ExecCommandInContainerWithFullOutput(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.ExecShellInContainer(/e2epod.ExecShellInContainer(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.ExecShellInPod(/e2epod.ExecShellInPod(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.ExecShellInPodWithFullOutput(/e2epod.ExecShellInPodWithFullOutput(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.ExecWithOptions(/e2epod.ExecWithOptions(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.MatchContainerOutput(/e2eoutput.MatchContainerOutput(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.PodClient(/e2epod.NewPodClient(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.PodClientNS(/e2epod.PodClientNS(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.TestContainerOutput(/e2eoutput.TestContainerOutput(\1, /" \
    -e "s/\(f\|fr\|\w\w*\.[fF]\w*\)\.TestContainerOutputRegexp(/e2eoutput.TestContainerOutputRegexp(\1, /" \
    -e "s/framework.AddOrUpdateLabelOnNode\b/e2enode.AddOrUpdateLabelOnNode/" \
    -e "s/framework.AllNodes\b/e2edebug.AllNodes/" \
    -e "s/framework.AllNodesReady\b/e2enode.AllNodesReady/" \
    -e "s/framework.ContainerResourceGatherer\b/e2edebug.ContainerResourceGatherer/" \
    -e "s/framework.ContainerResourceUsage\b/e2edebug.ContainerResourceUsage/" \
    -e "s/framework.CreateEmptyFileOnPod\b/e2eoutput.CreateEmptyFileOnPod/" \
    -e "s/framework.DefaultPodDeletionTimeout\b/e2epod.DefaultPodDeletionTimeout/" \
    -e "s/framework.DumpAllNamespaceInfo\b/e2edebug.DumpAllNamespaceInfo/" \
    -e "s/framework.DumpDebugInfo\b/e2eoutput.DumpDebugInfo/" \
    -e "s/framework.DumpNodeDebugInfo\b/e2edebug.DumpNodeDebugInfo/" \
    -e "s/framework.EtcdUpgrade\b/e2eproviders.EtcdUpgrade/" \
    -e "s/framework.EventsLister\b/e2edebug.EventsLister/" \
    -e "s/framework.ExecOptions\b/e2epod.ExecOptions/" \
    -e "s/framework.ExpectNodeHasLabel\b/e2enode.ExpectNodeHasLabel/" \
    -e "s/framework.ExpectNodeHasTaint\b/e2enode.ExpectNodeHasTaint/" \
    -e "s/framework.GCEUpgradeScript\b/e2eproviders.GCEUpgradeScript/" \
    -e "s/framework.ImagePrePullList\b/e2epod.ImagePrePullList/" \
    -e "s/framework.KubectlBuilder\b/e2ekubectl.KubectlBuilder/" \
    -e "s/framework.LocationParamGKE\b/e2eproviders.LocationParamGKE/" \
    -e "s/framework.LogSizeDataTimeseries\b/e2edebug.LogSizeDataTimeseries/" \
    -e "s/framework.LogSizeGatherer\b/e2edebug.LogSizeGatherer/" \
    -e "s/framework.LogsSizeData\b/e2edebug.LogsSizeData/" \
    -e "s/framework.LogsSizeDataSummary\b/e2edebug.LogsSizeDataSummary/" \
    -e "s/framework.LogsSizeVerifier\b/e2edebug.LogsSizeVerifier/" \
    -e "s/framework.LookForStringInLog\b/e2eoutput.LookForStringInLog/" \
    -e "s/framework.LookForStringInPodExec\b/e2eoutput.LookForStringInPodExec/" \
    -e "s/framework.LookForStringInPodExecToContainer\b/e2eoutput.LookForStringInPodExecToContainer/" \
    -e "s/framework.MasterAndDNSNodes\b/e2edebug.MasterAndDNSNodes/" \
    -e "s/framework.MasterNodes\b/e2edebug.MasterNodes/" \
    -e "s/framework.MasterUpgradeGKE\b/e2eproviders.MasterUpgradeGKE/" \
    -e "s/framework.NewKubectlCommand\b/e2ekubectl.NewKubectlCommand/" \
    -e "s/framework.NewLogsVerifier\b/e2edebug.NewLogsVerifier/" \
    -e "s/framework.NewNodeKiller\b/e2enode.NewNodeKiller/" \
    -e "s/framework.NewResourceUsageGatherer\b/e2edebug.NewResourceUsageGatherer/" \
    -e "s/framework.NodeHasTaint\b/e2enode.NodeHasTaint/" \
    -e "s/framework.NodeKiller\b/e2enode.NodeKiller/" \
    -e "s/framework.NodesSet\b/e2edebug.NodesSet/" \
    -e "s/framework.PodClient\b/e2epod.PodClient/" \
    -e "s/framework.RemoveLabelOffNode\b/e2enode.RemoveLabelOffNode/" \
    -e "s/framework.ResourceConstraint\b/e2edebug.ResourceConstraint/" \
    -e "s/framework.ResourceGathererOptions\b/e2edebug.ResourceGathererOptions/" \
    -e "s/framework.ResourceUsagePerContainer\b/e2edebug.ResourceUsagePerContainer/" \
    -e "s/framework.ResourceUsageSummary\b/e2edebug.ResourceUsageSummary/" \
    -e "s/framework.RunHostCmd\b/e2eoutput.RunHostCmd/" \
    -e "s/framework.RunHostCmdOrDie\b/e2eoutput.RunHostCmdOrDie/" \
    -e "s/framework.RunHostCmdWithFullOutput\b/e2eoutput.RunHostCmdWithFullOutput/" \
    -e "s/framework.RunHostCmdWithRetries\b/e2eoutput.RunHostCmdWithRetries/" \
    -e "s/framework.RunKubectl\b/e2ekubectl.RunKubectl/" \
    -e "s/framework.RunKubectlInput\b/e2ekubectl.RunKubectlInput/" \
    -e "s/framework.RunKubectlOrDie\b/e2ekubectl.RunKubectlOrDie/" \
    -e "s/framework.RunKubectlOrDieInput\b/e2ekubectl.RunKubectlOrDieInput/" \
    -e "s/framework.RunKubectlWithFullOutput\b/e2ekubectl.RunKubectlWithFullOutput/" \
    -e "s/framework.RunKubemciCmd\b/e2ekubectl.RunKubemciCmd/" \
    -e "s/framework.RunKubemciWithKubeconfig\b/e2ekubectl.RunKubemciWithKubeconfig/" \
    -e "s/framework.SingleContainerSummary\b/e2edebug.SingleContainerSummary/" \
    -e "s/framework.SingleLogSummary\b/e2edebug.SingleLogSummary/" \
    -e "s/framework.TimestampedSize\b/e2edebug.TimestampedSize/" \
    -e "s/framework.WaitForAllNodesSchedulable\b/e2enode.WaitForAllNodesSchedulable/" \
    -e "s/framework.WaitForSSHTunnels\b/e2enode.WaitForSSHTunnels/" \
    -e "s/framework.WorkItem\b/e2edebug.WorkItem/" \
    "$@"

for i in "$@"; do
    # Import all sub packages and let goimports figure out which of those
    # are redundant (= already imported) or not needed.
    sed -i -e '/"k8s.io.kubernetes.test.e2e.framework"/a e2edebug "k8s.io/kubernetes/test/e2e/framework/debug"' "$i"
    sed -i -e '/"k8s.io.kubernetes.test.e2e.framework"/a e2ekubectl "k8s.io/kubernetes/test/e2e/framework/kubectl"' "$i"
    sed -i -e '/"k8s.io.kubernetes.test.e2e.framework"/a e2enode "k8s.io/kubernetes/test/e2e/framework/node"' "$i"
    sed -i -e '/"k8s.io.kubernetes.test.e2e.framework"/a e2eoutput "k8s.io/kubernetes/test/e2e/framework/pod/output"' "$i"
    sed -i -e '/"k8s.io.kubernetes.test.e2e.framework"/a e2epod "k8s.io/kubernetes/test/e2e/framework/pod"' "$i"
    sed -i -e '/"k8s.io.kubernetes.test.e2e.framework"/a e2eproviders "k8s.io/kubernetes/test/e2e/framework/providers"' "$i"
    goimports -w "$i"
done
2022-10-06 08:19:47 +02:00
Dave Chen
857458cfa5 update ginkgo from v1 to v2 and gomega to 1.19.0
- update all the import statements
- run hack/pin-dependency.sh to change pinned dependency versions
- run hack/update-vendor.sh to update go.mod files and the vendor directory
- update the method signatures for custom reporters

Signed-off-by: Dave Chen <dave.chen@arm.com>
2022-07-08 10:44:46 +08:00
Sergiusz Urbaniak
1495c9f2cd
test/e2e/*: default existing tests to privileged pod security policy
This is to ensure that all existing tests don't break when defaulting
the pod security policy to restricted in the e2e test framework.
2022-04-05 08:41:12 +02:00
ahrtr
fe95aa614c io/ioutil has already been deprecated in golang 1.16, so replace all ioutil with io and os 2022-02-03 05:32:12 +08:00
Francesco Romani
bf9bab5bc6 e2e: podresources: wait for local node ready again
Let's wait for the local node (aka the kubelet)
to be ready before to query podresources again,
to avoid false negatives.

Co-authored-by: Artyom Lukianov <alukiano@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-11-09 19:02:19 +01:00
Francesco Romani
14105c09fb e2e: node: wait for kvm plugin removal
we need to make sure the system state is completely cleaned up
again, to avoid to mess up with the shared node state, before
we transition from one test to another.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-11-09 11:43:55 +01:00
Francesco Romani
4b46c3a0d2 e2e: node: podresources: fix exclusive cpus check
Since commit 42dd01aa3f the cpuRequest is in millicores, hence
we need to properly check translating to exclusive cpus
when verifying the resource allocation.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-11-09 11:16:54 +01:00
Francesco Romani
a6e8f7530a e2e: node: podresources: add internal helpers
the intent is to make the code more readable, no intended
changes in behaviour. Now it should be a bit more explicit
why the code is checking some values.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-11-09 11:16:54 +01:00
Artyom Lukianov
50fdcdfc59 e2e_node: refactor code to use a single method to update the kubelet config
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 15:44:35 +02:00
Artyom Lukianov
b6211657bf e2e_node: drop usage of DynamicKubeletConfig
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 15:26:19 +02:00
Kubernetes Prow Robot
e450e3331f
Merge pull request #105482 from endocrimes/dani/kubeletconfig
e2e_node: remove unnecessary dynamic config changes
2021-10-28 07:04:27 -07:00
Martin Schimandl
c9edee165a
Cleanup FeatureGate skippers (#105428)
* Cleanup FeatureGate skippers

* Perform changes requested by review

* some more review related changes

* Rename skipper functions to make code more readable

* add utilfeature back in
2021-10-20 01:47:57 -07:00
Kubernetes Prow Robot
fe62fcc9b4
Merge pull request #105516 from fromanirh/e2e-kubelet-restart-improvements
e2e: node: kubelet restart improvements
2021-10-14 17:58:54 -07:00
Kubernetes Prow Robot
894ceb63d0
Merge pull request #105003 from swatisehgal/getallocatable-to-beta
podresource-api: getAllocatableResources to Beta
2021-10-13 17:43:27 -07:00
Francesco Romani
d15bff2839 e2e: node: expose the running flag
Each e2e test knows it wants to restart a running kubelet or a
non-running kubelet. The vast majority of times, we want to
restart a running kubelet (e.g. to change config or to check
some properties hold across kubelet crashes/restarts), but sometimes
we stop the kubelet, do some actions and only then restart.

To accomodate both use cases, we just expose the `running` boolean
flag to the e2e tests.

Having the `restartKubelet` explicitly restarting a running kubelet
helps us to trobuleshoot e2e failures on which the kubelet
was supposed to be running, while it was not; attempting a restart
in such cases only murkied the waters further, making the
troubleshooting and the eventual fix harder.

In the happy path, no expected change in behaviour.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-10-07 22:15:28 +02:00
Swati Sehgal
5043b431b4 excludesharedpool: e2e tests: Test cases for pods with non-integral CPUs
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-07 15:39:41 +01:00
Swati Sehgal
42dd01aa3f excludesharedpool: e2e tests: code refactor to handle non-integral CPUs
This patch changes cpuCount to cpuRequest in order to cater to cases
where guaranteed pods make non-integral CPU Requests.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-07 15:39:40 +01:00
Kubernetes Prow Robot
c4d802b0b5
Merge pull request #103289 from AlexeyPerevalov/DoNotExportEmptyTopology
podresources: do not export empty NUMA topology
2021-10-07 07:11:46 -07:00
Swati Sehgal
9337902648 podresource: move the checkForTopology logic inline
As per the recommendation here: https://github.com/kubernetes/kubernetes/pull/103289#pullrequestreview-766949859
we move the check inline.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-06 11:31:48 +01:00
Danielle Lancashire
742d3d36f5 e2e_node: cleanup features in podresources 2021-10-05 14:39:59 +02:00
Swati Sehgal
01dacd0463 podresource-api: getAllocatableResources to Beta
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-01 16:48:29 +01:00
Francesco Romani
54c7d8fbb1 e2e: TM: add option to fail instead of skip
The Topology Manager e2e tests wants to run on real multi-NUMA system
and want to consume real devices supported by device plugins; SRIOV
devices happen to be the most commonly available of such devices.

CI machines aren't multi NUMA nor expose SRIOV devices, so the biggest portion
of the tests will just skip, and we need to keep it like this until we
figure out how to enable these features.

However, some organizations can and want to run the testsuite on bare metal;
in this case, the current test will skip (not fail) with misconfigured
boxes, and this reports a misleading result. It will be much better to
fail if the test preconditions aren't met.

To satisfy both needs, we add an option, controlled by an environment
variable, to fail (not skip) if the machine on which the test run
doesn't meet the expectations (multi-NUMA, 4+ cores per NUMA cell,
expose SRIOV VFs).
We keep the old behaviour as default to keep being CI friendly.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-09-13 13:23:36 +02:00
Alexey Perevalov
bb81101570 podresource: do not export NUMA topology if it's empty
If device plugin returns device without topology, keep it internaly
as NUMA node -1, it helps at podresources level to not export NUMA
topology, otherwise topology is exported with NUMA node id 0,
which is not accurate.

It's imposible to unveile this bug just by tracing json.Marshal(resp)
in podresource client, because NUMANodes field ID has json property
omitempty, in this case when ID=0 shown as emtpy NUMANode.
To reproduce it, better to iterate on devices and just
trace dev.Topology.Nodes[0].ID.

Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2021-08-24 15:38:21 +00:00
Francesco Romani
fc0955c26a e2e: topomgr: use deletePodSync for faster delete
Previously the code used to delete pods serially.
In this patch we factor out code to do that in parallel,
using goroutines.

This shaves some time in the e2e tm test run with no intended
changes in behaviour.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-18 13:28:16 +01:00
Francesco Romani
d7a30e1b08 podresources: getallocatable: add feature gate
Add feature gate to disable the GetAllocatableResources API.
The feature gate isd alpha stage, disabled by default.

Add e2e test to demonstrate the behaviour with feature gate disabled.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-09 13:14:56 +01:00
Francesco Romani
adfff27279 node: e2e: run deleteSync in parallel
speedup the cleanup after testcases deleting pods in separate
goroutines.
The post-test cleanup stage must be done carefully since pod require
exclusive allocation - so pods must take all the steps to properly
cleanup the tests to avoid to pollute the environment, but
this has a negative effect on test duration (take longer).

Hence, we add safe speedups like doing pod deletions in parallel.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-09 13:13:36 +01:00
Francesco Romani
9c69db3f04 e2e: node: add tests for GetAllocatableResources
Add e2e tests for the new GetAllocatableResources API.
The tests are added in the `podresources_test` suite
created previously in this series.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-09 13:13:36 +01:00
Francesco Romani
4e7434028c e2e: node: bootstrap podresources tests
Start e2e tests for the existing List() API.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-09 13:13:35 +01:00