Commit Graph

34 Commits

Author SHA1 Message Date
Aaron Crickenberger
28768166f5 decouple testfiles from framework
This drops testfiles.ReadOrDie and updated testfiles.Exists to return an
error, forcing the caller to decide whether to call framework.Fail or do
something else.

It makes for a slightly less friendly API, but also means the package is
decoupled from framework again, as per the comments at the top of the
file
2020-06-29 14:54:09 -07:00
David Porter
f5b8c3d746 Mark Topology Manager Test as non-alpha and NodeFeature 2020-05-26 12:10:18 -07:00
drfish
dfab6b637f Update .import-aliases for e2e test framework 2020-03-25 11:40:02 +08:00
Kubernetes Prow Robot
5708511499
Merge pull request #88708 from mikedanese/deleteopts
Migrate clientset metav1.DeleteOpts to pass-by-value
2020-03-05 23:09:23 -08:00
Kubernetes Prow Robot
50dd75f9c5
Merge pull request #88773 from vpickard/e2e-topology-manager-sriovdpReady
e2e-topology-manager: Wait for SR-IOV device plugin
2020-03-05 20:04:38 -08:00
Mike Danese
c58e69ec79 automated refactor 2020-03-05 14:59:46 -08:00
Kubernetes Prow Robot
1f2e1967d1
Merge pull request #88566 from Deepthidharwar/topology-mgr-numa-tests
Enable running cpu-mgr-multiNUMA e2e tests with Topology manager
2020-03-05 05:38:37 -08:00
vpickard
61565b3f6c e2e-topology-manager: Wait for SR-IOV device plugin
Make sure the SR-IOV device plugin is ready, and that
there are enough SR-IOV devices allocatable before
spinning up test pods.

Signed-off-by: vpickard <vpickard@redhat.com>
2020-03-04 10:07:35 -05:00
Deepthi Dharwar
1ede096465 Enable topology-manager-e2e tests to run on MultiNUMA nodes.
Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>
2020-03-02 22:36:43 +05:30
Deepthi Dharwar
a4b59a5d7c Currently SRIOV detection logic is reporting error if it fails to detect SRIOV device
on the system. This patch aims to fix the same.

Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>
2020-03-02 19:31:37 +05:30
Francesco Romani
64904d0ab8 e2e: topomgr: extend tests to all the policies
Per https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0035-20190130-topology-manager.md#multi-numa-systems-tests
we validate only the results for single-numa node policy,
because the is no a simple and reliable way to validate
the allocation performed by the other policies.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-20 18:22:34 +01:00
Francesco Romani
a249b93687 e2e: topomgr: address reviewer comments
Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-20 10:31:09 +01:00
Francesco Romani
833519f80b e2e: topomgr: properly clean up after completion
Due to an oversight, the e2e topology manager tests
were leaking a configmap and a serviceaccount.
This patch ensures a proper cleanup

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
7c12251c7a e2e: topomgr: add multi-container tests
Add tests to check alignment of pods which contains more than one
container.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
8e9d76f1b9 e2e: topomgr: validate all containers in pod
Up until now, the test validated the alignment of resources
only in the first container in a pod. That was just an overlook.
With this patch, we validate all the containers in a given pod.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
ddc18eae67 e2e: topomgr: autodetect NUMA position of VF devs
Add autodetection code to figure out on which NUMA node are
the devices attached to.
This autodetection work under the assumption all the VFs in
the system must be used for the tests.
Should not this be the case, or in general to handle non-trivial
configurations, we keep the annotations mechanism added to the
SRIOV device plugin config map.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
bb6beb99e5 e2e: topomgr: early check to detect VFs, not PFs
The e2e_node topology_manager check have a early, quick check
to rule out systems without sriov device, thus skipping the tests.

The first version of the ckeck detected PFs, (Physical Functions),
under the assumption that VFs (Virtual Functions) were already been
created. This works because, obviously, you can't have VFs without PFs.

However, it's a little safer and easier to understand if we check
firectly for VFs, bailing out from systems which don't provide them.

Nothing changes for properly configured test systems.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:41 +01:00
Francesco Romani
70cce5e3f1 e2e: topomgr: introduce sriov setup/teardown funcs
Reorganize the code with setup and teardown functions,
to make room for the future addition of more device plugin
support, and to make the code a bit tidier.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
2f0a6d2c76 e2e: topomgr: use constants for test limits
Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
fee1dba054 e2r: topomgr: improve the test logs
Add clarification to which test is doing what, to make
the test output easier to understand.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
83c344647f e2e: topomgr: better check for AffinityError
Add a helper function to check if a Pod failed
admission for Topology Affinity Error.
So far we only check the Status.Reason.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
512a4e8a3e e2e: topomgr: reduce node readiness timeout
Five minutes was initially used only to be overcautious.
From my experiments, the node is ready in usually less than a minute.
Double it to give some buffer space.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
3b4122bd03 e2e: topomgr: get and use topology hints from conf
TO properly implement some e2e tests, we need to know
some basic topology facts about the system running the tests.
The bare minimum we need to know is how many PCI SRIOV devices
are attached to which NUMA node.

This way we know which core we can reserve for kube services,
and which NUMA socket we can take to test full socket reservation.

To let the tests know the PCI device topology, we use annotations
in the SRIOV device plugin ConfigMap we need anyway.
The format is

```yaml
  metadata:
    annotations:
      pcidevice_node0: "2"
      pcidevice_node1: "0"
```

with one annotation per NUMA node in the system.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
d9d652e867 e2e: topomgr: initial negative tests
Negative tests is when we request a gu Pod we know the system cannot
fullfill - hence we expect rejection from the topology manager.

Unfortunately, besides the trivial case of excessive cores (request
more socket than a NUMA node provides) we cannot easily test the
devices, because crafting a proper pod will require detailed knowledge
of the hw topology.

Let's consider a hypotetical two-node NUMA system with two PCIe busses,
one per NUMA node, with a SRIOV device on each bus.
A proper negative test would require two SRIOV device, that the system
can provide but not on the same single NUMA node.
Requiring for example three devices (one more than the system provides)
will lead to a different, legitimate admission error.

For these reasons we bootstrap the testing infra for the negative tests,
but we add just the simplest one.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
ee92b4aae0 e2e: topomgr: add more positive tests
this patch builds on the topology manager e2e infrastructure to
add more positive e2e test cases.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
1b5801a086 e2e: topomgr: add option to specify the SRIOV conf
We cannot anticipate all the possible configurations
needed by the SRIOV device plugin: there is too much variety.

Hence, we need to allow the test environment to supply
a host-specific ConfigMap to properly configure the device
plugin and avoid false negatives.

We still provide a the default config map as fallback and reference.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
6687fcc78c e2e: topomgr: autodetect SRIOV resource to use
The SRIOV device plugin can create different resources depending
on both the hardware present on the system and the configuration.
As long as we have at least one SRIOV device, the tests don't actually
care about which specific device is.

Previously, the test hardcoded the most common intel SRIOV device
identifier. This patch lifts the restriction and let the test
autodetect and use what's available.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
fa26fb6817 e2e: topomgr: check pod resource alignment
This patch extends and completes the previously-added
empty topology manager test for single-NUMA node policy
by adding reporting in the test pod and checking
the resource alignment.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
cd7e3d626c e2e: topomgr: add test infra
This patch all the testing infra and utilities needed
to run e2e topology manager tests. This include setup
a guaranteed pod which needs some devices.

The simplest real device available for the purpose
are the SRIOV devices, hence we use them.

This patch pulls the SRIOV device plugin from
the official, yet external, repository.
We do it as close as possible for the nvidia GPU plugin.

This patch also performs minor refactoring for some
test framework utilities, needed to support the new
e2e tests.

Finally, we add an empty e2e topology manager test,
to be completed by the next patch.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
1fdf262137 e2e: topomgr: explicit save the kubelet config
For the sake of readability, save the old Kubelet config
once.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
tanjunchen
7ff3a1f8db test/e2e/framework: remove skip.go and use e2eskipper subpackage 2020-02-01 01:18:48 +08:00
vpickard
31b0d7f853 e2e-topology-manager: Fix package name
Change package name to e2enode

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 16:37:35 -05:00
vpickard
fba4a7be34 e2e-topology-manager: fixes for gofmt
Some cleanup for gofmt fixes

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 16:32:58 -05:00
vpickard
337fdf2f37 [WIP] e2e-topology-manager: Initial commit for E2E tests
This is the initial commit for E2E testing for Topology
Manager.

For now, run a subset of the CPU Manager tests.

Additional tests will be forthcoming.

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 16:32:58 -05:00