kubernetes

Author	SHA1	Message	Date
Francesco Romani	16d5ac3689	node: e2e: docs and fix for teardownSRIOVConfig Document why teardownSRIOVPod has to wait for all the containers to be gone before to end, and why is important. Additionally, change the code to wait for all the containers to be gone, not just the first. This is both a little cleaner and a little safer, even though it seems the current code caused no issues so far. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:36 +01:00
Francesco Romani	4e7434028c	e2e: node: bootstrap podresources tests Start e2e tests for the existing List() API. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:35 +01:00
Kubernetes Prow Robot	bd902db13d	Merge pull request #98342 from cynepco3hahue/e2e_move_delete_state_file_to_after_each e2e: move deleteState file to the AfterEach	2021-02-24 11:10:50 -08:00
pacoxu	a10bdfed09	fix all keps links 404 for kep folder migration Signed-off-by: pacoxu <paco.xu@daocloud.io>	2021-02-01 19:41:59 +08:00
Artyom Lukianov	97ac255513	e2e: move deleteState file to the AfterEach Under the CPU manager and topology manager e2e tests possible the situation when one of steps under the test will fail and it will not clean the CPU manager state file. Move the deletion of the state file to `AfterEach` to guarantee that the state file will be always removed from the node. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-01-26 20:34:17 +02:00
Francesco Romani	56106439cf	node: e2e: bring up/down SRIOV DP just once The e2e topology manager want to test the resource alignment using devices, and the easiest devices to use are the SRIOV devices at this moment. The resource alignment test cases are run for each supported policies, in a loop. The tests manage the SRIOV device plugin; up until now, the plugin was set up and tore down at each loop. There is no real need for that. Each loop must reconfigure (thus restart) the kubelet, but the device plugin can set up and tore down just once for all the policies, thus once. The kubelet can reconnect just fine to a running device plugin. This way, we greatly reduce the interactions and the complexity of the test environment, making it easier to understand and more robust, and we trim down some minutes from execution time. However, this patch also hides (not solves) a test flake we observed on some environment. The issue is hardly reproduceable and not well understood, but seems caused by doing the sriov dp setup/teardown in each policy testing loop. Investigation so far suggests that the kubelet sometimes have a stale state after the sriovdp teardown/setup cycle, leading to flakes and false negatives. We tried to address this in https://github.com/kubernetes/kubernetes/pull/95611 with no conclusive results yet. This patch was posted because overall we believe this patch gains exceeds the drawbacks (hiding the aforementioned flake) and because understanding the potential interaction issues between the sriovdp and the kubelet deserve a separate test. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-11-13 10:04:31 +01:00
Pawel Rapacz	16c7bf4db4	Implement e2e tests for pod scope alignment A suite of e2e tests was created for Topology Manager so as to test pod scope alignment feature. Co-authored-by: Pawel Rapacz <p.rapacz@partner.samsung.com> Co-authored-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com> Signed-off-by: Cezary Zukowski <c.zukowski@samsung.com>	2020-11-12 12:25:55 +01:00
Francesco Romani	82a730f116	e2e: topomgr: fix ginkgo log Due to a rebase glitch the fmt.Sprintf() was lost. This patches restores it improving the logs readability. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-10-19 19:28:01 +02:00
Francesco Romani	009b5356cb	e2e: node: topomgr: avoid plugin leak on test fail We need to make sure we tear down the sriov device plugin pod should the tests fail, to avoid leaking pods in the test environment. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-10-14 23:01:58 +02:00
Aaron Crickenberger	28768166f5	decouple testfiles from framework This drops testfiles.ReadOrDie and updated testfiles.Exists to return an error, forcing the caller to decide whether to call framework.Fail or do something else. It makes for a slightly less friendly API, but also means the package is decoupled from framework again, as per the comments at the top of the file	2020-06-29 14:54:09 -07:00
David Porter	f5b8c3d746	Mark Topology Manager Test as non-alpha and NodeFeature	2020-05-26 12:10:18 -07:00
drfish	dfab6b637f	Update .import-aliases for e2e test framework	2020-03-25 11:40:02 +08:00
Kubernetes Prow Robot	5708511499	Merge pull request #88708 from mikedanese/deleteopts Migrate clientset metav1.DeleteOpts to pass-by-value	2020-03-05 23:09:23 -08:00
Kubernetes Prow Robot	50dd75f9c5	Merge pull request #88773 from vpickard/e2e-topology-manager-sriovdpReady e2e-topology-manager: Wait for SR-IOV device plugin	2020-03-05 20:04:38 -08:00
Mike Danese	c58e69ec79	automated refactor	2020-03-05 14:59:46 -08:00
Kubernetes Prow Robot	1f2e1967d1	Merge pull request #88566 from Deepthidharwar/topology-mgr-numa-tests Enable running cpu-mgr-multiNUMA e2e tests with Topology manager	2020-03-05 05:38:37 -08:00
vpickard	61565b3f6c	e2e-topology-manager: Wait for SR-IOV device plugin Make sure the SR-IOV device plugin is ready, and that there are enough SR-IOV devices allocatable before spinning up test pods. Signed-off-by: vpickard <vpickard@redhat.com>	2020-03-04 10:07:35 -05:00
Deepthi Dharwar	1ede096465	Enable topology-manager-e2e tests to run on MultiNUMA nodes. Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>	2020-03-02 22:36:43 +05:30
Deepthi Dharwar	a4b59a5d7c	Currently SRIOV detection logic is reporting error if it fails to detect SRIOV device on the system. This patch aims to fix the same. Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>	2020-03-02 19:31:37 +05:30
Francesco Romani	64904d0ab8	e2e: topomgr: extend tests to all the policies Per https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0035-20190130-topology-manager.md#multi-numa-systems-tests we validate only the results for single-numa node policy, because the is no a simple and reliable way to validate the allocation performed by the other policies. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-20 18:22:34 +01:00
Francesco Romani	a249b93687	e2e: topomgr: address reviewer comments Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-20 10:31:09 +01:00
Francesco Romani	833519f80b	e2e: topomgr: properly clean up after completion Due to an oversight, the e2e topology manager tests were leaking a configmap and a serviceaccount. This patch ensures a proper cleanup Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-19 17:15:42 +01:00
Francesco Romani	7c12251c7a	e2e: topomgr: add multi-container tests Add tests to check alignment of pods which contains more than one container. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-19 17:15:42 +01:00
Francesco Romani	8e9d76f1b9	e2e: topomgr: validate all containers in pod Up until now, the test validated the alignment of resources only in the first container in a pod. That was just an overlook. With this patch, we validate all the containers in a given pod. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-19 17:15:42 +01:00
Francesco Romani	ddc18eae67	e2e: topomgr: autodetect NUMA position of VF devs Add autodetection code to figure out on which NUMA node are the devices attached to. This autodetection work under the assumption all the VFs in the system must be used for the tests. Should not this be the case, or in general to handle non-trivial configurations, we keep the annotations mechanism added to the SRIOV device plugin config map. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-19 17:15:42 +01:00
Francesco Romani	bb6beb99e5	e2e: topomgr: early check to detect VFs, not PFs The e2e_node topology_manager check have a early, quick check to rule out systems without sriov device, thus skipping the tests. The first version of the ckeck detected PFs, (Physical Functions), under the assumption that VFs (Virtual Functions) were already been created. This works because, obviously, you can't have VFs without PFs. However, it's a little safer and easier to understand if we check firectly for VFs, bailing out from systems which don't provide them. Nothing changes for properly configured test systems. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-19 17:15:41 +01:00
Francesco Romani	70cce5e3f1	e2e: topomgr: introduce sriov setup/teardown funcs Reorganize the code with setup and teardown functions, to make room for the future addition of more device plugin support, and to make the code a bit tidier. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:54 +01:00
Francesco Romani	2f0a6d2c76	e2e: topomgr: use constants for test limits Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:54 +01:00
Francesco Romani	fee1dba054	e2r: topomgr: improve the test logs Add clarification to which test is doing what, to make the test output easier to understand. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:54 +01:00
Francesco Romani	83c344647f	e2e: topomgr: better check for AffinityError Add a helper function to check if a Pod failed admission for Topology Affinity Error. So far we only check the Status.Reason. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:54 +01:00
Francesco Romani	512a4e8a3e	e2e: topomgr: reduce node readiness timeout Five minutes was initially used only to be overcautious. From my experiments, the node is ready in usually less than a minute. Double it to give some buffer space. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:54 +01:00
Francesco Romani	3b4122bd03	e2e: topomgr: get and use topology hints from conf TO properly implement some e2e tests, we need to know some basic topology facts about the system running the tests. The bare minimum we need to know is how many PCI SRIOV devices are attached to which NUMA node. This way we know which core we can reserve for kube services, and which NUMA socket we can take to test full socket reservation. To let the tests know the PCI device topology, we use annotations in the SRIOV device plugin ConfigMap we need anyway. The format is ```yaml metadata: annotations: pcidevice_node0: "2" pcidevice_node1: "0" ``` with one annotation per NUMA node in the system. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	d9d652e867	e2e: topomgr: initial negative tests Negative tests is when we request a gu Pod we know the system cannot fullfill - hence we expect rejection from the topology manager. Unfortunately, besides the trivial case of excessive cores (request more socket than a NUMA node provides) we cannot easily test the devices, because crafting a proper pod will require detailed knowledge of the hw topology. Let's consider a hypotetical two-node NUMA system with two PCIe busses, one per NUMA node, with a SRIOV device on each bus. A proper negative test would require two SRIOV device, that the system can provide but not on the same single NUMA node. Requiring for example three devices (one more than the system provides) will lead to a different, legitimate admission error. For these reasons we bootstrap the testing infra for the negative tests, but we add just the simplest one. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	ee92b4aae0	e2e: topomgr: add more positive tests this patch builds on the topology manager e2e infrastructure to add more positive e2e test cases. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	1b5801a086	e2e: topomgr: add option to specify the SRIOV conf We cannot anticipate all the possible configurations needed by the SRIOV device plugin: there is too much variety. Hence, we need to allow the test environment to supply a host-specific ConfigMap to properly configure the device plugin and avoid false negatives. We still provide a the default config map as fallback and reference. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	6687fcc78c	e2e: topomgr: autodetect SRIOV resource to use The SRIOV device plugin can create different resources depending on both the hardware present on the system and the configuration. As long as we have at least one SRIOV device, the tests don't actually care about which specific device is. Previously, the test hardcoded the most common intel SRIOV device identifier. This patch lifts the restriction and let the test autodetect and use what's available. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	fa26fb6817	e2e: topomgr: check pod resource alignment This patch extends and completes the previously-added empty topology manager test for single-NUMA node policy by adding reporting in the test pod and checking the resource alignment. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	cd7e3d626c	e2e: topomgr: add test infra This patch all the testing infra and utilities needed to run e2e topology manager tests. This include setup a guaranteed pod which needs some devices. The simplest real device available for the purpose are the SRIOV devices, hence we use them. This patch pulls the SRIOV device plugin from the official, yet external, repository. We do it as close as possible for the nvidia GPU plugin. This patch also performs minor refactoring for some test framework utilities, needed to support the new e2e tests. Finally, we add an empty e2e topology manager test, to be completed by the next patch. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
Francesco Romani	1fdf262137	e2e: topomgr: explicit save the kubelet config For the sake of readability, save the old Kubelet config once. Signed-off-by: Francesco Romani <fromani@redhat.com>	2020-02-10 22:47:53 +01:00
tanjunchen	7ff3a1f8db	test/e2e/framework: remove skip.go and use e2eskipper subpackage	2020-02-01 01:18:48 +08:00
vpickard	31b0d7f853	e2e-topology-manager: Fix package name Change package name to e2enode Signed-off-by: vpickard <vpickard@redhat.com>	2019-12-12 16:37:35 -05:00
vpickard	fba4a7be34	e2e-topology-manager: fixes for gofmt Some cleanup for gofmt fixes Signed-off-by: vpickard <vpickard@redhat.com>	2019-12-12 16:32:58 -05:00
vpickard	337fdf2f37	[WIP] e2e-topology-manager: Initial commit for E2E tests This is the initial commit for E2E testing for Topology Manager. For now, run a subset of the CPU Manager tests. Additional tests will be forthcoming. Signed-off-by: vpickard <vpickard@redhat.com>	2019-12-12 16:32:58 -05:00

43 Commits