Commit Graph

1886 Commits

Author SHA1 Message Date
RobertKielty
10bdcb8ef1 fixes 92907 improves test error output 2020-07-08 13:10:50 +01:00
Kubernetes Prow Robot
9eced04014
Merge pull request #91007 from lsytj0413/fix-89443
test(e2e_node): Parallelize prepulling all images in `e2e_node` tests
2020-07-08 04:19:07 -07:00
Seth Jennings
b1f91d33e7 add dashpole and sjenning to cmd/kubelet OWNERS 2020-07-02 14:20:41 -05:00
Amim Knabben
2a392bf8fc Fetching kubelet address:port from kubelet configuration 2020-07-02 14:15:44 -04:00
alejandrox1
4338053d8f Renamed image "white lists" to pre-pull image lists in test
Signed-off-by: alejandrox1 <alarcj137@gmail.com>
2020-07-01 13:48:47 -04:00
Aaron Crickenberger
28768166f5 decouple testfiles from framework
This drops testfiles.ReadOrDie and updated testfiles.Exists to return an
error, forcing the caller to decide whether to call framework.Fail or do
something else.

It makes for a slightly less friendly API, but also means the package is
decoupled from framework again, as per the comments at the top of the
file
2020-06-29 14:54:09 -07:00
Kubernetes Prow Robot
a03db636da
Merge pull request #91366 from giuseppe/cgroupfs-cgroupv2
vendor: update google/cadvisor and opencontainers/runc
2020-06-26 04:17:31 -07:00
Kubernetes Prow Robot
4a91ecb976
Merge pull request #91863 from knabben/kubelet-memcg-notification
Moving Kubelet kernel-memgc-notification to configuration file
2020-06-25 00:20:37 -07:00
Giuseppe Scrivano
e94aebf4cb
pkg/kubelet: adapt to new libcontainer API
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-06-24 18:39:51 +02:00
Amim Knabben
c39cf28ed3 Moving Kubelet kernel-memgc-notification to configuration file 2020-06-24 06:44:00 -04:00
RobertKielty
5c27c7e304 renames CommmonImageWhiteList to PrePulledImages in e2e/common/util
Part of work to remove racist language, this name change also improves on the
  semantics of this variable name as it was not actually a list of permissible
  images but rather a list of images that are required for e2e_node tests that
  are to be pre-pulled so that they are available prior to running e2e tests.

  Worth noting that this list of images is "union merged" with another list when
  setting up e2e_node tests and as such there is the possibilty for overlap.

 # Please enter the commit message for your changes. Lines starting
2020-06-21 18:46:06 +01:00
Kubernetes Prow Robot
219c856ce2
Merge pull request #91555 from daixiang0/scr
*.sh: cleanup all white noise
2020-06-20 05:26:53 -07:00
Warren Fernandes
296f50365b Rename NodeImageWhiteList to NodePrePullImageList 2020-06-19 16:12:27 -06:00
Kubernetes Prow Robot
62d091a49e
Merge pull request #91813 from bart0sh/PR0090-e2e_node-benchmark-decrease-number-of-pods
e2e_node: fix node-kubelet-benchmark test
2020-06-19 11:36:43 -07:00
Sergey Kanzhelev
2baed83b5c remove stale TODO after this PR: #92204 2020-06-18 22:55:21 +00:00
Kubernetes Prow Robot
99019502bd
Merge pull request #92234 from alejandrox1/add-cleanup-time-node-perf
Added a buffer period in the node performance tests
2020-06-18 06:04:10 -07:00
alejandrox1
9263dd1f02 Added a buffer period in the node performance tests
The node-kubelet-flaky e2e job that runs the the
`Node Performance Testing [Serial] [Slow] [Flaky]` e2e tests have been
flaking because of inconsistencies on the cpu manager checkpoint file.
This seems to be caused because the checkpoint file is deleted (which is
what needs to happen in order to change the CPU manager policy which is
used for these e2e tests) right after the e2e tests asserts that a pod
does not exist anymore.
However, after a pod is deleted, the CPU manager may still be cleaning
up the resources used by the pod which may result in the checkpoint file
being created.
Whenever this happened, the kubelet would panic if we then try to
subsequently change the CPU manager policy to "static" from "none" or
vice versa (this is done 4 times in these tests).

Signed-off-by: alejandrox1 <alarcj137@gmail.com>
2020-06-17 18:33:44 -04:00
Davanum Srinivas
01183e51f0
Check for either Docker or Containerd getting active for e2e_node tests
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-06-16 20:08:01 -04:00
Kubernetes Prow Robot
410b929e78
Merge pull request #91471 from MHBauer/rm-old-config
remove out of date test config
2020-06-10 04:39:07 -07:00
Artyom Lukianov
a4b367a6a3 Refactor and add new tests to hugepages e2e tests
Add tests to cover usage of multiple hugepages with different
page sizes under the same pod.

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2020-06-08 11:23:54 +03:00
Kubernetes Prow Robot
4c8e5c5a50
Merge pull request #91543 from bsdnet/runner
Simplify the logic by removing dead code and enhance logging
2020-06-06 17:29:45 -07:00
Aaron Crickenberger
019d3ee438 update e2e_node OWNERS file
specifically:
- move inactive approvers to emeritus
- add newly active contributors as reviewers
- add a sig/node label to PRs that touch this dir
2020-06-05 10:44:29 -07:00
Ed Bartosh
fa31c2c59c e2e_node: fix node-kubelet-benchmark test
e2e_node tests trigger OOM events on COS versions > 73-11636-0-0
possibly because of this change in the COS v.73-11636-0-0:
 Made containerd run as a standalone systemd service

OOM killer usually kills cadvisor and e2e_node.test processes
causing node-kubelet-benchmark failures.

Decreasing amount of pods from 105 to 90 frees enough memory for
the test to succeed.
2020-06-05 12:51:45 +03:00
Kubernetes Prow Robot
8ce1b535ee
Merge pull request #80831 from odinuge/hugetlb-pagesizes-cleanup
Add support for removing unsupported huge page sizes
2020-06-04 23:41:43 -07:00
Roy Yang
d79f0c6b39 Simplify the logic by removing dead code and enhance logging
Signed-off-by: Roy Yang <royyang@google.com>
2020-06-01 23:43:04 -07:00
Kubernetes Prow Robot
5592b5d67a
Merge pull request #91470 from MHBauer/fail-0-remote-images
explicitly fail if no images are found when running remote tests
2020-06-01 20:58:14 -07:00
Kubernetes Prow Robot
46d08c89ab
Merge pull request #91363 from alejandrox1/tune-node-perf-workloads
Tuned npb is workload resources
2020-05-30 23:25:53 -07:00
lsytj0413
64094899b0 test(e2e_node): Parallelize prepulling all images in e2e_node tests 2020-05-30 18:07:23 +08:00
Xiang Dai
e09bc312cb *.sh: cleanup all white noise
Signed-off-by: Xiang Dai <long0dai@foxmail.com>
2020-05-29 09:56:00 +08:00
Ed Bartosh
e6192a87af Remove unused e2e test image config
node_e2e tests use benchmark image configs from test-infra repository
This one is outdated and not used anywhere.
2020-05-28 14:44:31 +03:00
Kubernetes Prow Robot
6b15b1f4a6
Merge pull request #91467 from bobbypage/topology-manager-test
Mark Topology Manager Test as non-alpha and NodeFeature
2020-05-26 16:49:14 -07:00
Kubernetes Prow Robot
30eeacbf22
Merge pull request #91384 from alejandrox1/alejandrox1-patch-1
Added cadvisor test suite to flag info message
2020-05-26 16:48:56 -07:00
Morgan Bauer
a9b999c00d
remove out of date test config 2020-05-26 14:46:38 -07:00
Morgan Bauer
58924c2de5
explicitly fail if no images are found when running remote tests
The previous implementation succeeds if no images are run. This causes
silent failures when image matchers are provided that do not match any image.
2020-05-26 14:08:27 -07:00
David Porter
f5b8c3d746 Mark Topology Manager Test as non-alpha and NodeFeature 2020-05-26 12:10:18 -07:00
Jorge Alarcon Ochoa
a069eec2bb Added cadvisor test suite to flag info message
The cadvisor test suite is not mentioned in the remote runner's
`--test-suite` flag.
This PR will mention the existence of the cadvisor test suite.
2020-05-23 19:10:32 -04:00
Stephen Augustus
b692502a9d Update CNI to v0.8.6
Signed-off-by: Stephen Augustus <saugustus@vmware.com>
2020-05-22 17:48:56 -04:00
alejandrox1
ebd84a5517 Tuned npb is workload resources
Lowering the amount of cpu allocated to this workload will set the
resources allocated to be similar to the other npb and tf workload in
this tests.
This will also allow to run all three workloads in a n1-standard-12 gcp
instance - which has 16 cpus and 60 GB.

Signed-off-by: alejandrox1 <alarcj137@gmail.com>
2020-05-22 09:30:43 -04:00
Davanum Srinivas
0608e8be25
update bazel BUILD files
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-20 10:57:47 -04:00
Davanum Srinivas
5692926914
Move packages for slightly better UX for consumers
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-20 10:57:46 -04:00
Davanum Srinivas
07d88617e5
Run hack/update-vendor.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:33 -04:00
Davanum Srinivas
442a69c3bd
switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
Giuseppe Scrivano
b21b1a5436
test, e2e_node: drop superfluous systemd properties
commit 43c56eb403 introduced a change
where CPUAccounting, CPUAccounting and TasksAccounting are enabled for
the systemd service.

It causes a regression on RHEL 7.8 where systemd-run doesn't allow to
set TasksAccounting.

Since Delegate= already enables all the controllers, it is superfluous
to specify them.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-05-04 09:22:35 +02:00
Guangming Wang
e92a91eb72 cleanup: no need nil check before range 2020-04-27 22:12:12 +08:00
Kubernetes Prow Robot
a62cfe8451
Merge pull request #75111 from tnozicka/fix-e2e-watches
Fix watches in e2e tests
2020-04-23 19:20:07 -07:00
Tomas Nozicka
c62db98e95 Update Bazel 2020-04-23 17:27:13 +02:00
Tomas Nozicka
d0a4c52392 Fix watches in apparmor e2e 2020-04-23 17:26:28 +02:00
tanjunchen
de99aaf8d2 test/e2e_node/gpu_device_plugin_test.go:Remove prometheus dependencies from k/k 2020-04-23 14:23:43 +08:00
Kubernetes Prow Robot
d92fdebd85
Merge pull request #89897 from giuseppe/test-e2e-node
kubelet: fix e2e-node cgroups test on cgroup v2
2020-04-20 15:54:12 -07:00
Ed Bartosh
88478f3749 e2e_node: check if image exists locally before pulling
'docker pull' is a time consuming operation. It makes sense to check
if image exists locally before pulling it from a registry.

Checked if image exists by running 'docker inspect'. Only pull if
image doesn't exist.
2020-04-20 12:27:29 +03:00
tanjunchen
f76da50c7d test/e2e/framework/util.go:move DsFromManifest to test/e2e/framework/manifest , and rename it to DaemonSetFromURL 2020-04-14 09:54:41 +08:00
Giuseppe Scrivano
43c56eb403
e2e_node: adapt tests to cgroup v2
and fix node_container_manager_test to run with the systemd cgroup
manager.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-09 16:18:05 +02:00
Andrew Sy Kim
2e56866c97 move apparmor annotation constants to k8s.io/api/core/v1
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
2020-04-06 10:22:04 -04:00
Kubernetes Prow Robot
4e9dd8fd36
Merge pull request #89454 from gavinfish/import-aliases
Update .import-aliases for e2e test framework
2020-03-27 14:35:54 -07:00
张潇
a6d0f8e3dc
beta.kubernetes.io/arch is already deprecated 2020-03-25 13:34:43 +08:00
drfish
dfab6b637f Update .import-aliases for e2e test framework 2020-03-25 11:40:02 +08:00
Kubernetes Prow Robot
89dfebb214
Merge pull request #89359 from gongguan/process
eviction by process number
2020-03-24 15:27:25 -07:00
louisgong
0efb70c0a2 eviction by process number 2020-03-24 09:25:04 +08:00
tanjunchen
bed22fbb44 WaitForPodReady is simply wrapper functions for e2epod package,
and they made an invalid dependency to sub e2e framework from the core framework.

So we can use e2epod.WaitTimeoutForPodReadyInNamespace to remove invalid dependency.

The main purpose of this pr is to handle the framework core package dependency subpackage pod.
2020-03-22 23:08:52 +08:00
Odin Ugedal
a233b9aab0
Add verbose message when more than one kubelet is running 2020-03-19 13:08:08 +01:00
Odin Ugedal
8b6160a367
Add support for stopping kubelet in node-e2e
This makes it possible to stop the kubelet, do some work, and then
start it again.
2020-03-19 13:08:08 +01:00
Odin Ugedal
2830827442
Add support for removing unsupported huge page sizes
When kubelet is restarted, it will now remove the resources for huge
page sizes no longer supported. This is required when:
- node disables huge pages
- changing the default huge page size in older versions of linux
(because it will then only support the newly set default).
- Software updates that change what sizes are supported (eg. by changing
boot parameters).
2020-03-19 13:08:08 +01:00
Kubernetes Prow Robot
5708511499
Merge pull request #88708 from mikedanese/deleteopts
Migrate clientset metav1.DeleteOpts to pass-by-value
2020-03-05 23:09:23 -08:00
Kubernetes Prow Robot
50dd75f9c5
Merge pull request #88773 from vpickard/e2e-topology-manager-sriovdpReady
e2e-topology-manager: Wait for SR-IOV device plugin
2020-03-05 20:04:38 -08:00
Kubernetes Prow Robot
e23e7204f2
Merge pull request #88558 from egernst/e2e_node-PodOverhead
e2e node pod overhead
2020-03-05 20:04:11 -08:00
Mike Danese
76f8594378 more artisanal fixes
Most of these could have been refactored automatically but it wouldn't
have been uglier. The unsophisticated tooling left lots of unnecessary
struct -> pointer -> struct transitions.
2020-03-05 14:59:47 -08:00
Mike Danese
aaf855c1e6 deref all calls to metav1.NewDeleteOptions that are passed to clients.
This is gross but because NewDeleteOptions is used by various parts of
storage that still pass around pointers, the return type can't be
changed without significant refactoring within the apiserver. I think
this would be good to cleanup, but I want to minimize apiserver side
changes as much as possible in the client signature refactor.
2020-03-05 14:59:46 -08:00
Mike Danese
c58e69ec79 automated refactor 2020-03-05 14:59:46 -08:00
Kubernetes Prow Robot
1f2e1967d1
Merge pull request #88566 from Deepthidharwar/topology-mgr-numa-tests
Enable running cpu-mgr-multiNUMA e2e tests with Topology manager
2020-03-05 05:38:37 -08:00
vpickard
61565b3f6c e2e-topology-manager: Wait for SR-IOV device plugin
Make sure the SR-IOV device plugin is ready, and that
there are enough SR-IOV devices allocatable before
spinning up test pods.

Signed-off-by: vpickard <vpickard@redhat.com>
2020-03-04 10:07:35 -05:00
Deepthi Dharwar
1ede096465 Enable topology-manager-e2e tests to run on MultiNUMA nodes.
Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>
2020-03-02 22:36:43 +05:30
Deepthi Dharwar
4abbce4549 Refactor CPUMananger-e2e-tests so that it be reused by topology-manager-e2e-testsuite.
Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>
2020-03-02 22:36:31 +05:30
Deepthi Dharwar
a4b59a5d7c Currently SRIOV detection logic is reporting error if it fails to detect SRIOV device
on the system. This patch aims to fix the same.

Signed-off-by: Deepthi Dharwar <ddharwar@redhat.com>
2020-03-02 19:31:37 +05:30
Eric Ernst
aa12e1f8c4 e2e_node add test for PodOverhead feature
This test will verify that the Pod cgroup created takes Overhead into
account.

Signed-off-by: Eric Ernst <eric@amperecomputing.com>
2020-02-28 23:00:39 +00:00
Kubernetes Prow Robot
624da8b9a3
Merge pull request #88110 from fromanirh/refactor-get-current-kubelet-conf
e2e: e2e_node: refactor getCurrentKubeletConfig
2020-02-24 13:11:36 -08:00
Kubernetes Prow Robot
26f8535838
Merge pull request #88354 from jiahuif/nodee2e-stack-protector-flags
node-e2e testing: fix alias for stack protector kernel config.
2020-02-21 18:31:51 -08:00
Jiahui Feng
68b7564e7e fix alias for stack protector kernel config.
- fix YAML syntax
- alias -> aliases
- no need for CONFIG prefix
- add renamed config since 4.18
2020-02-21 11:04:48 -08:00
Kubernetes Prow Robot
57764e34d4
Merge pull request #87921 from Deepthidharwar/cpu-mgr-e2etest-NUMA-nodes
e2e test CPU-Manager: Extend CPUManager e2e tests to run on MultiNUMA node with/without HT
2020-02-20 16:29:57 -08:00
Kubernetes Prow Robot
3ae1b0ce80
Merge pull request #88234 from fromanirh/topomgr-e2e-tests-multicnt
e2e topology manager: single-numa-node multi container tests
2020-02-20 10:35:56 -08:00
Francesco Romani
64904d0ab8 e2e: topomgr: extend tests to all the policies
Per https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/0035-20190130-topology-manager.md#multi-numa-systems-tests
we validate only the results for single-numa node policy,
because the is no a simple and reliable way to validate
the allocation performed by the other policies.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-20 18:22:34 +01:00
Francesco Romani
a249b93687 e2e: topomgr: address reviewer comments
Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-20 10:31:09 +01:00
Kubernetes Prow Robot
937008e3ac
Merge pull request #81226 from claudiubelu/tests/reduce-to-agnhost-part-4
tests: Replaces images used with agnhost (part 4)
2020-02-20 01:13:03 -08:00
Francesco Romani
833519f80b e2e: topomgr: properly clean up after completion
Due to an oversight, the e2e topology manager tests
were leaking a configmap and a serviceaccount.
This patch ensures a proper cleanup

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
7c12251c7a e2e: topomgr: add multi-container tests
Add tests to check alignment of pods which contains more than one
container.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
8e9d76f1b9 e2e: topomgr: validate all containers in pod
Up until now, the test validated the alignment of resources
only in the first container in a pod. That was just an overlook.
With this patch, we validate all the containers in a given pod.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
ddc18eae67 e2e: topomgr: autodetect NUMA position of VF devs
Add autodetection code to figure out on which NUMA node are
the devices attached to.
This autodetection work under the assumption all the VFs in
the system must be used for the tests.
Should not this be the case, or in general to handle non-trivial
configurations, we keep the annotations mechanism added to the
SRIOV device plugin config map.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:42 +01:00
Francesco Romani
0c2827cb50 e2e: topomgr: remove single-numa node hack
On single-NUMA node systems the numa_node of sriov devices was
sometimes reported as "-1" instead of, say, 0. This makes some
tests that should succeed[0] fail unexpectedly.

The reporting works as expected on real multi-NUMA node systems.

This small workaround was added to handle this corner case,
but it makes overall the code less readable and a bit too lenient,
hence we remove it.

+++

[0] on a single NUMA node system some resources are obviously
always aligned if the pod can be admitted. It boils down to the
node capacity at pod admittal time.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:41 +01:00
Francesco Romani
bb6beb99e5 e2e: topomgr: early check to detect VFs, not PFs
The e2e_node topology_manager check have a early, quick check
to rule out systems without sriov device, thus skipping the tests.

The first version of the ckeck detected PFs, (Physical Functions),
under the assumption that VFs (Virtual Functions) were already been
created. This works because, obviously, you can't have VFs without PFs.

However, it's a little safer and easier to understand if we check
firectly for VFs, bailing out from systems which don't provide them.

Nothing changes for properly configured test systems.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-19 17:15:41 +01:00
Claudiu Belu
f7942290af tests: Replaces images used with agnhost (part 4)
Quite a few images are only used a few times in a few tests. Thus,
the images are being centralized into the agnhost image, reducing
the number of images that have to be pulled and used.

This PR replaces the usage of the following images with agnhost:

- resource-consumer-controller
- test-webserver
2020-02-18 16:29:49 -08:00
Francesco Romani
53cda47913 update bazel configuration
Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-16 10:07:12 +01:00
Kubernetes Prow Robot
4a45ae3236
Merge pull request #87645 from fromanirh/topomgr-e2e-tests
e2e-topology-manager: single-NUMA-node test
2020-02-14 08:06:19 -08:00
Kubernetes Prow Robot
4e8a7f4a4b
Merge pull request #87355 from mattjmcnaughton/mattjmcnaughton/remove-unnecessary-sudo-from-e2e
Clean up TODO around running test as sudo
2020-02-14 06:10:17 -08:00
Francesco Romani
bb770c0325 e2e: getCurrentKubeletConfig: move in subpkg
Address review comments and move the helper function
in the `framework/kubelet` package to avoid circular deps
(see https://github.com/kubernetes/kubernetes/issues/81245)

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-14 10:51:08 +01:00
Deepthi Dharwar
3342e4a09a Extend CPUManager e2e tests to run on MultiNUMA node with/without HT 2020-02-14 07:59:13 +05:30
Francesco Romani
08ba240c6b e2e: e2e_node: refactor getCurrentKubeletConfig
this patch moves the helper getCurrentKubeletConfig function,
used in both e2e and e2e_node tests and previously duplicated,
in the common framework.

There are no intended changes in behaviour.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-13 12:53:15 +01:00
Kubernetes Prow Robot
921ef35e64
Merge pull request #87949 from 928234269/non_ascii_01
Fix non-ascii characters in test/e2e_node and test/network.
2020-02-10 17:22:01 -08:00
Francesco Romani
70cce5e3f1 e2e: topomgr: introduce sriov setup/teardown funcs
Reorganize the code with setup and teardown functions,
to make room for the future addition of more device plugin
support, and to make the code a bit tidier.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
2f0a6d2c76 e2e: topomgr: use constants for test limits
Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
fee1dba054 e2r: topomgr: improve the test logs
Add clarification to which test is doing what, to make
the test output easier to understand.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
83c344647f e2e: topomgr: better check for AffinityError
Add a helper function to check if a Pod failed
admission for Topology Affinity Error.
So far we only check the Status.Reason.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
512a4e8a3e e2e: topomgr: reduce node readiness timeout
Five minutes was initially used only to be overcautious.
From my experiments, the node is ready in usually less than a minute.
Double it to give some buffer space.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:54 +01:00
Francesco Romani
3b4122bd03 e2e: topomgr: get and use topology hints from conf
TO properly implement some e2e tests, we need to know
some basic topology facts about the system running the tests.
The bare minimum we need to know is how many PCI SRIOV devices
are attached to which NUMA node.

This way we know which core we can reserve for kube services,
and which NUMA socket we can take to test full socket reservation.

To let the tests know the PCI device topology, we use annotations
in the SRIOV device plugin ConfigMap we need anyway.
The format is

```yaml
  metadata:
    annotations:
      pcidevice_node0: "2"
      pcidevice_node1: "0"
```

with one annotation per NUMA node in the system.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
d9d652e867 e2e: topomgr: initial negative tests
Negative tests is when we request a gu Pod we know the system cannot
fullfill - hence we expect rejection from the topology manager.

Unfortunately, besides the trivial case of excessive cores (request
more socket than a NUMA node provides) we cannot easily test the
devices, because crafting a proper pod will require detailed knowledge
of the hw topology.

Let's consider a hypotetical two-node NUMA system with two PCIe busses,
one per NUMA node, with a SRIOV device on each bus.
A proper negative test would require two SRIOV device, that the system
can provide but not on the same single NUMA node.
Requiring for example three devices (one more than the system provides)
will lead to a different, legitimate admission error.

For these reasons we bootstrap the testing infra for the negative tests,
but we add just the simplest one.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
ee92b4aae0 e2e: topomgr: add more positive tests
this patch builds on the topology manager e2e infrastructure to
add more positive e2e test cases.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
1b5801a086 e2e: topomgr: add option to specify the SRIOV conf
We cannot anticipate all the possible configurations
needed by the SRIOV device plugin: there is too much variety.

Hence, we need to allow the test environment to supply
a host-specific ConfigMap to properly configure the device
plugin and avoid false negatives.

We still provide a the default config map as fallback and reference.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
6687fcc78c e2e: topomgr: autodetect SRIOV resource to use
The SRIOV device plugin can create different resources depending
on both the hardware present on the system and the configuration.
As long as we have at least one SRIOV device, the tests don't actually
care about which specific device is.

Previously, the test hardcoded the most common intel SRIOV device
identifier. This patch lifts the restriction and let the test
autodetect and use what's available.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
fa26fb6817 e2e: topomgr: check pod resource alignment
This patch extends and completes the previously-added
empty topology manager test for single-NUMA node policy
by adding reporting in the test pod and checking
the resource alignment.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
cd7e3d626c e2e: topomgr: add test infra
This patch all the testing infra and utilities needed
to run e2e topology manager tests. This include setup
a guaranteed pod which needs some devices.

The simplest real device available for the purpose
are the SRIOV devices, hence we use them.

This patch pulls the SRIOV device plugin from
the official, yet external, repository.
We do it as close as possible for the nvidia GPU plugin.

This patch also performs minor refactoring for some
test framework utilities, needed to support the new
e2e tests.

Finally, we add an empty e2e topology manager test,
to be completed by the next patch.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Francesco Romani
1fdf262137 e2e: topomgr: explicit save the kubelet config
For the sake of readability, save the old Kubelet config
once.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-02-10 22:47:53 +01:00
Mike Danese
25651408ae generated: run refactor 2020-02-08 12:30:21 -05:00
Sakura
44bf3475ea
Fix non-ascii characters in test/e2e_node and test/network.
Signed-off-by: Sakura <longfei.shang@daocloud.io>
2020-02-08 17:47:19 +08:00
Mike Danese
2637772298 some manual fixes 2020-02-07 18:17:40 -08:00
Mike Danese
3aa59f7f30 generated: run refactor 2020-02-07 18:16:47 -08:00
tanjunchen
7ff3a1f8db test/e2e/framework: remove skip.go and use e2eskipper subpackage 2020-02-01 01:18:48 +08:00
Kubernetes Prow Robot
b32725b80b
Merge pull request #86413 from ohsewon/e2e_typo_fix
Fix cpu manager e2e test typo
2020-01-30 10:31:48 -08:00
Kubernetes Prow Robot
89714227ff
Merge pull request #78819 from justaugustus/cni
cni: Update CNI version to v0.8.5
2020-01-30 02:12:14 -08:00
Mike Danese
d55d6175f8 refactor 2020-01-29 08:50:45 -08:00
Stephen Augustus
1174e6698e cni: Update CNI version to v0.8.5
Signed-off-by: Stephen Augustus <saugustus@vmware.com>
2020-01-29 04:41:29 -05:00
Stephen Augustus
96f2588b61 cni: Update CNI download URLs to use new GCS bucket (k8s-artifacts-cni)
Signed-off-by: Stephen Augustus <saugustus@vmware.com>
2020-01-29 02:32:22 -05:00
Kubernetes Prow Robot
4fc5254c2f
Merge pull request #87456 from mattjmcnaughton/mattjmcnaughton/delete-todo-to-use-docker-client
Delete TODO to use docker client
2020-01-22 20:37:37 -08:00
Kubernetes Prow Robot
a06d16565c
Merge pull request #86184 from vpickard/e2e-topologyManager
e2e-topology-manager: Initial commit for E2E tests
2020-01-22 08:41:14 -08:00
mattjmcnaughton
d6d08b152e
Delete TODO to use docker client
Re conversation in https://github.com/kubernetes/kubernetes/pull/87373,
we should keep the current behavior (i.e. using the docker binary
instead of the docker client). Delete the TODO instructing us to change
the behavior.
2020-01-22 08:45:07 -05:00
mattjmcnaughton
16de853c5d
Clean up TODO around running test as sudo
Re the TODO, this command no longer needs to be prefixed by `sudo`, as
the test is already running as `root`.
2020-01-18 13:36:57 -05:00
Davanum Srinivas
d7d316e1e7
switch to docker command line 2020-01-17 21:08:13 -05:00
Kubernetes Prow Robot
9277bac9b8
Merge pull request #87003 from odinuge/node-e2e-instance-failure
Add error check for instance insert in node e2e
2020-01-16 13:14:45 -08:00
zouyee
c1de3d6e5b fix ci-kubernetes-node-kubelet-serial Non-system critical priority classes are not allowed to have a value larger than HighestUserDefinablePriority
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2020-01-14 09:43:04 +08:00
Odin Ugedal
c04ead5fb1
Add error check for instance insert
Not all errors will happen in sync during Instances.Insert(...).Do(), so
it is important to verify the operation object to see why insert fails.
An example is when exceeding the resource quota.

Eg.
could not create instance test-cos-beta-80-12739-29-0: [&{Code:QUOTA_EXCEEDED Location: Message:Quota 'CPUS' exceeded.  Limit: 24.0 in region europe-west6. ForceSendFields:[] NullFields:[]}

This fixes the issue where tests will fail "silently" when instance
insert fails.
2020-01-09 09:49:38 +01:00
Kubernetes Prow Robot
9781bb60e0
Merge pull request #86438 from klueska/upstream-e2e-approver
Add klueska as an approver in test/e2e_node/OWNERS
2020-01-06 11:12:16 -08:00
tanjunchen
fc3b210ad8 if no cycle dependency , use framework in test/e2e_node subpackage 2020-01-02 15:52:05 +08:00
Kenichi Omichi
52ddae0267 Remove Delete/CreateSyncInNamespace()
DeleteSyncInNamespace() was used at an e2e node test and DeleteSync()
only. In addition, the part of the e2e node test can be replaced with
DeleteSync(). CreateSyncInNamespace() is the same thing and can be
replaced with CreateSync(). So this replaces these functions and
removes them for the cleanup.
2019-12-30 18:59:42 +00:00
Kubernetes Prow Robot
a097243cba
Merge pull request #86062 from haosdent/clean-e2e-framework-gpu
e2e: move funs of framework/gpu to e2e_node
2019-12-28 21:23:39 -08:00
danielqsj
6596a14d39 add missing alias of api errors under test 2019-12-26 17:29:38 +08:00
Kubernetes Prow Robot
9fa1e00be9
Merge pull request #83437 from matthyx/startupprobe-beta
Promote StartupProbe to beta for 1.18
2019-12-20 00:59:32 -08:00
Kevin Klues
c60802e893 Add klueska as an approver in test/e2e_node/OWNERS 2019-12-19 15:37:03 +01:00
Kubernetes Prow Robot
4e35750abc
Merge pull request #86156 from tanjunchen/use-framework-Equal-test-e2e_node
test/e2e_node/:use framework.Equal() instead of using gomega.Expect(b…
2019-12-19 02:39:56 -08:00
Kubernetes Prow Robot
2f39e7304d
Merge pull request #86119 from haosdent/clean-e2e-framework-metrics
e2e: move funs of framework/metrics to e2e_node
2019-12-19 00:37:56 -08:00
sewon.oh
745248dd6f
Fix cpu manager e2e test typo
Signed-off-by: sewon.oh <sewon.oh@samsung.com>
2019-12-19 13:41:05 +09:00
Kubernetes Prow Robot
a1fc96f41e
Merge pull request #84462 from klueska/upstream-cpu-manager-update-state-semantics
Update CPUManager stored state semantics
2019-12-17 12:00:12 -08:00
Haosdent Huang
973fddd155 e2e: move funs of framework/gpu to e2e_node 2019-12-16 00:53:01 +08:00
Haosdent Huang
4536ed50a0 e2e: move funs of framework/deviceplugin to e2e_node 2019-12-16 00:46:56 +08:00
Haosdent Huang
8d3a8d5a6c e2e: move funs of framework/metrics to e2e_node 2019-12-16 00:27:58 +08:00
Matthias Bertschy
6603f41a13 Promote StartupProbe to beta for 1.18 2019-12-15 14:49:34 +01:00
vpickard
0e644c8749 e2e-topology-manager: Fix bazel tests
Fix some tests

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 19:52:59 -05:00
vpickard
31b0d7f853 e2e-topology-manager: Fix package name
Change package name to e2enode

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 16:37:35 -05:00
vpickard
fba4a7be34 e2e-topology-manager: fixes for gofmt
Some cleanup for gofmt fixes

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 16:32:58 -05:00
vpickard
337fdf2f37 [WIP] e2e-topology-manager: Initial commit for E2E tests
This is the initial commit for E2E testing for Topology
Manager.

For now, run a subset of the CPU Manager tests.

Additional tests will be forthcoming.

Signed-off-by: vpickard <vpickard@redhat.com>
2019-12-12 16:32:58 -05:00
Kubernetes Prow Robot
b38dfb3ccb
Merge pull request #85522 from YuikoTakada/local-latencies
Fix func VerifyLatencyWithinThreshold() to local
2019-12-11 14:30:32 -08:00
Kevin Klues
69f8053850 Update top-level CPUManager to adhere to new state semantics
For now, we just pass 'nil' as the set of 'initialContainers' for
migrating from old state semantics to new ones. In a subsequent commit
will we pull this information from higher layers so that we can pass it
down at this stage properly.
2019-12-11 23:02:51 +01:00
tanjunchen
35b0f1f7dd test/e2e_node/:use framework.Equal() instead of using gomega.Expect(bool).To(gomega.BeTrue(),explain) 2019-12-11 18:50:29 +08:00
Kubernetes Prow Robot
4fe4ff885d
Merge pull request #85934 from xueweiz/e2e-node
Convert ExpectEqual(err, nil) to ExpectNoError(err)
2019-12-05 15:55:03 -08:00