kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	38f012320f	Merge pull request #101947 from cynepco3hahue/memory_manager_move_to_beta memory manager: move to beta	2021-06-28 15:38:28 -07:00
Francesco Romani	889dcb5b54	e2e: node: fix npd test failures bumping image The PR https://github.com/kubernetes/kubernetes/pull/100041 updated node-problem-detector to v0.8.7, but unfortunately we didn't update also the image using in the e2e_node tests. As result, the tests were failing like E2eNode Suite: [sig-node] NodeProblemDetector [NodeFeature:NodeProblemDetector] [Serial] SystemLogMonitor should generate node condition and events for corresponding errors _output/local/go/src/k8s.io/kubernetes/test/e2e_node/node_problem_detector_linux.go:301 Timed out after 60.000s. Expected success, but got an error: <*errors.errorString \| 0xc0011f2600>: { s: "expected total number of events was 4, actual events counted was 7\nEvents This in turn was one of the contributing factors in making the pull-kubernetes-node-kubelet-serial lane constantly failing. This patch updates the image used in the tests, fixing the failure. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-06-28 16:32:12 +02:00
sanwishe	43f8f58895	add containers starttime metrics for metrics/resource endpoint Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>	2021-06-24 02:53:21 +08:00
Kubernetes Prow Robot	15a60d1a19	Merge pull request #100180 from fromanirh/tm-e2e-fix-wait e2e: TM: wait for SRIOV devices in pod scope tests	2021-06-23 11:42:10 -07:00
Odin Ugedal	0839c00b76	Increase pressure timout on DiskPressure test	2021-06-23 20:12:46 +02:00
Francesco Romani	47615c2020	e2e: node: remove obsolete AlphaFeature tag The CPUManager graduated to beta a while ago (k8s 1.10?) so let's get rid of the obsolete Alpha tag on its e2e tests. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-06-23 12:34:45 +02:00
Kubernetes Prow Robot	af60bebde3	Merge pull request #97028 from knabben/e2e-restart-kubelet Adding restart kubelet flag on e2e test	2021-06-22 21:00:09 -07:00
Kubernetes Prow Robot	2453f07e93	Merge pull request #102396 from odinuge/restart_test Restart test: Kill container runtime with SIGKILL	2021-06-22 13:10:10 -07:00
Artyom Lukianov	d4767ed5eb	memory manager: move to beta Move the memory manager feature to beta. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-06-22 20:15:29 +03:00
Artyom Lukianov	681905706d	e2e node: provide tests for memory manager pod resources metrics - verify memory manager data returned by `GetAllocatableResources` - verify pod container memory manager data Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-06-22 13:06:32 +03:00
Shiming Zhang	3daef0a534	Allows manual restart of dbus to work in Ubuntu.	2021-06-22 15:59:30 +08:00
Davanum Srinivas	7fcdbbef06	Switch to github.com/coreos/go-systemd/v22 and drop older package - We use the new v22 module released on May 10 - We drop the unmaintained `github.com/coreos/pkg` Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2021-06-16 11:14:16 -04:00
Odin Ugedal	e2477171ca	Ensure images are pulled after eviction tests	2021-06-16 12:52:21 +02:00
Kubernetes Prow Robot	fa152d25d8	Merge pull request #102209 from odinuge/node-e2e-fix Ignore first SIGINT in node-e2e tests	2021-06-15 11:31:23 -07:00
Kubernetes Prow Robot	4e7fc6df63	Merge pull request #100369 from wzshiming/fix/restart-dbus-for-graceful-node-shutdown After DBus restarts, make GracefulNodeShutdown work again	2021-06-14 20:50:00 -07:00
Kubernetes Prow Robot	94707017e1	Merge pull request #102773 from bart0sh/PR0097-run_remote-report-error run_remote: improve error reporting	2021-06-14 19:00:25 -07:00
Ed Bartosh	89284a1ba7	run_remote: improve error reporting Included more info to the error message.	2021-06-10 14:34:05 +03:00
Odin Ugedal	1c8675fc02	Ensure node e2e apiserver and test suite can open enough files The apiserver and test suite in node e2e runs under the sshd daemon that can limit the amount of files it can open. Set a higher limit to address the issues. Signed-off-by: Odin Ugedal <odin@uged.al>	2021-06-10 13:12:03 +02:00
Giuseppe Scrivano	c98306a09e	test: adjust summary test for cgroup v2 on cgroup v2 the reported metric is recursive for the entire and it includes all the sub cgroups. Adjust the test accordingly. Closes: https://github.com/kubernetes/kubernetes/issues/99230 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2021-06-09 14:04:06 +02:00
Odin Ugedal	c0c9f1f318	Ignore first SIGINT in node-e2e tests Node e2e tests exceeding the global timeout are sent SIGINT, resulting in no artifacts or console output. This will ignore the first SIGINT, and since all children processes are being stopped due to SIGINT, we can clean up before exiting.	2021-06-09 10:12:05 +02:00
Odin Ugedal	d5cb5065c4	Skip node container manager test on systemd	2021-05-28 10:24:24 +02:00
Odin Ugedal	2787e8c18c	Kill container runtime with SIGKILL Make sure to use SIGKILL so that the service is killed in a dirty way. In case container runtime use "Restart=on-abnormal" in systemd, killing with SIGTERM will not restart the service, as the kill looks intentional and clean. This is used by cri-o by default.	2021-05-28 10:16:23 +02:00
Lennart Jern	507710b50f	Update CNI plugins v0.9.1 ref: https://github.com/containernetworking/plugins/releases/tag/v0.9.1 Signed-off-by: Lennart Jern <lennart.jern@est.tech>	2021-05-26 11:02:04 +03:00
Ed Bartosh	38c56883f1	e2e: hugepages: delete test pod after the test Current test assumes that test pod is deleted when the test namespace is deleted. However, namespace deletion is an asynchronous operation. The pod may still be running and allocating hugepages resources when next test case creates another pod that requests the same hugepages resources. This can cause kubelet to fail the test pod with this kind of error: OutOfhugepages-2Mi: Node didn't have enough resource: hugepages-2Mi requested: 6291456, used: 6291456, capacity: 10485760 Explicitly deleting test pod should fix this issue.	2021-05-25 17:09:55 +03:00
Shiming Zhang	990d0949c4	Add test, after restart dbus, should be able to gracefully shutdown	2021-05-19 10:06:06 +08:00
Jordan Liggitt	4b45d0d921	Revert "Merge pull request 101888 from kolyshkin/update-runc-rc94" This reverts commit `b1b06fe0a4`, reversing changes made to `382a33986b`.	2021-05-18 09:13:47 -04:00
Kubernetes Prow Robot	4d4b530114	Merge pull request #101903 from cynepco3hahue/e2e_remote_kernel_args e2e node: make possible to add additional kernel arguments	2021-05-17 13:39:59 -07:00
Kubernetes Prow Robot	b1b06fe0a4	Merge pull request #101888 from kolyshkin/update-runc-rc94 vendor: bump runc to rc94	2021-05-17 09:43:30 -07:00
Kubernetes Prow Robot	f35e587087	Merge pull request #99899 from hasheddan/update-node-e2e-note Update dependencies in local node test runner	2021-05-13 03:54:25 -07:00
Artyom Lukianov	93ff47b05b	e2e node: make possible to add additional kernel arguments Add an option to configure additional kernel arguments during the setup of e2e node environment. The example: cos-stable1: image_family: cos-89-lts # docker v19.03.6, deprecated after 2021-06-24 project: cos-cloud metadata: "user-data<test/e2e_node/jenkins/cos-init-live-restore.yaml,gci-update-strategy=update_disabled" kernel_arguments: - "numa=fake=2" machine: n1-standard-4 Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-05-11 13:49:32 +03:00
Kubernetes Prow Robot	0e13f93c26	Merge pull request #101461 from cynepco3hahue/fix_race_condition_under_memory_manager_test e2e node: fix the race condition under the memory manager test	2021-05-10 18:51:36 -07:00
Giuseppe Scrivano	a460aaf41d	test: adjust number of expected page faults Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2021-05-10 17:34:49 -07:00
Kubernetes Prow Robot	70481591b3	Merge pull request #98629 from wzshiming/fix-pull-image-url Fix pull empty image URL	2021-05-05 20:21:15 -07:00
Shiming Zhang	91beb10aa4	Fix flake for GracefulNodeShutdown e2e	2021-04-29 10:52:00 +08:00
Artyom Lukianov	79dbdbb4c1	e2e node: fix the race condition under the memory manager test Wait for kubelet to be healthy after the dynamic update of the kubelet configuration. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-04-25 15:12:12 +03:00
wangyx1992	fda7421f24	cleanup: replace x.Sub(time.Now()) with time.Until(x) in e2e test Signed-off-by: wangyx1992 <wang.yixiang@zte.com.cn>	2021-04-23 11:27:12 +08:00
Kubernetes Prow Robot	1b08dde41f	Merge pull request #101191 from tanjing2020/container_manager_test Agnhost image's progress name is called agnhost, not test-webserver	2021-04-22 08:59:53 -07:00
Kubernetes Prow Robot	032007e007	Merge pull request #101312 from harche/ContainerLogPath_fix Add SELinux security context to ContainerLogPath test	2021-04-21 09:31:17 -07:00
Harshal Patil	df13eebfd0	Add SELinux security context to ContainerLogPath test Signed-off-by: Harshal Patil <harpatil@redhat.com>	2021-04-21 13:48:32 +05:30
Shiming Zhang	e08988ba16	Fix pull empty image URL Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>	2021-04-21 15:29:46 +08:00
Kubernetes Prow Robot	92aff21558	Merge pull request #95609 from fromanirh/tm-e2e-faster-delete e2e: topology manager: use deletePodSync for faster delete	2021-04-20 04:14:33 -07:00
Shihang Zhang	925900317e	allow multiple of --service-account-issuer	2021-04-19 09:54:11 -07:00
tanjing2020	fa3956844d	The process name is called agnhost, not test-webserver, after the Agnhost image is started.	2021-04-16 18:05:00 +08:00
Kubernetes Prow Robot	dd72c4534c	Merge pull request #97968 from saschagrunert/apparmor-host-check Remove check for apparmor_parser in AppArmor host validation	2021-04-13 01:58:50 -07:00
Kubernetes Prow Robot	4959cd6339	Merge pull request #100671 from Niekvdplas/spelling-mistakes Fixed several spelling mistakes	2021-04-09 05:19:45 -07:00
Kubernetes Prow Robot	e9d7247447	Merge pull request #99072 from cynepco3hahue/e2e_fix_memory_manager_tests e2e: fix memory manager tests	2021-04-08 14:27:40 -07:00
Niekvdplas	fec272a7b2	Fixed several spelling mistakes	2021-03-30 23:02:09 +02:00
tanjing2020	d0882e69e2	Fix the wrong judgment of oom_score_adj	2021-03-24 16:13:20 +08:00
Francesco Romani	fc0955c26a	e2e: topomgr: use deletePodSync for faster delete Previously the code used to delete pods serially. In this patch we factor out code to do that in parallel, using goroutines. This shaves some time in the e2e tm test run with no intended changes in behaviour. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-18 13:28:16 +01:00
Francesco Romani	04f091790e	e2e: TM: wait for SRIOV devices in pod scope tests The Topology Manager e2e tests wants to run on real multi-NUMA system and want to consume real devices supported by device plugins; SRIOV devices happen to be the most commonly available of such devices. The tests need to wait for resource availability before to actually run the tests, or they will fail with a false negative, also relatively hard to debug. An optimization was added in commit `56106439cf` to minimize the restarts, speed up the execution and make a nasty, yet not fully understood, flake with SRIOV device plugin much less likely. Unfortunately the pod-scope tests were mistakenly left over. This Patch fixes that. CI lanes did NOT fail (and will not fail) because the CI machines aren't multi NUMA nor expose SRIOV devices, so the relevant portion of the test will just skip, avoiding the issue. However, this resurfaces when running the testsuite on bare metal; this is how we noticed. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-12 11:01:56 +01:00

... 6 7 8 9 10 ...

2341 Commits