kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	e18fa74551	Merge pull request #115590 from swatisehgal/topology-mgr-duration-metrics node: topology-mgr: Add metric to measure topology manager admission latency	2023-02-15 07:12:25 -08:00
Swati Sehgal	cf21dcef51	node: topology-mgr: e2e: changes to validate admission latency metrics The component was previously incorrect. This patch updates to the correct component name. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-02-15 13:59:56 +00:00
ravisantoshgudimetla	d65262d1f9	Remove cgo dependency	2023-02-13 11:16:39 -05:00
Sascha Grunert	85106dc327	Allow SSH e2e node base64 key injection With the change of the CRI-O jobs to use butane, we now have a verification for base64 data urls in place. This means that the following URL is invalid: ``` data:text/plain;base64,GCE_SSH_PUBLIC_KEY_FILE_CONTENT ``` This means we have to pass valid base64 to the URL. To fix that, we now allow to inject SSH key values with both, the `GCE_SSH_PUBLIC_KEY_FILE_CONTENT` field and its base64 encoded variant. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2023-02-09 16:17:11 +01:00
Patrick Ohly	136f89dfc5	e2e: use error wrapping with %w The recently introduced failure handling in ExpectNoError depends on error wrapping: if an error prefix gets added with `fmt.Errorf("foo: %v", err)`, then ExpectNoError cannot detect that the root cause is an assertion failure and then will add another useless "unexpected error" prefix and will not dump the additional failure information (currently the backtrace inside the E2E framework). Instead of manually deciding on a case-by-case basis where %w is needed, all error wrapping was updated automatically with sed -i "s/fmt.Errorf$.$: '$%s\\|%v$'\",$. err)$/fmt.Errorf\1: %w\",\3/" $(git grep -l 'fmt.Errorf' test/e2e*) This may be unnecessary in some cases, but it's not wrong.	2023-02-06 15:39:13 +01:00
Patrick Ohly	9df3e2a47a	e2e: replace WaitForPodToDisappear with WaitForPodNotFoundInNamespace WaitForPodToDisappear was always called such that it listed all pods, which made it less efficient than trying to get just the one pod it was checking for. Being able to customize the poll interval in practice wasn't useful, therefore it can be replaced with WaitForPodNotFoundInNamespace.	2023-02-06 15:39:12 +01:00
Antonio Ojea	7f5ae1c0c1	Revert "e2e: wait for pods with gomega"	2023-02-06 12:08:22 +01:00
Kubernetes Prow Robot	85aa0057c6	Merge pull request #113298 from pohly/e2e-wait-for-pods-with-gomega e2e: wait for pods with gomega	2023-02-04 05:26:29 -08:00
Kubernetes Prow Robot	d415647739	Merge pull request #115441 from bobbypage/busybox-mirror-test test: Use preloaded busybox image in mirror pod test	2023-02-01 12:21:36 -08:00
Kubernetes Prow Robot	3a4cef70f2	Merge pull request #115445 from bobbypage/gh-115381 test: Fix node e2e device plugin flake	2023-02-01 02:55:06 -08:00
Kubernetes Prow Robot	bb7c9739a3	Merge pull request #114759 from my-git9/chore/k8staint chore: add k8s node-role.kubernetes.io/control-plane taint	2023-01-31 21:01:17 -08:00
David Porter	225658884b	test: Fix node e2e device plugin flake The device plugin test expects that no other pods are running prior to the test starting. However, it has been observed that in some cases some resources may still be around from previous tests. This is because the deletion of resources from other tests is handled by deleting that test's framework's namespace which is done asynchronously without waiting for the other test's namespace to be deleted. As a result, when the node e2e device plugin starts, there may still be other pods in process of termination. To work around this, add a retry to the device plugin test to account for the time it takes to delete the resources from the prior test. Signed-off-by: David Porter <david@porter.me>	2023-01-31 17:36:10 -08:00
David Porter	a3291a87d7	test: Use preloaded busybox image in mirror pod test Instead of hardcoding the busybox image, use the one that is preloaded during the test using imageutils. Signed-off-by: David Porter <david@porter.me>	2023-01-31 13:34:13 -08:00
Patrick Ohly	222f655062	e2e: use error wrapping with %w The recently introduced failure handling in ExpectNoError depends on error wrapping: if an error prefix gets added with `fmt.Errorf("foo: %v", err)`, then ExpectNoError cannot detect that the root cause is an assertion failure and then will add another useless "unexpected error" prefix and will not dump the additional failure information (currently the backtrace inside the E2E framework). Instead of manually deciding on a case-by-case basis where %w is needed, all error wrapping was updated automatically with sed -i "s/fmt.Errorf$.$: '$%s\\|%v$'\",$. err)$/fmt.Errorf\1: %w\",\3/" $(git grep -l 'fmt.Errorf' test/e2e*) This may be unnecessary in some cases, but it's not wrong.	2023-01-31 13:01:39 +01:00
Patrick Ohly	6eea1b2efa	e2e: replace WaitForPodToDisappear with WaitForPodNotFoundInNamespace WaitForPodToDisappear was always called such that it listed all pods, which made it less efficient than trying to get just the one pod it was checking for. Being able to customize the poll interval in practice wasn't useful, therefore it can be replaced with WaitForPodNotFoundInNamespace.	2023-01-31 13:01:39 +01:00
Kubernetes Prow Robot	981c4d59fb	Merge pull request #115155 from adrianreber/2023-01-18-checkpoint-test-result Extend checkpoint e2e test to check for results	2023-01-30 18:43:16 -08:00
Kubernetes Prow Robot	4df945853e	Merge pull request #115137 from swatisehgal/topologymgr-metrics node: topologymgr: add metrics about admission requests and errors	2023-01-30 18:43:00 -08:00
David Porter	b96290c08f	e2e node: Update runtime class handler skip logic There are two runtime class tests which required the container runtime config to include explicit configuration for `test-handler`. The current logic skips these tests in non GCE environments. This skip is too strict since the test is skipped in node e2e environments and in other environments such as kind, which support running the test and also configure `test-handler`. Instead of skipping based on provider, add a new function `NodeSupportsPreconfiguredRuntimeClassHandler` which examines the underlying container runtime config and checks if the config includes `test-handler`. The check is a bit brittle since it assumes container runtime config paths, but it is a net improvement over skipping the test entirely on non GCE environments. This results in the test working in the common test environments, namely GCE kube-up, node e2e, and kind. Signed-off-by: David Porter <david@porter.me>	2023-01-24 14:43:24 -08:00
Kubernetes Prow Robot	765f2ef7c7	Merge pull request #114981 from adisky/revert [Test] Revert "Fix:[Flake] [sig-node] Restart [Serial] [Slow] [Disruptive] K…	2023-01-24 03:58:15 -08:00
Adrian Reber	86b62b86d8	Extend checkpoint e2e test to check for results When the e2e_node/checkpoint_container.go test was introduced no CRI implementation supported the new CheckpointContainer RPC yet. With the release of CRI-O 1.25 the CheckpointContainer is implemented and the test has been extended to see if the content of the checkpoint is as expected. The test is skipped if the ContainerCheckpoint feature gate is disabled or if the CRI implementation does not support the CheckpointContainer RPC. Signed-off-by: Adrian Reber <areber@redhat.com>	2023-01-23 18:07:35 +00:00
Swati Sehgal	340db7109d	node: e2e: topologymgr: add tests for topology manager metrics Add node e2e tests to verify population of topology metrics. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-01-19 14:40:37 +00:00
Swati Sehgal	51c6a1fbe7	node: e2e: cpumgr: Rename: s/getCPUManagerMetrics/getKubeletMetrics Since we need to gather kubelet metrics for CPU Manager and Topology Manager, renaming this function to a more generic name. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-01-19 14:18:05 +00:00
Aditi Sharma	d83c37c311	Update CNI version to 1.2.0 Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>	2023-01-18 13:24:40 +05:30
Aditi Sharma	0a7b2eeae0	Revert "Fix:[Flake] [sig-node] Restart [Serial] [Slow] [Disruptive] Kubelet should correctly account for terminated pods after restart" This reverts commit `572360c5a5`.	2023-01-11 15:05:49 +05:30
Ian K. Coolidge	cbb985a310	cpuset: Delete 'builder' methods All usage of builder pattern is convertible to cpuset.New() with the same or fewer lines of code. Migrate Builder.Add to a private method of CPUSet, with a comment that it is only intended for internal use to preserve immutable propoerty of the exported interface. This also removes 'require' library dependency, which avoids non-standard library usage.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	f3829c4be3	cpuset: Rename 'NewCPUSet' to 'New'	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	e5143d16c2	cpuset: Make 'ToSlice*' methods look like 'set' methods In 'set', conversions to slice are done also, but with different names: ToSliceNoSort() -> UnsortedList() ToSlice() -> List() Reimplement List() in terms of UnsortedList to save some duplication.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	a0c989b99a	cpuset: Remove *Int64 methods These are rarely used and can be accommodated with a trivial helper.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	67a057d4f2	cpuset: Remove 'MustParse' method Removes exit/fatal from cpuset library. Usage in podresources test was not necessary. Library reference in cpu_manager_test was moved to a local function, and converted to use e2e test framework error catching.	2023-01-06 23:32:51 +00:00
ZhangKe10140699	572360c5a5	Fix:[Flake] [sig-node] Restart [Serial] [Slow] [Disruptive] Kubelet should correctly account for terminated pods after restart	2023-01-05 09:10:57 +08:00
xin.li	10ca605cdd	chroe: add k8s node-role.kubernetes.io/control-plane taint Signed-off-by: xin.li <xin.li@daocloud.io>	2023-01-02 21:04:43 +08:00
Antonio Ojea	e0a23577d2	pass context to gomega Change-Id: Ibef02a52d8922984a09efa48361b9876fc91287c	2022-12-19 13:14:02 +00:00
Patrick Ohly	2f6c4f5eab	e2e: use Ginkgo context All code must use the context from Ginkgo when doing API calls or polling for a change, otherwise the code would not return immediately when the test gets aborted.	2022-12-16 20:14:04 +01:00
Kubernetes Prow Robot	770b39c65b	Merge pull request #114072 from Tal-or/deflake_e2e_cpumanager_metrics_tests e2e: cpumanager: proper test clean-up	2022-12-14 11:55:45 -08:00
Kubernetes Prow Robot	7403090e40	Merge pull request #113309 from swatisehgal/devicemgr-e2e-remove-flakiness node: e2e: device plugins: Deflake e2e tests	2022-12-14 10:47:34 -08:00
Swati Sehgal	213a6edc57	node: e2e: Add descriptive messages for operation/error checks Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2022-12-13 14:54:48 +00:00
Patrick Ohly	d4729008ef	e2e: simplify test cleanup ginkgo.DeferCleanup has multiple advantages: - The cleanup operation can get registered if and only if needed. - No need to return a cleanup function that the caller must invoke. - Automatically determines whether a context is needed, which will simplify the introduction of context parameters. - Ginkgo's timeline shows when it executes the cleanup operation.	2022-12-13 08:09:01 +01:00
Swati Sehgal	62e4d39c2f	node: e2e: address review comments (2022/12/12) - use `ginkgo.DeferCleanup` instead of clean up in the AfterEach block - encourage use of ginkgo by not extending expect.go Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2022-12-12 16:31:40 +00:00
Swati Sehgal	a9e3689e63	node: e2e: ensure clean cluster state before e2e tests are run Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2022-12-12 14:50:36 +00:00
Swati Sehgal	7e880d1bab	node: e2e: ensure log rotation pod is deleted after test Some node e2e tests check for expected number of pods running on the node to verify the correct state of that node after running test scenarios. An example of such a check is in the device plugin end to end test here: [1]. If the node is not left in a clean state after an e2e test finishes running, it can lead to flaky tests because the node might have unexpected pods running on the node. In order to avoid that, we make sure that the test pods are cleaned up after the test runs. [1]: https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/device_plugin_test.go#L189-L190 Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2022-12-12 14:50:32 +00:00
Patrick Ohly	df5d84ae81	e2e: accept context from Ginkgo Every ginkgo callback should return immediately when a timeout occurs or the test run manually gets aborted with CTRL-C. To do that, they must take a ctx parameter and pass it through to all code which might block. This is a first automated step towards that: the additional parameter got added with sed -i 's/$framework.ConformanceIt\\|ginkgo.It$$.$func() {$/\1\2func(ctx context.Context) {/' \ $(git grep -l -e framework.ConformanceIt -e ginkgo.It ) $GOPATH/bin/goimports -w $(git status \| grep modified: \| sed -e 's/. //') log_test.go was left unchanged.	2022-12-10 19:50:18 +01:00
Talor Itzhak	56c5a95849	e2e: cpumanager: proper test clean-up One of the cpumanager tests doesn't remove the pod that got created during the test. This causes pollution of other tests and failures from time to time (depends on the test execution order). In order to defalke the tests, we should delete the pod and wait for it to be completely remove. Signed-off-by: Talor Itzhak <titzhak@redhat.com>	2022-11-22 17:25:52 +02:00
Andrew Sy Kim	6c8eacb157	test/e2e_node: set apiserver kubelet preferred addresses Signed-off-by: Andrew Sy Kim <andrewsy@google.com>	2022-11-21 15:09:04 -05:00
Michal Wozniak	c803892bd8	Enable the feature into beta	2022-11-09 09:02:40 +01:00
Michal Wozniak	52cd6755eb	Add pod disruption conditions for kubelet initiated failures	2022-11-07 11:23:22 +01:00
Kubernetes Prow Robot	565f582c4b	Merge pull request #113199 from bobbypage/node_e2e_stop_kubelet test: Stop kubelet systemd service after node e2e	2022-11-06 01:34:16 -08:00
David Ashpole	64af1adace	Second attempt: Plumb context to Kubelet CRI calls (#113591 ) * plumb context from CRI calls through kubelet * clean up extra timeouts * try fixing incorrectly cancelled context	2022-11-05 06:02:13 -07:00
Kubernetes Prow Robot	6fe5429969	Merge pull request #113273 from bobbypage/restart_test_fix test: Fix e2e_node restart_test flake	2022-11-04 05:14:14 -07:00
Kubernetes Prow Robot	a9f87ad6c8	Merge pull request #113384 from pohly/e2e-formatting e2e: formatting enhancements	2022-11-02 21:40:08 -07:00
Francesco Romani	ff44dc1932	cpumanager: the FG is locked to default (ON) hence we can remove the if() guards, the feature is always available. Signed-off-by: Francesco Romani <fromani@redhat.com>	2022-11-02 18:41:41 +01:00

1 2 3 4 5 ...

2365 Commits