Commit Graph

2688 Commits

Author SHA1 Message Date
Gunju Kim
d2b803246a Don't reuse the device allocated to the restartable init container 2023-10-17 18:28:29 +09:00
carlory
5d0f8530f6 fix Huge Pages failing test 2023-10-16 23:13:32 +08:00
charles-chenzz
7d31b5ffd0 Add test case for sandbox condition if pod fails to mount volume from a missing secret 2023-10-16 22:04:04 +08:00
Kevin Hannon
dd9c3358f5 Revert "podresources: e2e: force eager connection" 2023-10-16 09:46:04 -04:00
Shiming Zhang
33f2d487e2 Promote KEP-2681 to beta in 1.29 2023-10-16 10:10:35 +08:00
Kubernetes Prow Robot
378866edba Merge pull request #120518 from saschagrunert/metrics-container-start
kubelet: fix metric `container_start_time_seconds` timestamp
2023-10-15 07:05:37 +02:00
Kubernetes Prow Robot
675a64eaa6 Merge pull request #121129 from carlory/cleanup-e2e-framework-equal
remove deprecated framework.ExpectEqual
2023-10-14 23:50:37 +02:00
Kubernetes Prow Robot
ae9dc3330e Merge pull request #120874 from ruquanzhao/fixDevicePluginProbeCI
fix DevicePluginProbe node-e2e: pod and kubelet restarts
2023-10-14 23:50:28 +02:00
Kevin Hannon
1ae5429629 add potential fixes for flakiness in eviction tests 2023-10-13 11:36:44 -04:00
Kubernetes Prow Robot
4c8fca2f06 Merge pull request #112894 from pohly/e2e-framework-test-labels
e2e framework: test labels
2023-10-13 02:40:43 +02:00
Kubernetes Prow Robot
2b4ef19578 Merge pull request #121191 from dims/update-busybox-sha-based-image-to-match-tag-1.36-1-1
Update busybox SHA based image to match tag - 1.36.1-1
2023-10-12 22:49:43 +02:00
Kubernetes Prow Robot
8923c3c871 Merge pull request #119659 from kannon92/beta-pod-ready-to-start
[KEP-3085] Promote PodReadyToStartContainers to beta in 1.29
2023-10-12 22:49:16 +02:00
Davanum Srinivas
968d6b8a32 Update busybox SHA based image to match tag - 1.36.1-1
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-10-12 14:17:36 -04:00
Kevin Hannon
c94240e2e2 move kubelet constant for podreadytostart to staging 2023-10-12 11:18:11 -04:00
Kubernetes Prow Robot
38a1ec75f0 Merge pull request #119882 from ffromani/podres-client-wait
podresources: e2e: force eager connection
2023-10-12 15:59:55 +02:00
Kubernetes Prow Robot
dc1cde6e02 Merge pull request #121044 from charles-chenzz/e2e_pod_readytostart_false
[KEP-3085]: check PodReadyToStartContainers condition after gracefulshutdown
2023-10-11 20:29:32 +02:00
carlory
2c1836bc24 remove deprecated framework.ExpectEqual 2023-10-11 12:43:10 +08:00
RuquanZhao
babac47c6f fix DevicePluginProbe node-e2e: pod and kubelet restarts
The kubelet restarts working pods with an exponential back-off delay,
with a maximum cap of 5 minutes. The waiting 1 minutes may happen to be
in back-off time.

Signed-off-by: Ruquan Zhao <ruquan.zhao@arm.com>
2023-10-11 10:15:32 +08:00
Kubernetes Prow Robot
bdcb73d6b3 Merge pull request #120460 from tzneal/deflake-oom-tests-on-containerd
skip the reason check for OOM reason test if it will fail
2023-10-11 01:03:17 +02:00
Patrick Ohly
19ecf93ec3 e2e: define features and node features
The list is based on the -list-tests output.
2023-10-10 18:15:49 +02:00
Patrick Ohly
f2d34426f8 e2e: enhance SIGDescribe
framework.SIGDescribe is better because:
- Ginkgo uses the source code location of the test, not of the wrapper,
  when reporting progress.
- Additional annotations can be passed.

To make this a drop-in replacement, framework.SIGDescribe generates a function
that can be used instead of the former SIGDescribe functions.

windows.SIGDescribe contained some additional code to ensure that tests are
skipped when not running with a suitable node OS. This gets moved into a
separate wrapper generator, to allow using framework.SIGDescribe as intended.
To ensure that all callers were modified, the windows.sigDescribe isn't
exported anymore (wasn't necessary in the first place!).
2023-10-10 18:15:49 +02:00
carlory
d5d7fb595e e2e_node: stop using deprecated framework.ExpectEqual 2023-10-09 16:42:42 +08:00
Katarzyna Lach
122ff5a212 Move grpc rate limitter from podresource folder
Rate limitter.go file is a generic file implementing
grpc Limiter interface. This file can be reuse by other gRPC
API not only by podresource.

Change-Id: I905a46b5b605fbb175eb9ad6c15019ffdc7f2563
2023-10-09 07:22:23 +00:00
charles-chenzz
ccc6458683 e2e_node: add testcase to check status of pod ready to start condition are set to false after terminating 2023-10-08 20:40:36 +08:00
Gunju Kim
8b5f30ef09 Don't reuse CPU set of a restartable init container 2023-10-06 22:16:15 +09:00
Kubernetes Prow Robot
f19b62fc09 Merge pull request #120959 from pohly/e2e-test-whitespace-cleanup
e2e: remove redundant spaces in test names
2023-10-05 00:41:59 +02:00
Patrick Ohly
0e8a1f1816 e2e: remove redundant spaces in test names
The spaces are redundant because Ginkgo will add them itself when concatenating
the different test name components. Upcoming change in the framework will
enforce that there are no such redundant spaces.
2023-09-29 08:30:57 +02:00
Davanum Srinivas
d900217664 fix missed branch - targets when building using arm64
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-09-27 15:52:37 -04:00
Davanum Srinivas
52f5093d77 Build kubelet with CGO for sig-node e2e tests (not ginkgo)
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-09-26 08:32:59 -04:00
Kubernetes Prow Robot
884bc96fec Merge pull request #120773 from swatisehgal/tm-metrics-e2e-deflake
topology-mgr: metrics: Deflake Topology Manager metrics e2e tests
2023-09-20 11:26:26 -07:00
Kubernetes Prow Robot
7fb7e2625b Merge pull request #120401 from shijinye/e2eclean-node-notequal
cleanup:e2e:stop using deprecated framework.ExpectNotEqual
2023-09-20 11:26:19 -07:00
Kubernetes Prow Robot
3191493cea Merge pull request #119402 from Tal-or/e2e_podres_terminal_pods
e2e:podresources: verify count for terminal pods
2023-09-20 11:26:11 -07:00
Swati Sehgal
f5d915b594 topology-mgr: metrics: Deflake Topology Manager metrics e2e tests
On local execution of Topology Manager metrics tests, the tests pass rate was 100%.
Yet, we can see that the Topology Manager metrics tests are failing in upstream
CI consistently: https://testgrid.k8s.io/sig-node-presubmits#pr-kubelet-serial-gce-e2e-topology-manager.

From the logs, it was identified that these failures are because of timeouts,
so we are increasing the default timeout as well as polling interval frequency
of obtaining KubeletMetrics to deflake this test.

We have noticed a similar flake in case of CPU manager metrics tests as well:
https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/directory/pull-kubernetes-node-kubelet-serial-cpu-manager/1701615009836044288.
Once it is confirmed that the issue is resolved for Topology Manager test,
we will be fix this for CPU Manager as well in a follow-up PR.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-09-20 13:37:27 +01:00
Kubernetes Prow Robot
a68093a3ff Merge pull request #120506 from alexzielenski/import-restrictions
Update e2e import restrictions
2023-09-13 21:56:22 -07:00
Kubernetes Prow Robot
160fe010f3 Merge pull request #120464 from gjkim42/deflake-container-lifecycle-e2e-test
e2e_node: Assign enough time to finish the postStart hook
2023-09-12 17:44:44 -07:00
Kubernetes Prow Robot
04e5914079 Merge pull request #120349 from ruquanzhao/fixTopologyManagerJobs
e2e-node: fix TopologyManager test jobs.
2023-09-12 17:44:37 -07:00
Kubernetes Prow Robot
8aeebda818 Merge pull request #120306 from Rei1010/nodeClean
e2e_node:stop using deprecated framework.ExpectError
2023-09-12 17:44:23 -07:00
Todd Neal
af151eeba2 specifically check that the pod was successful 2023-09-12 13:40:20 -05:00
Gunju Kim
1fb4eee94e Use container log instead of termination log
Since the termination log cannot be accessed until the container is
terminated, use the container log.
2023-09-11 22:55:09 +09:00
Sascha Grunert
5e0931336b kubelet: fix metric container_start_time_seconds's timestamp
Adapting the tests and reverting https://github.com/kubernetes/kubernetes/pull/103429

Carry-over from https://github.com/kubernetes/kubernetes/pull/117881

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-09-08 09:13:37 +02:00
Alexander Zielenski
7a13b11af0 update e2e import restrictions 2023-09-07 12:20:29 -07:00
Kubernetes Prow Robot
b27670dfbd Merge pull request #118740 from saschagrunert/kubelet-label-types
Make kubelet label types public
2023-09-06 23:46:57 -07:00
Francesco Romani
2ea47038b9 podresources: e2e: force eager connection
Add and use more facilities to the *internal* podresources client.
Checking e2e test runs, we have quite some
```
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/lib/kubelet/pod-resources/kubelet.sock: connect: connection refused": rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/lib/kubelet/pod-resources/kubelet.sock: connect: connection refused"
```

This is likely caused by kubelet restarts, which we do plenty in e2e tests,
combined with the fact gRPC does lazy connection AND we don't really
check the errors in client code - we just bubble them up.

While it's arguably bad we don't check properly error codes, it's also
true that in the main case, e2e tests, the functions should just never
fail besides few well known cases, we're connecting over a
super-reliable unix domain socket after all.

So, we centralize the fix adding a function (alongside with minor
cleanups) which wants to trigger and ensure the connection happens,
localizing the changes just here. The main advantage is this approach
is opt-in, composable, and doesn't leak gRPC details into the client
code.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-09-07 08:24:49 +02:00
Todd Neal
94afd6e3a4 skip the reason check for OOM tests if it will fail
This is currently flaking badly due to a race between cgroup deletion
and the runtime detecting the OOM kill.
2023-09-06 12:20:02 -05:00
Gunju Kim
b468e4eb1c e2e_node: Assign enough time to finish the postStart hook
This deflakes the "Containers Lifecycle should not launch second
container before PostStart of the first container completed" test by
assigning enough time to finish the postStart hook.
2023-09-07 00:42:54 +09:00
Kubernetes Prow Robot
56cc5e77a1 Merge pull request #120441 from tzneal/revert-npd-update
Revert "bump npd to v0.8.14"
2023-09-06 06:39:04 -07:00
Kubernetes Prow Robot
debe30de70 Merge pull request #120281 from gjkim42/feature-gate-sidecar-containers-in-kuberuntime
Feature-gate SidecarContainers code in pkg/kubelet/kuberuntime
2023-09-05 18:34:54 -07:00
Todd Neal
355ae44a3c Revert "bump npd to v0.8.14"
This reverts commit 7b44d73f73.
2023-09-05 20:28:53 -05:00
jinye
a774887262 cleanup:e2e:stop using deprecated framework.ExpectNotEqual 2023-09-05 18:16:57 +08:00
RuquanZhao
bfc3c2110f e2e-node: fix TopologyManager test jobs.
Signed-off-by: Ruquan Zhao <ruquan.zhao@arm.com>
2023-09-01 17:53:16 +08:00