Commit Graph

2860 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
fa4b8f32ac Merge pull request #125935 from gjkim42/fix-125880
Terminate restartable init containers ignoring not-started containers
2024-07-23 15:45:11 -07:00
Kubernetes Prow Robot
a4f9910c51 Merge pull request #126014 from PannagaRao/kep-ephemeral-storage-quota
pkg/volume/*: Enable quotas in user namespace
2024-07-23 09:21:02 -07:00
Kubernetes Prow Robot
7590cb7adf Merge pull request #125257 from vinayakankugoyal/armor
KEP-24: Update AppArmor feature gates to GA stage.
2024-07-23 09:20:52 -07:00
Kubernetes Prow Robot
3e9a73d558 Merge pull request #126058 from AnishShah/patch-2
Deflake kubernetes-node-swap-fedora-serial jobs
2024-07-22 15:48:42 -07:00
Kubernetes Prow Robot
d21b17264e Merge pull request #125488 from pohly/dra-1.31
DRA for 1.31
2024-07-22 11:45:55 -07:00
Patrick Ohly
d11b58efe6 DRA kubelet: refactor gRPC call timeouts
Some of the E2E node tests were flaky. Their timeout apparently was chosen
under the assumption that kubelet would retry immediately after a failed gRPC
call, with a factor of 2 as safety margin. But according to
0449cef8fd,
kubelet has a different, higher retry period of 90 seconds, which was exactly
the test timeout. The test timeout has to be higher than that.

As the tests don't use the gRPC call timeout anymore, it can be made
private. While at it, the name and documentation gets updated.
2024-07-22 18:09:34 +02:00
Patrick Ohly
0b62bfb690 DRA e2e: adapt to v1alpha3 API 2024-07-22 18:09:34 +02:00
Itamar Holder
a6df16af85 node e2e test: exclude critical pods from swapping
Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-07-22 17:56:52 +03:00
PannagaRamamanohara
d16fd6a915 pkg/volume: Use QuotaMonitoring in UserNamespace
Enable LocalStorageCapacityIsolationFSQuotaMonitoring
only when hostUsers in PodSpec is set to false.
Modify unit tests and e2e tests to verify

Signed-off-by: PannagaRamamanohara <pbhojara@redhat.com>
2024-07-22 09:43:57 -04:00
Anish Shah
665df5794e wait for pod to be ready before continuing with the test
This test is flaky. I have noticed that this happens because the pod is not READY when it is being deleted at the end of the test. This fix ensures that the pod is READY before continuing with the rest of the test.
2024-07-22 05:26:59 +00:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
Kubernetes Prow Robot
f2428d66cc Merge pull request #125163 from pohly/dra-kubelet-api-version-independent-no-rest-proxy
DRA: make kubelet independent of the resource.k8s.io API version
2024-07-18 17:47:48 -07:00
Patrick Ohly
616a014347 DRA: move ResourceSlice publishing into DRA drivers
This is a first step towards making kubelet independent of the resource.k8s.io
API versioning because it now doesn't need to copy structs defined by that API
from the driver to the API server. The next step is removing the other
direction (reading ResourceClaim status and passing the resource handle to
drivers).

The drivers must get deployed so that they have their own connection to the API
server. Securing at least the writes via a validating admission policy should
be possible.

As before, the kubelet removes all ResourceSlices for its node at startup, then
DRA drivers recreate them if (and only if) they start up again. This ensures
that there are no orphaned ResourceSlices when a driver gets removed while the
kubelet was down.

While at it, logging gets cleaned up and updated to use structured, contextual
logging as much as possible. gRPC requests and streams now use a shared,
per-process request ID and streams also get logged.
2024-07-18 09:09:19 +02:00
Patrick Ohly
3d4bc44a2f dra e2e node: addd test case for ResourceSlice handling during kubelet startup
Any redundant object must get deleted, but not the ones of other names.
2024-07-18 09:09:19 +02:00
Kubernetes Prow Robot
b68a58d372 Merge pull request #126141 from Nordix/esotsal/fix-126135
test/e2e_node:  Fix pod_resize tests in CI
2024-07-17 16:29:25 -07:00
Peter Hunt
3d8cb4fa89 e2e_node: loosen proc mount test
the exact number of lines/ro lines is not important, just that there are more than 0 ro lines
and more than 1 line total.

this helps accomodate different architectures that implement different kernel APIs

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-07-17 13:26:23 -04:00
Kubernetes Prow Robot
ad72be434d Merge pull request #125417 from bitoku/splitfs
KEP-4191: Split Image Filesystem add end-to-end tests
2024-07-16 23:27:06 -07:00
Sotiris Salloumis
3a01281d2f test/e2e_node: pod_resize tests
add NodeAlphaFeature label, as the feature is in alpha to be skipped in CI
add missing Arm64 check
2024-07-17 07:55:44 +02:00
Kubernetes Prow Robot
a00c834ebf Merge pull request #123303 from haircommander/proc-mount-e2e-tests
KEP-4265: add e2e tests for ProcMountType
2024-07-16 19:37:05 -07:00
Peter Hunt
a20a8225cf e2e_node: skip proc mount tests on nodes without userns support in the runtime
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Co-authored-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-07-16 17:46:23 -04:00
Peter Hunt
d6ee9ca860 test/e2e_node: add proc mount tests
including one Alpha only test, as the feature is in alpha

Signed-off-by: Peter Hunt <pehunt@redhat.com>
Co-authored-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-07-16 17:45:26 -04:00
Kubernetes Prow Robot
157f4b94d8 Merge pull request #125753 from SergeyKanzhelev/devicePluginFailuresTests
device plugin failure tests
2024-07-16 04:36:59 -07:00
Kubernetes Prow Robot
bfffd43108 Merge pull request #124296 from Nordix/esotsal/e2e_node_pod_resize_test
Add Pod Resize Node E2E test using framework in test/e2e_node
2024-07-15 19:27:23 -07:00
Kubernetes Prow Robot
2263f2d719 Merge pull request #124148 from cyclinder/add_flag_kubelet
kubelet: Add a TopologyManager policy option: max-allowable-numa-nodes
2024-07-15 19:27:16 -07:00
Kubernetes Prow Robot
5427708866 Merge pull request #125404 from mimowo/fix-kubelet-podip
Fix that PodIP field is temporarily removed for a terminal pod
2024-07-15 16:41:10 -07:00
Vinayak Goyal
bc06071495 Update AppArmor feature gates to GA stage.
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2024-07-15 23:29:37 +00:00
Kubernetes Prow Robot
48eef1fc4f Merge pull request #125867 from zhifei92/fix-e2e-node-density
Fix the bug related to cleaning up density test pods
2024-07-15 11:55:09 -07:00
Davanum Srinivas
133c4290c7 Fix for OOMKiller test consistently failing in EC2 cgroupv1 serial jobs
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2024-07-13 18:44:15 -04:00
Michal Wozniak
5f1ab75d27 Fix that PodIP field is not set for terminal pod 2024-07-12 21:36:12 +02:00
Davanum Srinivas
2db4c4aaab Set ginkgo time if not specified explicitly
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2024-07-12 11:33:22 -04:00
Kevin Hannon
950781a342 add e2e tests for split filesystem
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2024-07-12 14:19:17 +00:00
Sotiris Salloumis
99f90934b4 Add Pod Resize Node E2E test using framework in test/e2e_node 2024-07-12 15:53:53 +02:00
zhifei92
115092b374 fix(e2e_node): density cleanup pods 2024-07-11 15:39:52 +08:00
Sergey Kanzhelev
541f2af78d device plugin failure tests 2024-07-10 20:14:59 +00:00
Kubernetes Prow Robot
672af9406e Merge pull request #125981 from dims/cleanup-pods-after-test-runs
[e2e-node] Cleanup pods after the test runs
2024-07-09 15:01:01 -07:00
Davanum Srinivas
f6836df520 [e2e-node] Cleanup pods after the test runs
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2024-07-09 16:53:28 -04:00
Gunju Kim
a03affab78 Terminate restartable init containers ignoring not-started containers
This ensures that the restartable init containers receive a termination
signal even if there are any not-started restartable init containers, by
ignoring the not-running containers.
2024-07-10 05:50:51 +09:00
Kubernetes Prow Robot
4a214f6ad9 Merge pull request #125461 from mimowo/pod-disruption-conditions-ga
Graduate PodDisruptionConditions to stable
2024-07-09 11:08:13 -07:00
cyclinder
87129c350a kubelet: Add a TopologyManager policy options: "max-allowable-numa-nodes"
Signed-off-by: cyclidner <kuocyclinder@gmail.com>
2024-07-09 22:26:24 +08:00
Davanum Srinivas
2dccf29f33 Fix for Merged kubelet config does not match the expected configuration in cgroupv1 based jobs
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2024-07-03 10:54:09 -04:00
Sascha Grunert
2df920120a Fix kubelet AppArmor rejection test
The corresponding e2e test needs to be adjusted side by side to the
merged PR: https://github.com/kubernetes/kubernetes/pull/125776.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-07-03 10:54:22 +02:00
Kubernetes Prow Robot
ac9aec9f9b Merge pull request #125116 from pohly/dra-one-of-source
DRA: remove "source" indirection from v1 Pod API
2024-06-28 12:46:45 -07:00
Michal Wozniak
780191bea6 review remarks for graduating PodDisruptionConditions 2024-06-28 17:32:27 +02:00
Michal Wozniak
bf0c9885a4 Graduate PodDisruptionConditions to stable 2024-06-28 16:36:51 +02:00
Patrick Ohly
bde9b64cdf DRA: remove "source" indirection from v1 Pod API
This makes the API nicer:

    resourceClaims:
    - name: with-template
      resourceClaimTemplateName: test-inline-claim-template
    - name: with-claim
      resourceClaimName: test-shared-claim

Previously, this was:

    resourceClaims:
    - name: with-template
      source:
        resourceClaimTemplateName: test-inline-claim-template
    - name: with-claim
      source:
        resourceClaimName: test-shared-claim

A more long-term benefit is that other, future alternatives
might not make sense under the "source" umbrella.

This is a breaking change. It's justified because DRA is still
alpha and will have several other API breaks in 1.31.
2024-06-27 17:53:24 +02:00
Kubernetes Prow Robot
25a43070ee Merge pull request #123468 from ffromani/fix-mm-metrics-test
node: memory manager: fix the metrics tests
2024-06-26 12:00:45 -07:00
Kubernetes Prow Robot
b29dce0757 Merge pull request #125627 from yt-huang/clean-up
drop deprecated PollWithContext and adopt PollUntilContextTimeout ins…
2024-06-26 10:58:55 -07:00
Kubernetes Prow Robot
2200f5ef1b Merge pull request #125446 from AkihiroSuda/rro-e2e-remove-withserial
e2e_node/mount_rro_linux_test.go: remove unneeded WithSerial
2024-06-25 14:18:12 -07:00
Kubernetes Prow Robot
0913b90809 Merge pull request #125402 from iholder101/swap/skip-e2e-test-if-no-swap
[KEP-2400]: Swap e2e tests: skip swap stress tests if swap is not provisioned
2024-06-25 14:17:58 -07:00
Francesco Romani
5b6fe2f8db e2e: node: ensure no pod leaks in the container_manager test
During the debugging of https://github.com/kubernetes/kubernetes/pull/123468
it became quite evident there are unexpected pods, leftovers from
the container_manager_test. But we need stronger isolation among test
to have good signal, so we add these safeguards (xref:
https://github.com/kubernetes/kubernetes/pull/123468#issuecomment-1977977609
)

Signed-off-by: Francesco Romani <fromani@redhat.com>
2024-06-25 07:40:41 +02:00