Commit Graph

2771 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
65f8129e51 Merge pull request #124668 from bart0sh/PR143-e2e-node-fix-containers-lifecycle
node_e2e: refactor RunTogether function
2024-05-06 15:11:46 -07:00
Kubernetes Prow Robot
de1674829c Merge pull request #123886 from adrianreber/2024-03-12-criu-not-found
Handle containerd "CRIU not found" error message
2024-05-04 06:54:28 -07:00
Ed Bartosh
6ecf0da1a5 node_e2e: refactor RunTogether function 2024-05-02 13:41:47 +03:00
Matthias Bertschy
f7ea5f3fe1 e2e lifecycle: increase delay for restartable init containers
Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
2024-05-01 22:12:04 +02:00
Matthias Bertschy
8833b4def0 e2e lifecycle: fix finishing -> exiting
Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
2024-05-01 18:27:13 +02:00
Matthias Bertschy
851d149a88 e2e lifecycle: use millisecond resolution for logs
Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
2024-05-01 18:27:10 +02:00
Kubernetes Prow Robot
d0fddf143b Merge pull request #122148 from pohly/controllers-context-support
controllers + apiserver: enhance context support
2024-04-30 01:30:09 -07:00
Kubernetes Prow Robot
1fd835ce59 Merge pull request #123398 from ffromani/remove-legacy-checkpoint
node: devicemgr: remove obsolete pre-1.20 checkpoint file support
2024-04-29 14:46:53 -07:00
Patrick Ohly
b92273a760 apiserver + controllers: enhance context support
27a68aee3a introduced context support for events. Creating an event
broadcaster with context makes tests more resilient against leaking goroutines
when that context gets canceled at the end of a test and enables per-test
output via ktesting.

The context could get passed to the constructor. A cleaner solution is to
enhance context support for the apiserver and then pass the context into the
controller's run method. This ripples up the call stack to all places which
start an apiserver.
2024-04-29 20:59:21 +02:00
Ed Bartosh
e4c6adacf0 Revert "add coverage tests for probes behavior"
This reverts commit 9be9832184.
2024-04-24 20:56:46 +03:00
Kubernetes Prow Robot
9db6aac7f3 Merge pull request #124086 from matthyx/probes
add coverage tests for probes behavior
2024-04-23 17:02:17 -07:00
Kubernetes Prow Robot
1a4f5a30f0 Merge pull request #124097 from Nordix/esotsal/cpu_manager_test_clean_code
e2e_node: clean cpu_manager test
2024-04-22 18:26:12 -07:00
Kubernetes Prow Robot
11ca079137 Merge pull request #124396 from mimowo/make-sure-traps-are-registered
Make e2e node tests more resiliant by ensuring the SIGTERM trap is registered
2024-04-22 17:25:40 -07:00
Kubernetes Prow Robot
d8f8c7fae0 Merge pull request #124288 from pohly/test-e2e-node-debugger
e2e node: debugger support
2024-04-22 08:43:27 -07:00
huweiwen
6ec421e2cf test/e2e: do not use global variable for image
We have "-kube-test-repo-list" command line flag to override the image registry. If we store it in global variable, then that overriding cannot take effect.

And this can cause puzzling bugs, e.g.: containerIsUnused() function will compare incorrect image address.
2024-04-22 19:29:39 +08:00
Dan Winship
01c7378531 Fix up pod hostIPs e2e
- The feature is GA so there's no feature gate so it doesn't need any
  special label now.

- The test is not dual-stack-specific, so it shouldn't claim to be.

- It asserted node-IP-assigning behavior that is not guaranteed to
  work on all clouds. (Among other things: that there are no "extra"
  InternalIPs, and that there are InternalIPs of every supported IP
  family, rather than there only being ExternalIPs of some families.)
2024-04-20 10:31:29 -04:00
Michal Wozniak
dccb775d6e Make e2e node tests more resiliant by ensuring the SIGTERM trap is registered 2024-04-19 09:05:36 +02:00
Kubernetes Prow Robot
1d171a7501 Merge pull request #124289 from pohly/test-e2e-node-verbosity-fix
e2e node: fix -v support
2024-04-18 04:24:23 -07:00
Kubernetes Prow Robot
6f995a4bbc Merge pull request #124181 from testwill/close_tmpfile
fix: close tmp file
2024-04-18 03:24:08 -07:00
Kubernetes Prow Robot
7f67cb5960 Merge pull request #123969 from liangyuanpeng/cleanup_rand
cleanup: delete rand.Seed(time.Now().UnixNano()) and using global number generator.
2024-04-18 02:10:26 -07:00
Kubernetes Prow Robot
f88d2454c5 Merge pull request #123950 from kannon92/move-eviction-tests
move system node critical test to eviction test lane
2024-04-18 02:10:17 -07:00
Kubernetes Prow Robot
0c55f74aed Merge pull request #123894 from saschagrunert/cni-plugins
Update cni-plugins to v1.4.1
2024-04-18 01:04:39 -07:00
Patrick Ohly
d97b67d97a e2e node: support running the test binary under a debugger
Single-stepping interactively through a test can be useful to understand what's
happening and to investigate the state at each step.

Similar support was added early to hack/ginkgo-e2e.sh, so the same env variable
is used again.
2024-04-16 11:46:28 +02:00
Francesco Romani
181fb0da51 node: devicemgr: remove obsolete pre-1.20 checkpoint file support
In commit 2f426fdba6 we added
compatibility (and tests) to deal with pre-1.20 checkpoint files.
We are now well past the end of support for pre-1.20 kubelets,
so we can get rid of this code.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2024-04-15 14:01:56 +02:00
Patrick Ohly
ff541e7924 e2e node: fix -v support
Since 43539c855f (first released in
v1.30.0-alpha.2), the test/e2e/framework manages -v and -vmodule and uses them
for a logger which writes to the Ginkgo output stream.

This did not work for test/e2e_node, because:
- logs.AddFlags(pflag.CommandLine) registers its own -v and -vmodule flags
- pflag.CommandLine.AddGoFlagSet(flag.CommandLine) skips the corresponding
  flags in the flag.CommandLine
- pflag.Parse() initializes the settings in the "logs" package even though
  those are not used at runtime

The solution is to not use the "logs" package.
2024-04-12 12:27:29 +02:00
Kevin Hannon
43e0bd4304 mark flaky jobs as flaky and move them to a different job 2024-04-08 09:27:15 -04:00
guoguangwu
ad7799d07d fix: close tmp file
Signed-off-by: guoguangwu <guoguangwug@gmail.com>
2024-04-06 10:55:08 +08:00
Sotiris Salloumis
87e113261d e2e_node: clean cpu_manager test 2024-03-28 12:41:07 +01:00
Matthias Bertschy
9be9832184 add coverage tests for probes behavior
Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
2024-03-27 22:31:47 +01:00
Kubernetes Prow Robot
f4e246bc93 Merge pull request #123908 from Nordix/esotsal/OOMKiller
oomkiller_linux_test: fix warnings
2024-03-27 11:42:19 -07:00
Kubernetes Prow Robot
20d0ab7ae8 Merge pull request #124011 from bart0sh/PR138-e2e_node-fix-podresurces-failure
e2e_node: fix podresources test
2024-03-22 08:16:08 -07:00
Kubernetes Prow Robot
95a6f2e4dc Merge pull request #124010 from bart0sh/PR137-e2e_node-fix-admission-error
Fix admission error on podresources e2e test
2024-03-21 14:14:13 -07:00
Ed Bartosh
6f5240b19c e2e_node: fix podresources test
Fixed `The phase of Pod e2e-test-pod is Succeeded which is unexpected`
error. `e2epod.NewPodClient(f).CreateSync` is unable to catch 'Running'
status of the pod as pod finishes too fast.
Using `Create` API should solve the issue as it doesn't query pod
status.
2024-03-21 13:11:03 +02:00
Ed Bartosh
9ce994af9f e2e_node: remove Dbus test case
The test case restarts dbus and systemd, which is considered dangerous
practice and caused slowdown of the test cases for CRI-O Serial jobs.
2024-03-20 18:38:11 +02:00
Ed Bartosh
247392271f Fix admission error
Fixed UnexpectedAdmissionError: Allocate failed due to not enough cpus
available to satisfy request: requested=2, available=1, which is unexpected
2024-03-20 18:03:13 +02:00
Lan Liang
dc992adad3 cleanup: delete rand.Seed(time.Now().UnixNano()) and using global number generator.
see https://tip.golang.org/doc/go1.20

Signed-off-by: Lan Liang <gcslyp@gmail.com>
2024-03-18 08:10:12 +00:00
Sotiris Salloumis
a7f23e46da Fix OOMKiller test warnings 2024-03-17 09:16:24 +01:00
Kevin Hannon
0bdc4c3911 move system node critical test to eviction test lane 2024-03-15 10:35:02 -04:00
Akihiro Suda
1dc05009fe api: NodeStatus: rename RuntimeClasses to RuntimeHandlers
The runtime classes are apiserver's concept, while the handlers are kubelet's concept.
For NodeStatus, it makes more sense to return the latter ones here.

This commit modifies the following files:

- pkg/apis/core/types.go
- staging/src/k8s.io/api/core/v1/types.go
- pkg/kubelet/nodestatus/setters.go
- pkg/kubelet/kubelet_node_status.go
- pkg/registry/core/node/strategy.go
- test/e2e_node/mount_rro_linux_test.go

Other changes were auto-generated by running `make update`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-14 08:06:39 +09:00
Kevin Hannon
19ae61bab0 innocent-pod should not be evicted due to exceeding requests/limits 2024-03-12 13:37:11 -04:00
Sascha Grunert
a35b75ee57 Update cni-plugins to v1.4.1
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-03-12 12:06:52 +01:00
Adrian Reber
cba34770d2 Handle containerd "CRIU not found" error message
During the PR to get "Forensic Container Checkpointing" enabled in
containerd the decision was made to not correctly report if containerd
cannot find the CRIU binary. The reason was that the e2e_node checkpoint
test did not understand the error message.

The e2e_node checkpoint test is skipped if the container runtime (CRI-O
or containerd) does not enable checkpoint support of if checkpoint
support is not implemented.

This commit adds another reason to skip a check. If the underlying OS
which is used to test "Forensic Container Checkpointing" in combination
with containerd or CRI-O is missing the CRIU binary.

This was encountered on Google's Container-Optimized OS (COS) based
tests where CRIU was not installed.

With this change merged it is possible for containerd to return the
correct error message without breaking Kubernetes e2e tests.

Signed-off-by: Adrian Reber <areber@redhat.com>
2024-03-12 08:13:53 +00:00
Akihiro Suda
ea14ccdf13 e2e_node: mount_rro: fix error string comparison
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-11 11:50:25 +09:00
Akihiro Suda
5cc1e56248 e2e_node: mount_rro: add SkipUnlessFeatureGateEnabled(RecursiveReadOnlyMounts)
Fix issue 123848

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-11 11:50:25 +09:00
Akihiro Suda
d4925ce8f8 e2e: KEP-3857: Recursive Read-only (RRO) mounts
Usage:
```
make test-e2e-node \
  TEST_ARGS='--service-feature-gates=RecursiveReadOnlyMounts=true --kubelet-flags="--feature-gates=RecursiveReadOnlyMounts=true"' \
  FOCUS="Mount recursive read-only" SKIP=""
```

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 03:00:59 +09:00
Patrick Ohly
d59676a545 dra kubelet: publish NodeResourceSlices
The information is received from the DRA driver plugin through a new gRPC
streaming interface. This is backwards compatible with old DRA driver kubelet
plugins, their gRPC server will return "not implemented" and that can be
handled by kubelet. Therefore no API break is needed.

However, DRA drivers need to be updated because the Go API changed. They can
return
    status.New(codes.Unimplemented, "no node resource support").Err()
if they don't support the new ListAndWatchResources method and
structured parameters.

The controller in kubelet then synchronizes this information from the driver
with NodeResourceSlice objects, creating, updating and deleting them as needed.
2024-03-07 22:22:13 +01:00
Kubernetes Prow Robot
bd25605619 Merge pull request #123435 from tallclair/apparmor-ga
AppArmor fields API
2024-03-06 15:35:14 -08:00
Tim Allclair
0eb5f52d06 Rename AppArmor annotation constants with Deprecated 2024-03-06 10:46:31 -08:00
Kubernetes Prow Robot
3686ceb5b8 Merge pull request #122745 from kannon92/swap-no-swap-default
[KEP-2400] add no swap as the default option for swap
2024-03-05 16:32:40 -08:00
Kubernetes Prow Robot
5fd38a8c78 Merge pull request #122907 from sohankunkerkar/prepare-kep-3983-for-beta
[KEP-4419]: promote KubeletConfigDropInDir feature to beta
2024-03-05 14:45:39 -08:00