Commit Graph

2355 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
d415647739 Merge pull request #115441 from bobbypage/busybox-mirror-test
test: Use preloaded busybox image in mirror pod test
2023-02-01 12:21:36 -08:00
Kubernetes Prow Robot
3a4cef70f2 Merge pull request #115445 from bobbypage/gh-115381
test: Fix node e2e device plugin flake
2023-02-01 02:55:06 -08:00
Kubernetes Prow Robot
bb7c9739a3 Merge pull request #114759 from my-git9/chore/k8staint
chore: add k8s node-role.kubernetes.io/control-plane taint
2023-01-31 21:01:17 -08:00
David Porter
225658884b test: Fix node e2e device plugin flake
The device plugin test expects that no other pods are running prior to
the test starting. However, it has been observed that in some cases
some resources may still be around from previous tests. This is because
the deletion of resources from other tests is handled by deleting that
test's framework's namespace which is done asynchronously without
waiting for the other test's namespace to be deleted.

As a result, when the node e2e device plugin starts, there may still be
other pods in process of termination. To work around this, add a retry
to the device plugin test to account for the time it takes to delete the
resources from the prior test.

Signed-off-by: David Porter <david@porter.me>
2023-01-31 17:36:10 -08:00
David Porter
a3291a87d7 test: Use preloaded busybox image in mirror pod test
Instead of hardcoding the busybox image, use the one that is preloaded
during the test using imageutils.

Signed-off-by: David Porter <david@porter.me>
2023-01-31 13:34:13 -08:00
Kubernetes Prow Robot
981c4d59fb Merge pull request #115155 from adrianreber/2023-01-18-checkpoint-test-result
Extend checkpoint e2e test to check for results
2023-01-30 18:43:16 -08:00
Kubernetes Prow Robot
4df945853e Merge pull request #115137 from swatisehgal/topologymgr-metrics
node: topologymgr: add metrics about admission requests and errors
2023-01-30 18:43:00 -08:00
David Porter
b96290c08f e2e node: Update runtime class handler skip logic
There are two runtime class tests which required the container runtime
config to include explicit configuration for `test-handler`. The current
logic skips these tests in non GCE environments. This skip is too strict
since the test is skipped in node e2e environments and in other
environments such as kind, which support running the test and also
configure `test-handler`.

Instead of skipping based on provider, add a new function
`NodeSupportsPreconfiguredRuntimeClassHandler` which examines the
underlying container runtime config and checks if the config includes
`test-handler`. The check is a bit brittle since it assumes container
runtime config paths, but it is a net improvement over skipping the test
entirely on non GCE environments.

This results in the test working in the common test environments, namely
GCE kube-up, node e2e, and kind.

Signed-off-by: David Porter <david@porter.me>
2023-01-24 14:43:24 -08:00
Kubernetes Prow Robot
765f2ef7c7 Merge pull request #114981 from adisky/revert
[Test] Revert "Fix:[Flake] [sig-node] Restart [Serial] [Slow] [Disruptive] K…
2023-01-24 03:58:15 -08:00
Adrian Reber
86b62b86d8 Extend checkpoint e2e test to check for results
When the e2e_node/checkpoint_container.go test was introduced no CRI
implementation supported the new CheckpointContainer RPC yet.

With the release of CRI-O 1.25 the CheckpointContainer is implemented
and the test has been extended to see if the content of the checkpoint
is as expected.

The test is skipped if the ContainerCheckpoint feature gate is disabled
or if the CRI implementation does not support the CheckpointContainer
RPC.

Signed-off-by: Adrian Reber <areber@redhat.com>
2023-01-23 18:07:35 +00:00
Swati Sehgal
340db7109d node: e2e: topologymgr: add tests for topology manager metrics
Add node e2e tests to verify population of topology metrics.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-01-19 14:40:37 +00:00
Swati Sehgal
51c6a1fbe7 node: e2e: cpumgr: Rename: s/getCPUManagerMetrics/getKubeletMetrics
Since we need to gather kubelet metrics for CPU Manager and Topology
Manager, renaming this function to a more generic name.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-01-19 14:18:05 +00:00
Aditi Sharma
d83c37c311 Update CNI version to 1.2.0
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2023-01-18 13:24:40 +05:30
Aditi Sharma
0a7b2eeae0 Revert "Fix:[Flake] [sig-node] Restart [Serial] [Slow] [Disruptive] Kubelet should correctly account for terminated pods after restart"
This reverts commit 572360c5a5.
2023-01-11 15:05:49 +05:30
Ian K. Coolidge
cbb985a310 cpuset: Delete 'builder' methods
All usage of builder pattern is convertible to cpuset.New()
with the same or fewer lines of code.

Migrate Builder.Add to a private method of CPUSet, with a comment
that it is only intended for internal use to preserve immutable
propoerty of the exported interface.

This also removes 'require' library dependency, which avoids
non-standard library usage.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
f3829c4be3 cpuset: Rename 'NewCPUSet' to 'New' 2023-01-06 23:32:51 +00:00
Ian K. Coolidge
e5143d16c2 cpuset: Make 'ToSlice*' methods look like 'set' methods
In 'set', conversions to slice are done also, but with different names:

ToSliceNoSort() -> UnsortedList()
ToSlice() -> List()

Reimplement List() in terms of UnsortedList to save some duplication.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
a0c989b99a cpuset: Remove *Int64 methods
These are rarely used and can be accommodated with a trivial helper.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
67a057d4f2 cpuset: Remove 'MustParse' method
Removes exit/fatal from cpuset library.

Usage in podresources test was not necessary.

Library reference in cpu_manager_test was moved to a local function, and
converted to use e2e test framework error catching.
2023-01-06 23:32:51 +00:00
ZhangKe10140699
572360c5a5 Fix:[Flake] [sig-node] Restart [Serial] [Slow] [Disruptive] Kubelet should correctly account for terminated pods after restart 2023-01-05 09:10:57 +08:00
xin.li
10ca605cdd chroe: add k8s node-role.kubernetes.io/control-plane taint
Signed-off-by: xin.li <xin.li@daocloud.io>
2023-01-02 21:04:43 +08:00
Antonio Ojea
e0a23577d2 pass context to gomega
Change-Id: Ibef02a52d8922984a09efa48361b9876fc91287c
2022-12-19 13:14:02 +00:00
Patrick Ohly
2f6c4f5eab e2e: use Ginkgo context
All code must use the context from Ginkgo when doing API calls or polling for a
change, otherwise the code would not return immediately when the test gets
aborted.
2022-12-16 20:14:04 +01:00
Kubernetes Prow Robot
770b39c65b Merge pull request #114072 from Tal-or/deflake_e2e_cpumanager_metrics_tests
e2e: cpumanager: proper test clean-up
2022-12-14 11:55:45 -08:00
Kubernetes Prow Robot
7403090e40 Merge pull request #113309 from swatisehgal/devicemgr-e2e-remove-flakiness
node: e2e: device plugins: Deflake e2e tests
2022-12-14 10:47:34 -08:00
Swati Sehgal
213a6edc57 node: e2e: Add descriptive messages for operation/error checks
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-12-13 14:54:48 +00:00
Patrick Ohly
d4729008ef e2e: simplify test cleanup
ginkgo.DeferCleanup has multiple advantages:
- The cleanup operation can get registered if and only if needed.
- No need to return a cleanup function that the caller must invoke.
- Automatically determines whether a context is needed, which will
  simplify the introduction of context parameters.
- Ginkgo's timeline shows when it executes the cleanup operation.
2022-12-13 08:09:01 +01:00
Swati Sehgal
62e4d39c2f node: e2e: address review comments (2022/12/12)
- use `ginkgo.DeferCleanup` instead of clean up in the AfterEach block
- encourage use of ginkgo by not extending expect.go

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-12-12 16:31:40 +00:00
Swati Sehgal
a9e3689e63 node: e2e: ensure clean cluster state before e2e tests are run
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-12-12 14:50:36 +00:00
Swati Sehgal
7e880d1bab node: e2e: ensure log rotation pod is deleted after test
Some node e2e tests check for expected number of pods running
on the node to verify the correct state of that node after running
test scenarios. An example of such a check is in the device plugin
end to end test here: [1].

If the node is not left in a clean state after an e2e test finishes
running, it can lead to flaky tests because the node might have
unexpected pods running on the node.

In order to avoid that, we make sure that the test pods are
cleaned up after the test runs.

[1]: https://github.com/kubernetes/kubernetes/blob/master/test/e2e_node/device_plugin_test.go#L189-L190

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-12-12 14:50:32 +00:00
Patrick Ohly
df5d84ae81 e2e: accept context from Ginkgo
Every ginkgo callback should return immediately when a timeout occurs or the
test run manually gets aborted with CTRL-C. To do that, they must take a ctx
parameter and pass it through to all code which might block.

This is a first automated step towards that: the additional parameter got added
with

    sed -i 's/\(framework.ConformanceIt\|ginkgo.It\)\(.*\)func() {$/\1\2func(ctx context.Context) {/' \
        $(git grep -l -e framework.ConformanceIt -e ginkgo.It )
    $GOPATH/bin/goimports -w $(git status | grep modified: | sed -e 's/.* //')

log_test.go was left unchanged.
2022-12-10 19:50:18 +01:00
Talor Itzhak
56c5a95849 e2e: cpumanager: proper test clean-up
One of the cpumanager tests doesn't remove the pod
that got created during the test.

This causes pollution of other tests and failures
from time to time (depends on the test execution order).

In order to defalke the tests, we should delete the pod
and wait for it to be completely remove.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2022-11-22 17:25:52 +02:00
Andrew Sy Kim
6c8eacb157 test/e2e_node: set apiserver kubelet preferred addresses
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
2022-11-21 15:09:04 -05:00
Michal Wozniak
c803892bd8 Enable the feature into beta 2022-11-09 09:02:40 +01:00
Michal Wozniak
52cd6755eb Add pod disruption conditions for kubelet initiated failures 2022-11-07 11:23:22 +01:00
Kubernetes Prow Robot
565f582c4b Merge pull request #113199 from bobbypage/node_e2e_stop_kubelet
test: Stop kubelet systemd service after node e2e
2022-11-06 01:34:16 -08:00
David Ashpole
64af1adace Second attempt: Plumb context to Kubelet CRI calls (#113591)
* plumb context from CRI calls through kubelet

* clean up extra timeouts

* try fixing incorrectly cancelled context
2022-11-05 06:02:13 -07:00
Kubernetes Prow Robot
6fe5429969 Merge pull request #113273 from bobbypage/restart_test_fix
test: Fix e2e_node restart_test flake
2022-11-04 05:14:14 -07:00
Kubernetes Prow Robot
a9f87ad6c8 Merge pull request #113384 from pohly/e2e-formatting
e2e: formatting enhancements
2022-11-02 21:40:08 -07:00
Francesco Romani
ff44dc1932 cpumanager: the FG is locked to default (ON)
hence we can remove the if() guards, the feature
is always available.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-11-02 18:41:41 +01:00
Kubernetes Prow Robot
63e40b1ed4 Merge pull request #113548 from aojea/revert_113408
Revert "plumb context from CRI calls through kubelet"
2022-11-02 10:13:14 -07:00
Kubernetes Prow Robot
e8449012e2 Merge pull request #113512 from ehashman/rm-ehashman-node
Remove ehashman from sig-node roles
2022-11-02 08:36:00 -07:00
Antonio Ojea
9c2b333925 Revert "plumb context from CRI calls through kubelet"
This reverts commit f43b4f1b95.
2022-11-02 13:37:23 +00:00
Kubernetes Prow Robot
7b84436168 Merge pull request #113408 from dashpole/kubelet_context
Plumb context to Kubelet CRI calls
2022-11-01 19:59:08 -07:00
Kubernetes Prow Robot
4a0bb39d2a Merge pull request #113282 from xmcqueen/master
Image Version Bump in Manifest for Node Perf Test tf-wide-deep
2022-11-01 19:58:45 -07:00
Elana Hashman
9d2d392802 Remove ehashman from sig-node roles 2022-11-01 12:16:43 -07:00
Patrick Ohly
5a01a52b0c test: extend gomega to use YAML for API types
Some of our API types contain fields that get rendered very poorly by
gomega.format.Object because they contain lots of internal information, for
example CreationTimestamp. As a result, dumping full API object typically gets
truncated.

What we want is a representation that is a) multi-line (in contrast to the
stringer implemented by our types) and b) drops empty fields where it
was defined that this is okay.

The normal YAML representation fits that requirement. We just need to teach
gomega how and when to do that. This cannot be done for each type through a
generated GomegaString method (lots of code, additional dependency in public
API on YAML encoder), but it can be done inside tests by adding a formatting
handler (new gomega feature).
2022-10-28 15:43:48 +02:00
David Ashpole
f43b4f1b95 plumb context from CRI calls through kubelet 2022-10-28 02:55:28 +00:00
Francesco Romani
bdc08eaa4b e2e: node: add tests for cpumanager metrics
Add tests to verify the cpumanager metrics are populated.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-10-27 14:40:56 +02:00
Brian McQueen
08c22d6d9a bumped version of tf-wide-deep image to 1.3 in test manifest, and removed the data download from the tf-wide-deep pod spec command 2022-10-23 10:13:13 -07:00