Commit Graph

23040 Commits

Author SHA1 Message Date
Anish Ramasekar
4e6d5dddfb [KMSv2] Add kind cluster and encryption config for e2e
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2023-02-13 06:42:54 +00:00
Patrick Ohly
3e760310b2 e2e: revise import restrictions
- test/e2e/framework/*.go should have very minimal dependencies.
  We can enforce that via import-boss.

- What each test/e2e/framework/* sub-package uses is less relevant,
  although ideally it also should be as minimal as possible in each case.

Enforcing this via import-boss ensures that new dependencies get flagged as
problem and thus will get additional scrutiny. It might be okay to add them,
but it needs to be considered.
2023-02-12 14:56:45 +01:00
Antonio Ojea
244d7449ce don't run loadbalancer tests on large environments
Change-Id: Id987e9469e563c0837c6437a44a65889cec2e202
2023-02-11 10:28:25 +00:00
David Porter
826472c99d test: e2e node shutdown test logging improvements
Since the pod names are reused across the test, searching the logs is
currently difficult.

Use a uuid for each pod name to make grepping the logs easier. Also,
always include the pod name and pod namespace in any logs or error
messages to make debugging easier.

Signed-off-by: David Porter <david@porter.me>
2023-02-10 16:54:31 -08:00
Kubernetes Prow Robot
0424a530a4 Merge pull request #115678 from pohly/e2e-full-reports
e2e: revise complete report creation
2023-02-10 15:07:29 -08:00
Kubernetes Prow Robot
1749bb2991 Merge pull request #115579 from ardaguclu/fix-wait-sh-timeout
flaky test wait.sh: Add deployment assertion before running wait
2023-02-10 13:59:29 -08:00
Patrick Ohly
3e2b26ce52 e2e: revise complete report creation
The previous approach was based on the observation that some Prow jobs use the
--report-dir parameter instead of the E2E_REPORT_DIR env variable. Parsing the
command line was necessary to use the --json-report and --junit-report
parameters.

But that is complex and can be avoided by triggering the creation of complete
reports in the E2E test suite. The paths are hard-coded and relative to the
report directory to keep the code simple.

There was a report that k8s-triage started processing more data after
6db4b741dd was merged. It's unclear whether
that was because of the new <report-dir>/ginkgo_report.xml file. To avoid
this potential problem, the reports are now in a "ginkgo" sub-directory.

While at it, error checking gets enhanced:
- Create directories at the start of
  the suite and bail out early if that fails.
- *All* e2e suites using the framework do this, not just test/e2e.
- Added missing error checking of truncated JUnit report writing.
2023-02-10 10:20:20 +01:00
Arda Güçlü
e0fedec69d (kubectl debug): Support debugging via files
Currently `kubectl debug` only supports passing names in command line.
However, users might want to pass resources in files by passing `-f` flag like
in all other kubectl commands.

This PR adds this ability.
2023-02-10 10:21:30 +03:00
Kubernetes Prow Robot
b2f8c8f00d Merge pull request #115635 from bobbypage/npd-time-fix
test: Simplify NPD start timestamp calculation
2023-02-09 18:37:31 -08:00
Kubernetes Prow Robot
0698d9eb82 Merge pull request #115649 from aramase/grpc-metrics
[KMSv2] Add metrics for grpc service
2023-02-09 15:50:45 -08:00
Kubernetes Prow Robot
6e2e61bb3c Merge pull request #115657 from saschagrunert/inject-base64
Allow SSH e2e node base64 key injection
2023-02-09 14:45:06 -08:00
Kubernetes Prow Robot
95c65ca3a0 Merge pull request #115454 from dgrisonnet/promote-pod-resource-metrics
Promote pod resource metrics to stable
2023-02-09 12:36:16 -08:00
Damien Grisonnet
49da8a1d4a scheduler: promote pod resource metrics to stable
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2023-02-09 20:30:45 +01:00
Anish Ramasekar
de3b2d525b [KMSv2] Add metrics for grpc service
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2023-02-09 18:51:37 +00:00
Sascha Grunert
85106dc327 Allow SSH e2e node base64 key injection
With the change of the CRI-O jobs to use butane, we now have a
verification for base64 data urls in place. This means that the
following URL is invalid:

```
data:text/plain;base64,GCE_SSH_PUBLIC_KEY_FILE_CONTENT
```

This means we have to pass valid base64 to the URL. To fix that, we now
allow to inject SSH key values with both, the
`GCE_SSH_PUBLIC_KEY_FILE_CONTENT` field and its base64 encoded variant.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-02-09 16:17:11 +01:00
vaibhav2107
6ab8a8fbec Updated the change in registry 2023-02-09 09:37:44 +05:30
David Porter
7fe371a974 test: Simplify NPD start timestamp calculation
The NPD test checks when NPD started to determine if it is needed to
check the kubelet start event. The current logic requires parsing the
journalctl logs which is quite fragile and is broken now because of
systemd changing the expected log format.

Newer versions of systemd do not print "end at" or "logs begin at" and
instead may print "No entries", which will result in the test panicking.

```
$ journalctl -u foo.service
-- No entries --
```

For units started, it will not print "end at" or "logs begin at":

```
root@ubuntu-jammy:~# journalctl -u foo.service
Feb 08 22:02:19 ubuntu-jammy systemd[1]: Started /usr/bin/sleep 1s.
Feb 08 22:02:20 ubuntu-jammy systemd[1]: foo.service: Deactivated successfully.
```

To avoid relying on log parsing which is fragile, let's instead directly
ask systemd when the NPD service started and parse the resulting
timestamp.

Signed-off-by: David Porter <david@porter.me>
2023-02-08 13:58:45 -08:00
Kubernetes Prow Robot
468ce59183 Merge pull request #115557 from MikeSpreitzer/cleanup-path-hack
Simplify construction of /metrics request
2023-02-08 09:28:58 -08:00
Matthew Cary
69808b74ec Remove obsolete GKE local SSD test
Change-Id: I156bd03ac740c2ebe394081d3106851f7182269f
2023-02-07 17:33:32 -08:00
Riaan Kleinhans
999e9f14f7 remove conformance tested endpoints 2023-02-08 11:55:44 +13:00
Kubernetes Prow Robot
090025f5e6 Merge pull request #115548 from pohly/e2e-wait-for-pods-with-gomega
e2e: wait for pods with gomega, II
2023-02-07 07:01:21 -08:00
Kubernetes Prow Robot
22b88dea36 Merge pull request #115315 from enj/enj/i/kas_kubelet_conn_close
kubelet/client: collapse transport wiring onto standard approach
2023-02-07 07:01:14 -08:00
Kubernetes Prow Robot
4f321041bd Merge pull request #115537 from MadhavJivrajani/bump-tools-deps-go120
*: Bump golangci-lint version and adapt to new linters
2023-02-07 05:53:12 -08:00
Pavel Beschetnov
456de495ef Calculate more precise usage for replicas 2023-02-07 12:41:36 +00:00
Arda Güçlü
60401d35d1 flaky test wait.sh: Add deployment assertion before running wait
There is a test in wait.sh integration suite which is checking the
given timeout value(passed by user) is equal to actual happened timeout value.

However, this test rarely gets `no matching resources found` error and
causes flakyiness. The reason is we are running wait command, immediately
after applying deployment. In reality, timeout test does not care about
deployment, since it is testing the timeout by passing invalid configurations.
But we need this deployment to not get `no matching resources found` error.

That's why, this PR adds deployment assertion before executing wait command.
2023-02-07 14:19:34 +03:00
Madhav Jivrajani
5e1f440d0a *: Fix linter warnings
Adapt to newly improved linters in golangci-lint v1.51.1

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2023-02-07 13:01:41 +05:30
Kubernetes Prow Robot
e944fc28ca Merge pull request #115443 from torredil/master
Add windows nodeSelector to e2e storage testing pods
2023-02-06 18:27:09 -08:00
Monis Khan
754cb3d601 kubelet/client: collapse transport wiring onto standard approach
Signed-off-by: Monis Khan <mok@microsoft.com>
2023-02-06 20:34:49 -05:00
Mike Spreitzer
e9973979d0 Simplify construction of /metrics request 2023-02-06 16:20:34 -05:00
torredil
25389ee0ee Add nodeSelector to e2e storage testing pods
Signed-off-by: torredil <torredil@amazon.com>
2023-02-06 16:00:51 +00:00
Patrick Ohly
136f89dfc5 e2e: use error wrapping with %w
The recently introduced failure handling in ExpectNoError depends on error
wrapping: if an error prefix gets added with `fmt.Errorf("foo: %v", err)`, then
ExpectNoError cannot detect that the root cause is an assertion failure and
then will add another useless "unexpected error" prefix and will not dump the
additional failure information (currently the backtrace inside the E2E
framework).

Instead of manually deciding on a case-by-case basis where %w is needed, all
error wrapping was updated automatically with

    sed -i "s/fmt.Errorf\(.*\): '*\(%s\|%v\)'*\",\(.* err)\)/fmt.Errorf\1: %w\",\3/" $(git grep -l 'fmt.Errorf' test/e2e*)

This may be unnecessary in some cases, but it's not wrong.
2023-02-06 15:39:13 +01:00
Patrick Ohly
9878e735dd e2e pod: unit test for pod status + API error
This covers new behavior in gomega.
2023-02-06 15:39:13 +01:00
Patrick Ohly
1bd1167d56 e2e pod: remove dead code 2023-02-06 15:39:13 +01:00
Patrick Ohly
3bb735e6fa e2e pod: use gomega.Eventually in WaitForRestartablePods 2023-02-06 15:39:13 +01:00
Patrick Ohly
1e346c4e4a e2e pod: convert ProxyResponseChecker into matcher
Instead of pod responses being printed to the log each time polling fails, we
get a consolidated failure message with all unexpected pod responses if (and
only if) the check times out or a progress report gets produced.
2023-02-06 15:39:13 +01:00
Patrick Ohly
c3266cde77 e2e: consolidate pod response checking
This renames PodsResponding to WaitForPodsResponding for the sake of
consistency and adds a timeout parameter. That is necessary because some other
users of NewProxyResponseChecker used a much lower timeout (2min vs. 15min).

Besides simplifying some code, it also makes it easier to rewrite
ProxyResponseChecker because it only gets used in WaitForPodsResponding.
2023-02-06 15:39:13 +01:00
Patrick Ohly
89a5d6d8af e2e pod: use gomega.Eventually in WaitForPodNotFoundInNamespace 2023-02-06 15:39:13 +01:00
Patrick Ohly
9df3e2a47a e2e: replace WaitForPodToDisappear with WaitForPodNotFoundInNamespace
WaitForPodToDisappear was always called such that it listed all pods, which
made it less efficient than trying to get just the one pod it was checking for.

Being able to customize the poll interval in practice wasn't useful, therefore
it can be replaced with WaitForPodNotFoundInNamespace.
2023-02-06 15:39:12 +01:00
Patrick Ohly
45d4631069 e2e: consolidate checking a pod list
WaitForPods is now a generic function which lists pods and then checks the pods
that it found against some provided condition. A parameter determines how many
pods must be found resp. match the condition for the check to succeed.
2023-02-06 15:39:12 +01:00
Patrick Ohly
d8428c6fb1 e2e pod: use gomega.Eventually in WaitTimeoutForPodReadyInNamespace/WaitForPodCondition
These get converted together because they relied on FinalErr which now isn't
needed anymore.
2023-02-06 15:39:12 +01:00
Patrick Ohly
01a40d9d6b e2e framework: support getting list of objects
This is similar to the previous support for getting a single object.
2023-02-06 15:39:12 +01:00
Patrick Ohly
3dd185aa40 e2e pod: use gomega.Eventually in WaitForPodsRunningReady
The code becomes simpler (78 insertions, 91 deletions), easier to read (all
code entirely inside WaitForPodsRunningReady, no need to declare and later
overwrite variables) and possibly more correct (if all API calls failed,
the resulting error was ignored when allowedNotReadyPods > 0).
2023-02-06 15:39:12 +01:00
Patrick Ohly
afbb2c5323 e2e framework: turn function into gomega.Matcher
The intention is to use this inside a helper function where the
corresponding Expect call is known.
2023-02-06 15:39:12 +01:00
Patrick Ohly
4d63e7d4d6 e2e: remove unused label filter from WaitForPodsRunningReady
None of the users of the functions passed anything other than nil or an empty
map and the implementation ignore the parameter - it seems like a candidate for
simplification.
2023-02-06 15:39:12 +01:00
Patrick Ohly
8181f97ecc e2e framework: include additional stack backtrace in failures
When a Gomega failure is converted to an error, the stack at the time when the
failure occurs may be useful: error wrapping provides some bread crumbs that
can be followed to determine where the failure really occurred, but error
wrapping may be missing or ambiguous.

To provide the additional information, a FailureError now includes a full stack
backtrace. The backtrace intentionally makes no attempt to exclude framework
functions besides the gomega support itself because helpers like
e2e/framework/pod may be relevant.

That backtrace is not included in the failure message for the sake of
brevity. Instead, it gets logged as part of the test's output.
2023-02-06 15:39:12 +01:00
Patrick Ohly
005a9da0cc e2e framework: implement pod polling with gomega.Eventually
gomega.Eventually provides better progress reports: instead of filling up the
log with rather useless one-line messages that are not enough to to understand
the current state, it integrates with Gingko's progress reporting (SIGUSR1,
--poll-progress-after) and then dumps the same complete failure message as
after a timeout. That makes it possible to understand why progress isn't
getting made without having to wait for the timeout.

The other advantage is that the failure message for some unexpected pod state
becomes more readable: instead of encapsulating it as "observed object" inside
an error, it directly gets rendered by gomega.
2023-02-06 15:39:12 +01:00
Patrick Ohly
71dc81ec89 e2e framework: gomega assertions as errors
Calling gomega.Expect/Eventually/Consistently deep inside a helper call chain
has several challenges:
- the stack offset must be tracked correctly, otherwise the callstack
  for the failure starts at some helper code, which is often not informative
- augmenting the failure message with additional information from each
  caller implies that each caller must pass down a string and/or format
  string plus arguments

Both challenges can be solved by returning errors:
- the stacktrace is taken at that level where the error is
  treated as a failure instead of passing back an error, i.e.
  inside the It callback
- traditional error wrapping can add additional information, if
  desirable

What was missing was some easy way to generate an error via a gomega
assertion. The new infrastructure achieves that by mirroring the
Gomega/Assertion/AsyncAssertion interfaces with errors as return values instead
of calling a fail handler.

It is intentionally less flexible than the gomega APIs:
- A context must be passed to Eventually/Consistently as first
  parameter because that is needed for proper timeout handling.
- No additional text can be added to the failure through this
  API because error wrapping is meant to be used for this.
- No need to adjust the callstack offset because no backtrace
  is recorded when a failure occurs.

To avoid the useless "unexpected error" log message when passing back a gomega
failure, ExpectNoError gets extended to recognize such errors and then skips
the logging.
2023-02-06 15:39:12 +01:00
Patrick Ohly
d17ce64ac5 e2e storage: remove WaitForPodTerminatedInNamespace
Calling WaitForPodTerminatedInNamespace after testFlexVolume is useless because
the client pod that it waits for always gets deleted by testVolumeClient:

0fcc3dbd55/test/e2e/framework/volume/fixtures.go (L541-L546)

Worse, because WaitForPodTerminatedInNamespace treats "not found" as "must keep
polling", these two tests always kept waiting for 5 minutes:

    Kubernetes e2e suite: [It] [sig-storage] Flexvolumes should be mountable
    when non-attachable 	6m4s

The only reason why these tests passed is that WaitForPodTerminatedInNamespace
used to return the "not found" API error. That is not guaranteed and about to
change.
2023-02-06 15:39:12 +01:00
Antonio Ojea
7f5ae1c0c1 Revert "e2e: wait for pods with gomega" 2023-02-06 12:08:22 +01:00
Kubernetes Prow Robot
cf14b50b0d Merge pull request #114502 from cpanato/go1.20
[go] Bump images, dependencies and versions to go 1.20
2023-02-04 13:40:28 -08:00