Commit Graph

1133 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
7c53c1eb91
Merge pull request #89819 from pohly/enhance-podlogs-master
tests: enhance podlogs
2020-04-16 22:19:07 -07:00
Kubernetes Prow Robot
ae8d30631d
Merge pull request #90214 from pohly/stop-pod-master
storage tests: really wait for pod to disappear
2020-04-16 20:49:07 -07:00
Kubernetes Prow Robot
aa0665dfee
Merge pull request #90147 from gnufied/use-random-node-zone-for-inline-e2e
Use random zone for inline volume e2e tests
2020-04-16 13:59:08 -07:00
Patrick Ohly
0cdd5365a1 storage tests: really wait for pod to disappear
As seen in one case (https://github.com/intel/pmem-csi/issues/587), a
pod can reach the "not running" state although its ephemeral volumes
are still being torn down by kubelet and the CSI driver. What happens
then is that the test returns too early and even deleting the
namespace and thus the pod succeeds before the NodeVolumeUnpublish
really finishes.

To avoid this, StopPod now waits for the pod to really disappear.
2020-04-16 21:10:56 +02:00
Hemant Kumar
7d6712632c Use random zone for inline volume e2e tests 2020-04-14 23:37:21 -04:00
Patrick Ohly
2ae6cf5984 mock tests: per-test timeout for ResourceExhausted
The timeout for the two loops inside the test itself are now bounded
by an upper limit for the duration of the entire test instead of
having their own, rather arbitrary timeouts.
2020-04-14 09:11:42 +02:00
Patrick Ohly
48f8e398fb mock tests: remove redundant wrapping of error
The "error waiting for expected CSI calls" is redundant because it's
immediately followed by checking that error with:

   framework.ExpectNoError(err, "while waiting for all CSI calls")
2020-04-07 13:09:31 +02:00
Patrick Ohly
2550051f3b mock tests: add timeout
The for loop that waited for the signal to delete pod had no timeout,
so if something went wrong, it would wait for the entire test suite to
time out.
2020-04-07 13:09:31 +02:00
Patrick Ohly
f117849582 mock tests: ResourceExhausted error handling in external-provisioner
The mock driver gets instructed to return a ResourceExhausted error
for the first CreateVolume invocation via the storage class
parameters.

How this should be handled depends on the situation: for normal
volumes, we just want external-scheduler to retry. For late binding,
we want to reschedule the pod. It also depends on topology support.
2020-04-07 13:09:31 +02:00
Patrick Ohly
367a23e4d9 mock tests: remove redundant retrieval of log output
The code became obsolete with the introduction of parseMockLogs
because that will retrieve the log itself. For debugging of a running
test the normal pod output logging is sufficient.
2020-04-07 13:07:09 +02:00
Patrick Ohly
d06589e4b6 mock tests: less verbose log output checking
parseMockLogs is called potentially multiple times while waiting for
output. Dumping all CSI calls each time is quite verbose and
repetitive. To verify what the driver has done already, the normal
capturing of the container log can be used instead:

csi-mockplugin-0/mock@127.0.0.1: gRPCCall: {"Method":"/csi.v1.Node/NodePublishVolume","Request"...
2020-04-07 13:07:09 +02:00
Patrick Ohly
981aae35dd mock tests: do not give up immediately for pod output errors
As seen in some test
runs (https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/89041),
retrieving output can fail with "the server rejected our request for
an unknown reason (get pods csi-mockplugin-0)".

If this truly an intermittent error, then the existing retry logic in
the callers can deal with this.
2020-04-06 15:03:44 +02:00
Jan Safranek
e23a26a380 Update to new javascript 2020-04-06 15:03:22 +02:00
Jan Safranek
a4f080861f Test NodeStage error cases
Especially related to "uncertain" global mounts. A large refactoring of CSI
mock tests were necessary:
- to be able to script the driver to return errors as required by the test
- to parse the CSI driver logs to check kubelet called the right CSI calls
2020-04-06 15:03:22 +02:00
Patrick Ohly
b9c5c55c09 podlogs: avoid dumping a terminated container more than once
The original logic was that dumping can stop (for example, due to
loosing the connection to the apiserver) and then will start again as
long as the container exists. That it duplicates output on restarts
is better than skipping output that might not have been dumped yet.

But that logic then also dumped the output of containers that have
terminated multiple times:
- logging is started, dumps all output and stops because the
  container has terminated
- next check finds the container again, sees no active logger,
  repeats

This wasn't a problem for short-lived logging in a custom
namespace (the way how it is done for CSI drivers in Kubernetes E2E),
but other testsuites (like the one from PMEM-CSI) keep logging running
for the entire test suite duration: there duplicate output became a
problem when adding driver redeployment as part of the suite's run.

To avoid duplicated output for terminated containers, which containers
have been handled is now stored permanently. For terminated containers,
restarting of dumping is prevented. This comes with the risk that if
the previous dumping ended before capturing all output, some output
will get lost.

Marking the start and stop of the log was also useful when streaming
to a single writer and thus gets enabled.
2020-04-03 14:45:00 +02:00
Patrick Ohly
dbac2a369a podlogs: adapt to modified error message
Commit 8a495cb5e4 changed the spelling of the error message that we
want to ignore. In case of version skew we suppress both the old and
new spelling.
2020-04-03 14:43:52 +02:00
Kubernetes Prow Robot
7bd48eb3f6
Merge pull request #89784 from oomichi/sshPort
Add common SSHPort on e2essh
2020-04-02 21:40:40 -07:00
Kenichi Omichi
48fdb95a82 Add common SSHPort on e2essh
There were several sshPort values in e2e test packages because
we've migrated code from e2e framework by copying and pastting.
This adds common SSHPort on e2essh package to reduce such duplicated
code.
2020-04-02 17:41:49 +00:00
Kubernetes Prow Robot
8d773421ee
Merge pull request #80973 from xiaoanyunfei/bugfix/orphan-volume
fix orphaned pod flexvolume  can not be cleaned up
2020-04-01 20:50:23 -07:00
Kubernetes Prow Robot
ed00f42848
Merge pull request #89563 from oomichi/RestartControllerManager
Separate RestartControllerManager() as e2ekubesystem
2020-04-01 13:42:23 -07:00
Kubernetes Prow Robot
8d257ad315
Merge pull request #88118 from pohly/wait-for-persistent-volume-deleted-error
e2e/storage: check result of WaitForPersistentVolumeDeleted
2020-03-30 07:01:55 -07:00
Kubernetes Prow Robot
4e9dd8fd36
Merge pull request #89454 from gavinfish/import-aliases
Update .import-aliases for e2e test framework
2020-03-27 14:35:54 -07:00
Kenichi Omichi
42bb845f40 Separate RestartControllerManager() as e2ekubesystem
RestartControllerManager() is kube-controller specific function
and it is better to separate the function as subpackage of e2e
test framework.
In addition, the function made invalid dependency into e2essh.
So this separates the function into e2ekubesystem subpackage.
2020-03-27 03:52:17 +00:00
Patrick Ohly
c9004e704d e2e/storage: check result of WaitForPersistentVolumeDeleted
When deleting fails, the tests should be considered as failed,
too. Ignoring the error caused a wrong return code in the CSI mock
driver to go unnoticed (see
https://github.com/kubernetes-csi/csi-test/pull/250). The v3.1.0
release of the CSI mock driver fixes that.
2020-03-26 13:22:04 +01:00
Kubernetes Prow Robot
3afbcad669
Merge pull request #89436 from oomichi/RestartKubelet
Move RestartKubelet() into e2e/storage/vsphere
2020-03-25 02:03:17 -07:00
drfish
dfab6b637f Update .import-aliases for e2e test framework 2020-03-25 11:40:02 +08:00
Kenichi Omichi
2158989d6f Move WaitForPersistentVolumeDeleted() to e2epv
The function is for persistent volumes and it doesn't have any
reason why it stays in core test framework. So this moves the
function into e2epv package for reducing e2e/framework/util.go
code.
2020-03-24 22:54:07 +00:00
Kenichi Omichi
23066215f5 Move RestartKubelet() into e2e/storage/vsphere
Since 4e7c2f638d the function has been
called from storage vsphere e2e test only. This moves the function
into the test file for
- Reducing test/e2e/framework/util.go which is one of huge files
- Remove invalid dependency on e2e test framework
- Remove unnecessary TODO
2020-03-24 17:42:58 +00:00
Kenichi Omichi
d191660c25 Use e2epod.WaitForPodTerminatedInNamespace directly
WaitForPod*() are just wrapper functions for e2epod package, and they
made an invalid dependency to sub e2e framework from the core framework.
So this replaces WaitForPodTerminated() with the e2epod function.
2020-03-22 17:43:33 +00:00
Kubernetes Prow Robot
d23310a40e
Merge pull request #89324 from tanjunchen/remove-invalid-dependency-waitForPod-002
use e2epod.WaitForPodRunningInNamespaceSlow directly
2020-03-22 00:36:44 -07:00
tanjunchen
aa52bfe4d6 use e2epod.WaitForPodRunningInNamespaceSlow directly 2020-03-21 15:51:49 +08:00
tanjunchen
d18e6569e0 use e2epod.WaitForPodNotFoundInNamespace directly 2020-03-21 15:11:40 +08:00
Kubernetes Prow Robot
a26f50a52e
Merge pull request #86679 from oomichi/remove-invalid-dependency-27
Use e2epod.WaitForPodNameRunningInNamespace directly
2020-03-20 15:58:44 -07:00
Kubernetes Prow Robot
990a3802f6
Merge pull request #89180 from oomichi/LogOutput
Move podlogs into e2e/storage/testsuites
2020-03-19 20:31:22 -07:00
Kubernetes Prow Robot
b8a729b899
Merge pull request #89191 from misterikkit/vsphere-panic
Fix nil panic in vsphere tests
2020-03-19 06:06:29 -07:00
Kubernetes Prow Robot
df989a45f0
Merge pull request #89011 from oomichi/move-GetClusterZones
Move GetClusterZones() to e2enode
2020-03-18 22:23:43 -07:00
Kenichi Omichi
017eaf170a Move podlogs into e2e/storage/podlogs
The e2e framework package podlogs is used in e2e/storage/testsuites
only. In addition we considered we should have a single e2e framework
package for pod without the podlogs. So this moves the podlogs into
e2e/storage/podlogs for the e2e storage tests.
2020-03-18 17:44:12 +00:00
Jonathan Basseri
7f17ef28a8 Fix nil panic in vsphere tests
During test cleanup, we iterate over nodes.Items, but if test fails
during setup, nodes may be nil.
2020-03-17 14:00:10 -07:00
Kenichi Omichi
2c8955fd4a Use e2epod.WaitForPodNameRunningInNamespace directly
WaitForPod*() are just wrapper functions for e2epod package, and they
made an invalid dependency to sub e2e framework from the core framework.
So this replaces WaitForPodRunning() with the e2epod function.
2020-03-17 00:13:14 +00:00
Somtochi Onyekwere
ee41c6b1a4 Refactors MakeSecPods function 2020-03-12 07:14:08 +01:00
Kenichi Omichi
c586d8837a Move GetClusterZones() to e2enode 2020-03-10 18:16:54 +00:00
Kubernetes Prow Robot
672aa55ee4
Merge pull request #87777 from dbenoit17/master
fix range copy issue
2020-03-07 08:25:34 -08:00
Jordan Liggitt
d8abacba40 client-go: update expansions callers 2020-03-06 16:50:41 -05:00
Jordan Liggitt
b7c2faf26c client-go dynamic client: add context to callers 2020-03-06 10:56:23 -05:00
Jordan Liggitt
b19dc3a474 client-go dynamic client: update DeleteOptions callers 2020-03-06 10:21:23 -05:00
Christian Huffman
c6fd25d100 Updated CSIDriver references 2020-03-06 08:21:26 -05:00
Mike Danese
76f8594378 more artisanal fixes
Most of these could have been refactored automatically but it wouldn't
have been uglier. The unsophisticated tooling left lots of unnecessary
struct -> pointer -> struct transitions.
2020-03-05 14:59:47 -08:00
Mike Danese
aaf855c1e6 deref all calls to metav1.NewDeleteOptions that are passed to clients.
This is gross but because NewDeleteOptions is used by various parts of
storage that still pass around pointers, the return type can't be
changed without significant refactoring within the apiserver. I think
this would be good to cleanup, but I want to minimize apiserver side
changes as much as possible in the client signature refactor.
2020-03-05 14:59:46 -08:00
Mike Danese
c58e69ec79 automated refactor 2020-03-05 14:59:46 -08:00
Jan Safranek
98b9c7b5e8 Fix GCE PD snapshot flakiness
It takes more than 5 minutes to restore a GCE PD snapshot + run a pod with
it. Therefore TestVolumeClientSlow is introduced.
2020-03-04 12:39:13 +01:00