Extract TestSuite, TestDriver, TestPattern, TestConfig
and VolumeResource, SnapshotVolumeResource from testsuite
package and put them into a new package called api.
The ultimate goal here is to make the testsuites as clean
as possible. And only testsuites in the package.
In our current mock CSI driver e2e test, we are not waiting
for the CSI driver register successfully to perform test
including provision PVC. This can lead to timeout when the
csi driver takes longer to register the socket.
This change adds the waiting part so that the system will
wait for up to 10 minutes for the driver to be ready. This
normally won't take this long. However, under a resource
constraint environment it can take longer than expected time.
https://github.com/kubernetes/kubernetes/issues/93358
The way gingko handles interrupts is:
- It starts running AfterSuite hooks in a separate goroutine (this includes cleanupAction hooks)
- Once AfterSuite hook is done executing it calls
os.Exit(1) on test suite.
So how cleanupFunc() that runs via defer in test can be interrupted
is:
- cleanupFunc starts running via defer (or AfterEach hook) but first
thing that function does is to remove cleanupHandle from
framework.RemoveCleanupAction.
- Test suite receives interrupt from user and AfterSuite block
starts executing
- remember that while cleanupFunc is running in goroutine#1,
AfterSuite is running concurrently in goroutine#2.
- AfterSuite hook has bunch of CleanupActions it needs to run which
were registered via framework.AddCleanupAction(cleanupFunc) but
once cleanupFunc starts executing via defer in the test, it will
remove the cleanupHandle from framework's aftersuite hooks.
- So if AfterSuite did not had anything to run (because
those actions were removed via framework.RemoveCleanupAction
then it will simply go to the last framework.AfterEach action and call os.Exit(1)
- So if os.Exit(1) is called before cleanupFunc has a chance to finish in defer, it will not complete.
The mock driver gets instructed to return a ResourceExhausted error
for the first CreateVolume invocation via the storage class
parameters.
How this should be handled depends on the situation: for normal
volumes, we just want external-scheduler to retry. For late binding,
we want to reschedule the pod. It also depends on topology support.
Especially related to "uncertain" global mounts. A large refactoring of CSI
mock tests were necessary:
- to be able to script the driver to return errors as required by the test
- to parse the CSI driver logs to check kubelet called the right CSI calls
Many times an e2e test fails with an unexpected error,
"timed out waiting for the condition".
Useful information may be in the test logs, but debugging e2e test
failures will be much faster if we add context to errors when they
happen.
This change makes sure we add context to all errors returned from
helpers like wait.Poll().
- add csi pd driver manifests
- modify snapshottable test case
- fix tests of pod has to be created first for delay-binding PVC, otherwise PVC won't be bound
test/e2e/storage/testsuites creates volumes dynamically. Initially, the size of those volumes was
hard-coded in the test, which prevented using the tests with storage backends that couldn't support
that hard-coded size
The assumption so far was that all drivers support read/write
volumes. That might not necessarily be true, so we have to let the
test driver specify it and then test accordingly.
Another aspect that is worth testing is whether the driver correctly
creates a new volume for each pod even if the volume attributes are
the same. However, drivers are not required to do that, so again we
have to let the test driver specify that.
We need the 1.2.0 driver for that because that has support for
detecting the volume mode dynamically, and we need to deploy a
CSIDriver object which enables pod info (for the dynamic detection)
and both modes (to satisfy the new mode sanity check).
This ensures that the files are in sync with:
hostpath: v1.2.0-rc3
external-attacher: v2.0.1
external-provisioner: v1.3.0
external-resizer: v0.2.0
external-snapshotter: v1.2.0
driver-registrar/rbac.yaml is obsolete because only
node-driver-registrar is in use now and does not need RBAC rules.
mock/e2e-test-rbac.yaml was not used anywhere.
The README.md files were updated to indicate that these really are
files copied from elsewhere. To avoid the need to constantly edit
these files on each update, <version> is used as placeholder in the URL.