Commit Graph

123 Commits

Author SHA1 Message Date
Amarnath Valluri
e68c9f3dec test/e2e/storage: replace mock driver with hostpath driver
This is a first step towards removing the mock CSI driver completely from
e2e testing in favor of hostpath plugin. With the recent hostpath plugin
changes(PR #260, #269), it supports all the features supported by the mock
csi driver.

Using hostpath-plugin for testing also covers CSI persistent feature
usecases.
2021-12-02 14:41:08 +01:00
Cheng Xing
bca1b79728 Delegate FSGroup CSI driver e2e: verify fsgroup is passed to CSI calls using mock driver tests 2021-11-22 17:00:39 -08:00
Patrick Ohly
5378b4d220 e2e: restore volume lifecycle check for most tests, II
Besides "subPath should unmount if pod is gracefully deleted while kubelet is
down" we also need a special case for "subPath should unmount if pod is force
deleted while kubelet is down".

This fixes a test failure in https://testgrid.k8s.io/sig-storage-kubernetes#gce-serial
2021-10-14 10:59:03 +02:00
Patrick Ohly
e99b945b17 e2e: restore volume lifecycle check for most tests
f1e1f3a416
disable the check to work around an issue in one test. It's better to keep the
check enabled by default and only disable it for that test.
2021-09-09 10:29:37 +02:00
Jing Xu
f1e1f3a416 Fix disruptive subPath test failures
This PR fixes two distruptive subpath test failures.

1. disable --check-volume-lifecycle check
2. skip hostpath driver tests on graceful pod deletion test too.

See details in
https://github.com/kubernetes/kubernetes/issues/103651#issuecomment-887227562

Change-Id: Ibecd051be865feea5f2a92d22ade848367400939
2021-07-27 02:17:31 -07:00
Kubernetes Prow Robot
756203fda0
Merge pull request #102576 from dobsonj/101911
kubelet: do not call RemoveAll on volumes directory for orphaned pods
2021-06-29 06:54:40 -07:00
Jordan Liggitt
1134456c89 Fix CSI mock driver to get marshaleable grpc error 2021-06-15 09:53:06 -04:00
Jonathan Dobson
484eb01822 kubelet: do not call RemoveAll on volumes directory for orphaned pods 2021-06-08 13:57:35 -06:00
Patrick Ohly
528baa09f6 e2e storage: disable health-monitor controller in hostpath deployment
This reverts commit
c15fd76ee9. Most (all?) of the hostpath
tests and several other tests started to fail again in
gce-scale-master-correctness after re-enabling the controller. This
shows that it was not just the obsolete agent which causes scalability
problems, but also the controller.

It has to be disabled until the scalability problems are addressed.
2021-06-08 20:27:05 +02:00
Kubernetes Prow Robot
cc7721362c
Merge pull request #102665 from gnufied/add-online-expansion-cap
Add explicit capability for online volume expansion
2021-06-08 08:33:36 -07:00
Hemant Kumar
95c8b02096 Add explicit capability for online volume expansion 2021-06-07 13:43:18 -04:00
Patrick Ohly
c15fd76ee9 e2e storage: enable health-check controller in hostpath deployment
It was disabled together with the agent to avoid test failures in
gce-master-scale-correctness (https://github.com/kubernetes/kubernetes/issues/102452). That
solved the problem, but we still need to check whether the controller
alone works.
2021-06-05 18:16:19 +02:00
Patrick Ohly
c26c423b1c storage e2e: disable health check containers
They are not needed for any of the tests and may be causing too much
overhead (see
https://github.com/kubernetes/kubernetes/issues/102452#issuecomment-854452816).

We already disabled them earlier and then re-enabled them again
because it wasn't clear how much overhead they were causing. A recent
change in how the sidecars get
deployed (https://github.com/kubernetes/kubernetes/pull/102282) seems
to have made the situation worse again. There's no logical explanation
for that yet, though.

(cherry picked from commit 0c2cee5676e64976f9e767f40c4c4750a8eeb11f)
2021-06-04 09:57:02 +02:00
Patrick Ohly
4acb6a865c storage e2e: use csi-driver-host-path v1.7.2 in single pod
The new default deployment in that release puts sidecars into the same
pod as the driver. This is expected to reduce load during testing.
2021-05-26 09:07:46 +02:00
Kubernetes Prow Robot
17f3990ea1
Merge pull request #100484 from gavinfish/e2e-storage-suffix
Remove suffixes for VolumeSnapshotClasses in E2E tests
2021-04-26 17:37:03 -07:00
Patrick Ohly
3299469437 Revert "storage e2e: disable health check containers"
This reverts commit 0c2cee5676e64976f9e767f40c4c4750a8eeb11f.

The health check containers are not required for any test, but we want
to run them anyway to ensure that they cause no unexpected issues.
2021-04-22 08:20:39 +02:00
Patrick Ohly
c794b5c442 storage e2e: patch in RBAC rules for secrets
In one mock test, the snapshotter needs permission to read
secrets. That was disabled in the RBAC files of recent releases. We
need to patch it back in during deployment.
2021-04-21 09:57:54 +02:00
Patrick Ohly
7682e39a47 storage e2e: disable health check containers
They are not needed for any of the tests and in practice apparently
caused enough overhead that even unrelated tests timed out. For
example, in the pull-kubernetes-e2e-kind test, 43 out of 5771 tests
failed, including tests from sig-node, sig-cli, sig-api-machinery,
sig-network.
2021-04-20 08:07:15 +02:00
Patrick Ohly
446c1136dc storage e2e: automate hostpath YAML updates, hostpath v1.6.2
Mirroring the various YAML files by hand is tedious. The new
update-hostpath.sh does all the necessary steps automatically.

The result is now a bit more consistent with the upstream repos in the
sense that the original file names and paths for the RBAC YAML files
are used.

The csi-hostpath-testing.yaml is included for the sake of
completeness, but not used during E2E testing.

The new hostpath driver release is v1.6.2, which adds the
external-health-monitor for the first time.
2021-04-20 08:07:15 +02:00
shahra
34e4a5f22c Add e2e test to validate performance metrics of volume lifecycle operations.
This test currently validates latency and throughput of volume
provisioning against a high baseline.
2021-03-24 13:50:32 -07:00
drfish
244d7a5d67 Remove suffixes for VolumeSnapshotClasses in E2E tests 2021-03-23 21:24:28 +08:00
Kubernetes Prow Robot
42c1ccb38e
Merge pull request #99701 from wojtek-t/cleanup_describe_13
Tag storage windows tests with [Feature:Windows] instead of [sig-windows]
2021-03-05 10:00:47 -08:00
wojtekt
ca333d7a7a Tag storage windows tests with [Feature:Windows] instead of [sig-windows] 2021-03-03 12:08:59 +01:00
Patrick Ohly
baecaa8209 e2e test: log gRPC calls in embedded CSI driver
It is useful to see all calls as they occur. The output format is
the more readable JSON representation.
2021-03-01 19:22:37 +01:00
Patrick Ohly
3adcf11b45 e2e storage: use embedded mock CSI driver
This replaces embedding of JavaScript code into the mock driver that
runs inside the cluster with Go callbacks which run inside the
e2e.test suite itself. In contrast to the JavaScript hooks, they have
direct access to all parameters and can fabricate arbitrary responses,
not just error codes.

Because the callbacks run in the same process as the test itself, it
is possible to set up two-way communication via shared variables or
channels. This opens the door for writing better tests. Some of the
existing tests that poll mock driver output could be simplified, but
that can be addressed later.

For now, only tests using hooks use embedding. How gRPC calls are
retrieved is abstracted behind the CSIMockTestDriver interface, so
tests don't need to be modified when switching between embedding
and remote mock driver.
2021-03-01 19:22:37 +01:00
Cheng Xing
5ed2ef67cb storage CSI e2e: Move csi driver cleanup functions into a common one 2021-02-17 17:51:21 -08:00
Kubernetes Prow Robot
4d8b2020f6
Merge pull request #98555 from verult/pdcsi-e2e-skip-gke
Storage e2e: Remove pd csi driver installation in GKE
2021-02-17 04:55:07 -08:00
Cheng Xing
6b3f74dafb Storage e2e: Remove pd csi driver installation in GKE 2021-02-09 13:14:58 -08:00
Manohar Reddy
a6ec62f76d add e2e tests for DeleteSnapshotsecrets 2021-02-04 10:46:00 +05:30
lala123912
165907f60a remove suffixes from generated StorageClasses and VolumeSnapshotClass 2020-12-14 16:48:41 +08:00
Jiawei Wang
356bea6c9f Add storage framework and address comments 2020-12-10 22:48:06 -08:00
Jiawei Wang
988563f8f5 Extract testsuite api to a separate package
Extract TestSuite, TestDriver, TestPattern, TestConfig
and VolumeResource, SnapshotVolumeResource from testsuite
package and put them into a new package called api.

The ultimate goal here is to make the testsuites as clean
as possible. And only testsuites in the package.
2020-12-10 11:12:51 -08:00
Christian Huffman
4d2d063635 Included e2e test for CSIDriver FSGroupPolicy 2020-11-12 16:30:38 -05:00
Shihang Zhang
d2859cd89b plumb service account token down to csi driver 2020-11-12 09:26:43 -08:00
Kubernetes Prow Robot
0bb732842a
Merge pull request #95971 from chrishenzie/e2e-stress-snapshots
Add E2E stress test suite for creation / deletion of VolumeSnapshot resources
2020-11-05 14:25:03 -08:00
Chris Henzie
fb6bc4f8b0 E2E stress test suite for VolumeSnapshots
Introduces a new test suite that creates and deletes many
VolumeSnapshots simultaneously to test snapshottable storage plugins
under load.
2020-11-05 08:58:13 -08:00
shahra
e95af138b5 Volume snapshot e2e test to validate
VolumeSnapshotContent and PVC finalizer
2020-11-04 14:08:24 -08:00
Kubernetes Prow Robot
ac6447c76f
Merge pull request #94318 from gnufied/fix-namespace-deletion
Prevent deletion  of namespace again
2020-09-10 10:46:03 -07:00
Hemant Kumar
c4ce420667 Prevent deletion of namespace again 2020-09-09 11:22:08 -04:00
Kubernetes Prow Robot
f6d169c7ca
Merge pull request #93120 from msau42/e2e-export
Export WaitForCSIDriverRegistrationOnAllNodes
2020-08-28 06:34:53 -07:00
Jiawei Wang
76b4973b42 Wait for mock CSI Driver bringup to perform e2e test
In our current mock CSI driver e2e test, we are not waiting
for the CSI driver register successfully to perform test
including provision PVC. This can lead to timeout when the
csi driver takes longer to register the socket.

This change adds the waiting part so that the system will
wait for up to 10 minutes for the driver to be ready. This
normally won't take this long. However, under a resource
constraint environment it can take longer than expected time.

https://github.com/kubernetes/kubernetes/issues/93358
2020-08-10 11:03:35 -07:00
Michelle Au
10a8c195a1 Export WaitForCSIDriverRegistrationOnAllNodes to be used by external csi driver repos
Change-Id: Ie61430b1050a778d8ba98177e0c995ff2553f9cd
2020-07-15 16:53:37 -07:00
Patrick Ohly
567ce87aee CSIStorageCapacity: E2E test with mock driver
We can create CSIStorageCapacity objects manually, therefore we don't
need the updated external-provisioner for these tests.
2020-07-08 08:02:26 +02:00
Hemant Kumar
74be9f04fa Ensure CleanupActionHandle always completes
The way gingko handles interrupts is:
 - It starts running AfterSuite hooks in a separate goroutine (this includes cleanupAction hooks)
 - Once AfterSuite hook is done executing it calls
   os.Exit(1) on test suite.

So how cleanupFunc() that runs via defer in test can be interrupted
is:
 - cleanupFunc starts running via defer (or AfterEach hook) but first
   thing that function does is to remove cleanupHandle from
   framework.RemoveCleanupAction.
 - Test suite receives interrupt from user and AfterSuite block
   starts executing
 - remember that while cleanupFunc is running in goroutine#1,
   AfterSuite is running concurrently in goroutine#2.
 - AfterSuite hook has bunch of CleanupActions it needs to run which
   were registered via framework.AddCleanupAction(cleanupFunc) but
   once cleanupFunc starts executing via defer in the test, it will
   remove the cleanupHandle from framework's aftersuite hooks.
 - So if AfterSuite did not had anything to run (because
   those actions were removed via framework.RemoveCleanupAction
   then it will simply go to the last framework.AfterEach action and call os.Exit(1)
 - So if os.Exit(1) is called before cleanupFunc has a chance to finish in defer, it will not complete.
2020-06-02 12:40:32 -04:00
Hemant Kumar
da941d8d3e Create mock CSI driver resources in different namespace 2020-05-13 11:16:00 -04:00
Kubernetes Prow Robot
7f78048594
Merge pull request #90781 from msau42/increase-timeout
Increase timeout waiting for driver to start on nodes
2020-05-06 22:23:08 -07:00
Michelle Au
fc08f74157 Increase timeout waiting for driver to start on nodes to reduce test flakiness
Change-Id: Id553943e4473b387bf0ae14a18a90cb3a1bcd5c1
2020-05-05 18:10:10 -07:00
Michelle Au
6596e20b18 Make stress test parameters configurable
Change-Id: Ia062f3433b6043825a51a54c7c07eb4cdf809631
2020-04-16 14:18:21 -07:00
Patrick Ohly
f117849582 mock tests: ResourceExhausted error handling in external-provisioner
The mock driver gets instructed to return a ResourceExhausted error
for the first CreateVolume invocation via the storage class
parameters.

How this should be handled depends on the situation: for normal
volumes, we just want external-scheduler to retry. For late binding,
we want to reschedule the pod. It also depends on topology support.
2020-04-07 13:09:31 +02:00
Jan Safranek
a4f080861f Test NodeStage error cases
Especially related to "uncertain" global mounts. A large refactoring of CSI
mock tests were necessary:
- to be able to script the driver to return errors as required by the test
- to parse the CSI driver logs to check kubelet called the right CSI calls
2020-04-06 15:03:22 +02:00