Commit Graph

1176 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
ed67d43ea4 Merge pull request #92530 from mattcary/metricsload
Avoid grabbing metrics when they're not validated
2020-06-26 11:49:46 -07:00
Kubernetes Prow Robot
5ff5dbb07c Merge pull request #92497 from oomichi/vsphere
nit: Fix invokeStaleDummyVMTestWithStoragePolicy()
2020-06-26 00:08:04 -07:00
Kenichi Omichi
176c8e219f Avoid DeprecatedMightBeMasterNode() in e2e metrics
As its name, DeprecatedMightBeMasterNode is deprecated.
In e2e metrics, the function was used for knowing master node name to
get metrics from kube-scheduler and kube-controller-manager pods.
This make e2e metrics get these metrics directly by getting those pod
names without calling DeprecatedMightBeMasterNode().
2020-06-25 23:08:24 +00:00
Kenichi Omichi
d964569e1e nit: Fix invokeStaleDummyVMTestWithStoragePolicy()
There were nits in invokeStaleDummyVMTestWithStoragePolicy() like
- The error message didn't contain necessary space
- IsVMPresent() can return an error, but lack of the error handling
- IsVMPresent() returns true/false, but didn't use ExpectEqual() and
  less code readability

This fixes those things.
2020-06-25 04:49:39 +00:00
Matthew Cary
028176deb2 Avoid grabbing metrics when they're not validated
Change-Id: I0dd23b993b1bbc4908341d092c485566b9725c7a
2020-06-25 02:01:53 +00:00
Kenichi Omichi
5edf15ea97 Use worker nodes for WaitForStableCluster()
WaitForStableCluster() checks all pods run on worker nodes, and the
function used to refer master nodes to skip checking controller plane
pods.
GetMasterAndWorkerNodes() was used for getting master nodes, but the
implementation is not good because it usesDeprecatedMightBeMasterNode().

This makes WaitForStableCluster() refer worker nodes directly to avoid
using GetMasterAndWorkerNodes().
2020-06-24 15:21:12 +00:00
Kubernetes Prow Robot
f8705f22f8 Merge pull request #89705 from ggriffiths/add_snapshot_retainpolicy_e2e_test
Add VolumeSnapshot retain policy test and test for snapshot delete
2020-06-19 11:35:59 -07:00
Kubernetes Prow Robot
0bb640c25a Merge pull request #92205 from mrunalp/fix_host_path_socket_tests
test: Start a pod with nc instead of execing a background command
2020-06-18 06:03:31 -07:00
Grant Griffiths
e1f0e4cd9f Add retain policy test and refactor snapshottable tests
Signed-off-by: Grant Griffiths <grant@portworx.com>
2020-06-17 19:53:53 -07:00
Mrunal Patel
7643b64050 test: Start a pod with nc instead of execing a background command
The behavior for exec'ing a backgrounded command is not specified
with CRI so modify the test to run the command directly instead
of using exec.

Signed-off-by: Mrunal Patel <mpatel@redhat.com>
2020-06-16 17:30:31 -07:00
Fabio Bertinatto
8d644092ed Create pod to force volume provisioning in storage e2e test
Otherwise, tests can fail if the default StorageClass
is configured with late binding.
2020-06-10 08:45:41 +02:00
lixiaobing1
2d66e7ecd3 another:Replace framework.Failf with ExpectNoError 2020-06-05 16:43:22 +08:00
Kubernetes Prow Robot
64bba294ae Merge pull request #91741 from oomichi/nit-ExpectError
Replace framework.Failf with ExpectNoError
2020-06-04 13:53:05 -07:00
Kubernetes Prow Robot
1925eb81ac Merge pull request #91689 from gnufied/fix-after-suite-race
Ensure CleanupActionHandle always completes
2020-06-04 10:51:15 -07:00
Kenichi Omichi
0ebaae88b1 Replace framework.Failf with ExpectNoError 2020-06-03 20:16:12 +00:00
Kubernetes Prow Robot
f2e3154a14 Merge pull request #91642 from huffmanca/update-azure-e2e
Adjust Azure e2e binding mode
2020-06-03 05:44:32 -07:00
Hemant Kumar
74be9f04fa Ensure CleanupActionHandle always completes
The way gingko handles interrupts is:
 - It starts running AfterSuite hooks in a separate goroutine (this includes cleanupAction hooks)
 - Once AfterSuite hook is done executing it calls
   os.Exit(1) on test suite.

So how cleanupFunc() that runs via defer in test can be interrupted
is:
 - cleanupFunc starts running via defer (or AfterEach hook) but first
   thing that function does is to remove cleanupHandle from
   framework.RemoveCleanupAction.
 - Test suite receives interrupt from user and AfterSuite block
   starts executing
 - remember that while cleanupFunc is running in goroutine#1,
   AfterSuite is running concurrently in goroutine#2.
 - AfterSuite hook has bunch of CleanupActions it needs to run which
   were registered via framework.AddCleanupAction(cleanupFunc) but
   once cleanupFunc starts executing via defer in the test, it will
   remove the cleanupHandle from framework's aftersuite hooks.
 - So if AfterSuite did not had anything to run (because
   those actions were removed via framework.RemoveCleanupAction
   then it will simply go to the last framework.AfterEach action and call os.Exit(1)
 - So if os.Exit(1) is called before cleanupFunc has a chance to finish in defer, it will not complete.
2020-06-02 12:40:32 -04:00
Christian Huffman
7a55d3978c Adjust Azure e2e binding mode 2020-06-01 14:55:46 -04:00
Jordan Liggitt
c9638d54d0 Defer ginkgo recovers 2020-06-01 11:02:41 -04:00
Davanum Srinivas
b1742f19ef Switch kube-controller-manager to distroless image
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-21 22:33:54 -04:00
Kubernetes Prow Robot
bded41a817 Merge pull request #90689 from aojea/nfsv6
add ipv6 support to the e2e nfs tests
2020-05-21 03:30:36 -07:00
Kubernetes Prow Robot
0e8a2d2244 Merge pull request #90793 from pohly/flaky-mount-volume-calls
mock e2e test: reduce flakiness by not testing all calls
2020-05-19 15:22:19 -07:00
Davanum Srinivas
07d88617e5 Run hack/update-vendor.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:33 -04:00
Davanum Srinivas
442a69c3bd switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
Kubernetes Prow Robot
9978c281ec Merge pull request #90773 from gnufied/fix-csi-e2e-orphans
Fix CSI e2e leaving pods in terminating state
2020-05-13 22:14:21 -07:00
Hemant Kumar
da941d8d3e Create mock CSI driver resources in different namespace 2020-05-13 11:16:00 -04:00
Hemant Kumar
708261e06c Make AfterSuite hooks ordered
ginkgo has a weird bug that - AfterEach does not get called when
testsuite exits with certain kind of interrupt (Ctrl-C for example).
More info - https://github.com/onsi/ginkgo/issues/222

We workaround this issue in Kubernetes by adding a special hook into
AfterSuite call, but AfterSuite can not be used to peforms certain
kind of cleanup because it can race with AfterEach hook and
framework.AfterEach hook will set framework.ClientSet to nil.

This presents a problem in cleaning up CSI driver and testpods. This
PR removes cleanup of driver manifest via CleanupAction because that
is not safe and racy (such as f.ClientSet may disappear!) and makes
AfterSuite hooks run in a ordered fashion
2020-05-13 11:15:27 -04:00
Kubernetes Prow Robot
620b7720e6 Merge pull request #90828 from gaurav1086/fix_data_race_storage
Fix date race in storage tests
2020-05-13 00:18:40 -07:00
Gaurav Singh
af74fbabf4 Remove unused err variable 2020-05-08 14:20:35 -04:00
Saikat Roychowdhury
dcfaaefc60 Pickup Snapshot Provisioner from the snapshot class "driver" info.
When using FromFile or FromExisitingClass options, snapshot provisioner
should be picked up from the "driver" tag of VolumeSnapshotClass object.
2020-05-08 05:45:36 +00:00
Kubernetes Prow Robot
7f78048594 Merge pull request #90781 from msau42/increase-timeout
Increase timeout waiting for driver to start on nodes
2020-05-06 22:23:08 -07:00
Gaurav Singh
37458b350e Fix date race in storage 2020-05-06 22:57:08 -04:00
Patrick Ohly
5aa3805a5f mock e2e test: reduce flakiness by not testing all calls
kubelet sometimes calls NodeStageVolume an NodePublishVolume too
often, which breaks this test and leads to flakiness. The test isn't
about that, so we can relax the checking and it still covers what it
was meant to cover.
2020-05-06 11:43:16 +02:00
Michelle Au
fc08f74157 Increase timeout waiting for driver to start on nodes to reduce test flakiness
Change-Id: Id553943e4473b387bf0ae14a18a90cb3a1bcd5c1
2020-05-05 18:10:10 -07:00
Kubernetes Prow Robot
fbacb6e264 Merge pull request #90335 from pohly/cleanup-late-binding
e2e storage: wait for PV deletion also for late binding
2020-05-04 18:05:07 -07:00
Antonio Ojea
26a00f9032 add ipv6 support to the e2e nfs tests
nfs mount command need to use the IP enclosed with square brackets
if is an IPv6 address
2020-05-03 11:06:10 +02:00
Patrick Ohly
e3d258d6ca e2e storage: wait for PV deletion also for late binding
When a test pattern or storage class uses late binding, the cleanup
code didn't know about the PV that may have been created for the PVC
since setting it up and thus then also didn't wait for PV deletion.

This is problematic for test isolation because the next test was
allowed to be started before fully cleaning up. Worse, it the driver
gets removed after the test, the volume might never get deleted.
2020-04-27 10:34:50 +02:00
Di Xu
3f5e09b6e2 add e2e tests for HostPathType and mark as slow 2020-04-23 10:52:40 +08:00
Kubernetes Prow Robot
fc9d174102 Merge pull request #88248 from claudiubelu/tests/reduce-to-agnhost-mounttest
tests: Replaces mounttest images used with agnhost (part 4)
2020-04-22 04:53:52 -07:00
Kubernetes Prow Robot
07179d0207 Merge pull request #87998 from msau42/e2e-reattach-stress
Add stress test to repeatedly restart Pods with PVCs in parallel
2020-04-21 03:04:57 -07:00
Kubernetes Prow Robot
7c53c1eb91 Merge pull request #89819 from pohly/enhance-podlogs-master
tests: enhance podlogs
2020-04-16 22:19:07 -07:00
Kubernetes Prow Robot
ae8d30631d Merge pull request #90214 from pohly/stop-pod-master
storage tests: really wait for pod to disappear
2020-04-16 20:49:07 -07:00
Michelle Au
6596e20b18 Make stress test parameters configurable
Change-Id: Ia062f3433b6043825a51a54c7c07eb4cdf809631
2020-04-16 14:18:21 -07:00
Kubernetes Prow Robot
aa0665dfee Merge pull request #90147 from gnufied/use-random-node-zone-for-inline-e2e
Use random zone for inline volume e2e tests
2020-04-16 13:59:08 -07:00
Patrick Ohly
0cdd5365a1 storage tests: really wait for pod to disappear
As seen in one case (https://github.com/intel/pmem-csi/issues/587), a
pod can reach the "not running" state although its ephemeral volumes
are still being torn down by kubelet and the CSI driver. What happens
then is that the test returns too early and even deleting the
namespace and thus the pod succeeds before the NodeVolumeUnpublish
really finishes.

To avoid this, StopPod now waits for the pod to really disappear.
2020-04-16 21:10:56 +02:00
Michelle Au
e132b77ae4 Add stress test to repeatedly restart Pods with PVCs in parallel
Change-Id: I499571cc86b1058d0e16d79e5e998d1dedfd9a4a
2020-04-15 18:10:35 -07:00
Hemant Kumar
7d6712632c Use random zone for inline volume e2e tests 2020-04-14 23:37:21 -04:00
Patrick Ohly
2ae6cf5984 mock tests: per-test timeout for ResourceExhausted
The timeout for the two loops inside the test itself are now bounded
by an upper limit for the duration of the entire test instead of
having their own, rather arbitrary timeouts.
2020-04-14 09:11:42 +02:00
Patrick Ohly
48f8e398fb mock tests: remove redundant wrapping of error
The "error waiting for expected CSI calls" is redundant because it's
immediately followed by checking that error with:

   framework.ExpectNoError(err, "while waiting for all CSI calls")
2020-04-07 13:09:31 +02:00
Patrick Ohly
2550051f3b mock tests: add timeout
The for loop that waited for the signal to delete pod had no timeout,
so if something went wrong, it would wait for the entire test suite to
time out.
2020-04-07 13:09:31 +02:00