sig-storage tests that delete pods need to wait for owned resources to
also be cleaned up before returning in the case that resources such as
ephemeral inline volumes are being used. This was previously implemented
by modifying the pod delete call of the e2e framework, which negatively
impacted other tests. This was reverted and now the logic has been moved
to StopPodAndDependents, which is local to the sig-storage tests.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Adds check for index out of bounds error instead of panic when passing
container to kubectl exec.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
This extends the existing "ephemeral volume" tests to also cover
generic ephemeral inline volumes. They get instantiated for all
drivers (CSI and others) which support persistent volume provisioning,
for several different filesystems.
Configuring the number of inline volumes via a flag with a computed
name had been identified as problematic before and now gets removed
because re-using the tests as a stress test with a higher number of
volumes should be added and configured separately.
Windows test for subPath is failing due to an issue related to
removeUnusedContainers calls. After image is changed to Agnhost, it
automatically has a args by default. However, there are places to use
container commands instead of args and causing issues.
This is the first step to fix this issue. Next plan to replace
busybox used in Linux with Agnhost which can work for both linux and
windows.
I also mark two subPath tests as LinuxOnly. I think they are not ready
for windows yet. Before they were passing due to wrong reason. The tests
checks failed container status but the contain fails due to other
reasons than what we expected.
The test steps are as follows:
1. Write some data
2. Take a snapshot
3. Write more data
4. Create a new volume from snapshot
5. Validate data is the old data
1. Use ginkgo before each to do common setup
2. Use volume resource to create SC, PV, PVC and handle cleanup
3. Add SnapshotResource to handle creating and cleanup of VS, VSC, VSClass
4. Add test pattern for deletion policy: Delete vs Retain
5. Use test pattern to determine test behaviour
6. Add test pattern for preprovisioned snapshot (not implemented)
These changes are made to consolidate common setup steps and stop resource
leaks by waiting for objects to be deleted.
By creating CSIStorageCapacity objects in advance, we get the
FailedScheduling pod event if (and only if!) the test is expected to
fail because of insufficient or missing capacity. We can use that as
indicator that waiting for pod start can be stopped early. However,
because we might not get to see the event under load, we still need
the timeout.
Setting testParameters.scName had no effect because
StorageClassTest.StorageClassName isn't used anywhere. Instead, the
storage class name is generated dynamically.
When using the entire test name as file name, the name became too
long (> 256 characters, which wasn't supported by all file systems)
and the artifact directory got cluttered.
The original reason (a limitation in Gubernator) no longer applies
because Spyglass is used now for log viewing.
This drops testfiles.ReadOrDie and updated testfiles.Exists to return an
error, forcing the caller to decide whether to call framework.Fail or do
something else.
It makes for a slightly less friendly API, but also means the package is
decoupled from framework again, as per the comments at the top of the
file
As its name, DeprecatedMightBeMasterNode is deprecated.
In e2e metrics, the function was used for knowing master node name to
get metrics from kube-scheduler and kube-controller-manager pods.
This make e2e metrics get these metrics directly by getting those pod
names without calling DeprecatedMightBeMasterNode().
There were nits in invokeStaleDummyVMTestWithStoragePolicy() like
- The error message didn't contain necessary space
- IsVMPresent() can return an error, but lack of the error handling
- IsVMPresent() returns true/false, but didn't use ExpectEqual() and
less code readability
This fixes those things.
WaitForStableCluster() checks all pods run on worker nodes, and the
function used to refer master nodes to skip checking controller plane
pods.
GetMasterAndWorkerNodes() was used for getting master nodes, but the
implementation is not good because it usesDeprecatedMightBeMasterNode().
This makes WaitForStableCluster() refer worker nodes directly to avoid
using GetMasterAndWorkerNodes().
The behavior for exec'ing a backgrounded command is not specified
with CRI so modify the test to run the command directly instead
of using exec.
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
The way gingko handles interrupts is:
- It starts running AfterSuite hooks in a separate goroutine (this includes cleanupAction hooks)
- Once AfterSuite hook is done executing it calls
os.Exit(1) on test suite.
So how cleanupFunc() that runs via defer in test can be interrupted
is:
- cleanupFunc starts running via defer (or AfterEach hook) but first
thing that function does is to remove cleanupHandle from
framework.RemoveCleanupAction.
- Test suite receives interrupt from user and AfterSuite block
starts executing
- remember that while cleanupFunc is running in goroutine#1,
AfterSuite is running concurrently in goroutine#2.
- AfterSuite hook has bunch of CleanupActions it needs to run which
were registered via framework.AddCleanupAction(cleanupFunc) but
once cleanupFunc starts executing via defer in the test, it will
remove the cleanupHandle from framework's aftersuite hooks.
- So if AfterSuite did not had anything to run (because
those actions were removed via framework.RemoveCleanupAction
then it will simply go to the last framework.AfterEach action and call os.Exit(1)
- So if os.Exit(1) is called before cleanupFunc has a chance to finish in defer, it will not complete.