ginkgo has a weird bug that - AfterEach does not get called when
testsuite exits with certain kind of interrupt (Ctrl-C for example).
More info - https://github.com/onsi/ginkgo/issues/222
We workaround this issue in Kubernetes by adding a special hook into
AfterSuite call, but AfterSuite can not be used to peforms certain
kind of cleanup because it can race with AfterEach hook and
framework.AfterEach hook will set framework.ClientSet to nil.
This presents a problem in cleaning up CSI driver and testpods. This
PR removes cleanup of driver manifest via CleanupAction because that
is not safe and racy (such as f.ClientSet may disappear!) and makes
AfterSuite hooks run in a ordered fashion
This is gross but because NewDeleteOptions is used by various parts of
storage that still pass around pointers, the return type can't be
changed without significant refactoring within the apiserver. I think
this would be good to cleanup, but I want to minimize apiserver side
changes as much as possible in the client signature refactor.
We don't want to set the name directly because then starting the pod
can fail when the node is temporarily out of resources
(https://github.com/kubernetes/kubernetes/issues/87855).
For CSI driver deployments, we have three options:
- modify the pod spec with custom code, similar
to how the NodeSelection utility code does it
- add variants of SetNodeSelection and SetNodeAffinity which
work with a pod spec instead of a pod
- change their parameter from pod to pod spec and then use
them also when patching a pod spec
The last approach is used here because it seems more general. There
might be other cases in the future where there's only a pod spec that
needs to be modified.
Many times an e2e test fails with an unexpected error,
"timed out waiting for the condition".
Useful information may be in the test logs, but debugging e2e test
failures will be much faster if we add context to errors when they
happen.
This change makes sure we add context to all errors returned from
helpers like wait.Poll().
Issue: https://github.com/kubernetes/kubernetes/issues/86884
This PR fixes the issue that err variable gets shadowed. Because of
this, it might get nil pointer error.
Change-Id: Ib7da918418a7c8148a6ca598db12b3744eb3b7c8
Use POST method instead of running local kubectl.
Use ExecCommandInContainerWithFullOutput() instead of RunKubectl().
PodExec() takes additional framework arg, passed down in call chain.
VerifyExecInPodFail uses different error code cast as original
one causes test code Panic if used with new call method.
The following functions are called at some specific places only,
so this moves these functions to the places and makes them local.
- WaitForPersistentVolumeClaimDeleted: Moved to e2e storage
- PrintSummaries: Moved to e2e framework.go
- GetHostExternalAddress: Moved to e2e node
- WaitForMasters: Moved to e2e cloud gcp
- WaitForApiserverUp: Moved to e2e network
- WaitForKubeletUp: Moved to e2e storage vsphere
A number of tests were using hardcoded image paths instead of
going through the imageutils package. The reason for centralizing
the logic there is to keep an eye on what images we use and where
they come from.
We need the 1.2.0 driver for that because that has support for
detecting the volume mode dynamically, and we need to deploy a
CSIDriver object which enables pod info (for the dynamic detection)
and both modes (to satisfy the new mode sanity check).
Using a "normal" CSI driver for an inline ephemeral volume may have
unexpected and potentially harmful effects when the driver gets a
NodePublishVolume call that it isn't expecting. To prevent that mistake,
driver deployments for a driver that supports such volumes must:
- deploy a CSIDriver object for the driver
- set CSIDriver.Spec.VolumeLifecycleModes such that it contains "ephemeral"
The default for that field is "persistent", so existing deployments
continue to work and are automatically protected against incorrect
usage.
For the E2E tests we need a way to specify the driver mode. The
existing cluster-driver-registrar doesn't support that and also was
deprecated, so we stop using it altogether and instead deploy and
patch a CSIDriver object.