Tests should accept a context from Ginkgo and pass it through to all functions
which may block for a longer period of time. In particular all Kubernetes API
calls through client-go should use that context. Then if a timeout occurs,
the test returns immediately because everything that it could block on will
return.
Cleanup code then needs to run in a separate Ginkgo node, typically
DeferCleanup, which ensures that it gets a separate context which has not timed
out yet.
Followup on https://github.com/kubernetes/kubernetes/pull/111846. This
particular test was left out from that PR because once it was enabled it
started failing. It was desired to merge
https://github.com/kubernetes/kubernetes/pull/111846 irrespective of
this particular test.
The failure in the test was caused due to the
`createFSGroupRequestPreHook` mock CSI driver hook function assuming
that the request object passed to it is an instance of the respective
struct, but it's actually a pointer instead. This resulted in the hook
function not fulfilling its purpose, and the so the test failed.
Fixes instances of #98213 (to ultimately complete #98213 linting is
required).
This commit fixes a few instances of a common mistake done when writing
parallel subtests or Ginkgo tests (basically any test in which the test
closure is dynamically created in a loop and the loop doesn't wait for
the test closure to complete).
I'm developing a very specific linter that detects this king of mistake
and these are the only violations of it it found in this repo (it's not
airtight so there may be more).
In the case of Ginkgo tests, without this fix, only the last entry in
the loop iteratee is actually tested. In the case of Parallel tests I
think it's the same problem but maybe a bit different, iiuc it depends
on the execution speed.
Waiting for the CI to confirm the tests are still passing, even after
this fix - since it's likely it's the first time those test cases are
executed - they may be buggy or testing code that is buggy.
Another instance of this is in `test/e2e/storage/csi_mock_volume.go` and
is still failing so it has been left out of this commit and will be
addressed in a separate one
- update all the import statements
- run hack/pin-dependency.sh to change pinned dependency versions
- run hack/update-vendor.sh to update go.mod files and the vendor directory
- update the method signatures for custom reporters
Signed-off-by: Dave Chen <dave.chen@arm.com>
When adding the ephemeral volume feature, the special case for
PersistentVolumeClaim volume sources in kubelet's host path and node
limits checks was overlooked. An ephemeral volume source is another
way of referencing a claim and has to be treated the same way.
All dependencies of VolumeBinding plugin from
"k8s.io/kubernetes/pkg/controller/volume/scheduling" package moved to
"k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding" package:
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache.go
- whole file pkg/controller/volume/scheduling/scheduler_assume_cache_test.go
- whole file pkg/controller/volume/scheduling/scheduler_binder.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_fake.go
- whole file pkg/controller/volume/scheduling/scheduler_binder_test.go
Package "k8s.io/kubernetes/pkg/controller/volume/scheduling/metrics" moved
to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding/metrics"
because it only used in VolumeBinding plugin and (e2e) tests.
More described in issue #89930 and PR #102953.
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
The previous approach with grabbing via a nginx proxy had some
drawbacks:
- it did not work when the pods only listened on localhost (as
configured by kubeadm) and the proxy got deployed on a different
node
- starting the proxy raced with starting the pods, causing
sporadic test failures because the proxy was not set up
properly unless it saw all pods when starting the e2e.test
- the proxy was always started, whether it is needed or not
- the proxy was left running after a test and then the next
test run triggered potentially confusing messages when
it failed to create objects for the proxy
The new approach is similar to "kubectl port-forward" + "kubectl get
--raw". It uses the port forwarding feature to establish a TCP
connection via a custom dialer, then lets client-go handle TLS and
credentials.
Somehow verifying the server certificate did not work. As this
shouldn't be a big concern for E2E testing, certificate checking gets
disabled on the client side instead of investigating this further.
This replaces embedding of JavaScript code into the mock driver that
runs inside the cluster with Go callbacks which run inside the
e2e.test suite itself. In contrast to the JavaScript hooks, they have
direct access to all parameters and can fabricate arbitrary responses,
not just error codes.
Because the callbacks run in the same process as the test itself, it
is possible to set up two-way communication via shared variables or
channels. This opens the door for writing better tests. Some of the
existing tests that poll mock driver output could be simplified, but
that can be addressed later.
For now, only tests using hooks use embedding. How gRPC calls are
retrieved is abstracted behind the CSIMockTestDriver interface, so
tests don't need to be modified when switching between embedding
and remote mock driver.
Extract TestSuite, TestDriver, TestPattern, TestConfig
and VolumeResource, SnapshotVolumeResource from testsuite
package and put them into a new package called api.
The ultimate goal here is to make the testsuites as clean
as possible. And only testsuites in the package.
As discussed in https://github.com/kubernetes/kubernetes/pull/93658,
relying on a watch to deliver all events is not acceptable, not even
for a test, because it can and did (at least for OpenShift testing) to
test flakes.
The solution in #93658 was to replace log flooding with a clear test
failure, but that didn't solve the flakiness.
A better solution is to use a RetryWatcher which "under normal
circumstances" (https://github.com/kubernetes/kubernetes/pull/93777#discussion_r467932080)
should always deliver all changes.
In our current mock CSI driver e2e test, we are not waiting
for the CSI driver register successfully to perform test
including provision PVC. This can lead to timeout when the
csi driver takes longer to register the socket.
This change adds the waiting part so that the system will
wait for up to 10 minutes for the driver to be ready. This
normally won't take this long. However, under a resource
constraint environment it can take longer than expected time.
https://github.com/kubernetes/kubernetes/issues/93358