When a resource gets deleted during migration, the SVM SSA patch
calls are interpreted as a logical create request. Since the object
from storage is nil, the merged result is just a type meta object,
which lacks a name in the body. This fails when the API server
checks that the name from the request URL and the body are the same.
Note that a create request is something that SVM controller should
never do.
Once the UID is set on the patch, the API server will fail the
request at a slightly earlier point with an "uid mismatch" conflict
error, which the SVM controller can handle gracefully.
Setting UID by itself is not sufficient. When a resource gets
deleted and recreated, if RV is not set but UID is set, we would get
an immutable field validation error for attempting to update the
UID. To address this, we set the resource version on the SSA patch
as well. This will cause that update request to also fail with a
conflict error.
Added the create verb on all resources for SVM controller RBAC as
otherwise the API server will reject the request before it fails
with a conflict error.
The change addresses a host of other issues with the SVM controller:
1. Include failure message in SVM resource
2. Do not block forever on unsynced GC monitor
3. Do not immediately fail on GC monitor being missing, allow for
a grace period since discovery may be out of sync
4. Set higher QPS and burst to handle large migrations
Test changes:
1. Clean up CRD webhook convertor logs
2. Allow SVM tests to be run multiple times to make finding flakes easier
3. Create and delete CRs during CRD test to force out any flakes
4. Add a stress test with multiple parallel migrations
5. Enable RBAC on KAS
6. Run KCM directly to exercise wiring and RBAC
7. Better logs during CRD migration
8. Scan audit logs to confirm SVM controller never creates
Signed-off-by: Monis Khan <mok@microsoft.com>
format.Object adds some white space in front of the value and a type identifier
in angle brackets. Both is distracting when printing simple values and can be
avoided by picking fmt.Sprintf for those types, plus trimming the result of
format.Object.
Before:
allocator.go:483: I0625 15:35:31.946980] Allocating one device currentClaim= <int>: 0 totalClaims= <int>: 1 currentRequest= <int>: 0 totalRequestsPerClaim= <int>: 1 currentDevice= <int>: 0 devicesPerRequest= <int>: 1 allDevices= <bool>: false adminAccess= <bool>: false
After:
allocator.go:483: I0625 15:35:04.371441] Allocating one device currentClaim=0 totalClaims=1 currentRequest=0 totalRequestsPerClaim=1 currentDevice=0 devicesPerRequest=1 allDevices=false adminAccess=false
testify is used throughout the codebase; this switches mocks from
gomock to testify with the help of mockery for code generation.
Handlers and mocks in test/utils/oidc are moved to a new package:
mockery operates package by package, and requires packages to build
correctly; test/utils/oidc/testserver.go relies on the mocks and fails
to build when they are removed. Moving the interface and mocks to a
different package allows mockery to process that package without
having to build testserver.go.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
Several pods sharing the same claim is not common, but can be useful and thus
should get tested.
Before, createPods and createAny operations were not able to do this because
each generated object was the same. What we need are different, predictable
names of the claims (from createAny) and different references to those in the
pods (from createPods). Now text/template processing with the index number of
the pod respectively claim as input is used to inject these varying fields. A
"div" function is needed to use the same claim in several different pods.
While at it, some existing test cases get cleaned up a bit (removal of
incorrect comments, adding comments for testing with queuing hints).
See https://github.com/golang/mock#gomock: golang/mock is no longer
maintained, and should be replaced by go.uber.org/mock.
This allows golang/mock to be dropped from the status and vendored
fields in unwanted-dependencies.json.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
The return type of ktesting.NewTestContext is now a TContext. Code
which combined it WithCancel often didn't compile anymore (cannot overwrite
ktesting.TContext with context.Context). This is a good thing because all of
that code can be simplified to let ktesting handle the cancelation.
Extending the duration and the allowed delta in f6682370b1 was still not enough
to make the unit test run reliably in pull-kubernetes-unit.
Now it uses the original, stricter timing again, but only when run locally. In
Prow (detected by checking the "CI" env variable), the duration check is
skipped.
Since v2.45, the `stress` subcommand was added and the CI issue was fixed:
- kubernetes/kubernetes PR 123258
- kubernetes/kubernetes PR 123284
- kubernetes/k8s.io PR 6422
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
The new TContext interface combines a normal context and the testing interface,
then adds some helper methods. The context gets canceled when the test is done,
but that can also be requested earlier via Cancel.
The intended usage is to pass a single `tCtx ktesting.TContext` parameter
around in all helper functions that get called by a unit or integration test.
Logging is also more useful: Log[f] and Fatal[f] output is prefixed with
"[FATAL] ERROR: " to make it stand out more from regular log output.
If this approach turns out to be useful, it could be extended further (for
example, with a per-test timeout) and might get moved to a staging repository
to enable usage of it in other staging repositories.
To allow other implementations besides testing.T and testing.B, a custom
ktesting.TB interface gets defined with the methods expected from the
actual implementation. One such implementation can be ginkgo.GinkgoT().