They contain some nice-to-have improvements (for example, better printing of
errors with gomega/format.Object) but nothing that is critical right now.
"go mod tidy" was run manually in
staging/src/k8s.io/kms/internal/plugins/mock (https://github.com/kubernetes/kubernetes/pull/116613
not merged yet).
It is possible for a KMSv2 plugin to return a static value as Ciphertext
and store the actual encrypted DEK in the annotations. In this case,
using the encDEK will not work. Instead, we are now using a combination
of the encDEK, keyID and annotations to generate the cache key.
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
The Poll* methods predate context in Go, and the current implementation
will return ErrWaitTimeout even if the context is cancelled, which
prevents callers who are using Poll* from handling that error directly
(for instance, if you want to cancel a function in a controlled fashion
but still report cleanup errors to logs, you want to know the difference
between 'didn't cancel', 'cancelled cleanly', and 'hit an error).
This commit adds two new methods that reflect how modern Go uses
context in polling while preserving all Kubernetes-specific behavior:
PollUntilContextCancel
PollUntilContextTimeout
These methods can be used for infinite polling (normal context),
timed polling (deadline context), and cancellable poll (cancel context).
All other Poll/Wait methods are marked as deprecated for removal in
the future. The ErrWaitTimeout error will no longer be returned from the
Poll* methods, but will continue to be returned from ExponentialBackoff*.
Users updating to use these new methods are responsible for converting
their error handling as appropriate. A convenience helper
`Interrupted(err) bool` has been added that should be used instead of
checking `err == ErrWaitTimeout`. In a future release ErrWaitTimeout will
be made private to prevent incorrect use. The helper can be used with all
polling methods since context cancellation and deadline are semantically
equivalent to ErrWaitTimeout. A new `ErrorInterrupted(cause error)` method
should be used instead of returning ErrWaitTimeout in custom code.
The convenience method PollUntilContextTimeout is added because deadline
context creation is verbose and the cancel function must be called to
properly cleanup the context - many of the current poll users would see
code sizes increase. To reduce the overall method surface area, the
distinction between PollImmediate and Poll has been reduced to a single
boolean on PollUntilContextCancel so we do not need multiple helper methods.
The existing methods were not altered because ecosystem callers have been
observed to use ErrWaitTimeout to mean "any error that my condition func
did not return" which prevents cancellation errors from being returned
from the existing methods. Callers must make a deliberate migration.
Callers migrating to `PollWithContextCancel` should:
1. Pass a context with a deadline or timeout if they were previously using
`Poll*Until*` and check `err` for `context.DeadlineExceeded` instead of
`ErrWaitTimeout` (more specific) or use `Interrupted(err)` for a generic
check.
2. Callers that were waiting forever or for context cancellation should
ensure they are checking `context.Canceled` instead of `ErrWaitTimeout`
to detect when the poll was stopped early.
Callers of `ExponentialBackoffWithContext` should use `Interrupted(err)`
instead of directly checking `err == ErrWaitTimeout`. No other changes are
needed.
Code that returns `ErrWaitTimeout` should instead define a local cause
and return `wait.ErrorInterrupted(cause)`, which will be recognized by
`wait.Interrupted()`. If nil is passed the previous message will be used
but clients are highly recommended to use typed checks vs message checks.
As a consequence of this change the new methods are more efficient - Poll
uses one less goroutine.
Added EnableNodeLogQuery field to kubelet/apis/config/types.go and
staging/src/k8s.io/kubelet/config/v1beta1/types.go, then executed.
`hack/update-codegen.sh`.
This new field will default to off and will need to be explicitly
enabled in addition to the NodeLogQuery gate to use the feature.
This change updates KMS v2 to not create a new DEK for every
encryption. Instead, we re-use the DEK while the key ID is stable.
Specifically:
We no longer use a random 12 byte nonce per encryption. Instead, we
use both a random 4 byte nonce and an 8 byte nonce set via an atomic
counter. Since each DEK is randomly generated and never re-used,
the combination of DEK and counter are always unique. Thus there
can never be a nonce collision. AES GCM strongly encourages the use
of a 12 byte nonce, hence the additional 4 byte random nonce. We
could leave those 4 bytes set to all zeros, but there is no harm in
setting them to random data (it may help in some edge cases such as
live VM migration).
If the plugin is not healthy, the last DEK will be used for
encryption for up to three minutes (there is no difference on the
behavior of reads which have always used the DEK cache). This will
reduce the impact of a short plugin outage while making it easy to
perform storage migration after a key ID change (i.e. simply wait
ten minutes after the key ID change before starting the migration).
The DEK rotation cycle is performed in sync with the KMS v2 status
poll thus we always have the correct information to determine if a
read is stale in regards to storage migration.
Signed-off-by: Monis Khan <mok@microsoft.com>