When replacing the deterministic ResourceClaim name with a generated one this
particular piece of the original validation was incorrectly left in place.
It's not required anymore that "<pod name>-<claim name in pod spec>" is a valid
ResourceClaim name.
Currently, there are some unit tests that are failing on Windows due
to various reasons:
- IPVS proxy mode is not supported on Windows.
- pkg/kubelet/cri/remote was moved to cri-client.
This makes the API nicer:
resourceClaims:
- name: with-template
resourceClaimTemplateName: test-inline-claim-template
- name: with-claim
resourceClaimName: test-shared-claim
Previously, this was:
resourceClaims:
- name: with-template
source:
resourceClaimTemplateName: test-inline-claim-template
- name: with-claim
source:
resourceClaimName: test-shared-claim
A more long-term benefit is that other, future alternatives
might not make sense under the "source" umbrella.
This is a breaking change. It's justified because DRA is still
alpha and will have several other API breaks in 1.31.
The JSON patch approach works, but it is complex. A retry loop is easier to
understand (detect conflict, get new claim, try again). There is one additional
API call (the get), but in practice this scenario is unlikely.
There was a race caused by having to update claim finalizer and status in two
different operations:
- Resource claim controller removes allocation, does not yet
get to remove the finalizer.
- Scheduler prepares an allocation, without adding the finalizer
because it's there.
- Controller removes finalizer.
- Scheduler adds allocation.
This is an invalid state. Automatic checking found this during the execution of
the "with translated parameters on single node.*supports sharing a claim
sequentially" E2E test, but only when run stand-alone. When running in
parallel (as in the CI), the bad outcome of the race did not occur.
The fix is to check that the finalizer is still set when adding the
allocation. The apiserver doesn't check that because it doesn't know which
finalizer goes with the allocation result. It could check for "some finalizer",
but that is not guaranteed to be correct (could be some unrelated one).
Checking the finalizer can only be done with a JSON patch. Despite the
complications, having the ability to add multiple pods concurrently to
ReservedFor seems worth it (avoids expensive rescheduling or a local retry
loop).
The resource claim controller doesn't need this, it can do a normal update
which implicitly checks ResourceVersion.
MultiCIDRServiceAllocator implements a new ClusterIP allocator based on
IPAddress object to solve the problems and limitations caused by
existing bitmap allocators.
However, during the rollout of new versions, deployments need to support
a skew of one version between kube-apiservers. To avoid the possible
problem where there are multiple Services requests on the skewed
apiservers and that both allocate the same IP to different Services,
the new allocator will implement a dual-write strategy under the
feature gate DisableAllocatorDualWrite.
After the MultiCIDRServiceAllocator is GA, the DisableAllocatorDualWrite
can be enabled safely as all apiservers will run with the new
allocators. The graduation of DisableAllocatorDualWrite can also
be used to clean up the opaque API object that contains the old bitmaps.
If MultiCIDRServiceAllocator is enabled and DisableAllocatorDualWrite is disable
and is a new environment, there is no bitmap object created, hence, the
apiserver will initialize it to be able to write on it.
ServiceCIDRs are protected by finalizers and the CIDRs fields are
inmutable once set, only the readiness state impact the allocator
as it can only allocate IPs if any of the ServiceCIDR is ready.
The Add/Update events triggers a reconcilation of the current state
of the ServiceCIDR present in the informers with the existing IP
allocators.
The Delete events are handled directly to update or delete the
corresponing IP allocator.
The DRA plugin does that. It didn't actually work and only printed an error
message about NodeInfo not implementing klog.KMetata. That's not a compile-time
check due to limitations with Go generics and had been missed earlier.
The encoding/json package marshals []byte to a JSON string containing the base64 encoding of the
input slice's bytes, and unmarshals JSON strings to []byte by assuming the JSON string contains a
valid base64 text.
As a binary format, CBOR is capable of representing arbitrary byte sequences without converting them
to a text encoding, but it also needs to interoperate with the existing JSON serializer. It does
this using the "expected later encoding" tags defined in RFC 8949, which indicate a specific text
encoding to be used when interoperating with text-based protocols. The actual conversion to or from
a text encoding is deferred until necessary, so no conversion is performed during roundtrips of
[]byte to CBOR.