In the API, the effect of the feature gate is that alpha fields get dropped on
create. They get preserved during updates if already set. The
PodSchedulingContext registration is *not* restricted by the feature gate.
This enables deleting stale PodSchedulingContext objects after disabling
the feature gate.
The scheduler checks the new feature gate before setting up an informer for
PodSchedulingContext objects and when deciding whether it can schedule a
pod. If any claim depends on a control plane controller, the scheduler bails
out, leading to:
Status: Pending
...
Warning FailedScheduling 73s default-scheduler 0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
The rest of the changes prepare for testing the new feature separately from
"structured parameters". The goal is to have base "dra" jobs which just enable
and test those, then "classic-dra" jobs which add DRAControlPlaneController.
This is a complete revamp of the original API. Some of the key
differences:
- refocused on structured parameters and allocating devices
- support for constraints across devices
- support for allocating "all" or a fixed amount
of similar devices in a single request
- no class for ResourceClaims, instead individual
device requests are associated with a mandatory
DeviceClass
For the sake of simplicity, optional basic types (ints, strings) where the null
value is the default are represented as values in the API types. This makes Go
code simpler because it doesn't have to check for nil (consumers) and values
can be set directly (producers). The effect is that in protobuf, these fields
always get encoded because `opt` only has an effect for pointers.
The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new
"request" field. This is considered acceptable because the entire `claims`
field in the pod spec is still alpha.
The implementation is complete enough to bring up the apiserver.
Adapting other components follows.
As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate
allocation is one of those features which can be removed because it makes no
sense for structured parameters and the justification for classic DRA is weak.
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.
Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.
Only source code where the version really matters (like API registration)
retains the versioned import.
* Pod terminationGracePeriodSeconds is always valid
Validation of a pod spec will always use the pod's
TerminationGracePeriodSeconds value.
A set of pod test-helpers have been created to help construct Pods.
* remove unused func
* reduction
* reduce 2
* simplify test
* report invalid grace period
* update SupplementalGroupPolicy tests
MultiCIDRServiceAllocator implements a new ClusterIP allocator based on
IPAddress object to solve the problems and limitations caused by
existing bitmap allocators.
However, during the rollout of new versions, deployments need to support
a skew of one version between kube-apiservers. To avoid the possible
problem where there are multiple Services requests on the skewed
apiservers and that both allocate the same IP to different Services,
the new allocator will implement a dual-write strategy under the
feature gate DisableAllocatorDualWrite.
After the MultiCIDRServiceAllocator is GA, the DisableAllocatorDualWrite
can be enabled safely as all apiservers will run with the new
allocators. The graduation of DisableAllocatorDualWrite can also
be used to clean up the opaque API object that contains the old bitmaps.
If MultiCIDRServiceAllocator is enabled and DisableAllocatorDualWrite is disable
and is a new environment, there is no bitmap object created, hence, the
apiserver will initialize it to be able to write on it.
ServiceCIDRs are protected by finalizers and the CIDRs fields are
inmutable once set, only the readiness state impact the allocator
as it can only allocate IPs if any of the ServiceCIDR is ready.
The Add/Update events triggers a reconcilation of the current state
of the ServiceCIDR present in the informers with the existing IP
allocators.
The Delete events are handled directly to update or delete the
corresponing IP allocator.