Add a new cidrset named `multicidrset` which extends the current
cidrset mechanism to track allocatable Pod and Service CIDRs.
multicidrset stores the info about allocated CIDRs in a Map as opposed
to the current cidrset implementation where it is stored in a bitmap.
- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The evictorLock only protects zonePodEvictor and zoneNoExecuteTainter.
processTaintBaseEviction showed indications of increased lock contention
among goroutines (see issue 110341 for more details).
The refactor done is to ensure that all codepaths in that function that
hold the evictorLock AND make API calls under the lock, are now making
API calls outside the lock and the lock is held only for accessing either
zonePodEvictor or zoneNoExecuteTainter or both.
Two other places where the refactor was done is the doEvictionPass and
doNoExecuteTaintingPass functions which make multiple API calls under
the evictorLock.
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
Fix a TODO to plumb an update filter from above in the resource quota
monitor code that was handling update events for quota-able objects,
instead of hard-coding the logic in the resource quota monitor.
Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
6 minute force-deatch timeout should be used only for nodes that are not
healthy.
In case a CSI driver is being upgraded or it's simply slow, NodeUnstage
can take more than 6 minutes. In that case, Pod is already deleted from the
API server and thus A/D controller will force-detach a mounted volume,
possibly corrupting the volume and breaking CSI - a CSI driver expects
NodeUnstage to succeed before Kubernetes can call ControllerUnpublish.
When a Pod is referencing a Node that doesn't exist on the local
informer cache, the current behavior was to return an error to
retry later and stop processing.
However, this can cause scenarios that a missing node leaves a
Slice stuck, it can no reflect other changes, or be created.
Also, this doesn't respect the publishNotReadyAddresses options
on Services, that considers ok to publish pod Addresses that are
known to not be ready.
The new behavior keeps retrying the problematic Service, but it
keeps processing the updates, reflacting current state on the
EndpointSlice. If the publishNotReadyAddresses is set, a missing
node on a Pod is not treated as an error.
There is always a placeholder slice.
The ServicePortCache logic was considering always one endpointSlice
per Endpoint, but if there are multiple empty Endpoints, we just
use one placeholder slice, not multiple placeholder slices.
Fixes Issue 108231 by checking `slicesToDelete` in the EndpointSlice
reconciler for a pre-existing placeholder slice.
Also adds a helper function for comparing the slices.