This was an oversight in the initial EndpointSlice release. This update
will ensure that Endpoints and EndpointSlices use the same logic to set
the Hostname attribute.
The Service spec includes a PublishNotReadyAddresses field which has
been used by Endpoints to report all matching resources ready. This may
or may not have been the initial purpose of the field, but given the
desire to provide backwards compatibility with the Endpoints API here,
it seems to make sense to continue to provide the same functionality.
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. Unlike other controllers, we
drop the replica set create error very late (in the queue handleErr)
in order to avoid changing the structure of the controller
substantially.
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. While we could technically
exit the entire processing loop early for very large jobs,
we should wait for more evidence that is an issue before changing
that logic substantially.
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. While we could technically
exit the entire processing loop early for very large daemon sets,
we should wait for more evidence that is an issue before changing
that logic substantially.
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. While we could technically
exit the entire processing loop early for very large replica sets,
we should wait for more evidence that is an issue before changing
that logic substantially.
In some scenarios the service account and token controllers can
race with namespace deletion, causing a burst of errors as they
attempt to recreate secrets being deleted.
Instead, detect these errors and do not retry.
When scaling down a ReplicaSet, delete doubled up replicas first, where a
"doubled up replica" is defined as one that is on the same node as an
active replica belonging to a related ReplicaSet. ReplicaSets are
considered "related" if they have a common controller (typically a
Deployment).
The intention of this change is to make a rolling update of a Deployment
scale down the old ReplicaSet as it scales up the new ReplicaSet by
deleting pods from the old ReplicaSet that are colocated with ready pods of
the new ReplicaSet. This change in the behavior of rolling updates can be
combined with pod affinity rules to preserve the locality of a Deployment's
pods over rollout.
A specific scenario that benefits from this change is when a Deployment's
pods are exposed by a Service that has type "LoadBalancer" and external
traffic policy "Local". In this scenario, the load balancer uses health
checks to determine whether it should forward traffic for the Service to a
particular node. If the node has no local endpoints for the Service, the
health check will fail for that node. Eventually, the load balancer will
stop forwarding traffic to that node. In the meantime, the service proxy
drops traffic for that Service. Thus, in order to reduce risk of dropping
traffic during a rolling update, it is desirable preserve node locality of
endpoints.
* pkg/controller/controller_utils.go (ActivePodsWithRanks): New type to
sort pods using a given ranking.
* pkg/controller/controller_utils_test.go (TestSortingActivePodsWithRanks):
New test for ActivePodsWithRanks.
* pkg/controller/replicaset/replica_set.go
(getReplicaSetsWithSameController): New method. Given a ReplicaSet, return
all ReplicaSets that have the same owner.
(manageReplicas): Call getIndirectlyRelatedPods, and pass its result to
getPodsToDelete.
(getIndirectlyRelatedPods): New method. Given a ReplicaSet, return all
pods that are owned by any ReplicaSet with the same owner.
(getPodsToDelete): Add an argument for related pods. Use related pods and
the new getPodsRankedByRelatedPodsOnSameNode function to take into account
whether a pod is doubled up when sorting pods for deletion.
(getPodsRankedByRelatedPodsOnSameNode): New function. Return an
ActivePodsWithRanks value that wraps the given slice of pods and computes
ranks where each pod's rank is equal to the number of active related pods
that are colocated on the same node.
* pkg/controller/replicaset/replica_set_test.go (newReplicaSet): Set
OwnerReferences on the ReplicaSet.
(newPod): Set a unique UID on the pod.
(byName): New type to sort pods by name.
(TestGetReplicaSetsWithSameController): New test for
getReplicaSetsWithSameController.
(TestRelatedPodsLookup): New test for getIndirectlyRelatedPods.
(TestGetPodsToDelete): Augment the "various pod phases and conditions, diff
= len(pods)" test case to ensure that scale-down still selects doubled-up
pods if there are not enough other pods to scale down. Add a "various pod
phases and conditions, diff = len(pods), relatedPods empty" test case to
verify that getPodsToDelete works even if related pods could not be
determined. Add a "ready and colocated with another ready pod vs not
colocated, diff < len(pods)" test case to verify that a doubled-up pod gets
preferred for deletion. Augment the "various pod phases and conditions,
diff < len(pods)" test case to ensure that not-ready pods are preferred
over ready but doubled-up pods.
* pkg/controller/replicaset/BUILD: Regenerate.
* test/e2e/apps/deployment.go
(testRollingUpdateDeploymentWithLocalTrafficLoadBalancer): New end-to-end
test. Create a deployment with a rolling update strategy and affinity
rules and a load balancer with "Local" external traffic policy, and verify
that set of nodes with local endponts for the service remains unchanged
during rollouts.
(setAffinity): New helper, used by
testRollingUpdateDeploymentWithLocalTrafficLoadBalancer.
* test/e2e/framework/service/jig.go (GetEndpointNodes): Factor building the
set of node names out...
(GetEndpointNodeNames): ...into this new method.
In addition create a similar method that doesn't copy objects.
Benchmark for the new no-copy method vs the old one:
```
benchmark old ns/op new ns/op delta
BenchmarkGetControllerOf-12 214 14.8 -93.08%
benchmark old allocs new allocs delta
BenchmarkGetControllerOf-12 1 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkGetControllerOf-12 80 0 -100.00%
```
Benchamrk for the new (copy) method vs the old one:
```
benchmark old ns/op new ns/op delta
BenchmarkGetControllerOf-12 128 114 -10.94%
benchmark old allocs new allocs delta
BenchmarkGetControllerOf-12 1 1 +0.00%
benchmark old bytes new bytes delta
BenchmarkGetControllerOf-12 80 80 +0.00%
```
Overall there is a 10% improvement for the old vs new (copy) method and
huge improvent (x10) for the old vs new (no-copy).
I changed the IsControlledBy and a few other methods to use the new (no-copy) method.
syncService shouldn't return error if the service doesn't exist which
means it's triggered by service deletion, otherwise the service would be
enqueued repeatedly even its cleanup has been executed successfully.
This patch makes syncService return nil if the error is NotFound when
getting the service, like the other controllers do.