Commit Graph

4619 Commits

Author SHA1 Message Date
Clayton Coleman
c6e34e58c5
job: Ignore namespace termination errors when creating pods or jobs
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. While we could technically
exit the entire processing loop early for very large jobs,
we should wait for more evidence that is an issue before changing
that logic substantially.
2019-10-20 18:39:01 -04:00
Clayton Coleman
8f74c8970b
daemonset: Ignore namespace termination errors when creating pods
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. While we could technically
exit the entire processing loop early for very large daemon sets,
we should wait for more evidence that is an issue before changing
that logic substantially.
2019-10-20 18:39:00 -04:00
Clayton Coleman
2e8ace82eb
replicaset: Ignore namespace termination errors when creating pods
Instead of reporting an event or displaying an error, simply exit
when the namespace is being terminated. This reduces the amount of
controller churn on namespace shutdown. While we could technically
exit the entire processing loop early for very large replica sets,
we should wait for more evidence that is an issue before changing
that logic substantially.
2019-10-20 18:39:00 -04:00
Clayton Coleman
dc0c21c7d7
serviceaccount: If namespace is terminating, ignore create errors
In some scenarios the service account and token controllers can
race with namespace deletion, causing a burst of errors as they
attempt to recreate secrets being deleted.

Instead, detect these errors and do not retry.
2019-10-20 18:39:00 -04:00
Clayton Coleman
937ef77257
endpoints: If namespace is terminating, drop item immediately
Avoid sending an event to the namespace that is being terminated,
since it will be rejected.
2019-10-20 18:38:59 -04:00
Kubernetes Prow Robot
aab740ffc2
Merge pull request #82703 from draveness/feature/graduate-taint-nodes-by-condition-to-ga
feat: update taint nodes by condition to GA
2019-10-18 20:01:37 -07:00
draveness
1163a1d51e feat: update taint nodes by condition to GA 2019-10-19 09:17:41 +08:00
Kubernetes Prow Robot
de9a7d863d
Merge pull request #83934 from wccsama/wcc-service-dev
Convert error messages to use event recorder
2019-10-18 12:37:46 -07:00
wccsama
18cf49e3df Convert error messages to use event recorder
remove mix protocol validation
remove check nil
2019-10-18 13:30:00 +08:00
Miciah Masters
980b6406b2 Prefer to delete doubled-up pods of a ReplicaSet
When scaling down a ReplicaSet, delete doubled up replicas first, where a
"doubled up replica" is defined as one that is on the same node as an
active replica belonging to a related ReplicaSet.  ReplicaSets are
considered "related" if they have a common controller (typically a
Deployment).

The intention of this change is to make a rolling update of a Deployment
scale down the old ReplicaSet as it scales up the new ReplicaSet by
deleting pods from the old ReplicaSet that are colocated with ready pods of
the new ReplicaSet.  This change in the behavior of rolling updates can be
combined with pod affinity rules to preserve the locality of a Deployment's
pods over rollout.

A specific scenario that benefits from this change is when a Deployment's
pods are exposed by a Service that has type "LoadBalancer" and external
traffic policy "Local".  In this scenario, the load balancer uses health
checks to determine whether it should forward traffic for the Service to a
particular node.  If the node has no local endpoints for the Service, the
health check will fail for that node.  Eventually, the load balancer will
stop forwarding traffic to that node.  In the meantime, the service proxy
drops traffic for that Service.  Thus, in order to reduce risk of dropping
traffic during a rolling update, it is desirable preserve node locality of
endpoints.

* pkg/controller/controller_utils.go (ActivePodsWithRanks): New type to
sort pods using a given ranking.
* pkg/controller/controller_utils_test.go (TestSortingActivePodsWithRanks):
New test for ActivePodsWithRanks.
* pkg/controller/replicaset/replica_set.go
(getReplicaSetsWithSameController): New method.  Given a ReplicaSet, return
all ReplicaSets that have the same owner.
(manageReplicas): Call getIndirectlyRelatedPods, and pass its result to
getPodsToDelete.
(getIndirectlyRelatedPods): New method.  Given a ReplicaSet, return all
pods that are owned by any ReplicaSet with the same owner.
(getPodsToDelete): Add an argument for related pods.  Use related pods and
the new getPodsRankedByRelatedPodsOnSameNode function to take into account
whether a pod is doubled up when sorting pods for deletion.
(getPodsRankedByRelatedPodsOnSameNode): New function.  Return an
ActivePodsWithRanks value that wraps the given slice of pods and computes
ranks where each pod's rank is equal to the number of active related pods
that are colocated on the same node.
* pkg/controller/replicaset/replica_set_test.go (newReplicaSet): Set
OwnerReferences on the ReplicaSet.
(newPod): Set a unique UID on the pod.
(byName): New type to sort pods by name.
(TestGetReplicaSetsWithSameController): New test for
getReplicaSetsWithSameController.
(TestRelatedPodsLookup): New test for getIndirectlyRelatedPods.
(TestGetPodsToDelete): Augment the "various pod phases and conditions, diff
= len(pods)" test case to ensure that scale-down still selects doubled-up
pods if there are not enough other pods to scale down.  Add a "various pod
phases and conditions, diff = len(pods), relatedPods empty" test case to
verify that getPodsToDelete works even if related pods could not be
determined.  Add a "ready and colocated with another ready pod vs not
colocated, diff < len(pods)" test case to verify that a doubled-up pod gets
preferred for deletion.  Augment the "various pod phases and conditions,
diff < len(pods)" test case to ensure that not-ready pods are preferred
over ready but doubled-up pods.
* pkg/controller/replicaset/BUILD: Regenerate.
* test/e2e/apps/deployment.go
(testRollingUpdateDeploymentWithLocalTrafficLoadBalancer): New end-to-end
test.  Create a deployment with a rolling update strategy and affinity
rules and a load balancer with "Local" external traffic policy, and verify
that set of nodes with local endponts for the service remains unchanged
during rollouts.
(setAffinity): New helper, used by
testRollingUpdateDeploymentWithLocalTrafficLoadBalancer.
* test/e2e/framework/service/jig.go (GetEndpointNodes): Factor building the
set of node names out...
(GetEndpointNodeNames): ...into this new method.
2019-10-17 11:52:32 -04:00
Miciah Masters
865c3c5670 TestGetPodsToDelete: Use field names in test cases
* pkg/controller/replicaset/replica_set_test.go (TestGetPodsToDelete): Use
explicit field names in declarations of test cases.
2019-10-17 11:50:09 -04:00
Kubernetes Prow Robot
bdc3f96838
Merge pull request #83989 from wojtek-t/remove_coordination_v1beta1
Swtich nodelifecyclecontroller to coordination/v1
2019-10-17 01:47:29 -07:00
Kubernetes Prow Robot
6a5f0e6eda
Merge pull request #81348 from yastij/code-org-service-controller
move service helpers to k8s.io/cloud-provider
2019-10-17 00:20:38 -07:00
Kubernetes Prow Robot
064458de46
Merge pull request #83951 from zouyee/pdbtomeb
add tombstones handle for pdb
2019-10-16 09:36:36 -07:00
Kubernetes Prow Robot
78abdf5375
Merge pull request #83902 from gongguan/remove_duplicate_code
remove duplicate code
2019-10-16 09:34:22 -07:00
Yassine TIJANI
d796baea27 move service helpers to k8s.io/cloud-provider
Signed-off-by: Yassine TIJANI <ytijani@vmware.com>
2019-10-16 14:12:11 +02:00
wojtekt
cf9203501e Swtich nodelifecyclecontroller to coordination/v1 2019-10-16 10:59:02 +02:00
Kubernetes Prow Robot
112bdfb7c0
Merge pull request #83862 from mrbobbytables/update-controller-network-owners
Prune inactive owners from pkg/controller/* network related OWNERS files
2019-10-15 23:06:49 -07:00
zouyee
65ddf102ef add tombstoones handle for pdb
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2019-10-15 22:02:18 +08:00
louisgong
9b7b50c9da remove duplicate function 2019-10-15 09:41:30 +08:00
Krzysztof Siedlecki
b1dfa83be6 using pod pointers in node lifecycle controller 2019-10-14 12:44:43 +02:00
Bob Killen
30053bb00f
Prune inactive owners from pkg/controller/* network related OWNERS files. 2019-10-13 08:50:18 -04:00
zouyee
a864fd2100 fix unsafe JSON construction
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2019-10-10 09:44:54 +08:00
Kubernetes Prow Robot
a5dc1fffb0
Merge pull request #83543 from yutedz/attach-resync-comment
Remove stale comment about resyncPeriod
2019-10-09 13:17:50 -07:00
Kubernetes Prow Robot
4b002b3baa
Merge pull request #82123 from xiaoanyunfei/cleanup/take-effect-stateofworld-hashmap
replace iteration with hashmap in *state_of_world
2019-10-09 02:17:50 -07:00
Kubernetes Prow Robot
ac9390627e
Merge pull request #83536 from yutedz/del-volume-err
Log the error return from store.Delete
2019-10-08 19:59:50 -07:00
Kubernetes Prow Robot
72d052a444
Merge pull request #81797 from yastij/move-metrics-util
move util/metrics to component-base
2019-10-08 17:08:05 -07:00
Kubernetes Prow Robot
b00f009316
Merge pull request #82996 from tnqn/endpointslice-deletion
Fix EndpointSliceController service deletion processing
2019-10-08 15:42:27 -07:00
Kubernetes Prow Robot
b4489d1709
Merge pull request #82865 from tnqn/endpointslice
Fix wrong comments and inaccurate logs in endpointslice_controller
2019-10-08 15:42:16 -07:00
Yassine TIJANI
c1487840bc move util/metrics to component-base
Signed-off-by: Yassine TIJANI <ytijani@vmware.com>
2019-10-08 14:42:31 +02:00
Ted Yu
f0a6aa1e9b Log error from AddIndexers in NewAttachDetachController 2019-10-07 16:15:25 -07:00
Kubernetes Prow Robot
9f875de5d2
Merge pull request #83540 from cofyc/fix83343
Fix volume scheduling error handling
2019-10-07 01:53:09 -07:00
Kubernetes Prow Robot
48b90db9c3
Merge pull request #83495 from tanjunchen/fix-typo
remove the repeat word in documents
2019-10-06 15:05:08 -07:00
tanjunchen
de3cf23414 remove the repeat word in documents 2019-10-06 23:32:01 +08:00
Ted Yu
b81242b62e Remove stale comment about resyncPeriod 2019-10-06 05:02:07 -07:00
Yecheng Fu
b5889ee82c update internal error message 2019-10-06 14:37:31 +08:00
Ted Yu
56717a79ff Log the error return from store.Delete 2019-10-05 19:34:39 -07:00
Kubernetes Prow Robot
e0f651a0be
Merge pull request #83501 from yastij/remove-node-cond-dep
remove Get/Set node condition dependency for the ccm controllers
2019-10-04 16:31:26 -07:00
Kubernetes Prow Robot
74dc287490
Merge pull request #83420 from yutedz/sched-assume-cache
Check the return value from store.Update
2019-10-04 10:22:18 -07:00
Kubernetes Prow Robot
7fab683455
Merge pull request #83343 from yutedz/bind-vol-err
Return proper error message when BindPodVolumes fails
2019-10-04 08:50:30 -07:00
Yassine TIJANI
356e3d0d61 remove Get/Set node condition dependency for the ccm controllers
Signed-off-by: Yassine TIJANI <ytijani@vmware.com>
2019-10-04 16:52:36 +02:00
Ted Yu
94d4bf1287 Return proper error message when BindPodVolumes fails 2019-10-04 04:36:48 -07:00
Ted Yu
c264338741 Check the return value from store.Update 2019-10-02 12:02:13 -07:00
Kubernetes Prow Robot
bd89dc462c
Merge pull request #83320 from krzysied/node_controller_delete_pods
Adding pods to DeletePods and MarkPodsReady methods parameters
2019-10-02 09:03:08 -07:00
Krzysztof Siedlecki
a07a3a6878 adding pods to MarkPodsNotReady parameters 2019-10-02 14:22:58 +02:00
Krzysztof Siedlecki
8f48896709 adding pods to DeletePods parameters 2019-10-02 13:11:23 +02:00
mengyang02
b116585b22 remove redundant quota.V1Equals 2019-10-02 01:02:09 +08:00
Kubernetes Prow Robot
c5440829d5
Merge pull request #83248 from krzysied/node_controller_test
Adding fakeGetPodsAssignedToNode to node lifecycle controller tests
2019-09-30 03:19:38 -07:00
Krzysztof Siedlecki
99eeab35a3 adding fakeGetPodsAssignedToNode 2019-09-30 11:03:36 +02:00
Kubernetes Prow Robot
14e5adfc85
Merge pull request #82683 from davidz627/fix/translationStruct
Refactor CSI Translation Library into a struct that is injected into various components to simplify unit testing
2019-09-29 10:11:37 -07:00