Commit Graph

11188 Commits

Author SHA1 Message Date
Miciah Masters
980b6406b2 Prefer to delete doubled-up pods of a ReplicaSet
When scaling down a ReplicaSet, delete doubled up replicas first, where a
"doubled up replica" is defined as one that is on the same node as an
active replica belonging to a related ReplicaSet.  ReplicaSets are
considered "related" if they have a common controller (typically a
Deployment).

The intention of this change is to make a rolling update of a Deployment
scale down the old ReplicaSet as it scales up the new ReplicaSet by
deleting pods from the old ReplicaSet that are colocated with ready pods of
the new ReplicaSet.  This change in the behavior of rolling updates can be
combined with pod affinity rules to preserve the locality of a Deployment's
pods over rollout.

A specific scenario that benefits from this change is when a Deployment's
pods are exposed by a Service that has type "LoadBalancer" and external
traffic policy "Local".  In this scenario, the load balancer uses health
checks to determine whether it should forward traffic for the Service to a
particular node.  If the node has no local endpoints for the Service, the
health check will fail for that node.  Eventually, the load balancer will
stop forwarding traffic to that node.  In the meantime, the service proxy
drops traffic for that Service.  Thus, in order to reduce risk of dropping
traffic during a rolling update, it is desirable preserve node locality of
endpoints.

* pkg/controller/controller_utils.go (ActivePodsWithRanks): New type to
sort pods using a given ranking.
* pkg/controller/controller_utils_test.go (TestSortingActivePodsWithRanks):
New test for ActivePodsWithRanks.
* pkg/controller/replicaset/replica_set.go
(getReplicaSetsWithSameController): New method.  Given a ReplicaSet, return
all ReplicaSets that have the same owner.
(manageReplicas): Call getIndirectlyRelatedPods, and pass its result to
getPodsToDelete.
(getIndirectlyRelatedPods): New method.  Given a ReplicaSet, return all
pods that are owned by any ReplicaSet with the same owner.
(getPodsToDelete): Add an argument for related pods.  Use related pods and
the new getPodsRankedByRelatedPodsOnSameNode function to take into account
whether a pod is doubled up when sorting pods for deletion.
(getPodsRankedByRelatedPodsOnSameNode): New function.  Return an
ActivePodsWithRanks value that wraps the given slice of pods and computes
ranks where each pod's rank is equal to the number of active related pods
that are colocated on the same node.
* pkg/controller/replicaset/replica_set_test.go (newReplicaSet): Set
OwnerReferences on the ReplicaSet.
(newPod): Set a unique UID on the pod.
(byName): New type to sort pods by name.
(TestGetReplicaSetsWithSameController): New test for
getReplicaSetsWithSameController.
(TestRelatedPodsLookup): New test for getIndirectlyRelatedPods.
(TestGetPodsToDelete): Augment the "various pod phases and conditions, diff
= len(pods)" test case to ensure that scale-down still selects doubled-up
pods if there are not enough other pods to scale down.  Add a "various pod
phases and conditions, diff = len(pods), relatedPods empty" test case to
verify that getPodsToDelete works even if related pods could not be
determined.  Add a "ready and colocated with another ready pod vs not
colocated, diff < len(pods)" test case to verify that a doubled-up pod gets
preferred for deletion.  Augment the "various pod phases and conditions,
diff < len(pods)" test case to ensure that not-ready pods are preferred
over ready but doubled-up pods.
* pkg/controller/replicaset/BUILD: Regenerate.
* test/e2e/apps/deployment.go
(testRollingUpdateDeploymentWithLocalTrafficLoadBalancer): New end-to-end
test.  Create a deployment with a rolling update strategy and affinity
rules and a load balancer with "Local" external traffic policy, and verify
that set of nodes with local endponts for the service remains unchanged
during rollouts.
(setAffinity): New helper, used by
testRollingUpdateDeploymentWithLocalTrafficLoadBalancer.
* test/e2e/framework/service/jig.go (GetEndpointNodes): Factor building the
set of node names out...
(GetEndpointNodeNames): ...into this new method.
2019-10-17 11:52:32 -04:00
Kubernetes Prow Robot
cedacc9cae Merge pull request #84025 from oomichi/move-CreateNginxPod
Move CreateNginxPod() to specific e2e
2019-10-17 01:47:48 -07:00
Kubernetes Prow Robot
c3d8ad06a5 Merge pull request #84002 from cofyc/fix74552-cleanup
e2e: remove duplicated test specs
2019-10-16 22:25:41 -07:00
Kubernetes Prow Robot
9aed79b585 Merge pull request #83812 from oomichi/move-Initialized
Move Initialized() to e2e framework util
2019-10-16 22:25:21 -07:00
Kubernetes Prow Robot
cae9bbd059 Merge pull request #81358 from bclau/tests/replace-redis-image
tests: Replaces Redis image with Agnhost
2019-10-16 22:24:51 -07:00
Howard Zhang
1c9da19bf5 Add kubectlPath flag to e2e_node.test
e2e_node.test does not set default kubectlPath, which lead to test
errors as following:
[Fail] [sig-storage] EmptyDir volumes [It] pod should support
shared volumes between containers [Conformance]

When the test trying to read file in shared volume, it uses
"kubeclt exec namespace -c container_name -- cat file_name".
However, as variable framework.TestContext.KubectlPath not set,
kubectl binary can not be found in the test and the tast fails.

This patch move kubectlPath flag from RegisterClusterFlags to
RegisterCommonFlags, thus default value for
framework.TestContext.KubectlPath will be set,and
user can also use --kubectl-path flag to set kubectl path.

Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2019-10-17 11:29:44 +08:00
Kenichi Omichi
9e17a0e9f3 Move CreateNginxPod() to specific e2e
CreateNginxPod() is called from flexvolume_online_resize only and
that seems storage specific function because that requires a PVC.
So this moves the function to the place which calls it for the code
cleanup.
2019-10-17 00:10:38 +00:00
Kubernetes Prow Robot
d1188a6802 Merge pull request #83946 from jsafrane/disable-local-reconstruction
Disable local block volume reconstruction test
2019-10-16 09:36:08 -07:00
Yecheng Fu
1ff13e0782 e2e: remove duplicatd test suites 2019-10-16 19:42:11 +08:00
Kubernetes Prow Robot
63cf2e260b Merge pull request #83819 from mrbobbytables/emeritus-jbeda
Move jbeda to emeritus status.
2019-10-15 23:06:20 -07:00
Kubernetes Prow Robot
6b3d154787 Merge pull request #83816 from oomichi/remove-test_verify.go
Remove test_verify from e2e framework package
2019-10-15 23:06:09 -07:00
Boqin Qin
24dcd8eaac framework: Fix a goroutine leak bug in resource_usage_gatherer.go 2019-10-15 16:09:31 -04:00
Kubernetes Prow Robot
cb3b715de2 Merge pull request #83804 from jpbetz/etcd-3_3_17_server
Upgrade to etcd server 3.3.17
2019-10-15 12:50:09 -07:00
Jan Safranek
7c240a18b6 Disable local block volume reconstruction test
Quite hacky, hoping to fix the volume plugin soon.
2019-10-15 13:59:36 +02:00
Kubernetes Prow Robot
46a29a0cc3 Merge pull request #71674 from grayluck/firewall-event-msg
Change XPN firewall change msg. Should be required by security admin
2019-10-14 21:09:51 -07:00
Kubernetes Prow Robot
63bd1d7a5c Merge pull request #80725 from aramase/dualstack-phase2-e2e
E2E tests for dualstack phase2
2019-10-14 17:45:51 -07:00
Joe Betz
c92bd5e7b5 Upgrade to etcd server 3.3.17 2019-10-13 17:17:15 -07:00
Kubernetes Prow Robot
2e55cf01d1 Merge pull request #83854 from mrbobbytables/update-test-vsphere-owners
Prune inactive owners from test/e2e/framework/providers/vsphere/OWNERS.
2019-10-13 13:20:36 -07:00
Bob Killen
e37d702208 Prune inactive owners from autoscaling related OWNERS files. 2019-10-13 08:52:14 -04:00
Bob Killen
340eefe76b Prune inactive owners from test/e2e/framework/providers/vsphere/OWNERS. 2019-10-13 08:39:38 -04:00
Kubernetes Prow Robot
743031d793 Merge pull request #83817 from oomichi/rename-framework-funcs
Rename e2e framework functions used locally
2019-10-12 17:34:37 -07:00
Lubomir I. Ivanov
0b3d50b6dc test/e2e: move GKE/GCE tests from /lifecycle to /cloud/gcp
Move GKE/GCE tests from the sig-cluster-lifecycle
ownership to the sig-cloud-provider-gcp ownership
(ideally the GCP sub-project).
2019-10-12 21:40:07 +03:00
Kubernetes Prow Robot
fbcfabe8ae Merge pull request #83808 from oomichi/rename-volume-fixtures
Rename Generate[Read|Write]FileCmd()s on e2e framework
2019-10-11 22:40:38 -07:00
tanjunchen
33dda68788 fix staticcheck in test/e2e/common directory 2019-10-12 11:30:47 +08:00
Kubernetes Prow Robot
8553d50426 Merge pull request #83793 from oomichi/psp
Fix package name of psp on e2e framework
2019-10-11 18:05:04 -07:00
Kenichi Omichi
0126d35df1 Rename e2e framework functions used locally
The following functions are used locally in e2e framework subpackages.
 - RunSSHCommandViaBastion
 - MakeNginxPod
 - LogPodTerminationMessages
 - CheckPodsCondition
 - SetNodeAffinityRequirement

This renames them to clarify them as local ones.
2019-10-12 00:06:49 +00:00
Kenichi Omichi
ab208e9063 Remove test_verify from e2e framework package
test_verify.go contained the function TestPodSuccessOrFail() only,
and the function is used in the package only.
This moves the function to create.go and remove test_verify.go.
2019-10-12 00:01:20 +00:00
Kenichi Omichi
06d41a485c Move Initialized() to e2e framework util
The function is used at e2e framework util module only.
So this moves the function to the module for trying to remove
dependencies to subpackages from core e2e framework.
2019-10-11 22:29:03 +00:00
Kenichi Omichi
e13fb0cbe5 Rename Generate[Read|Write]FileCmd()s
These functions are only used in fixtures.go module.
So it is not necessary to define them for exposing.
This renames these functions for making them local functions clearly.
2019-10-11 21:54:35 +00:00
Bob Killen
e65d8bb11f Move jbeda to emeritus status. 2019-10-11 17:46:18 -04:00
Jean Rouge
d17624ad82 Amending the GMSA e2e test to allow it to run against Windows-only clusters
e2e Windows tests can be run against Windows-only clusters, which
currently will cause the GMSA test to fail, as it needs to be able to
deploy pods to at least one Linux node, for the GMSA webhook; this patch leverages the new
`--tolerate-master` flag that was added to the GMSA webhook deploy
script in https://github.com/kubernetes-sigs/windows-gmsa/pull/18.

Signed-off-by: Jean Rouge <rougej+github@gmail.com>
2019-10-11 14:16:04 -07:00
Kubernetes Prow Robot
a847874655 Merge pull request #83792 from liggitt/flake-pre-stop
Mark 'wait until preStop hook completes the process' flaky
2019-10-11 13:42:38 -07:00
Kubernetes Prow Robot
2125c26a40 Merge pull request #83647 from BenTheElder/tainting-nodes-is-disruptive
tag test that taints a node as disruptive
2019-10-11 13:42:18 -07:00
Kenichi Omichi
c0430d3f8e Fix package name of psp on e2e framework
psp is imported as separated package from main framework but the
name was framework. This made confusion, so this renames it to psp.
2019-10-11 18:21:28 +00:00
Jordan Liggitt
73dce3adec Mark 'wait until preStop hook completes the process' flaky 2019-10-11 13:09:32 -04:00
mattjmcnaughton
b92a51285b Address staticcheck failures for test/e2e/lifecycle/bootstrap
Make small, non-functional changes to make the
`test/e2e/lifecycle/bootstrap` pass staticcheck.
2019-10-11 10:28:15 -04:00
Kubernetes Prow Robot
f985367ba4 Merge pull request #83755 from roycaihw/e2e-kubelet-resource-monitor
kubelet e2e: run resource monitor only if the actual number of nodes is small
2019-10-10 22:49:49 -07:00
Kubernetes Prow Robot
d69dfa7e13 Merge pull request #83729 from danwinship/drop-getreadyschedulablenodesordie
Drop framework.GetReadySchedulableNodesOrDie
2019-10-10 19:00:43 -07:00
Kubernetes Prow Robot
242d806672 Merge pull request #83587 from timothysc/testing-OWNERS
Audit of test/* OWNERS files
2019-10-10 19:00:00 -07:00
Haowei Cai
f5d6951c96 kubelet e2e: run resource monitor only if the actual number of nodes is
small
2019-10-10 17:02:51 -07:00
Kubernetes Prow Robot
77f86630d4 Merge pull request #82491 from openSUSE/pod-status-check
Validate container status in e2e pod status checks
2019-10-10 16:27:20 -07:00
Anish Ramasekar
50e2182faf e2e test for dualstack phase2
dual-stack phase2 tests

update e2elog to framework

run update-bazel

update comment

fix go vet error

Review feedback

update method

Review feedback
2019-10-10 16:24:39 -07:00
Timothy St. Clair
97055841b1 Audit of test/* OWNERS files 2019-10-10 15:52:51 -05:00
Kubernetes Prow Robot
1bb7835f0a Merge pull request #83609 from avalluri/fix-storage-e2e-tests
Remove e2e/common package usage in volumemode testsuite
2019-10-10 13:41:52 -07:00
Kubernetes Prow Robot
46dd075bab Merge pull request #83718 from serathius/aliases
Introduce sig-instrumentation aliases in OWNERS_ALISES and simplify OWNERS files
2019-10-10 09:04:05 -07:00
Kubernetes Prow Robot
4eb1ca46ed Merge pull request #83667 from k-toyoda-pi/use_log_e2e_storage_topology
Use log functions of core framework on testsuites/topology.go
2019-10-10 07:31:57 -07:00
Sascha Grunert
5a8b695fef Validate AgnhostPod readyness status in e2e tests
We now additionally check if the agnhost pods are ready before
marking the pod as running to increase the overall test stability.

Relates to: https://github.com/kubernetes/kubernetes/pull/82420
Fixes: https://github.com/kubernetes/kubernetes/issues/82445

Signed-off-by: Sascha Grunert <sgrunert@suse.com>
2019-10-10 14:11:06 +02:00
Marek Siarkowicz
c601d34eba Introduce sig-instrumentation aliases in OWNERS_ALISES and simplify OWNERS files 2019-10-10 14:04:20 +02:00
Kevin Taylor
cb8a7c1a4c Promote VolumeSubpathEnvExpansion feature gate to GA 2019-10-10 09:34:40 +01:00
Amarnath Valluri
3333806734 Remove e2e/common package usage in volumemode testsuite
Change 04300826fd has introduced
"e2e/common" package dependency on volumemode testusuite. This results in
pulling all tests defined in common package while running storage e2e tests,
which are not necessary.

The only interested part from common package is the WaitTimeoutForEvent().
2019-10-10 09:30:12 +03:00