Commit Graph

2345 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
bc82d87f0a Merge pull request #43398 from enisoc/deletion-race-flake
Automatic merge from submit-queue

Deflake TestSyncDeploymentDeletionRace

**What this PR does / why we need it**:

The cache was sometimes catching up while we were testing the case
where the cache is not yet caught up.

Before this fix, I could reproduce the failure with the following
command. After the fix, it passes.

```
go test -count 100000 -run TestSyncDeploymentDeletionRace
```

I checked the other controllers, and they all were already not starting informers for the deletion race test. I also checked that the deletion race tests for other controllers all pass with `-count 100000`.

**Which issue this PR fixes**:

Fixes #43390

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-03-20 13:26:03 -07:00
Anthony Yeh
0b9233648e Deflake TestSyncDeploymentDeletionRace
The cache was sometimes catching up while we were testing the case
where the cache is not yet caught up.

Before this fix, I could reproduce the failure with the following
command. After the fix, it passes.

```
go test -count 100000 -run TestSyncDeploymentDeletionRace
```
2017-03-20 11:13:26 -07:00
Anthony Yeh
f4ee44eb39 RC/RS: Check that ControllerRef UID matches found controller.
Otherwise, we may confuse a former controller by that name with a new
one that has the same name.
2017-03-20 08:57:42 -07:00
Kubernetes Submit Queue
47320fd3f0 Merge pull request #42938 from enisoc/orphan-race
Automatic merge from submit-queue

GC: Fix re-adoption race when orphaning dependents.

**What this PR does / why we need it**:

The GC expects that once it sees a controller with a non-nil
DeletionTimestamp, that controller will not attempt any adoption.
There was a known race condition that could cause a controller to
re-adopt something orphaned by the GC, because the controller is using a
cached value of its own spec from before DeletionTimestamp was set.

This fixes that race by doing an uncached quorum read of the controller
spec just before the first adoption attempt. It's important that this
read occurs after listing potential orphans. Note that this uncached
read is skipped if no adoptions are attempted (i.e. at steady state).

**Which issue this PR fixes**:

Fixes #42639

**Special notes for your reviewer**:

**Release note**:
```release-note
```

cc @kubernetes/sig-apps-pr-reviews
2017-03-20 01:30:11 -07:00
Anthony Yeh
b4b8fdbca3 GC: Fix re-adoption race when orphaning dependents.
The GC expects that once it sees a controller with a non-nil
DeletionTimestamp, that controller will not attempt any adoption.
There was a known race condition that could cause a controller to
re-adopt something orphaned by the GC, because the controller is using a
cached value of its own spec from before DeletionTimestamp was set.

This fixes that race by doing an uncached quorum read of the controller
spec just before the first adoption attempt. It's important that this
read occurs after listing potential orphans. Note that this uncached
read is skipped if no adoptions are attempted (i.e. at steady state).
2017-03-17 15:39:26 -07:00
Kubernetes Submit Queue
f37cffcf4e Merge pull request #43239 from enisoc/kubectl-controller-ref
Automatic merge from submit-queue

kubectl: Use v1.5-compatible ownership logic when listing dependents.

**What this PR does / why we need it**:

This restores compatibility between kubectl 1.6 and clusters running Kubernetes 1.5.x. It introduces transitional ownership logic in which the client considers ControllerRef when it exists, but does not require it to exist.

If we were to ignore ControllerRef altogether (pre-1.6 client behavior), we would introduce a new failure mode in v1.6 because controllers that used to get stuck due to selector overlap will now make progress. For example, that means when reaping ReplicaSets of an overlapping Deployment, we would risk deleting ReplicaSets belonging to a different Deployment that we aren't about to delete.

This transitional logic avoids such surprises in 1.6 clusters, and does no worse than kubectl 1.5 did in 1.5 clusters. To prevent this when kubectl 1.5 is used against 1.6 clusters, we can cherrypick this change.

**Which issue this PR fixes**:

Fixes #43159

**Special notes for your reviewer**:

**Release note**:
```release-note
```
2017-03-17 14:25:38 -07:00
Kubernetes Submit Queue
fe08925805 Merge pull request #42869 from better88/patch-1
Automatic merge from submit-queue

Fix revision when SetDeploymentRevision

When some oldRSs be deleted or cleared(eg. revisionHistoryLimit set 0), the revision for  SetDeploymentRevision is incorrect
2017-03-17 10:55:29 -07:00
better88
6c13a02026 Fix revision when SetDeploymentRevision 2017-03-17 23:23:41 +08:00
Anthony Yeh
de92f90f12 Deployment: Clear obsolete OverlapAnnotaiton.
This ensures old clients will not assume the Deployment is blocked.
2017-03-16 14:52:01 -07:00
Chao Xu
33da82bc67 construction of GC should not fail for restmapper error caused by tpr 2017-03-16 14:19:17 -07:00
Anthony Yeh
fa23729a6d kubectl: Use v1.5-compatible ownership logic when listing dependents.
In particular, we should not assume ControllerRefs are necessarily set.
However, we can still use ControllerRefs that do exist to avoid
interfering with controllers that do use it.
2017-03-16 12:28:38 -07:00
Anthony Yeh
725ec0cc5e kubectl: Check for Deployment overlap annotation in reaper.
This effectively reverts the client-side changes in
cec3899b96.
We have to maintain the old behavior on the client side to support
version skew when talking to old servers that set the annotation.

However, the new server-side behavior is still to NOT set the
annotation.
2017-03-16 12:28:28 -07:00
Kubernetes Submit Queue
0afdcfcaf6 Merge pull request #43114 from enisoc/deployment-upgrade-test
Automatic merge from submit-queue

Fix Deployment upgrade test.

**What this PR does / why we need it**:

When the upgrade test operates on Deployments in a pre-1.6 cluster (i.e. during the Setup phase), it needs to use the v1.5 deployment/util logic. In particular, the v1.5 logic does not filter children to only those with a matching ControllerRef.

**Which issue this PR fixes**:

Fixes #42738

**Special notes for your reviewer**:

**Release note**:
```release-note
```
cc @kubernetes/sig-apps-pr-reviews
2017-03-15 07:03:19 -07:00
Anthony Yeh
0c015927c4 Fix Deployment upgrade test.
When the upgrade test operates on Deployments in a pre-1.6 cluster
(i.e. during the Setup phase), it needs to use the v1.5 deployment/util
logic. In particular, the v1.5 logic does not filter children to only
those with a matching ControllerRef.
2017-03-14 17:39:29 -07:00
Chao Xu
0605ba7a6d wait for garbagecollector to be synced in test 2017-03-14 16:19:33 -07:00
Klaus Ma
d0e04427d7 Fixed incorrect result of getMinTolerationTime. 2017-03-12 20:21:14 +08:00
Kubernetes Submit Queue
3f660a9779 Merge pull request #42913 from aveshagarwal/master-fix-taint-based-eviction-no-node-cidr
Automatic merge from submit-queue

Fix taint based pod eviction for clusters where controller manager is not running with allocate-node-cidrs set

Fixes https://github.com/kubernetes/kubernetes/issues/42733

In my cluster, I have not set allocate-node-cidr, and It is causing taint based pod eviction to fail. 

@gmarek @kubernetes/sig-scheduling-bugs @davidopp @derekwaynecarr
2017-03-11 14:02:45 -08:00
Kubernetes Submit Queue
486ec2b7c9 Merge pull request #42862 from caesarxuchao/sync-warning
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)

Let GC print specific message for RESTMapping failure

Make the error messages reported in https://github.com/kubernetes/kubernetes/issues/39816 to be more specific, also only print the message once.

I'll also update the garbage collector's doc to clearly state we don't support tpr yet.

We'll wait for the watchable discovery feature (@sttts are you going to work on that?) to land in 1.7, and then enable the garbage collector to handle TPR.

cc @hongchaodeng @MikaelCluseau @djMax
2017-03-10 14:01:23 -08:00
Avesh Agarwal
c3a80719a2 Fix taint based pod eviction for clusters where controller manager
is not running with --allocate-node-cidrs set.
2017-03-10 15:39:21 -05:00
Chao Xu
d7aef0a338 Let GC print specific message for RESTMapping failure 2017-03-10 11:38:57 -08:00
gmarek
fddac63c27 Remove unused functions and make logs slightly better 2017-03-10 11:57:51 +01:00
Kubernetes Submit Queue
1f5708d460 Merge pull request #42640 from lukaszo/ds-updates-fix
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)

kubectl: respect DaemonSet strategy parameters for rollout status

It handles "after-merge" comments from #41116

cc @kargakis @janetkuo 

I will add one more e2e test later. I need to handle some in company stuff.
2017-03-09 16:41:54 -08:00
Kubernetes Submit Queue
7002c53a9c Merge pull request #42808 from ravisantoshgudimetla/nodecontroller_eviction_flake
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)

Node controller test flake 39975 with delay for try function

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #39975 

/cc @ncdc @gmarek @liggitt
2017-03-09 16:41:52 -08:00
ravisantoshgudimetla
7d444263a5 Change from Micro to Milli for introducing delay 2017-03-09 14:10:28 -05:00
Łukasz Oleś
b32afe1720 kubectl: respect DaemonSet strategy parameters for rollout status
It handles "after-merge" comments from #41116
2017-03-09 20:02:52 +01:00
gmarek
48d784272e Move taint eviction feature flag to feature-gates 2017-03-08 10:04:18 +01:00
Kubernetes Submit Queue
d306acca86 Merge pull request #42175 from enisoc/controller-ref-dep
Automatic merge from submit-queue

Deployment: Fully Respect ControllerRef

**What this PR does / why we need it**:

This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings Deployment into full compliance with ControllerRef. See the individual commit messages for details.

**Which issue this PR fixes**:

This ensures that Deployment does not fight with other controllers over control of Pods and ReplicaSets.

Ref: https://github.com/kubernetes/kubernetes/issues/24433

**Special notes for your reviewer**:

**Release note**:

```release-note
Deployment now fully respects ControllerRef to avoid fighting over Pods and ReplicaSets. At the time of upgrade, **you must not have Deployments with selectors that overlap**, or else [ownership of ReplicaSets may change](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md#upgrading).
```
cc @erictune @kubernetes/sig-apps-pr-reviews
2017-03-07 20:44:36 -08:00
Anthony Yeh
fac372d090 DaemonSet: Relist Pods before each phase of sync.
The design of DaemonSet requires a relist before each phase (manage,
update, status) because it does not short-circuit and requeue for each
action triggered.
2017-03-07 16:42:29 -08:00
Anthony Yeh
182753f841 DaemonSet: Check that ControllerRef UID matches. 2017-03-07 16:42:29 -08:00
Anthony Yeh
97c363a3e0 DaemonSet: Always set BlockOwnerDeletion in ControllerRef. 2017-03-07 16:42:29 -08:00
Anthony Yeh
ab5a82d6e6 DaemonSet: Don't log Pod events unless some DaemonSet cares. 2017-03-07 16:42:29 -08:00
Anthony Yeh
1099811833 DaemonSet: Use ControllerRef to route watch events.
This is part of the completion of ControllerRef, as described here:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md#watches
2017-03-07 16:42:28 -08:00
Anthony Yeh
421e0bbd83 DaemonSet: Use ControllerRefManager to adopt/orphan. 2017-03-07 16:42:28 -08:00
Anthony Yeh
8f3a56f582 DaemonSet: Add ControllerRef on all created Pods. 2017-03-07 16:42:28 -08:00
Kubernetes Submit Queue
c9d4e60131 Merge pull request #42634 from gmarek/nc_test_sleep
Automatic merge from submit-queue

Extend the sleep time in the NC unit test

Ref. https://github.com/kubernetes/kubernetes/issues/39975#issuecomment-284600278
2017-03-07 09:11:35 -08:00
Kubernetes Submit Queue
74c60fbd71 Merge pull request #42633 from gmarek/nc_logs
Automatic merge from submit-queue (batch tested with PRs 41890, 42593, 42633, 42626, 42609)

Improve NodeControllers logs
2017-03-07 08:10:43 -08:00
Kubernetes Submit Queue
ed04316828 Merge pull request #41890 from soltysh/issue37166
Automatic merge from submit-queue (batch tested with PRs 41890, 42593, 42633, 42626, 42609)

Remove everything that is not new from batch/v2alpha1

Fixes #37166.

@lavalamp you've asked for it 
@erictune this is a prereq for moving CronJobs to beta. I initially planned to put all in one PR, but after I did that I figured out it'll be easier to review separately. ptal 

@kubernetes/api-approvers @kubernetes/sig-api-machinery-pr-reviews ptal
2017-03-07 08:10:38 -08:00
gmarek
65f556788e Extend the sleep time in the NC unit test 2017-03-07 10:48:37 +01:00
gmarek
0db355a8ca Improve NodeControllers logs 2017-03-07 10:29:57 +01:00
Kubernetes Submit Queue
4f57c107df Merge pull request #42596 from enisoc/e2e-rc
Automatic merge from submit-queue (batch tested with PRs 42506, 42585, 42596, 42584)

RC/RS: Fix ignoring inactive Pods.

**What this PR does / why we need it**:

Fix typo that broke ignoring of inactive Pods in RC, and add unit test for that case.

**Which issue this PR fixes**:

Fixes #37479

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-03-06 22:20:13 -08:00
Kubernetes Submit Queue
d50a59ec66 Merge pull request #42080 from enisoc/controller-ref-ss
Automatic merge from submit-queue (batch tested with PRs 42080, 41653, 42598, 42555)

StatefulSet: Respect ControllerRef

**What this PR does / why we need it**:

This is part of the completion of the [ControllerRef](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md) proposal. It brings StatefulSet into full compliance with ControllerRef. See the individual commit messages for details.

**Which issue this PR fixes**:

Fixes #36859

**Special notes for your reviewer**:

**Release note**:

```release-note
StatefulSet now respects ControllerRef to avoid fighting over Pods. At the time of upgrade, **you must not have StatefulSets with selectors that overlap** with any other controllers (such as ReplicaSets), or else [ownership of Pods may change](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md#upgrading).
```
cc @erictune @kubernetes/sig-apps-pr-reviews
2017-03-06 17:16:10 -08:00
Anthony Yeh
e9e8fe6c32 RC/RS: Fix ignoring inactive Pods. 2017-03-06 15:51:53 -08:00
Anthony Yeh
8c4bcb38fb Deployment: Filter by ControllerRef in Reaper.
We don't want to delete ReplicaSets we don't own.
2017-03-06 15:12:08 -08:00
Anthony Yeh
cec3899b96 Deployment: Remove Overlap and SelectorUpdate annotations.
These are not used anymore since ControllerRef now protects against
fighting between controllers with overlapping selectors.
2017-03-06 15:12:08 -08:00
Anthony Yeh
94b3c216a1 Deployment: Consolidate Adopt/Release unit tests. 2017-03-06 15:12:08 -08:00
Anthony Yeh
f2a2895a78 Deployment: Check that ControllerRef UID matches. 2017-03-06 15:12:07 -08:00
Anthony Yeh
111b9ce9b5 Deployment: Fix data race in unit tests. 2017-03-06 15:12:07 -08:00
Anthony Yeh
d96c4847b6 Deployment: Filter Pods by Deployment selector in addition to ControllerRef.
Deployment should ignore Pods that don't match the selector, even if
they have a ControllerRef pointing to one of the ReplicaSets it owns.
The ReplicaSet itself will orphan the Pod as soon as it syncs.
2017-03-06 15:12:07 -08:00
Anthony Yeh
37534b66df Deployment: Always set BlockOwnerDeletion in ControllerRef. 2017-03-06 15:12:07 -08:00
Anthony Yeh
0d9c9bfee0 Deployment: Use ControllerRef to route watch events.
This is part of the completion of ControllerRef, as described here:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/controller-ref.md#watches
2017-03-06 15:12:07 -08:00