Commit Graph

19845 Commits

Author SHA1 Message Date
Bowei Du
dc1e614a72 Split the GCE cloud provider into more managable chunks
Each major interface is now in its own file. Any package private
functions that are only referenced by a particular module was also moved
to the corresponding file. All common helper functions were moved to
gce_util.go.

This change is a pure movement of code; no semantic changes were made.
2017-03-23 14:40:16 -07:00
Jordan Liggitt
707f0fb131 Preserve API group order in discovery, prefer extensions over apps 2017-03-23 11:10:53 -04:00
Kubernetes Submit Queue
a84f100faa Merge pull request #42422 from vmware/fix-42399.kerneltime
Automatic merge from submit-queue

Fix adding disks to more than one scsi adapter. Fixes #42399

**What this PR does / why we need it**: Allows a single node to use more than 16 disks.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #42399

**Special notes for your reviewer**: 

**Release note**:

```release-note
Fix adding disks to more than one scsi adapter.
```
2017-03-22 19:23:19 -07:00
Kubernetes Submit Queue
7c24d1a665 Merge pull request #43539 from yujuhong/hostnet_ip
Automatic merge from submit-queue (batch tested with PRs 43533, 43539)

kuberuntime: don't override the pod IP for pods using host network

This fixes the issue of not passing pod IP via downward API for host network pods.
2017-03-22 15:07:18 -07:00
Yu-Ju Hong
ea868d6f7b kuberuntime: don't override the pod IP for pods using host network 2017-03-22 13:28:17 -07:00
Kubernetes Submit Queue
fb890dee06 Merge pull request #43474 from dcbw/cni-network-status
Automatic merge from submit-queue (batch tested with PRs 43465, 43529, 43474, 43521)

kubelet/cni: hook network plugin Status() up to CNI network discovery

Ensure that the plugin returns NotReady status until there is a
CNI network available which can be used to set up pods.

Fixes: https://github.com/kubernetes/kubernetes/issues/43014

I think the only reason it wasn't done like this in the first place was that the dynamic "reread /etc/cni/net.d every 10s forever" was added long after the Status() hook was.  What do you think?

@freehan @caseydavenport @luxas @jbeda
2017-03-22 12:35:11 -07:00
Kubernetes Submit Queue
0450c2925f Merge pull request #43465 from kargakis/update-validation
Automatic merge from submit-queue

Disable readyReplicas validation for Deployments

Because there is no field in 1.5, when we update to 1.6 and the
controller tries to update the Deployment, it will be denied by
validation because the pre-existing availableReplicas field is greater
than readyReplicas (normally readyReplicas should always be greater or
equal).

Fixes https://github.com/kubernetes/kubernetes/issues/43392

@kubernetes/sig-apps-bugs
2017-03-22 12:09:33 -07:00
Michail Kargakis
7f4670d622 Disable readyReplicas validation for Deployments
Because there is no field in 1.5, when we update to 1.6 and the
controller tries to update the Deployment, it will be denied by
validation because the pre-existing availableReplicas field is greater
than readyReplicas (normally readyReplicas should always be greater or
equal).
2017-03-22 08:42:34 -04:00
Kubernetes Submit Queue
ee255d09fa Merge pull request #43498 from aveshagarwal/master-issue-43228
Automatic merge from submit-queue

Add validation for affinities and taints/tolerations annotations.

It fixes annotations validation issues for pod/node affinities and taints/tolerations annotations for 1.5 to 1.6 upgrade tests as discussed in the issue https://github.com/kubernetes/kubernetes/issues/43228 .

@davidopp @derekwaynecarr @kubernetes/sig-scheduling-pr-reviews
2017-03-22 03:24:29 -07:00
Avesh Agarwal
eccbd992da Add validation for taints annotations. 2017-03-22 01:01:49 -04:00
Avesh Agarwal
ff4c1d80d2 Add validation for toleration annotations. 2017-03-22 00:53:34 -04:00
Avesh Agarwal
ab5b462d17 Add node affinity, pod affinity and pod antiaffinity validation for alpha annotations. 2017-03-22 00:28:00 -04:00
Kubernetes Submit Queue
00938eac64 Merge pull request #42741 from kargakis/avoid-ns-skew
Automatic merge from submit-queue (batch tested with PRs 43481, 43419, 42741, 43480)

controller: work around milliseconds skew in AddAfter

AddAfter is not requeueing precisely after the provided time and may
skew for some millieseconds. This is really important because controllers
don't relist often so a missed check because of ms difference is
essentially dropping the key. For example, in [1] the test requeues a
Deployment for a progress check after 10s[2] but the Deployment is synced
9ms earlier ending up in the controller not recognizing the Deployment as
failed thus dropping it from the queue w/o any error. The drop is fixed by
forcing the controller to resync the Deployment but we are going to resync
after the full duration.

@deads2k if you don't like this I am going to handle this on a case by case basis

[1] https://github.com/kubernetes/kubernetes/issues/39785#issuecomment-279959133
[2] c48b2cab0f/test/e2e/deployment.go (L1122)
2017-03-21 19:29:27 -07:00
Kubernetes Submit Queue
eb77144474 Merge pull request #42715 from DirectXMan12/bug/infinite-hpa
Automatic merge from submit-queue

Rate limit HPA controller to sync period

Since the HPA controller pulls information from an external source that
makes no guarantees about consistency, it's possible for the HPA
to get into an infinite update loop -- if the metrics change with
every query, the HPA controller will run it's normal reconcilation,
post a status update, see that status update itself, fetch new metrics,
and if those metrics are different, post another status update, and
repeat.  This can lead to continuously updating a single HPA.
    
By rate-limiting each HPA to once per sync interval, we prevent this
from happening.

**Release note**:
```release-note
NONE
```
2017-03-21 14:26:16 -07:00
Dan Williams
193abffdbe kubelet/cni: hook network plugin Status() up to CNI network discovery
Ensure that the plugin returns NotReady status until there is a
CNI network available which can be used to set up pods.

Fixes: https://github.com/kubernetes/kubernetes/issues/43014
2017-03-21 15:50:39 -05:00
Michail Kargakis
68b78282d7 controller: work around milliseconds skew in AddAfter 2017-03-21 16:39:32 -04:00
Kubernetes Submit Queue
51beb16ede Merge pull request #43422 from liggitt/convert-null-list
Automatic merge from submit-queue

Ensure slices are serialized as zero-length, not null

Fixes https://github.com/kubernetes/kubernetes/issues/43203 null serialization of slices to prevent NPE errors in clients that store and expect to receive non-null JSON values in these fields.

Ensures when we are converting to an external slice field that will be serialized even if empty (has `json` tag that does not include `omitempty`), we populate it with `[]`, not `nil`

Other places I considered putting this logic instead:

* When unmarshaling
  * Would have to be done for both protobuf and ugorji
  * Would still have to be done here (or on marshal) to handle cases where we construct objects to return
* When marshaling
  * Would have to switch to use custom json marshaler (currently we use stdlib)
* When defaulting
  * Defaulting isn't run on some fields, notably, pod template in rc/deployment spec
  * Would still have to be done here (or on marshal) to handle cases where we construct objects to return

```release-note
API fields that previously serialized null arrays as `null` and empty arrays as `[]` no longer distinguish between those values and always output `[]` when serializing to JSON.
```
2017-03-21 11:46:19 -07:00
Kubernetes Submit Queue
1e5fa8fed5 Merge pull request #43415 from thockin/fix-nodeport-close-wait
Automatic merge from submit-queue

Install a REJECT rule for nodeport with no backend

Rather than actually accepting the connection, REJECT.  This will avoid
CLOSE_WAIT.

Fixes #43212


@justinsb @felipejfc @spiddy
2017-03-20 23:22:21 -07:00
Tim Hockin
2ec87999a9 Install a REJECT rule for nodeport with no backend
Rather than actually accepting the connection, REJECT.  This will avoid
CLOSE_WAIT.
2017-03-20 21:37:00 -07:00
Jordan Liggitt
939ca532aa generated files 2017-03-20 23:57:38 -04:00
Jordan Liggitt
0e2f1b535d Ensure empty serialized slices are zero-length, not null 2017-03-20 23:56:39 -04:00
Kubernetes Submit Queue
e09fda63be Merge pull request #43414 from Random-Liu/use-uid-in-config
Automatic merge from submit-queue

Use uid in config.go instead of pod full name.

For https://github.com/kubernetes/kubernetes/issues/43397.

In config.go, use pod uid in pod cache.

Previously, if we update the static pod, even though a new UID is generated in file.go, config.go will only reference the pod with pod full name, and never update the pod UID in the internal cache. This causes:
1) If we change container spec, kubelet will restart the corresponding container because the container hash is changed.
2) If we change pod spec, kubelet will do nothing.

With this fix, kubelet will always restart pod whenever pod spec (including container spec) is changed.

@yujuhong @bowei @dchen1107 
/cc @kubernetes/sig-node-bugs
2017-03-20 18:18:36 -07:00
Kubernetes Submit Queue
4974a0589b Merge pull request #43337 from janetkuo/ds-template-semantic-deepequal
Automatic merge from submit-queue

Use Semantic.DeepEqual to compare DaemonSet template on updates

Switch to `Semantic.DeepEqual` when comparing templates on DaemonSet updates, since we can't distinguish between `null` and `[]` in protobuf. This avoids unnecessary DaemonSet pods restarts. 

I didn't touch `reflect.DeepEqual` used in controller because it's close to release date, and the DeepEqual in the controller doesn't cause serious issues (except for maybe causing more enqueues than needed). 

Fixes #43218 

@liggitt @kargakis @lukaszo @kubernetes/sig-apps-pr-reviews
2017-03-20 17:24:18 -07:00
Random-Liu
fbc320af28 Use uid in config.go instead of pod full name. 2017-03-20 15:52:29 -07:00
Kubernetes Submit Queue
a2d74cda38 Merge pull request #42452 from jingxu97/Mar/nodeNamePrefix
Automatic merge from submit-queue (batch tested with PRs 42452, 43399)

Modify getInstanceByName to avoid calling getInstancesByNames

This PR modify getInstanceByname to loop through all management zones
directly instead of calling getInstancesByNames. Currently
getInstancesByNames use a node name prefix as a filter to list the
instances. If the prefix does not match, it will return all instances
which is very wasteful since getInstanceByName only query one instance
with a specific name.

Partially fix issue #42445
2017-03-20 15:23:33 -07:00
Janet Kuo
f780f32c1e Use Semantic.DeepEqual to compare DaemonSet template on updates 2017-03-20 13:58:49 -07:00
Kubernetes Submit Queue
948e3754f8 Merge pull request #43368 from feiskyer/dns-policy
Automatic merge from submit-queue (batch tested with PRs 43398, 43368)

CRI: add support for dns cluster first policy

**What this PR does / why we need it**:

PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools was updated to support the feature. 

This PR updates kuberuntime to support it for all runtimes.


**Which issue this PR fixes** 

fixes #43352

**Special notes for your reviewer**:

Candidate for v1.6.

**Release note**:

```release-note
NONE
```

cc @thockin @luxas @vefimova @Random-Liu
2017-03-20 13:54:33 -07:00
Kubernetes Submit Queue
bc82d87f0a Merge pull request #43398 from enisoc/deletion-race-flake
Automatic merge from submit-queue

Deflake TestSyncDeploymentDeletionRace

**What this PR does / why we need it**:

The cache was sometimes catching up while we were testing the case
where the cache is not yet caught up.

Before this fix, I could reproduce the failure with the following
command. After the fix, it passes.

```
go test -count 100000 -run TestSyncDeploymentDeletionRace
```

I checked the other controllers, and they all were already not starting informers for the deletion race test. I also checked that the deletion race tests for other controllers all pass with `-count 100000`.

**Which issue this PR fixes**:

Fixes #43390

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-03-20 13:26:03 -07:00
Kubernetes Submit Queue
e668ee1182 Merge pull request #43370 from feiskyer/port-mapping
Automatic merge from submit-queue (batch tested with PRs 42659, 43370)

dockershim: process protocol correctly for port mapping

**What this PR does / why we need it**:

dockershim: process protocol correctly for port mapping.

**Which issue this PR fixes** 

Fixes #43365.

**Special notes for your reviewer**:

Should be included in v1.6.

**Release note**:

```release-note
NONE
```

cc/ @Random-Liu @justinsb @kubernetes/sig-node-pr-reviews
2017-03-20 12:40:40 -07:00
Anthony Yeh
0b9233648e Deflake TestSyncDeploymentDeletionRace
The cache was sometimes catching up while we were testing the case
where the cache is not yet caught up.

Before this fix, I could reproduce the failure with the following
command. After the fix, it passes.

```
go test -count 100000 -run TestSyncDeploymentDeletionRace
```
2017-03-20 11:13:26 -07:00
Anthony Yeh
c74aab649f RC/RS: Mark lookup-cache-size flags as deprecated. 2017-03-20 09:10:12 -07:00
Anthony Yeh
f4ee44eb39 RC/RS: Check that ControllerRef UID matches found controller.
Otherwise, we may confuse a former controller by that name with a new
one that has the same name.
2017-03-20 08:57:42 -07:00
Pengfei Ni
95c3782043 Rewrite resolv.conf for dockershim
PR #29378 introduces ClusterFirstWithHostNet, but docker doesn't support
setting dns options togather with hostnetwork. This commit rewrites
resolv.conf same as dockertools.
2017-03-20 18:45:39 +08:00
Pengfei Ni
079158fa08 CRI: add support for dns cluster first policy
PR #29378 introduces ClusterFirstWithHostNet policy but only dockertools
was updated to support the feature. This PR updates kuberuntime to
support it for all runtimes.

Also fixes #43352.
2017-03-20 17:50:38 +08:00
Pengfei Ni
99ed3202f3 Run hack/update-bazel.sh 2017-03-20 17:48:36 +08:00
Pengfei Ni
53b5f2df48 Add unit test for MakePortsAndBindings 2017-03-20 17:47:38 +08:00
Pengfei Ni
2ddaaec199 dockershim: process protocol correctly for port mapping 2017-03-20 16:52:24 +08:00
Kubernetes Submit Queue
47320fd3f0 Merge pull request #42938 from enisoc/orphan-race
Automatic merge from submit-queue

GC: Fix re-adoption race when orphaning dependents.

**What this PR does / why we need it**:

The GC expects that once it sees a controller with a non-nil
DeletionTimestamp, that controller will not attempt any adoption.
There was a known race condition that could cause a controller to
re-adopt something orphaned by the GC, because the controller is using a
cached value of its own spec from before DeletionTimestamp was set.

This fixes that race by doing an uncached quorum read of the controller
spec just before the first adoption attempt. It's important that this
read occurs after listing potential orphans. Note that this uncached
read is skipped if no adoptions are attempted (i.e. at steady state).

**Which issue this PR fixes**:

Fixes #42639

**Special notes for your reviewer**:

**Release note**:
```release-note
```

cc @kubernetes/sig-apps-pr-reviews
2017-03-20 01:30:11 -07:00
Kubernetes Submit Queue
7bc86d84c1 Merge pull request #43116 from dchen1107/master
Automatic merge from submit-queue (batch tested with PRs 42828, 43116)

Apply taint tolerations for NoExecute for all static pods.

Fixed https://github.com/kubernetes/kubernetes/issues/42753


**Release note**:
```
Apply taint tolerations for NoExecute for all static pods.
```

cc/ @davidopp
2017-03-17 18:14:29 -07:00
Kubernetes Submit Queue
9497139cb6 Merge pull request #42828 from janetkuo/ds-types
Automatic merge from submit-queue

Update field descriptions of DaemonSet rolling udpate

@kargakis @lukaszo @kubernetes/sig-apps-bugs
2017-03-17 17:54:14 -07:00
Anthony Yeh
b4b8fdbca3 GC: Fix re-adoption race when orphaning dependents.
The GC expects that once it sees a controller with a non-nil
DeletionTimestamp, that controller will not attempt any adoption.
There was a known race condition that could cause a controller to
re-adopt something orphaned by the GC, because the controller is using a
cached value of its own spec from before DeletionTimestamp was set.

This fixes that race by doing an uncached quorum read of the controller
spec just before the first adoption attempt. It's important that this
read occurs after listing potential orphans. Note that this uncached
read is skipped if no adoptions are attempted (i.e. at steady state).
2017-03-17 15:39:26 -07:00
Kubernetes Submit Queue
4b00d5e42a Merge pull request #43307 from gnufied/fix-aws-legacy-tagging
Automatic merge from submit-queue (batch tested with PRs 43313, 43257, 43271, 43307)

Fix AWS untagged instances

To revert to 1.5 behaviour we need to consider untagged
instances if no clusterID has been specified or found.

Fixes https://github.com/kubernetes/kubernetes/issues/43063 

cc @justinsb
2017-03-17 15:12:35 -07:00
Kubernetes Submit Queue
eb43cd5eb3 Merge pull request #43271 from liggitt/affinity-namespace
Automatic merge from submit-queue (batch tested with PRs 43313, 43257, 43271, 43307)

Remove 'all namespaces' meaning of empty list in PodAffinityTerm

Removes the distinction between `null` and `[]` for the PodAffinityTerm#namespaces field (option 4 discussed in https://github.com/kubernetes/kubernetes/issues/43203#issuecomment-287237992), since we can't distinguish between them in protobuf (and it's a less than ideal API)

Leaves the door open to reintroducing "all namespaces" function via a dedicated field or a dedicated token in the list of namespaces

Wanted to get a PR open and tests green in case we went with this option.

Not sure what doc/release-note is needed if the "all namespaces" function is not present in 1.6
2017-03-17 15:12:33 -07:00
Janet Kuo
263d605112 Auto-generate 2017-03-17 14:42:37 -07:00
Kubernetes Submit Queue
f37cffcf4e Merge pull request #43239 from enisoc/kubectl-controller-ref
Automatic merge from submit-queue

kubectl: Use v1.5-compatible ownership logic when listing dependents.

**What this PR does / why we need it**:

This restores compatibility between kubectl 1.6 and clusters running Kubernetes 1.5.x. It introduces transitional ownership logic in which the client considers ControllerRef when it exists, but does not require it to exist.

If we were to ignore ControllerRef altogether (pre-1.6 client behavior), we would introduce a new failure mode in v1.6 because controllers that used to get stuck due to selector overlap will now make progress. For example, that means when reaping ReplicaSets of an overlapping Deployment, we would risk deleting ReplicaSets belonging to a different Deployment that we aren't about to delete.

This transitional logic avoids such surprises in 1.6 clusters, and does no worse than kubectl 1.5 did in 1.5 clusters. To prevent this when kubectl 1.5 is used against 1.6 clusters, we can cherrypick this change.

**Which issue this PR fixes**:

Fixes #43159

**Special notes for your reviewer**:

**Release note**:
```release-note
```
2017-03-17 14:25:38 -07:00
Janet Kuo
bca3691029 Use json field names instead of go field names 2017-03-17 14:24:21 -07:00
Janet Kuo
4cebc865dc Update description of fields for DaemonSet rolling udpate 2017-03-17 14:12:00 -07:00
Dawn Chen
d419efbe71 Fix unittest reflecting the default taint tolerations change for static
pods.
2017-03-17 14:06:34 -07:00
Hemant Kumar
1de4c5bbe0 Fix AWS untagged instances
To revert to 1.5 behaviour we need to consider untagged
instances if no clusterID has been specified or found.
2017-03-17 14:05:52 -04:00
Kubernetes Submit Queue
fe08925805 Merge pull request #42869 from better88/patch-1
Automatic merge from submit-queue

Fix revision when SetDeploymentRevision

When some oldRSs be deleted or cleared(eg. revisionHistoryLimit set 0), the revision for  SetDeploymentRevision is incorrect
2017-03-17 10:55:29 -07:00