Commit Graph

23868 Commits

Author SHA1 Message Date
Manjunath A Kumatagi
5db5ef8501 Enhance message in cluster-info dump 2017-03-14 12:44:59 +05:30
wenlxie
33385214bc recycle pod can't get the event since the channel been closed 2017-03-14 10:35:08 +08:00
Yu-Ju Hong
035afab901 dockershim: remove corrupted sandbox checkpoints
This is a workaround to ensure that kubelet doesn't block forever when
the checkpoint is corrupted.
2017-03-13 15:41:01 -07:00
Yifan Gu
a489bd2674 pkg/util/flock: Fix the flock so it actually locks.
With this PR, the second call to `Acquire()` will block unless the lock is released (process exits).
Also removed the memory mutex in the previous code since we don't need `Release()` here so no need to save and protect the local fd.

Fix #42929.
2017-03-13 14:24:59 -07:00
Random-Liu
e6341cc3c7 Fix kubelet panic in cgroup manager. 2017-03-13 12:06:08 -07:00
Vishnu kannan
ad743a922a remove dead code in gpu manager
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Vishnu kannan
ff158090b3 use active pods instead of runtime pods in gpu manager
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Vishnu Kannan
8ed9bff073 handle container restarts for GPUs
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Derek Carr
39b380c7bd Unit test quota for nodeport associated with loadbalancer 2017-03-13 11:21:56 -04:00
Klaus Ma
3f24d46564 Removed err from return value of AddOrUpdateTolerationInPod. 2017-03-13 22:37:41 +08:00
Jan Safranek
06feaccead Remove 'beta' from default storage class annotation 2017-03-13 12:53:41 +01:00
Kubernetes Submit Queue
e1248bcbbc Merge pull request #42962 from k82cn/fix_min_tolerant_time
Automatic merge from submit-queue

Fixed incorrect result of getMinTolerationTime.

For the following case, `getMinTolerationTime` should return one; but  it returned -1 :
1. for tolerations[0], TolerationSeconds is nil, minTolerationTime is not set 
2. for tolerations[1], it's TolerationSeconds (1) is bigger than `minTolerationTime`, so minTolerationTime is still -1 which means infinite.

```
+		{
+			tolerations: []v1.Toleration{
+				{
+					TolerationSeconds: nil,
+				},
+				{
+					TolerationSeconds: &one,
+				},
+			},
+		},
```
2017-03-12 23:55:39 -07:00
Kubernetes Submit Queue
65ddace3ed Merge pull request #42702 from smarterclayton/printer_owners
Automatic merge from submit-queue

Add pkg/printers OWNERS

Should also include more sig-api-machinery as this will be moving to server side
2017-03-12 21:04:57 -07:00
Hemant Kumar
a4a3d20934 Fix vsphere selinux support
Managed flag must be true for SELinux relabelling to work
for vsphere.
2017-03-12 23:21:07 -04:00
tanshanshan
26ab52a3cb fix 2017-03-13 10:00:19 +08:00
AdoHe
8ebc6e91f8 print warning when delete current context 2017-03-12 22:29:11 +08:00
Klaus Ma
d0e04427d7 Fixed incorrect result of getMinTolerationTime. 2017-03-12 20:21:14 +08:00
Kubernetes Submit Queue
e315c388b2 Merge pull request #42944 from liggitt/patch-defaulting
Automatic merge from submit-queue

Ensure patched objects are defaulted correctly

Restores defaulting behavior for patch API calls removed in e34e1abe33 (diff-517d1b81963bbc7c9b0a16e6eb3c0e2f)

Restores the unit test that ensures we get a defaulted result after applying a patch

Fixes https://github.com/kubernetes/kubernetes/issues/42764
Fixes #42834
2017-03-11 17:49:41 -08:00
Kubernetes Submit Queue
3f660a9779 Merge pull request #42913 from aveshagarwal/master-fix-taint-based-eviction-no-node-cidr
Automatic merge from submit-queue

Fix taint based pod eviction for clusters where controller manager is not running with allocate-node-cidrs set

Fixes https://github.com/kubernetes/kubernetes/issues/42733

In my cluster, I have not set allocate-node-cidr, and It is causing taint based pod eviction to fail. 

@gmarek @kubernetes/sig-scheduling-bugs @davidopp @derekwaynecarr
2017-03-11 14:02:45 -08:00
Cao Shufeng
b2f530d756 [cli] fix Generator's error messages
Invalid variables are used when format error messages. This change
fixes them.
2017-03-11 02:09:52 -05:00
Kubernetes Submit Queue
8cb14a4f7f Merge pull request #42755 from aveshagarwal/master-fix-default-toleration-seconds
Automatic merge from submit-queue (batch tested with PRs 41794, 42349, 42755, 42901, 42933)

Fix DefaultTolerationSeconds admission plugin

DefaultTolerationSeconds is not working as expected. It is supposed to add default tolerations (for unreachable and notready conditions). but no pod was getting these toleration. And api server was throwing this error:

```
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.769212   32070 admission.go:71] expected pod but got Pod
Mar 08 13:43:57 fedora25 hyperkube[32070]: E0308 13:43:57.789055   32070 admission.go:71] expected pod but got Pod
Mar 08 13:44:02 fedora25 hyperkube[32070]: E0308 13:44:02.006784   32070 admission.go:71] expected pod but got Pod
Mar 08 13:45:39 fedora25 hyperkube[32070]: E0308 13:45:39.754669   32070 admission.go:71] expected pod but got Pod
Mar 08 14:48:16 fedora25 hyperkube[32070]: E0308 14:48:16.673181   32070 admission.go:71] expected pod but got Pod
```

The reason for this error is that the input to admission plugins is internal api objects not versioned objects so expecting versioned object is incorrect. Due to this, no pod got desired tolerations and it always showed:

```
Tolerations: <none>
```

After this fix, the correct  tolerations are being assigned to pods as follows:

```
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
```

@davidopp @kevin-wangzefeng @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-scheduling-bugs @derekwaynecarr 

Fixes https://github.com/kubernetes/kubernetes/issues/42716
2017-03-10 22:02:18 -08:00
Jordan Liggitt
464db160b4 Ensure patched objects are defaulted correctly 2017-03-10 22:07:10 -05:00
Kubernetes Submit Queue
59aa924a9b Merge pull request #42642 from fraenkel/envfrom
Automatic merge from submit-queue

Invalid environment var names are reported and pod starts

When processing EnvFrom items, all invalid keys are collected and
reported as a single event.

The Pod is allowed to start.

fixes #42583
2017-03-10 17:37:31 -08:00
Kubernetes Submit Queue
9590f694c8 Merge pull request #41830 from irfanurrehman/fed-rbac-1
Automatic merge from submit-queue

[Federation] Kubefed Init should use the right RBAC API version clientset

**What this PR does / why we need it**:
Implements the need as described in https://github.com/kubernetes/kubernetes/issues/41263
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/41263

**Special notes for your reviewer**:
@madhusudancs @shashidharatd @marun 
cc @kubernetes/sig-federation-bugs

**Release note**:

```
NONE
```
2017-03-10 15:56:47 -08:00
Kubernetes Submit Queue
486ec2b7c9 Merge pull request #42862 from caesarxuchao/sync-warning
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)

Let GC print specific message for RESTMapping failure

Make the error messages reported in https://github.com/kubernetes/kubernetes/issues/39816 to be more specific, also only print the message once.

I'll also update the garbage collector's doc to clearly state we don't support tpr yet.

We'll wait for the watchable discovery feature (@sttts are you going to work on that?) to land in 1.7, and then enable the garbage collector to handle TPR.

cc @hongchaodeng @MikaelCluseau @djMax
2017-03-10 14:01:23 -08:00
Kubernetes Submit Queue
d2d3884f83 Merge pull request #42362 from soltysh/deployment_generators
Automatic merge from submit-queue (batch tested with PRs 38805, 42362, 42862)

Fix deployment generator after introducing deployments in apps/v1beta1

This PR does two things:

1. Switches all generator to produce versioned objects, to bypass the problem of having an object in multiple versions, which then results in not having stable generator (iow. producing exactly the same object).
2. Introduces new generator for `apps/v1beta1` deployments.

@kargakis @janetkuo ptal

@kubernetes/sig-apps-pr-reviews @kubernetes/sig-cli-pr-reviews ptal

This is a followup to https://github.com/kubernetes/kubernetes/pull/39683, so I'm adding 1.6 milestone.

```release-note
Introduce new generator for apps/v1beta1 deployments
```
2017-03-10 14:01:21 -08:00
Kubernetes Submit Queue
e2218290cf Merge pull request #42444 from jingxu97/Mar/deleteVolume
Automatic merge from submit-queue (batch tested with PRs 42608, 42444)

Return nil when deleting non-exist GCE PD

When gce cloud tries to delete a disk, if the disk could not be found
from the zones, the function should return nil error. This modified behavior is also consistent with AWS
2017-03-10 12:50:24 -08:00
Avesh Agarwal
c3a80719a2 Fix taint based pod eviction for clusters where controller manager
is not running with --allocate-node-cidrs set.
2017-03-10 15:39:21 -05:00
Chao Xu
d7aef0a338 Let GC print specific message for RESTMapping failure 2017-03-10 11:38:57 -08:00
Kris
ee4227f4bf Remove krousey from some OWNERS files 2017-03-10 11:12:29 -08:00
Kubernetes Submit Queue
e261cabb09 Merge pull request #42877 from gmarek/taint_cleanup
Automatic merge from submit-queue (batch tested with PRs 42877, 42853)

Remove unused functions and make logs slightly better

Zero risk cleanup, removing function that are not used anymore, and adding few more logs to help debugging problems.

cc @aveshagarwal
2017-03-10 09:54:21 -08:00
Kubernetes Submit Queue
18ffc95308 Merge pull request #36704 from fabxc/client-metrics2
Automatic merge from submit-queue

Use Prometheus instrumentation conventions

The `System` and `Subsystem` parameters are subject to removal.
(x-ref: https://github.com/prometheus/client_golang/issues/240)

All metrics should use base units, which is seconds in the duration
case.

Counters should always end in `_total` and metrics should avoid
referring to potential label dimensions. Those should rather be
mentioned in the documentation string.

@kubernetes/sig-instrumentation 

Reference docs:
https://prometheus.io/docs/practices/instrumentation/
https://prometheus.io/docs/practices/naming/

**Release note**:
```
Breaking change: Renamed REST client Prometheus metrics to follow the instrumentation conventions ("request_latency_microseconds" -> "rest_client_request_latency_seconds", "request_status_codes" -> "rest_client_requests_total"). Please update your alerting pipeline if you rely on them. 
```
2017-03-10 09:04:18 -08:00
Maciej Szulik
597a359c38 Error out when cronjob generator not specified, but cronjobs are not available 2017-03-10 12:08:01 +01:00
Maciej Szulik
aa4390750c Introduce new generator for apps/v1beta1 deployments 2017-03-10 12:08:01 +01:00
Maciej Szulik
1049dad0a4 Switch generators to use versioned objects 2017-03-10 12:08:01 +01:00
gmarek
fddac63c27 Remove unused functions and make logs slightly better 2017-03-10 11:57:51 +01:00
Kubernetes Submit Queue
c38717b73a Merge pull request #42843 from janetkuo/ds-status-kubectl
Automatic merge from submit-queue

Add new DaemonSetStatus to kubectl printer and describer

@kargakis @lukaszo @kubernetes/sig-apps-pr-reviews @kubernetes/sig-cli-pr-reviews 

```release-note
Add new DaemonSet status fields to kubectl printer and describer. 
```
2017-03-10 01:56:59 -08:00
timchenxiaoyu
c295514443 accurate hint 2017-03-10 16:41:51 +08:00
Fabian Reinartz
2b66b49a2f Use Histogram instead of Summary
A histogram allows to aggregate by labels and calculate more
comprehensive quantiles.
2017-03-10 07:24:38 +01:00
Fabian Reinartz
49e2074f74 Use Prometheus instrumentation conventions
The `System` and `Subsystem` parameters are subject to removal.
(x-ref: https://github.com/prometheus/client_golang/issues/240)

All metrics should use base units, which is seconds in the duration
case.

Counters should always end in `_total` and metrics should avoid
referring to potential label dimensions. Those should rather be
mentioned in the documentation string.
2017-03-10 07:24:38 +01:00
Kubernetes Submit Queue
ab6fecfa3a Merge pull request #42811 from gnufied/validation-no-probe
Automatic merge from submit-queue (batch tested with PRs 42811, 42859)

 Validation PVs for mount options

We are going to move the validation in its own package and we will be calling validation for individual volume types as needed.

Fixes https://github.com/kubernetes/kubernetes/issues/42573
2017-03-09 18:47:52 -08:00
Avesh Agarwal
9f533de80d Fix DefaultTolerationSeconds admission plugin. It was using
versioned object whereas admission plugins operate on internal objects.
2017-03-09 20:24:43 -05:00
Kubernetes Submit Queue
1f5708d460 Merge pull request #42640 from lukaszo/ds-updates-fix
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)

kubectl: respect DaemonSet strategy parameters for rollout status

It handles "after-merge" comments from #41116

cc @kargakis @janetkuo 

I will add one more e2e test later. I need to handle some in company stuff.
2017-03-09 16:41:54 -08:00
Kubernetes Submit Queue
7002c53a9c Merge pull request #42808 from ravisantoshgudimetla/nodecontroller_eviction_flake
Automatic merge from submit-queue (batch tested with PRs 42024, 42780, 42808, 42640)

Node controller test flake 39975 with delay for try function

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #39975 

/cc @ncdc @gmarek @liggitt
2017-03-09 16:41:52 -08:00
Kubernetes Submit Queue
9498a1270f Merge pull request #42024 from luomiao/fix-vsphere-remove-port
Automatic merge from submit-queue

Remove VCenterPort from vsphere cloud provider.

**What this PR does / why we need it**:
Address a bug inside vsphere cloud provider when a port number other than 443 is specified inside the config file.
The url which is used for communicating with govmomi should not include port number.
A port number other than 443 will result in 404 error.
VCenterPort stays in VSphereConfig structure for backward compatibility.

**Which issue this PR fixes** : fixes https://github.com/kubernetes/kubernetes-anywhere/issues/338
2017-03-09 15:59:33 -08:00
Janet Kuo
39857f4865 Add new DaemonSetStatus to kubectl printer and describer 2017-03-09 15:45:17 -08:00
Hemant Kumar
12d6b87894 Validation PVs for mount options
We are going to move the validation in its own package
and we will be calling validation for individual volume types
as needed.
2017-03-09 18:24:37 -05:00
Kubernetes Submit Queue
d790851c8f Merge pull request #42694 from dchen1107/master
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

Dropped docker 1.9.x support. Changed the minimumDockerAPIVersion to

1.22

cc/ @Random-Liu @yujuhong 

We talked about dropping docker 1.9.x support for a while. I just realized that we haven't really done it yet. 

```release-note
Dropped the support for docker 1.9.x and the belows. 
```
2017-03-09 15:07:00 -08:00
Kubernetes Submit Queue
5a47671614 Merge pull request #42814 from yujuhong/cri-kubemark-3rd-time-is-the-charm
Automatic merge from submit-queue (batch tested with PRs 42734, 42745, 42758, 42814, 42694)

kubemark: enable CRI in the hollow kubelet
2017-03-09 15:06:58 -08:00
Solly Ross
8337031bf5 Rate limit HPA controller to sync period
Since the HPA controller pulls information from an external source that
makes no guarantees about consistency, it's possible for the HPA
to get into an infinite update loop -- if the metrics change with
every query, the HPA controller will run it's normal reconcilation,
post a status update, see that status update itself, fetch new metrics,
and if those metrics are different, post another status update, and
repeat.  This can lead to continuously updating a single HPA.

By rate-limiting each HPA to once per sync interval, we prevent this
from happening.
2017-03-09 16:32:01 -05:00