Commit Graph

5949 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
96ec318718
Merge pull request #59842 from ixdy/update-rules_go-02-2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

 Update bazelbuild/rules_go, kubernetes/repo-infra, and gazelle dependencies

**What this PR does / why we need it**: updates our bazelbuild/rules_go dependency in order to bump everything to go1.9.4. I'm separating this effort into two separate PRs, since updating rules_go requires a large cleanup, removing an attribute from most build rules.

**Release note**:

```release-note
NONE
```
2018-02-19 22:23:05 -08:00
David Ashpole
960856f4e8 collect metrics on the /kubepods cgroup on-demand 2018-02-17 12:32:40 -08:00
Kubernetes Submit Queue
270ed995f4
Merge pull request #59841 from dashpole/metrics_after_reclaim
Automatic merge from submit-queue (batch tested with PRs 59683, 59964, 59841, 59936, 59686). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reevaluate eviction thresholds after reclaim functions

**What this PR does / why we need it**:
When the node comes under `DiskPressure` due to inodes or disk space, the eviction manager runs garbage collection functions to clean up dead containers and unused images.
Currently, we use the strategy of trying to measure the disk space and inodes freed by garbage collection.  However, as #46789 and #56573 point out, there are gaps in the implementation that can cause extra evictions even when they are not required.  Furthermore, for nodes which frequently cycle through images, it results in a large number of evictions, as running out of inodes always causes an eviction.

This PR changes this strategy to call the garbage collection functions and ignore the results.  Then, it triggers another collection of node-level metrics, and sees if the node is still under DiskPressure.
This way, we can simply observe the decrease in disk or inode usage, rather than trying to measure how much is freed.

**Which issue(s) this PR fixes**:
Fixes #46789
Fixes #56573
Related PR #56575

**Special notes for your reviewer**:
This will look cleaner after #57802  removes arguments from [makeSignalObservations](https://github.com/kubernetes/kubernetes/pull/57802/files#diff-9e5246d8c78d50ce4ba440f98663f3e9R719).

**Release note**:
```release-note
NONE
```

/sig node
/kind bug
/priority important-soon
cc @kubernetes/sig-node-pr-reviews
2018-02-16 16:31:33 -08:00
Jeff Grafton
ef56a8d6bb Autogenerated: hack/update-bazel.sh 2018-02-16 13:43:01 -08:00
Kubernetes Submit Queue
930f86574f
Merge pull request #57885 from cimomo/kubelet-fixes
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve comments for kubelet

**What this PR does / why we need it**:
Improve comments and fix typos for kubelet.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-16 13:38:49 -08:00
Kubernetes Submit Queue
244549f02a
Merge pull request #59769 from dashpole/capacity_ephemeral_storage
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Collect ephemeral storage capacity on initialization

**What this PR does / why we need it**:
We have had some node e2e flakes where a pod can be rejected if it requests ephemeral storage.  This is because we don't set capacity and allocatable for ephemeral storage on initialization.
This PR causes cAdvisor to do one round of stats collection during initialization, which will allow it to get the disk capacity when it first sets the node status.
It also sets the node to NotReady if capacities have not been initialized yet.

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
/assign @jingxu97 @Random-Liu 

/sig node
/kind bug
/priority important-soon
2018-02-16 11:17:02 -08:00
Kubernetes Submit Queue
eac5bc0035
Merge pull request #57136 from k82cn/k8s_54313
Automatic merge from submit-queue (batch tested with PRs 57136, 59920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updated PID pressure node condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
part of #54313 

**Release note**:

```release-note
Updated PID pressure node condition
```
2018-02-16 10:35:33 -08:00
David Ashpole
e0830d0b71 reevaluate eviction thresholds after reclaim functions 2018-02-16 08:35:24 -08:00
Kubernetes Submit Queue
c105796e4b
Merge pull request #59953 from Random-Liu/fix-pod-scheduled
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix pod scheduled.

Fix `PodScheduled` condition.

The test `[k8s.io] EquivalenceCache [Serial] validates pod affinity works properly when new replica pod is scheduled` for cri-containerd is flaky.
The reason is that it assume all existing pods should have `PodScheduled` condition, but it is not the case:
```
Feb 15 15:31:01.359: INFO: with-label-390d246e-1265-11e8-beb8-0a580a3c7b55       bootstrap-e2e-minion-group-l6qw  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:30:59 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:31:00 +0000 UTC  } {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:30:59 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-7mzxc                                     bootstrap-e2e-minion-group-hztx  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 14:17:05 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 14:17:59 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-kvrsx                                     bootstrap-e2e-minion-group-l6qw  Running         [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:24:54 +0000 UTC  } {Ready True 0001-01-01 00:00:00 +0000 UTC 2018-02-15 15:25:20 +0000 UTC  }]
Feb 15 15:31:01.359: INFO: calico-node-llwjh        
```

I'm not sure why this doesn't happen to docker. One theory is that we don't prepull image in cri-containerd, and we do start pod a bit faster for cri-containerd, and that exposes the race condition.

/cc @kubernetes/sig-node-bugs 
Signed-off-by: Lantao Liu <lantaol@google.com>



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-02-15 20:16:44 -08:00
Kubernetes Submit Queue
99c87cf679
Merge pull request #59923 from jsafrane/volumemanager-logs
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rework volume manager log levels

- all normal logs to go to level 4
- too frequent / duplicate logs go to level 5 (e.g. when something else logged similar message not too far away).

I checked that there is no excessive spam in the log - reconciler runs every 100ms, but it does not log anything if there is nothing to do.

**What this PR does / why we need it**:
This will help us debug flakes. E2e tests do not log levels 10-12 used in volume manager

**Release note**:

```release-note
NONE
```

/sig storage
/sig node
cc: @jingxu97 @sjenning
2018-02-15 20:16:38 -08:00
Kubernetes Submit Queue
c7c5d89e32
Merge pull request #59873 from jsafrane/fix-downward-flake
Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix DownwardAPI refresh race.

WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP) and DSWP should mark the volume to be remounted only when the new pod has been processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler, then DownwardAPI and Projected volumes of the pod are not refreshed with new content and they are updated after the next periodic sync (60-90 seconds).

Fixes #59813 

/assign @jingxu97 @saad-ali 
/sig storage
/sig node

```release-note
None
```
2018-02-15 20:16:32 -08:00
Kubernetes Submit Queue
bfdd94c6a0
Merge pull request #59170 from cofyc/fix_kubelet_volume_metrics
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix kubelet PVC stale metrics

**What this PR does / why we need it**:

Volumes on each node changes, we should not only add PVC metrics into
gauge vector. It's better use a collector to collector metrics from internal
stats.

Currently, if a PV (bound to a PVC `testpv`)  is attached and used by node A, then migrated to node B or just deleted from node A later.  `testpvc` metrics will not disappear from kubelet on node A. After a long running time, `kubelet` process will keep a lot of stale volume metrics in memory.

For these dynamic metrics, it's better to use a collector to collect metrics from a data source (`StatsProvider` here), like [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) scraping metrics from kube-apiserver.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/57686

**Special notes for your reviewer**:

**Release note**:

```release-note
Fix kubelet PVC stale metrics
```
2018-02-15 18:44:08 -08:00
David Ashpole
b259543985 collect ephemeral storage capacity on initialization 2018-02-15 17:33:22 -08:00
Lantao Liu
f69b4e9262 Fix pod scheduled.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-02-16 00:51:20 +00:00
JulienBalestra
493f335830 kubelet: revert the get pod status 2018-02-15 22:24:35 +01:00
Kubernetes Submit Queue
c03edcc58e
Merge pull request #53833 from mtaufen/kubeletconfig-to-beta
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Graduate kubeletconfig API group to beta

Regarding https://github.com/kubernetes/features/issues/281, this PR moves the kubeletconfig API group to beta. 

After #53088, the KubeletConfiguration type should not contain any deprecated or experimental fields, and we should not have to remove any more fields from the type before graduating it to beta. 

We need the community to double check for two things, however:
1. Are there any fields currently in the KubeletConfiguration type that you were going to mark deprecated this quarter, but haven't yet?
2. Are there any fields currently in the KubeletConfiguration type that are experimental or alpha, but were not explicitly denoted as such?

Please comment on this PR if you can answer "yes" to either of those two questions. Please cc anyone with a stake in the kubeletconfig API, so we get as much coverage as possible.

/cc @thockin @dchen1107 @Random-Liu @yujuhong @dashpole @tallclair @vishh @abw @freehan @dnardo @bowei @MrHohn @luxas @liggitt @ncdc @derekwaynecarr @mikedanese 

@kubernetes/sig-network-pr-reviews, @kubernetes/sig-node-pr-reviews 

```release-note
action required: The `kubeletconfig` API group has graduated from alpha to beta, and the name has changed to `kubelet.config.k8s.io`. Please use `kubelet.config.k8s.io/v1beta1`, as `kubeletconfig/v1alpha1` is no longer available. 
```

**TODO:**
- [x] Move experimental/non-gated-alpha/soon-to-be-deprecated fields to `KubeletFlags`
  - [x] #53088
  - [x] #54154
  - [x] #54160
  - [x] #55562
  - [x] #55983
  - [x] #57851
- [x] Lift embedded structure out of strings
  - [x] #53025
  - [x] #54643
  - [x] #54823
  - [x] #55254
- [x] Resolve relative paths against the location config files are loaded from
  - [x] #55648 
- [x] Rename to `kubelet.config.k8s.io`
- [x] Comments
  - [x] Make sure existing comments at least read sensibly.
  - [x] Note default values in comments on the versioned struct.
  - [x] Remove any reference to default values in comments on the internal struct.
- [x] Most fields should be `+optional` and `omitempty`. Add where necessary. ~Where omitted, explicitly comment.~ Edit: We should not distinguish between nil and empty, see below items.
- [x] Ensure defaults are specified via `pkg/kubelet/apis/kubelet.config.k8s.io/v1beta1/defaults.go`, not `cmd/kubelet/app/options/options.go`.
  - [x] #57770
- [x] Ensure kubeadm does not persist v1alpha1 KubeletConfiguration objects (or feature-gates this functionality)
- [x] Don't make a distinction between empty and nil, because of #43203.
  - [x] #59515
  - [x] #59681
- [x] Take the opportunity to fix insecure Kubelet defaults @tallclair 
  - [x] #59666
- [x] Remove CAdvisorPort from KubeletConfiguration wrt #56523.
  - [x] #59580
- [x] Hide `ConfigTrialDuration` until we're more sure what to do with it.
   - [x] #59628
- [x] Fix `// default: x` comments after rebasing on recent changes.
2018-02-15 11:06:40 -08:00
Kubernetes Submit Queue
b099e91920
Merge pull request #59905 from mtaufen/dkcfg-config-ok-kubelet-config-ok
Automatic merge from submit-queue (batch tested with PRs 59353, 59905, 53833). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rename ConfigOK to KubeletConfigOk

This is a more accurate name for the condition, as it describes the
status of the Kubelet's configuration.

Also cleans up capitalization of internal names.

```release-note
The ConfigOK node condition has been renamed to KubeletConfigOk.
```
2018-02-15 11:06:36 -08:00
Jan Safranek
e260096392 Rework volume manager log levels
- all normal logs to go to level 4
- too frequent / duplicate logs go to level 5 (e.g. when something else logged similar message not too far away).
2018-02-15 16:33:17 +01:00
Kubernetes Submit Queue
7377c5911a
Merge pull request #59892 from JulienBalestra/revert-host-ip
Automatic merge from submit-queue (batch tested with PRs 59877, 59886, 59892). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet: revert the status HostIP behavior

**What this PR does / why we need it**:

This PR partially revert #57106 to fix #59889.

The PR #57106 changed the behavior of `generateAPIPodStatus` when a **kubeClient** is nil.

**Release note**:
```release-note
NONE
```
2018-02-15 00:01:35 -08:00
Kubernetes Submit Queue
4b147e0361
Merge pull request #59588 from jiayingz/v1beta1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create pkg/kubelet/apis/deviceplugin/v1beta1 directory.

The proto stays the same as v1alpha. Only changes Version in
constants.go to "v1beta1" and the BUILD file to pick up the new dir.

```release-note
Adding pkg/kubelet/apis/deviceplugin/v1beta1 API.
```
2018-02-14 22:51:32 -08:00
Michael Taufen
d8cc440dd6 Rename ConfigOK to KubeletConfigOk
This is a more accurate name for the condition, as it describes the
status of the Kubelet's configuration.

Also cleans up capitalization of internal names.
2018-02-14 19:36:52 -08:00
Michael Taufen
9ebaf5e7d2 Move the kubeletconfig v1alpha1 API to beta, rename to kubelet.config.k8s.io 2018-02-14 17:30:22 -08:00
JulienBalestra
2130f5bc55 kubelet: revert the status HostIP behavior 2018-02-14 23:38:09 +01:00
Kai Chen
9ca0d32fbb Improve comments for kubelet 2018-02-14 12:03:46 -08:00
Kubernetes Submit Queue
63380d12db
Merge pull request #59666 from mtaufen/kc-secure-componentconfig-defaults
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Secure Kubelet's componentconfig defaults while maintaining CLI compatibility

This updates the Kubelet's componentconfig defaults, while applying the legacy defaults to values from options.NewKubeletConfiguration(). This keeps defaults the same for the command line and improves the security of defaults when you load config from a file.

See: https://github.com/kubernetes/kubernetes/issues/53618
See: https://github.com/kubernetes/kubernetes/pull/53833#discussion_r166669931

Also moves EnableServer to KubeletFlags, per @tallclair's comments on #53833.

We should find way of generating documentation for config file defaults, so that people can easily look up what's different from flags.

```release-note
Action required: Default values differ between the Kubelet's componentconfig (config file) API and the Kubelet's command line. Be sure to review the default values when migrating to using a config file.
```
2018-02-14 10:09:13 -08:00
Kubernetes Submit Queue
79b1589657
Merge pull request #59788 from Lihua93/fix/typos
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix typos

**What this PR does / why we need it**:
To fix some typos
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-14 08:51:31 -08:00
Jan Safranek
c232a0165a Fix DownwardAPI refresh race.
WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP)
and DSWP should mark the volume to be remounted only when the new pod has been
processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler,
then DownwardAPI and Projected volumes of the pod are not refreshed with new
content and they are updated after the next periodic sync (60-90 seconds).
2018-02-14 16:54:25 +01:00
Kubernetes Submit Queue
a129c0f984
Merge pull request #59832 from shyamjvs/fix-fake-docker-client-ip-collision
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fake docker-client assigns random IPs to containers

Fixes https://github.com/kubernetes/kubernetes/issues/59823

/cc @wojtek-t @Random-Liu
2018-02-14 07:08:24 -08:00
Shyam Jeedigunta
517301df21
Fake docker-client assigns random IPs to containers 2018-02-14 14:28:52 +01:00
Kubernetes Submit Queue
58dcf3c533
Merge pull request #59489 from pohly/master-tmpdir
Automatic merge from submit-queue (batch tested with PRs 59489, 59716). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

devicemanager testing: dynamically choose tmp dir

This avoids the test issue #59488 that I was running into.

I believe I have a reasonable explanation for the race condition in that issue (TLDR: it's probably part of the gRPC API and k8s can only avoid the issue until a proper solution gets worked out together with gRPC), therefore I suggest to merge this PR now both because it avoids the issue and because using fixed tmp directories is something that should be avoided anyway.

/assign @jiayingz
2018-02-14 00:14:31 -08:00
Michael Taufen
c1e34bc725 Secure Kubelet's componentconfig defaults while maintaining CLI compatibility
This updates the Kubelet's componentconfig defaults, while applying the
legacy defaults to values from options.NewKubeletConfiguration().
This keeps defaults the same for the command line and improves the
security of defaults when you load config from a file.

See: https://github.com/kubernetes/kubernetes/issues/53618
See: https://github.com/kubernetes/kubernetes/pull/53833#discussion_r166669931
2018-02-13 18:10:15 -08:00
Kubernetes Submit Queue
7b678dc403
Merge pull request #57106 from JulienBalestra/kubelet-update-local-pods
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Kubelet status manager sync the status of local Pods

**What this PR does / why we need it**:

In the kubelet, when using `--pod-manifest-path` the kubelet creates static pods but doesn't update the status accordingly in the `PodList`.

This PR fixes the incorrect status of each Pod in the kubelet's `PodList`.

This is the setup used to reproduce the issue:

**manifest**:

```bash
cat ~/kube/staticpod.yaml
```

```yaml
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: nginx
  name: nginx
  namespace: default
spec:
  hostNetwork: true
  containers:
  - name: nginx
    image: nginx:latest
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: os-release
      mountPath: /usr/share/nginx/html/index.html
      readOnly: true

  volumes:
  - name: os-release
    hostPath:
      path: /etc/os-release
```


**kubelet**:

```bash
~/go/src/k8s.io/kubernetes/_output/bin/kubelet --pod-manifest-path ~/kube/ --cloud-provider="" --register-node --kubeconfig kubeconfig.yaml
```


You can observe this by querying the kubelet API `/pods`:

```bash
curl -s 127.0.0.1:10255/pods | jq .
```

```json
{
  "kind": "PodList",
  "apiVersion": "v1",
  "metadata": {},
  "items": [
    {
      "metadata": {
        "name": "nginx-nodeName",
        "namespace": "default",
        "selfLink": "/api/v1/namespaces/default/pods/nginx-nodeName",
        "uid": "0fdfa64c73d9de39a9e5c05ef7967e72",
        "creationTimestamp": null,
        "labels": {
          "app": "nginx"
        },
        "annotations": {
          "kubernetes.io/config.hash": "0fdfa64c73d9de39a9e5c05ef7967e72",
          "kubernetes.io/config.seen": "2017-12-12T18:42:46.088157195+01:00",
          "kubernetes.io/config.source": "file"
        }
      },
      "spec": {
        "volumes": [
          {
            "name": "os-release",
            "hostPath": {
              "path": "/etc/os-release",
              "type": ""
            }
          }
        ],
        "containers": [
          {
            "name": "nginx",
            "image": "nginx:latest",
            "resources": {},
            "volumeMounts": [
              {
                "name": "os-release",
                "readOnly": true,
                "mountPath": "/usr/share/nginx/html/index.html"
              }
            ],
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          }
        ],
        "restartPolicy": "Always",
        "terminationGracePeriodSeconds": 30,
        "dnsPolicy": "ClusterFirst",
        "nodeName": "nodeName",
        "hostNetwork": true,
        "securityContext": {},
        "schedulerName": "default-scheduler",
        "tolerations": [
          {
            "operator": "Exists",
            "effect": "NoExecute"
          }
        ]
      },
      "status": {
        "phase": "Pending",
        "conditions": [
          {
            "type": "PodScheduled",
            "status": "True",
            "lastProbeTime": null,
            "lastTransitionTime": "2017-12-12T17:42:51Z"
          }
        ]
      }
    }
  ]
}
```

The status of the nginx `Pod` will remain in **Pending** state phase.

```bash
curl -I 127.0.0.1
HTTP/1.1 200 OK
```

It's reported as expected on the apiserver side:

```bash
kubectl get po --all-namespaces -w

NAMESPACE   NAME                    READY     STATUS    RESTARTS   AGE
default     nginx-nodeName   0/1       Pending   0          0s
default     nginx-nodeName   1/1       Running   0          2s
```


It doesn't work either with a standalone kubelet:

```bash
~/go/src/k8s.io/kubernetes/_output/bin/kubelet --pod-manifest-path ~/kube/ --cloud-provider="" --register-node false
```

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-02-13 16:46:33 -08:00
Kubernetes Submit Queue
9de5839944
Merge pull request #59681 from mtaufen/kc-empty-eviction-hard
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Ignore 0% and 100% eviction thresholds

Primarily, this gives a way to explicitly disable eviction, which is
necessary to use omitempty on EvictionHard.
See: https://github.com/kubernetes/kubernetes/pull/53833#discussion_r166672137

As justification for this approach, neither 0% nor 100% make sense as
eviction thresholds; in the "less-than" case, you can't have less than
0% of a resource and 100% perpetually evicts; in the
"greater-than" case (assuming we ever add a resource with this
semantic), the reasoning is the reverse (not more than 100%, 0%
perpetually evicts).

```release-note
Eviction thresholds set to 0% or 100% are now ignored.
```
2018-02-13 09:48:11 -08:00
Lihua Tang
cad52f6576 Fix typos 2018-02-13 16:17:37 +08:00
Kubernetes Submit Queue
4afdc33c57
Merge pull request #56454 from mtanino/volumehandler-refactor
Automatic merge from submit-queue (batch tested with PRs 59767, 56454, 59237, 59730, 55479). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Block Volume: Refactor volumehandler in operationexecutor

**What this PR does / why we need it**:

Based on discussion with @saad-ali at #51494, we need refactor volumehandler in operationexecutor
for Block Volume feature. We don't need to add volumehandler as separated object.

```
VolumeHandler does not need to be a separate object that is constructed inline like this. 
You can create a new operation, e.g. UnmountOperation to which you pass the spec,
and it can return either a UnmountVolume or UnmapVolume.
```

**Which issue(s) this PR fixes** : no related issue.

**Special notes for your reviewer**:

@saad-ali @msau42 

**Release note**:

```release-note
NONE
```
2018-02-12 15:44:33 -08:00
Michael Taufen
21dbbe14f2 Ignore 0% and 100% eviction thresholds
Primarily, this gives a way to explicitly disable eviction, which is
necessary to use omitempty on EvictionHard.
See: https://github.com/kubernetes/kubernetes/pull/53833#discussion_r166672137

As justification for this approach, neither 0% nor 100% make sense as
eviction thresholds; in the "less-than" case, you can't have less than
0% of a resource and 100% perpetually evicts; in the
"greater-than" case (assuming we ever add a resource with this
semantic), the reasoning is the reverse (not more than 100%, 0%
perpetually evicts).
2018-02-12 14:13:00 -08:00
Seth Jennings
9ab9ddeb19 kubelet: check for illegal phase transition 2018-02-12 15:28:10 -06:00
Yecheng Fu
fecff55c59 Fix kubelet PVC metrics using a volume stats collector.
Volumes on each node changes, we should not only add PVC metrics into
gauge vector. It's better use a collector to collector metrics from
stats.
2018-02-11 23:48:06 +08:00
Kubernetes Submit Queue
317853c90c
Merge pull request #59464 from dixudx/fix_all_typos
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix all the typos across the project

**What this PR does / why we need it**:
There are lots of typos across the project. We should avoid small PRs on fixing those annoying typos, which is time-consuming and low efficient.

This PR does fix all the typos across the project currently. And with #59463, typos could be avoided when a new PR gets merged.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
/sig testing
/area test-infra
/sig release
/cc @ixdy 
/assign @fejta 

**Release note**:

```release-note
None
```
2018-02-10 22:12:45 -08:00
Di Xu
48388fec7e fix all the typos across the project 2018-02-11 11:04:14 +08:00
Kubernetes Submit Queue
9d33926d5c
Merge pull request #59628 from mtaufen/kc-hide-trial-duration
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bury KubeletConfiguration.ConfigTrialDuration for now

Based on discussion in https://github.com/kubernetes/kubernetes/pull/53833/files#r166669046, this PR chooses not to expose a knob for the trial duration yet. It is unclear exactly which shape this functionality should take in the API.

```release-note
The alpha KubeletConfiguration.ConfigTrialDuration field is no longer available.
```
2018-02-10 14:03:08 -08:00
mtanino
973583e781 Refactor volumehandler in operationexecutor 2018-02-09 15:39:55 -05:00
Patrick Ohly
0d828e061b devicemanager testing: time out sooner
Each individual step should not take longer than a second.
Suggest by Vikas Choudhary (https://github.com/kubernetes/kubernetes/pull/59489#discussion_r167205672).
2018-02-09 20:51:54 +01:00
Kubernetes Submit Queue
76e6da25fa
Merge pull request #59481 from rojkov/dm-unittests
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

devicemanager: increase code coverege of endpoint's unit test

Particularly cover the code path when an unhealthy device
becomes healthy.
2018-02-09 10:35:22 -08:00
Patrick Ohly
1325c2f8be devicemanager testing: dynamically choose tmp dir
Hard-coding the tests to use /tmp/device_plugin for sockets is
problematic because it prevents running tests in parallel on the same
machine (perhaps because there are multiple developers, perhaps
because testing is done independently on different code checkouts).
/tmp/device_plugin also was not removed after testing.

This is probably not that relevant. But more importantly, this change
also fixes https://github.com/kubernetes/kubernetes/issues/59488.
"make test" failed in TestDevicePluginReRegistration because something
removed /tmp/device_plugin/device-plugin.sock while something else
tried to connect to it:

2018/02/07 14:34:39 Starting to serve on /tmp/device_plugin/device-plugin.sock
[pid 29568] connect(14, {sa_family=AF_UNIX, sun_path="/tmp/device_plugin/server.sock"}, 33) = 0
[pid 29568] unlinkat(AT_FDCWD, "/tmp/device_plugin/server.sock", 0) = 0
[pid 29568] unlinkat(AT_FDCWD, "/tmp/device_plugin/device-plugin.sock", 0) = 0
[pid 29568] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=29568, si_uid=1000} ---
[pid 29568] connect(6, {sa_family=AF_UNIX, sun_path="/tmp/device_plugin/device-plugin.sock"}, 40) = -1 ENOENT (No such file or directory)
E0207 14:34:39.961321   29568 endpoint.go:117] listAndWatch ended unexpectedly for device plugin mock with error rpc error: code = Unavailable desc = transport is closing
strace: Process 29623 attached
[pid 29574] connect(3, {sa_family=AF_UNIX, sun_path="/tmp/device_plugin/device-plugin.sock"}, 40) = -1 ENOENT (No such file or directory)
[pid 29623] connect(3, {sa_family=AF_UNIX, sun_path="/tmp/device_plugin/device-plugin.sock"}, 40) = -1 ENOENT (No such file or directory)
[pid 29574] connect(3, {sa_family=AF_UNIX, sun_path="/tmp/device_plugin/device-plugin.sock"}, 40) = -1 ENOENT (No such file or directory)
E0207 14:34:49.961324   29568 endpoint.go:60] Can't create new endpoint with path /tmp/device_plugin/device-plugin.sock err failed to dial device plugin: context deadline exceeded
E0207 14:34:49.961390   29568 manager.go:340] Failed to dial device plugin with request &RegisterRequest{Version:v1alpha2,Endpoint:device-plugin.sock,ResourceName:fake-domain/resource,}: failed to dial device plugin: context deadline exceeded
panic: test timed out after 2m0s

It's not entirely certain which code was to blame for this unlinkat()
calls (perhaps some cleanup code from a previous test running in a
goroutine?) but this no longer happened after switching to per-test
socket directories.
2018-02-09 14:01:13 +01:00
Michael Taufen
9c8eab96d0 Bury KubeletConfiguration.ConfigTrialDuration for now 2018-02-08 21:41:21 -08:00
Kubernetes Submit Queue
b24bc2cfdc
Merge pull request #59475 from Random-Liu/use-mountpoint
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add mountpoint as CRI image filesystem storage identifier.

Fixes https://github.com/kubernetes/kubernetes/issues/57356.

This PR changes CRI to use mountpoint as storage identifier. See https://github.com/kubernetes/kubernetes/issues/57356#issuecomment-363608733.

Note that:
1) This doesn't work with devicemapper for now. Please feel free to propose change for device mapper, we can discuss more about this after this first version is merged. @mrunalp @runcom
2) `mountpoint` is added as new field in `StorageIdentifier` now. After https://github.com/kubernetes/kubernetes/pull/58973 is merged, we can remove the UUID field in `v1alpha2`. 

/cc @yujuhong @feiskyer @yguo0905 @dashpole @mikebrow @abhi @kubernetes/sig-node-api-reviews 

**Release note**:

```release-note
CRI starts using moutpoint as image filesystem identifier instead of UUID.
```
2018-02-08 19:41:57 -08:00
Kubernetes Submit Queue
d6625f857a
Merge pull request #58177 from jingxu97/Jan/reconstruct
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Redesign and implement volume reconstruction work

This PR is the first part of redesign of volume reconstruction work. The detailed design information is https://github.com/kubernetes/community/pull/1601

The changes include
1. Remove dependency on volume spec stored in actual state for volume
cleanup process (UnmountVolume and UnmountDevice)

Modify AttachedVolume struct to add DeviceMountPath so that volume
unmount operation can use this information instead of constructing from
volume spec

2. Modify reconciler's volume reconstruction process (syncState). Currently workflow
is when kubelet restarts, syncState() is only called once before
reconciler starts its loop.
a. If volume plugin supports reconstruction, it will use the
reconstructed volume spec information to update actual state as before.
b. If volume plugin cannot support reconstruction, it will use the
scanned mount path information to clean up the mounts.

In this PR, all the plugins still support reconstruction (except
glusterfs), so reconstruction of some plugins will still have issues.
The next PR will modify those plugins that cannot support reconstruction
well.

This PR addresses issue #52683
2018-02-08 18:21:34 -08:00
Kubernetes Submit Queue
98abac70ce
Merge pull request #59598 from Random-Liu/remove-unnecessary-summary-call
Automatic merge from submit-queue (batch tested with PRs 59344, 59595, 59598). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove unnecessary summary api call.

Summary API call is not as cheap as we think. Especially for CRI container runtime, it means:
1) Extra cgroups parsing (because cpu/memory are considered to be on demand);
2) Extra grpc encoding/decoding `ListPodSandboxes`, `ListContainers`, `ListContainerStats`;

I don't think it is necessary to call summary twice inside the same function.
/cc @kubernetes/sig-node-pr-reviews @dashpole @jingxu97 

Signed-off-by: Lantao Liu <lantaol@google.com>

**Release note**:

```release-note
none
```
2018-02-08 18:06:37 -08:00
Jiaying Zhang
90c2098103 Create pkg/kubelet/apis/deviceplugin/v1beta1 directory.
The proto stays the same as v1alpha. Only changes Version in
constants.go to "v1beta1" and the BUILD file to pick up the new dir.
2018-02-08 17:04:43 -08:00