Commit Graph

4252 Commits

Author SHA1 Message Date
Pengfei Ni
81b9064ca4 Fix typo of defualt 2017-02-11 22:28:24 +08:00
Random-Liu
8177030957 Change timeout for ExecSync, RunPodSandbox and PullImage. 2017-02-10 16:09:19 -08:00
Kubernetes Submit Queue
e9de1b0221 Merge pull request #40992 from k82cn/rm_empty_line
Automatic merge from submit-queue (batch tested with PRs 41236, 40992)

Removed unnecessarly empty line.
2017-02-10 05:38:42 -08:00
Kubernetes Submit Queue
8188c3cca4 Merge pull request #40796 from wojtek-t/use_node_ttl_in_secret_manager
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)

Implement TTL controller and use the ttl annotation attached to node in secret manager

For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.

Apiserver cannot keep up with it very easily.

Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.

So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is). 
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.

@dchen1107 @thockin @liggitt
2017-02-10 00:04:44 -08:00
Kubernetes Submit Queue
76b39431d3 Merge pull request #41147 from derekwaynecarr/improve-eviction-logs
Automatic merge from submit-queue (batch tested with PRs 41074, 41147, 40854, 41167, 40045)

Add debug logging to eviction manager

**What this PR does / why we need it**:
This PR adds debug logging to eviction manager.

We need it to help users understand when/why eviction manager is/is not making decisions to support information gathering during support.
2017-02-09 17:41:41 -08:00
Kubernetes Submit Queue
f5c07157a8 Merge pull request #41092 from yujuhong/cri-docker1_10
Automatic merge from submit-queue (batch tested with PRs 41037, 40118, 40959, 41084, 41092)

CRI node e2e: add tests for docker 1.10
2017-02-09 16:44:44 -08:00
David Ashpole
b224f83c37 Revert "[Kubelet] Delay deletion of pod from the API server until volumes are deleted" 2017-02-09 08:45:18 -08:00
Wojciech Tyczynski
6c0535a939 Use secret TTL annotation in secret manager 2017-02-09 13:53:32 +01:00
Kubernetes Submit Queue
42d8d4ca88 Merge pull request #40948 from freehan/cri-hostport
Automatic merge from submit-queue (batch tested with PRs 40873, 40948, 39580, 41065, 40815)

[CRI] Enable Hostport Feature for Dockershim

Commits:
1. Refactor common hostport util logics and add more tests

2. Add HostportManager which can ADD/DEL hostports instead of a complete sync.

3. Add Interface for retreiving portMappings information of a pod in Network Host interface. 
Implement GetPodPortMappings interface in dockerService. 

4. Teach kubenet to use HostportManager
2017-02-08 14:14:43 -08:00
Derek Carr
0171121486 Add debug logging to eviction manager 2017-02-08 15:01:12 -05:00
Yu-Ju Hong
f96611ac45 dockershim: set the default cgroup driver 2017-02-08 10:22:19 -08:00
Minhan Xia
be9eca6b51 teach kubenet to use hostport_manager 2017-02-08 09:35:04 -08:00
Minhan Xia
bd05e1af2b add portmapping getter into network host 2017-02-08 09:35:04 -08:00
Minhan Xia
8e7219cbb4 add hostport manager 2017-02-08 09:26:52 -08:00
Minhan Xia
aabdaa984f refactor hostport logic 2017-02-08 09:26:52 -08:00
David Ashpole
67cb2704c5 delete volumes before pod deletion 2017-02-08 07:34:49 -08:00
Kubernetes Submit Queue
c6f30d3750 Merge pull request #41105 from Random-Liu/fix-kuberuntime-log
Automatic merge from submit-queue (batch tested with PRs 38796, 40823, 40756, 41083, 41105)

Let ReadLogs return when there is a read error.

Fixes a bug in kuberuntime log.

Today, @yujuhong found that once we cancel `kubectl logs -f` with `Ctrl+C`, kuberuntime will keep complaining:
```
27939 kuberuntime_logs.go:192] Failed with err write tcp 10.240.0.4:10250->10.240.0.2:53913: write: broken pipe when writing log for log file "/var/log/pods/5bb76510-ed71-11e6-ad02-42010af00002/busybox_0.log": &{timestamp:{sec:63622095387 nsec:625309193 loc:0x484c440} stream:stdout log:[84 117 101 32 70 101 98 32 32 55 32 50 48 58 49 54 58 50 55 32 85 84 67 32 50 48 49 55 10]}
```

This is because kuberuntime keeps writing to the connection even though it is already closed. Actually, kuberuntime should return and report error whenever there is a writing error.

Ref the [docker code](3a4ae1f661/pkg/stdcopy/stdcopy.go (L159-L167))

I'm still creating the cluster and verifying this fix. Will post the result here after that.

/cc @yujuhong @kubernetes/sig-node-bugs
2017-02-08 00:49:51 -08:00
Kubernetes Submit Queue
1b7bdde40b Merge pull request #38796 from chentao1596/kubelet-cni-log
Automatic merge from submit-queue (batch tested with PRs 38796, 40823, 40756, 41083, 41105)

kubelet/network-cni-plugin: modify the log's info

**What this PR does / why we need it**:

  Checking the startup logs of kubelet, i can always find a error like this:
    "E1215 10:19:24.891724    2752 cni.go:163] error updating cni config: No networks found in /etc/cni/net.d"
 
 It will appears, neither i use cni network-plugin or not. 
After analysis codes, i thought it should be a warn log, because it will not produce any actions like as exit or abort, and just ignored when not any valid plugins exit.

 thank you!
2017-02-08 00:49:44 -08:00
Kubernetes Submit Queue
843e6d1cc3 Merge pull request #40770 from apilloud/clientset_interface
Automatic merge from submit-queue (batch tested with PRs 41103, 41042, 41097, 40946, 40770)

Use Clientset interface in KubeletDeps

**What this PR does / why we need it**:
This replaces the Clientset struct with the equivalent interface for the KubeClient injected via KubeletDeps. This is useful for testing and for accessing the Node and Pod status event stream without an API server.

**Special notes for your reviewer**:
Follow up to #4907

**Release note**:

`NONE`
2017-02-07 22:12:39 -08:00
Kubernetes Submit Queue
51b8eb9424 Merge pull request #40946 from yujuhong/docker_sep
Automatic merge from submit-queue (batch tested with PRs 41103, 41042, 41097, 40946, 40770)

dockershim: set security option separators based on the docker version

Also add a version cache to avoid hitting the docker daemon frequently.

This is part of #38164
2017-02-07 22:12:37 -08:00
Random-Liu
65190e2a72 Let ReadLogs return when there is a read error. 2017-02-07 15:43:48 -08:00
Yu-Ju Hong
e66dd63b05 Add OWNERS to the dockertools package 2017-02-07 11:31:37 -08:00
Yu-Ju Hong
d8e29e782f dockershim: set security option separators based on the docker version
Also add a version cache to avoid hitting the docker daemon frequently.
2017-02-07 11:06:40 -08:00
Kubernetes Submit Queue
98f97496ef Merge pull request #40903 from yujuhong/security_opts
Automatic merge from submit-queue (batch tested with PRs 40971, 41027, 40709, 40903, 39369)

Set docker opt separator correctly for SELinux options

This is based on @pmorie's commit from #40179
2017-02-06 20:57:17 -08:00
Yu-Ju Hong
05c3b8c1cf Set docker opt separator correctly for SELinux options 2017-02-06 14:47:30 -08:00
Kubernetes Submit Queue
d4bcf3ede5 Merge pull request #40951 from yujuhong/fix_cri_portforward
Automatic merge from submit-queue (batch tested with PRs 40930, 40951)

Fix CRI port forwarding

Websocket support was introduced #33684, which broke the CRI
implementation. This change fixes it.
2017-02-06 14:27:05 -08:00
Klaus Ma
cc26fe6ee9 Removed unnecessarly empty line. 2017-02-06 11:10:34 +08:00
Kubernetes Submit Queue
a777a8e3ba Merge pull request #39972 from derekwaynecarr/pod-cgroups-default
Automatic merge from submit-queue (batch tested with PRs 40289, 40877, 40879, 39972, 40942)

Rename experimental-cgroups-per-pod flag

**What this PR does / why we need it**:
1. Rename `experimental-cgroups-per-qos` to `cgroups-per-qos`
1. Update hack/local-up-cluster to match `CGROUP_DRIVER` with docker runtime if used.

**Special notes for your reviewer**:
We plan to roll this feature out in the upcoming release.  Previous node e2e runs were running with this feature on by default.  We will default this feature on for all e2es next week.

**Release note**:
```release-note
Rename --experiemental-cgroups-per-qos to --cgroups-per-qos
```
2017-02-04 04:43:08 -08:00
Kubernetes Submit Queue
4796c7b409 Merge pull request #40727 from Random-Liu/handle-cri-in-place-upgrade
Automatic merge from submit-queue

CRI: Handle cri in-place upgrade

Fixes https://github.com/kubernetes/kubernetes/issues/40051.

## How does this PR restart/remove legacy containers/sandboxes?
With this PR, dockershim will convert and return legacy containers and infra containers as regular containers/sandboxes. Then we can rely on the SyncPod logic to stop the legacy containers/sandboxes, and the garbage collector to remove the legacy containers/sandboxes.

To forcibly trigger restart:
* For infra containers, we manually set `hostNetwork` to opposite value to trigger a restart (See [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kuberuntime/kuberuntime_manager.go#L389))
* For application containers, they will be restarted with the infra container.
## How does this PR avoid extra overhead when there is no legacy container/sandbox?
For the lack of some labels, listing legacy containers needs extra `docker ps`. We should not introduce constant performance regression for legacy container cleanup. So we added the `legacyCleanupFlag`:
* In `ListContainers` and `ListPodSandbox`, only do extra `ListLegacyContainers` and `ListLegacyPodSandbox` when `legacyCleanupFlag` is `NotDone`.
* When dockershim starts, it will check whether there are legacy containers/sandboxes.
  * If there are none, it will mark `legacyCleanupFlag` as `Done`.
  * If there are any, it will leave `legacyCleanupFlag` as `NotDone`, and start a goroutine periodically check whether legacy cleanup is done.
This makes sure that there is overhead only when there are legacy containers/sandboxes not cleaned up yet.

## Caveats
* In-place upgrade will cause kubelet to restart all running containers.
* RestartNever container will not be restarted.
* Garbage collector sometimes keep the legacy containers for a long time if there aren't too many containers on the node. In that case, dockershim will keep performing extra `docker ps` which introduces overhead.
  * Manually remove all legacy containers will fix this.
  * Should we garbage collect legacy containers/sandboxes in dockershim by ourselves? /cc @yujuhong 
* Host port will not be reclaimed for the lack of checkpoint for legacy sandboxes. https://github.com/kubernetes/kubernetes/pull/39903 /cc @freehan 

/cc @yujuhong @feiskyer @dchen1107 @kubernetes/sig-node-api-reviews 
**Release note**:

```release-note
We should mention the caveats of in-place upgrade in release note.
```
2017-02-03 22:17:56 -08:00
Kubernetes Submit Queue
f20b4fc67f Merge pull request #40655 from vishh/flag-gate-critical-pod-annotation
Automatic merge from submit-queue

Optionally avoid evicting critical pods in kubelet

For #40573

```release-note
When feature gate "ExperimentalCriticalPodAnnotation" is set, Kubelet will avoid evicting pods in "kube-system" namespace that contains a special annotation - `scheduler.alpha.kubernetes.io/critical-pod`
This feature should be used in conjunction with the rescheduler to guarantee availability for critical system pods - https://kubernetes.io/docs/admin/rescheduler/
```
2017-02-03 16:22:26 -08:00
Yu-Ju Hong
bb0eb3c33e Fix CRI port forwarding
Websocket support was introduced #33684, which broke the CRI
implementation. This change fixes it.
2017-02-03 15:29:49 -08:00
Derek Carr
04a909a257 Rename cgroups-per-qos flag to not be experimental 2017-02-03 17:10:53 -05:00
Andrew Pilloud
3f8505022c Use clientset.Interface for KubeClient 2017-02-03 07:36:16 -08:00
Kubernetes Submit Queue
2bb1e75815 Merge pull request #40863 from kubernetes/sttts-big-genericapiserver-move
Automatic merge from submit-queue (batch tested with PRs 40795, 40863)

Move pkg/genericapiserver and pkg/storage to k8s.io/apiserver

approved based on #40363

These must merge first:
- [x] genericvalidation https://github.com/kubernetes/kubernetes/pull/40810
- [x] openapi https://github.com/kubernetes/kubernetes/pull/40829
- [x] episode 7 https://github.com/kubernetes/kubernetes/pull/40853
2017-02-03 03:48:50 -08:00
Kubernetes Submit Queue
0dcc04d698 Merge pull request #40795 from wojtek-t/use_caching_manager
Automatic merge from submit-queue (batch tested with PRs 40795, 40863)

Use caching secret manager in kubelet

I just found that this is in my local branch I'm using for testing, but not in master :)
2017-02-03 03:48:48 -08:00
Dr. Stefan Schimanski
6af3210d6f Update generated files 2017-02-03 08:15:46 +01:00
Dr. Stefan Schimanski
80b96b441b Mechanical import fixup: pkg/storage 2017-02-03 07:33:43 +01:00
Random-Liu
b9cf8ebe77 Update bazel. 2017-02-02 15:36:24 -08:00
Random-Liu
626680d289 Add unit test for legacy container cleanup 2017-02-02 15:36:24 -08:00
Random-Liu
14940edaad Add legacy container cleanup 2017-02-02 15:36:24 -08:00
Kubernetes Submit Queue
c84f3abca1 Merge pull request #40571 from jcbsmpsn/close-watch
Automatic merge from submit-queue

Release API watch resources when done.
2017-02-02 14:10:10 -08:00
Wojciech Tyczynski
6f3acb801d Fix bug in secret manager. 2017-02-02 21:39:26 +01:00
Vishnu Kannan
c967ab7b99 Avoid evicting critical pods in Kubelet if a special feature gate is enabled
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-02 11:32:20 -08:00
Vishnu Kannan
6ddb528446 Revert "Sort critical pods before admission"
This reverts commit b7409e0038.
2017-02-02 10:41:24 -08:00
Vishnu Kannan
ffd7dda234 Revert "Kubelet admits critical pods even under memory pressure"
This reverts commit afd676d94c.
2017-02-02 10:41:24 -08:00
Vishnu Kannan
a3ae8c2b21 Revert "assign -998 as the oom_score_adj for critical pods."
This reverts commit 53931fbce4.
2017-02-02 10:41:21 -08:00
Vishnu Kannan
b8a63537dd Revert "Don't evict static pods"
This reverts commit 1743c6b6ab.
2017-02-02 10:29:12 -08:00
Minhan Xia
51526d3103 Add checkpointHandler to DockerService 2017-02-02 10:19:34 -08:00
Minhan Xia
344d2f591f add checkpoint structures for dockershim 2017-02-02 10:18:37 -08:00
Jacob Simpson
cf31d9413e Release API watch resources when done. 2017-02-02 10:17:08 -08:00