Commit Graph

2657 Commits

Author SHA1 Message Date
k8s-merge-robot
879c1807c8 Merge pull request #24821 from freehan/kubenetmutex
Automatic merge from submit-queue

add mutex for kubenet

I saw a bunch of weird cases in kubenet suite. For instance, SetUpPod return successfully, but right after that, kubelet cannot retrieve podIP from podCIDR map.


cc: @dcbw @thockin 

ref: #24211
2016-05-02 13:16:23 -07:00
Clayton Coleman
fdb110c859 Fix the rest of the code 2016-04-29 17:12:10 -04:00
k8s-merge-robot
ad67363c12 Merge pull request #24362 from ArtfulCoder/hostname-field
Automatic merge from submit-queue

Promote Pod Hostname & Subdomain to fields (were annotations)

Deprecating the podHostName, subdomain and PodHostnames annotations and created corresponding new fields for them on PodSpec and Endpoints types.

Annotation doc: #22564
Annotation code: #20688
2016-04-29 01:06:45 -07:00
k8s-merge-robot
492762d394 Merge pull request #24911 from pmorie/kubelet-godoc
Automatic merge from submit-queue

Add godoc for some kubelet funcs

Chipping away at that old boulder

@kubernetes/sig-node
2016-04-28 14:52:45 -07:00
Paul Morie
b9f0e8c610 Add godoc for some kubelet funcs 2016-04-28 17:03:37 -04:00
Abhishek Shah
8a3ed48808 Added Hostname and Subdomain field to Pod.Spec 2016-04-28 10:56:56 -07:00
k8s-merge-robot
7a725418af Merge pull request #24622 from derekwaynecarr/pod_qos_util
Automatic merge from submit-queue

Add utility for determining qos of a pod

@vishh - per slack chat.
2016-04-28 07:26:50 -07:00
k8s-merge-robot
00308f7a9f Merge pull request #24598 from wojtek-t/improve_scheduler_predicates
Automatic merge from submit-queue

Store node information in NodeInfo

This is significantly improving scheduler throughput.

On 1000-node cluster:
- empty cluster: ~70pods/s
- full cluster: ~45pods/s
Drop in throughput is mostly related to priority functions, which I will be looking into next (I already have some PR #24095, but we need for more things before).

This is roughly ~40% increase.
However, we still need better understanding of predicate function, because in my opinion it should be even faster as it is now. I'm going to look into it next week.

@gmarek @hongchaodeng @xiang90
2016-04-28 02:17:59 -07:00
k8s-merge-robot
d0b887e4e0 Merge pull request #24595 from zhouhaibing089/httpserverclose
Automatic merge from submit-queue

Uncomment the code that caused by #19254

Fix https://github.com/kubernetes/kubernetes/issues/24546.

@lavalamp
2016-04-28 01:41:16 -07:00
k8s-merge-robot
04b70bc6c7 Merge pull request #24376 from resouer/fix-cache
Automatic merge from submit-queue

Do not update cache with so much effort

Fixes: #24298
1. Remove automatic update
2. Every time we check if we can get valid value from cache, if not, get the value directly from api

cc @Random-Liu
2016-04-28 01:00:33 -07:00
k8s-merge-robot
4c7abddc1c Merge pull request #24567 from yifan-gu/post_start_hook
Automatic merge from submit-queue

rkt: Add post-start hook support.

This adds a poll-and-timeout procedure after the pod is
started, to make sure the post-start hooks execute when the
container is actually running.

This is a temporal workaround for implementing post-hooks,
a long term solution is to use lifecycle event to trigger
those hooks, see https://github.com/kubernetes/kubernetes/issues/23084.

Also this fixes a bug of getting container ID for a non-running
container when running pre-stop hook.


cc @sjpotter @euank @kubernetes/sig-node
2016-04-27 11:14:35 -07:00
k8s-merge-robot
7e430f543b Merge pull request #24545 from swagiaal/rename-cleaner-tuple
Automatic merge from submit-queue

Rename cleanerTuple to cleaner

Rename cleanerTuple to cleaner.
This is a follow up to address: https://github.com/kubernetes/kubernetes/pull/19503#discussion_r49538769

@saad-ali
2016-04-27 09:51:26 -07:00
Harry Zhang
d6f26b68bc Use expiration cache for version check 2016-04-27 05:42:50 -04:00
Minhan Xia
c8470c49ac add mutex for kubenet 2016-04-26 13:58:10 -07:00
k8s-merge-robot
55cb7cceb3 Merge pull request #23632 from stefwalter/parse-repository-tag-removed
Automatic merge from submit-queue

Fix use of docker removed ParseRepositoryTag() function

Docker has removed the ParseRepositoryTag() function in
leading to failures using the kubernetes Go client API.

Failure:

```
../k8s.io/kubernetes/pkg/util/parsers/parsers.go:30: undefined: parsers.ParseRepositoryTag
```
2016-04-26 09:49:25 -07:00
k8s-merge-robot
a586177360 Merge pull request #23740 from dcbw/kubenet-shaper
Automatic merge from submit-queue

kubenet: hook pod bandwidth resources up to shaper

@bprashanth @thockin Last bit for shaping.
2016-04-25 22:15:42 -07:00
k8s-merge-robot
cf38d68734 Merge pull request #23595 from vishh/image-accounting
Automatic merge from submit-queue

Collect and expose runtime's image storage usage via Kubelet's /stats/summary endpoint

This information is useful to users since docker images are typically not stored on the root filesystem.

Kubelet will also consume this feature in the future to decide is evicting images will help with disk usage on the nodes.

cc @kubernetes/sig-node
2016-04-25 21:34:30 -07:00
Vishnu kannan
e566948a75 Track image storage usage for docker containers
add image fs info to summary stats API.
Adding node e2e test for image stats.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-04-25 16:00:34 -07:00
Euan Kemp
941caa1372 rkt: Pass through os argument
This was lost in a rebase in #24496 and, while not required to build, is
required to function correctly.
2016-04-25 12:56:25 -07:00
zhouhaibing089
bf1a3f99c0 Uncomment the code that cause by #19254 2016-04-25 23:21:31 +08:00
Stef Walter
481dbca8bc Fix use of docker removed ParseRepositoryTag() function
Docker has removed the ParseRepositoryTag() function in
leading to failures using the kubernetes Go client API.

Lets use github.com/docker/distribution reference.ParseNamed()
instead.

Failure:

../k8s.io/kubernetes/pkg/util/parsers/parsers.go:30: undefined: parsers.ParseRepositoryTag
2016-04-25 11:37:10 +02:00
Wojciech Tyczynski
1835c8528d Store node information in NodeInfo 2016-04-25 10:08:05 +02:00
k8s-merge-robot
4f9e8729bf Merge pull request #23800 from resouer/image-refactor
Automatic merge from submit-queue

Refactor image related functions to use docker engine-api

ref #23563 

Hopes can do some help, cc @Random-Liu 

If it's ok, will add more work here.
2016-04-23 20:01:41 -07:00
k8s-merge-robot
30891c7f3f Merge pull request #24496 from euank/rkt-finished-at
Automatic merge from submit-queue

rkt: Return `FinishedAt` for pod

This is implemented via touching a file on stop as a hook in the systemd
unit. The ctime of this file is then used to get the `finishedAt` time
in the future.
In addition, this changes the `startedAt` and `createdAt` to use the api
server's results rather than the annotations it previously used.

It's possible we might want to move this into the api in the future.

Fixes #23887


I did the following manual testing:
```
$ cat ./examples/output/exit-output.yml 
apiVersion: v1
kind: Pod
metadata:
  labels:
    name: exit
  name: exit-output
spec:
  restartPolicy: Never
  containers:
    - name: exit
      image: busybox
      command: ["sh", "-c", "echo Exiting in 60; sleep 60; echo goodbye"]
$ kubectl create -f ./examples/exit/exit-output.yaml
$ # wait
$ kubectl describe pod exit-output | grep State -A 4
    State:		Terminated
      Reason:		Completed
      Exit Code:	0
      Started:		Tue, 19 Apr 2016 13:23:13 -0700
      Finished:		Tue, 19 Apr 2016 13:24:13 -0700
$ kubectl logs exit-output
Exiting in 60
goodbye
```

I double checked as well that the file at `/var/lib/kubelet/pods/$id/finished-$id` existed and looked as expected.

This is related to https://github.com/coreos/rkt/issues/1789#issuecomment-207111814 and follows https://github.com/kubernetes/kubernetes/pull/24367 + https://github.com/coreos/rkt/issues/2445

cc @jonboulle @iaguis @yifan-gu @kubernetes/sig-node
2016-04-23 18:29:07 -07:00
Harry Zhang
a3939473d3 Refactor PullImage RemoveImage methods
Refactor image remove
2016-04-23 10:33:47 -04:00
Harry Zhang
3918eee5bf Refactor InspectImage method 2016-04-23 16:37:15 +08:00
Harry Zhang
7ecb44fe16 Refactor list image to use new api 2016-04-23 16:37:15 +08:00
Yifan Gu
a12a7c2a2c rkt: Add post-start hook support.
This adds a poll-and-timeout procedure after the pod is
started, to make sure the post-start hooks execute when the
container is actually running.

This is a temporal workaround for implementing post-hooks,
a long term solution is to use lifecycle event to trigger
those hooks, see https://github.com/kubernetes/kubernetes/issues/23084.

Also this fixes a bug of getting container ID for a non-running
container when running pre-stop hook.
2016-04-22 15:38:05 -07:00
Euan Kemp
a6718f5969 rkt: Implement pod FinishedAt
This is implemented via touching a file on stop as a hook in the systemd
unit. The ctime of this file is then used to get the `finishedAt` time
in the future.
In addition, this changes the `startedAt` and `createdAt` to use the api
server's results rather than the annotations it previously used.

It's possible we might want to move this into the api in the future.

Fixes #23887
2016-04-22 15:34:55 -07:00
gmarek
e0712f7e57 Fix MaxPods feature in scheduler 2016-04-22 22:49:50 +02:00
k8s-merge-robot
06c2db4fe2 Merge pull request #23907 from Random-Liu/all-but-image-related-functions
Automatic merge from submit-queue

Kubelet: Refactor all but image related functions in DockerInterface

For #23563.
Based on #23699 and #23844.

Only last 3 commits are new. This PR refactored all functions except image related functions, including:
* CreateExec
* StartExec
* InspectExec
* AttachToContainer
* Logs
* Info
* Version

@kubernetes/sig-node
2016-04-21 20:57:38 -07:00
derekwaynecarr
2b9cfd414d Add utility for determining qos of a pod 2016-04-21 17:15:17 -04:00
k8s-merge-robot
9d4eee63ab Merge pull request #24589 from derekwaynecarr/fix_shm
Automatic merge from submit-queue

docker daemon complains SHM size must be greater than 0

Fixes https://github.com/kubernetes/kubernetes/issues/24588

I am hitting this on Fedora 23 w/ docker 1.9.1 using systemd cgroup-driver.

```
$ docker version
Client:
 Version:         1.9.1
 API version:     1.21
 Package version: docker-1.9.1-9.gitee06d03.fc23.x86_64
 Go version:      go1.5.3
 Git commit:      ee06d03/1.9.1
 Built:           
 OS/Arch:         linux/amd64

Server:
 Version:         1.9.1
 API version:     1.21
 Package version: docker-1.9.1-9.gitee06d03.fc23.x86_64
 Go version:      go1.5.3
 Git commit:      ee06d03/1.9.1
 Built:           
 OS/Arch:         linux/amd64
```

Not sure why I am on the only one hitting it right now, but putting this out here for comment.

/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @smarterclayton
2016-04-21 12:11:03 -07:00
Random-Liu
d981fee2ee Refactor Info and Version. 2016-04-21 12:02:50 -07:00
derekwaynecarr
cbf1cb81a9 SHM size must be greater than 0 2016-04-21 11:45:28 -04:00
Chao Xu
8537095415 use fully qualified resource in fake clients actions 2016-04-20 19:44:40 -07:00
Sami Wagiaalla
234d599763 Rename cleanerTuple to cleaner 2016-04-20 14:38:40 -04:00
goltermann
3fa6c6f6d9 Enable vet 2016-04-20 09:48:24 -07:00
Minhan Xia
a7783e5334 add log line before invoking network plugin 2016-04-19 15:34:06 -07:00
Dan Williams
8086d64131 kubenet: hook pod bandwidth resources up to shaper 2016-04-19 15:32:46 -05:00
k8s-merge-robot
d37e6ad332 Merge pull request #24126 from Random-Liu/fix-pull-image
Automatic merge from submit-queue

Fix PullImage and add corresponding node e2e test

Fixes #24101. This is a bug introduced by #23506, since ref #23563.

The root cause of #24101 is described [here](https://github.com/kubernetes/kubernetes/issues/24101#issuecomment-208547623).

This PR
1) Fixes #24101 by decoding the messages returned during pulling image, and return error if any of the messages contains error.
2) Add the node e2e test to detect this kind of failure.
3) Get present check out of `ConformanceImage.Remove()` and `ConformanceImage.Pull()`. Because sometimes we may expect error to occur in `PullImage()` and `RemoveImage()`, but even that doesn't happen, the `Present()` check will still return error and let the test pass.

@yujuhong @freehan @liangchenye 

Also /cc @resouer, because he is doing the image related functions refactoring.
2016-04-18 07:05:44 -07:00
k8s-merge-robot
d0b52dd8b3 Merge pull request #24107 from yifan-gu/load_bridge
Automatic merge from submit-queue

kubenet: Load bridge netfilter module in Init().

This lets the kubenet loads the bridge netfilter module and set bridge-nf-call-iptables=1

Fix #24018 

Follow up PRs would be appreciate if we also load the module in the bridge plugin binary itself. Ref https://github.com/kubernetes/kubernetes/issues/24018#issuecomment-207682514

cc @kubernetes/sig-node @sjpotter @euank
2016-04-18 00:08:25 -07:00
k8s-merge-robot
9637b09f69 Merge pull request #24047 from derekwaynecarr/reuse_summary_provider
Automatic merge from submit-queue

Expose SummaryProvider for reuse by other parts of kubelet

To support out of resource killing in the kubelet, we will introduce a new top-level module that will ensure node stability by checking if eviction thresholds have been met for memory and file-system usage on the node.  In addition, it will then need information about pod memory and disk usage in order to make an eviction selection.  Currently, this information is collected in `SummaryProvider` but it's hidden away and not available for re-use by other top-level modules of the kubelet.  This initial refactor adds the ability to get summary stat information from the `ResourceAnalyzer` so it can be reused by other top-level modules.

I suspect we will further re-factor this area as code evolves, but this unblocks further progress on out-of-resource killing.

/cc @vishh @timothysc @kubernetes/sig-node @kubernetes/rh-cluster-infra
2016-04-17 20:22:57 -07:00
Random-Liu
d33b69a0de Refactor AttachToContainer and Logs. 2016-04-17 13:00:52 -07:00
Random-Liu
de5f407058 Refactor CreateExec, StartExec and InspectExec. 2016-04-17 12:58:47 -07:00
k8s-merge-robot
75b49f591a Merge pull request #23948 from derekwaynecarr/memory_available
Automatic merge from submit-queue

Add memory available to summary stats provider

To support out of resource killing when low on memory, we want to let operators specify eviction thresholds based on available memory instead of memory usage for ease of use when working with heterogeneous nodes.  

So for example, a valid eviction threshold would be the following: 
* If node.memory.available < 200Mi for 30s, then evict pod(s)

For the node, `memory.availableBytes` is always known since the `memory.limit_in_bytes` is always known for root cgroup.  For individual containers in pods, we only populate the `availableBytes` if the container was launched with a memory limit specified.  When no memory limit is specified, the cgroupfs sets a value of 1 << 63 in the `memory.limit_in_bytes` so we look for a similar max value to handle unbounded limits, and ignore setting `memory.availableBytes`.

FYI @vishh @timstclair - as discussed on Slack.

/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra
2016-04-17 06:32:36 -07:00
Wojciech Tyczynski
495e274500 Merge pull request #24384 from Random-Liu/disable-version-cache
Disable the version cache to fix #24298.
2016-04-17 04:48:07 -07:00
Random-Liu
19249a8cbc Disable the version cache to fix #24298. 2016-04-17 03:14:03 -07:00
k8s-merge-robot
8990897ce6 Merge pull request #23940 from freehan/netinterface
Automatic merge from submit-queue

switch to use ContainerID instead of DockerID in network plugin interface

fix: #15663
2016-04-17 01:12:51 -07:00
k8s-merge-robot
2e87b0e363 Merge pull request #23699 from Random-Liu/container-related-functions
Automatic merge from submit-queue

Kubelet: Refactor container related functions in DockerInterface

For #23563.
Based on #23506, will rebase after #23506 is merged.

The last 4 commits of this PR are new.
This PR refactors all container lifecycle related functions in DockerInterface, including:
* ListContainers
* InspectContainer
* CreateContainer
* StartContainer
* StopContainer
* RemoveContainer

@kubernetes/sig-node
2016-04-16 21:41:19 -07:00