Commit Graph

1695 Commits

Author SHA1 Message Date
Michael Taufen
38aee0464d Providing kubeconfig file is now the switch for standalone mode
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
2017-07-24 11:03:00 -07:00
Jess Frazelle
e1493c9c88
allowPrivilegeEscalation: apply to correct docker api versions
Signed-off-by: Jess Frazelle <acidburn@google.com>
2017-07-24 12:52:43 -04:00
David Ashpole
7a23f8b018 remove deprecated flags LowDiskSpaceThresholdMB and OutOfDiskTransitionFrequency 2017-07-20 13:23:13 -07:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Kubernetes Submit Queue
c0287ce420 Merge pull request #47316 from k82cn/k8s_47315
Automatic merge from submit-queue (batch tested with PRs 48981, 47316, 49180)

Added golint check for pkg/kubelet.

**What this PR does / why we need it**:
Added golint check for pkg/kubelet, and make golint happy.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #47315 

**Release note**:
```release-note-none
```
2017-07-19 11:21:25 -07:00
Klaus Ma
63b78a37e0 Added golint check for pkg/kubelet. 2017-07-19 11:33:06 +08:00
Mikhail Mazurskiy
d789615902
Shared Informer Run blocks until all goroutines finish
Fixes #45454
2017-07-18 14:05:08 +10:00
Jacob Simpson
29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
Jing Xu
bb1920edcc Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
2017-07-13 12:06:19 -07:00
Clayton Coleman
b8e662fcea
Move the kubelet certificate management code into a single package
Code is very similar and belongs together.
2017-07-05 18:11:49 -04:00
Dan Williams
5b8ad3f7c5 kubelet: remove unused bandwidth shaping teardown code
Since v1.5 and the removal of --configure-cbr0:

0800df74ab "Remove the legacy networking mode --configure-cbr0"

kubelet hasn't done any shaping operations internally.  They
have all been delegated to network plugins like kubenet or
external CNI plugins.  But some shaping code was still left
in kubelet, so remove it now that it's unused.
2017-06-30 11:51:22 -05:00
Vishnu kannan
82f7820066 Kubelet:
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.

This patch fixes a bug in container manager where it was writing to a map without synchronization.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-06-27 18:45:02 -07:00
Kubernetes Submit Queue
d95a8bf66b Merge pull request #47783 from NickrenREN/containerruntime
Automatic merge from submit-queue (batch tested with PRs 47694, 47772, 47783, 47803, 47673)

Make different container runtimes constant

Make different container runtimes constant to avoid hardcode

**Release note**:

```release-note
NONE
```
2017-06-23 08:29:28 -07:00
Kubernetes Submit Queue
dd126ae19c Merge pull request #38431 from NickrenREN/newVolumeMgr-return
Automatic merge from submit-queue

Modify NewVolumeManager() function return value
2017-06-22 16:43:29 -07:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
f2d3220a11 run root-rewrite-import-client-go-api-types 2017-06-22 11:30:59 -07:00
Chao Xu
cde4772928 run ./root-rewrite-all-other-apis.sh, then run make all, pkg/... compiles 2017-06-22 11:30:52 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
NickrenREN
6de7e3f3dc Make different container runtimes constant 2017-06-20 19:58:39 +08:00
NickrenREN
312cd1bbe6 Modify NewVolumeManager() function return value
Since function NewVolumeManager() will always return vm and nil, we do not need the second return value, it will always be nil.
2017-06-17 23:33:12 +08:00
Jacob Simpson
334de1cbe1 Auto approve kubelet certificate signing requests. 2017-06-16 08:47:12 -07:00
Kubernetes Submit Queue
17244ea5d9 Merge pull request #47124 from andyxning/remove_sync_loop_health_check
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)

fix sync loop health check

This PR will do error logging about the fall behind sync for kubelet instead of sync loop healthz checking.

The reason is kubelet can not do sync loop and therefore can not update sync loop time when there is any runtime error, such as docker hung. 

When there is any runtime error, according to current implementation, kubelet will not do sync operation and thus kubelet's sync loop time will not be updated. This will make when there is any runtime error, kubelet will also return non 200 response status code when accessing healthz endpoint. This is contrary with #37865 which prevents kubelet from being killed when docker hangs.

**Release note**:
```release-note
fix sync loop health check with seperating runtime errors
```

/cc @yujuhong @Random-Liu @dchen1107
2017-06-12 18:19:51 -07:00
Andy Xie
96cb43993a fix sync loop health check 2017-06-10 11:25:59 +08:00
Zihong Zheng
d5c9d27ed7 Make kubelet touch iptables lock file during initialization 2017-06-09 09:34:48 -07:00
David Ashpole
889afa5e2d trigger aggressive container garbage collection when under disk pressure 2017-06-03 07:52:36 -07:00
Jing Xu
dd67e96c01 Add local storage (scratch space) allocatable support
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
2017-06-01 15:57:50 -07:00
Shyam Jeedigunta
1cf6b339f6 Use TTL-based caching configmap manager in kubelet 2017-05-31 10:39:40 +02:00
Shyam Jeedigunta
4425864707 Migrate kubelet configmap management logic to an interface 2017-05-31 10:39:36 +02:00
Kubernetes Submit Queue
f2074ba8de Merge pull request #45059 from jcbsmpsn/rotate-server-certificate
Automatic merge from submit-queue (batch tested with PRs 46635, 45619, 46637, 45059, 46415)

Certificate rotation for kubelet server certs.

Replaces the current kubelet server side self signed certs with certs signed by
the Certificate Request Signing API on the API server. Also renews expiring
kubelet server certs as expiration approaches.

Two Points:
1. With `--feature-gates=RotateKubeletServerCertificate=true` set, the kubelet will
    request a certificate during the boot cycle and pause waiting for the request to
    be satisfied.
2. In order to have the kubelet's certificate signing request auto approved,
    `--insecure-experimental-approve-all-kubelet-csrs-for-group=` must be set on
    the cluster controller manager. There is an improved mechanism for auto
    approval [proposed](https://github.com/kubernetes/kubernetes/issues/45030).

**Release note**:
```release-note
With `--feature-gates=RotateKubeletServerCertificate=true` set, the kubelet will
request a server certificate from the API server during the boot cycle and pause
waiting for the request to be satisfied. It will continually refresh the certificate as
the certificates expiration approaches.
```
2017-05-30 19:49:02 -07:00
Yu-Ju Hong
c82350214e Group container-runtime-specific flags/options together
Do not store them in kubelet's configuration. Eventually, we would like
to deprecate all these flags as they should not be part of kubelet.
2017-05-30 08:10:39 -07:00
Jacob Simpson
4c22e6bc6a Certificate rotation for kubelet server certs.
Replaces the current kubelet server side self signed certs with certs
signed by the Certificate Request Signing API on the API server. Also
renews expiring kubelet server certs as expiration approaches.
2017-05-29 12:28:01 -07:00
Kubernetes Submit Queue
5e853709a7 Merge pull request #46089 from karataliu/wincri1
Automatic merge from submit-queue (batch tested with PRs 46124, 46434, 46089, 45589, 46045)

Support TCP type runtime endpoint for kubelet

**What this PR does / why we need it**:
Currently the grpc server for kubelet and dockershim has a hardcoded endpoint: unix socket '/var/run/dockershim.sock', which is not applicable on non-unix OS.

This PR is to support TCP endpoint type besides unix socket.

**Which issue this PR fixes** 
This is a first attempt to address issue https://github.com/kubernetes/kubernetes/issues/45927

**Special notes for your reviewer**:
Before this change, running on Windows node results in:
```
Container Manager is unsupported in this build
```

After adding the cm stub, error becomes:
```
listen unix /var/run/dockershim.sock: socket: An address incompatible with the requested protocol was used.
```

This PR is to fix those two issues.

After this change, still meets 'seccomp' related issue when running on Windows node, needs more updates later.

**Release note**:
2017-05-25 21:40:02 -07:00
Dong Liu
fb26c9100a Support TCP type runtime endpoint for kubelet. 2017-05-25 09:16:11 +08:00
Kubernetes Submit Queue
90250220a9 Merge pull request #44428 from qiujian16/commenttypo
Automatic merge from submit-queue

Fix some typo of comment in kubelet.go

**What this PR does / why we need it**:
The PR is to fix some typo in kubelet.go

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
N/A

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-05-23 18:45:34 -07:00
Kubernetes Submit Queue
99a8f7c303 Merge pull request #43590 from dashpole/eviction_complete_deletion
Automatic merge from submit-queue (batch tested with PRs 46022, 46055, 45308, 46209, 43590)

Eviction does not evict unless the previous pod has been cleaned up

Addresses #43166
This PR makes two main changes:
First, it makes the eviction loop re-trigger immediately if there may still be pressure.  This way, if we already waited 10 seconds to delete a pod, we dont need to wait another 10 seconds for the next synchronize call.
Second, it waits for the pod to be cleaned up (including volumes, cgroups, etc), before moving on to the next synchronize call.  It has a timeout for this operation currently set to 30 seconds.
2017-05-22 20:00:03 -07:00
Clayton Coleman
3e095d12b4
Refactor move of client-go/util/clock to apimachinery 2017-05-20 14:19:48 -04:00
David Ashpole
21fb487245 wait for previous evicted pod to be cleaned up 2017-05-16 14:23:42 -07:00
Xing Zhou
a2e68e96cb Fix typo.
Fixed typo.
2017-05-15 14:01:30 +08:00
Kubernetes Submit Queue
3619c33350 Merge pull request #42759 from mtaufen/kubelet-apis-reorg
Automatic merge from submit-queue

Reorganize kubelet tree so apis can be independently versioned

@yujuhong @lavalamp @thockin @bgrant0607 
This is an example of how we might reorganize `pkg/kubelet` so the apis it exposes can be independently versioned. This would also provide a logical place to put the `KubeletConfiguration` type, which currently lives in `pkg/apis/componentconfig`; it could live in e.g. `pkg/kubelet/apis/config` instead.

Take a look when you have a chance and let me know what you think. The most significant change in this PR is reorganizing `pkg/kubelet/api` to `pkg/kubelet/apis`, the rest is pretty much updating import paths and `BUILD` files.
2017-05-12 17:43:22 -07:00
Kubernetes Submit Queue
9c8287d629 Merge pull request #45624 from dashpole/kubelet_cleanup
Automatic merge from submit-queue (batch tested with PRs 45685, 45572, 45624, 45723, 45733)

Remove unused fields from Kubelet struct

Just a small attempt to clean up some unused fields in the kubelet struct.  This doesn't make any actual code changes.

/assign @mtaufen
2017-05-12 14:00:57 -07:00
Michael Taufen
cbad320205 Reorganize kubelet tree so apis can be independently versioned 2017-05-12 10:02:33 -07:00
David Ashpole
b69dacbd86 remove unused fields from Kubelet struct 2017-05-10 16:25:09 -07:00
Yu-Ju Hong
daa329c9ae Remove the deprecated --enable-cri flag
Except for rkt, CRI is the default and only integration point for
container runtimes.
2017-05-10 13:03:41 -07:00
Kubernetes Submit Queue
77b2e6302c Merge pull request #45236 from verb/sharedpid-2-default
Automatic merge from submit-queue

Enable shared PID namespace by default for docker pods

**What this PR does / why we need it**: This PR enables PID namespace sharing for docker pods by default, bringing the behavior of docker in line with the other CRI runtimes when used with docker >= 1.13.1.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref #1615

**Special notes for your reviewer**: cc @dchen1107 @yujuhong 

**Release note**:

```release-note
Kubernetes now shares a single PID namespace among all containers in a pod when running with docker >= 1.13.1. This means processes can now signal processes in other containers in a pod, but it also means that the `kubectl exec {pod} kill 1` pattern will cause the pod to be restarted rather than a single container.
```
2017-05-10 12:06:01 -07:00
Kubernetes Submit Queue
51a3413371 Merge pull request #45307 from yujuhong/mv-docker-client
Automatic merge from submit-queue (batch tested with PRs 45453, 45307, 44987)

Migrate the docker client code from dockertools to dockershim

Move docker client code from dockertools to dockershim/libdocker. This includes
DockerInterface (renamed to Interface), FakeDockerClient, etc.

This is part of #43234
2017-05-09 20:23:44 -07:00
Kubernetes Submit Queue
a062782524 Merge pull request #44258 from wlan0/master
Automatic merge from submit-queue (batch tested with PRs 45508, 44258, 44126, 45441, 45320)

cloud initialize node in external cloud controller

@thockin This PR adds support in the `cloud-controller-manager` to initialize nodes (instead of kubelet, which did it previously)

This also adds support in the kubelet to skip node cloud initialization when `--cloud-provider=external`

Specifically,

Kubelet

1. The kubelet has a new flag called `--provider-id` which uniquely identifies a node in an external DB
2. The kubelet sets a node taint - called "ExternalCloudProvider=true:NoSchedule" if cloudprovider == "external"

Cloud-Controller-Manager

1. The cloud-controller-manager listens on "AddNode" events, and then processes nodes that starts with that above taint. It performs the cloud node initialization steps that were previously being done by the kubelet.
2. On addition of node, it figures out the zone, region, instance-type, removes the above taint and updates the node.
3. Then periodically queries the cloudprovider for node addresses (which was previously done by the kubelet) and updates the node if there are new addresses

```release-note
NONE  
```
2017-05-08 16:34:43 -07:00
Kubernetes Submit Queue
f4fc4be805 Merge pull request #44727 from x1957/master
Automatic merge from submit-queue

adds log when gpuManager.start() failed

If gpuManager.start() returns error, there is no log.

We confused with scheduler do not schedule any pod(with gpu) to one node.
kubectl describe node xxx shows there is no gpu on that node, because the gpu driver do not work on that node, gpuManager.start() failed, but we can not see anything in log.
2017-05-08 14:27:48 -07:00
wlan0
45d2bc06b7 cloud initialize node in external cloud controller 2017-05-05 16:51:45 -07:00
Yu-Ju Hong
389c140eaf Move docker client code from dockertools to dockershim/dockerlib
The code affected include DockerInterface (renamed to Interface),
FakeDockerClient, etc.
2017-05-05 11:48:08 -07:00
Kubernetes Submit Queue
f6ec7bade1 Merge pull request #45316 from yujuhong/dockershim-plugin-settings
Automatic merge from submit-queue (batch tested with PRs 45316, 45341)

Pass NoOpLegacyHost to dockershim in --experimental-dockershim mode

This allows dockershim to use network plugins, if needed.

/cc @Random-Liu
2017-05-04 05:19:49 -07:00
Yu-Ju Hong
40b0474956 pass noopnetworkhost to dockershim 2017-05-03 16:32:01 -07:00
Yu-Ju Hong
78b2c3b4c2 kuberuntime: remove the unused network plugin
Network plugin is completely handled by the container runtimes. Remove
this unused field in the kuberuntime manager.
2017-05-03 16:21:46 -07:00
Lee Verberne
b668371a63 Enable shared PID namespace by default for docker 2017-05-03 17:12:08 +00:00
Jian Qiu
b0a415e453 Fix some typo of comment in kubelet.go 2017-05-03 10:40:28 +08:00
Yu-Ju Hong
93ecaf6812 Move exec.go from dockertools to dockershim 2017-05-01 16:00:46 -07:00
Yu-Ju Hong
9f3184c5a4 Remove DockerManager from kubelet
This commit deletes code in dockertools that is only used by
DockerManager. A follow-up change will rename and clean up the rest of
the files in this package.

The commit also sets EnableCRI to true if the container runtime is not
rkt. A follow-up change will remove the flag/field and all references to
it.
2017-05-01 12:14:50 -07:00
Lee Verberne
d22dd0fa35 Implement shared PID namespace in the dockershim 2017-04-27 23:43:53 +00:00
x1957
3db1127e72 adds log when gpuManager.start() failed 2017-04-20 23:09:25 +08:00
Klaus Ma
6d29cfc0cc Registered node before other initialization. 2017-04-18 10:43:56 +08:00
Yu-Ju Hong
1d3d12dfc2 Don't check runtime condition for rktnetes
rktnetes is not a CRI implementation, and does not provide runtime
conditions. This change fixes the issue where rkt will never be
considered running from kubelet's point of view.
2017-04-17 11:33:58 -07:00
Andy Goldstein
00e11566f2 Make the dockershim root directory configurable
Make the dockershim root directory configurable so things like
integration tests (e.g. in OpenShift) can run as non-root.
2017-04-12 09:06:21 -04:00
Andy Goldstein
010b71a5f7 kubelet: make dockershim.sock configurable
Make the location of dockershim.sock configurable, so downstream
projects (such as OpenShift) can place it in a location that does not
require root access (e.g. for integration tests).

Make the kubelet respect and use the values of
--container-runtime-endpoint and --image-service-endpoint, if set. If
unset, the default value of /var/run/dockershim.sock is used.
2017-04-06 12:01:21 -04:00
Kubernetes Submit Queue
faf2eca226 Merge pull request #42916 from dashpole/misleading_log
Automatic merge from submit-queue

Clearer ImageGC failure errors.  Fewer events.

Addresses #26000.  Kubelet often "fails" image garbage collection if cAdvisor has not completed the first round of stats collection.  Don't create events for a single failure, and make log messages more specific.

@kubernetes/sig-node-bugs
2017-04-04 11:23:32 -07:00
Michael Taufen
f5eed7e91d Add a separate flags struct for Kubelet flags
Kubelet flags are not necessarily appropriate for the KubeletConfiguration
object. For example, this PR also removes HostnameOverride and NodeIP
from KubeletConfiguration. This is a preleminary step to enabling Nodes
to share configurations, as part of the dynamic Kubelet configuration
feature (#29459). Fields that must be unique for each node inhibit
sharing, because their values, by definition, cannot be shared.
2017-04-03 13:28:29 -07:00
David Ashpole
2cd65ea863 only create event for multiple imagegc failures 2017-03-30 16:19:18 -07:00
NickrenREN
2f89a6bda6 optimize getPullSecretsForPod() and syncPod()
Since getPullSecretsForPod() will never return err,we do not need the second return value,and modify syncPod() function.
2017-03-25 11:05:13 +08:00
Vishnu kannan
ff158090b3 use active pods instead of runtime pods in gpu manager
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-03-13 10:58:26 -07:00
Andy Goldstein
b011529d8a Add pprof trace support
Add pprof trace support and --enable-contention-profiling to those
components that don't already have it.
2017-03-07 10:10:42 -05:00
Kubernetes Submit Queue
4bbf98850f Merge pull request #42500 from vishh/fix-gpu-init
Automatic merge from submit-queue

[Bug] Fix gpu initialization in Kubelet

Kubelet incorrectly fails if `AllAlpha=true` feature gate is enabled with container runtimes that are not `docker`.

Replaces #42407
2017-03-04 20:28:08 -08:00
Vishnu kannan
038585626d fix gpu initialization
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-03-03 12:13:01 -08:00
David Ashpole
ac612eab8e eviction manager changes for allocatable 2017-03-02 07:36:24 -08:00
Kubernetes Submit Queue
fa0387c9fe Merge pull request #42195 from Random-Liu/cri-support-non-json-logging
Automatic merge from submit-queue (batch tested with PRs 41931, 39821, 41841, 42197, 42195)

Use `docker logs` directly if the docker logging driver is not `json-file`

Fixes https://github.com/kubernetes/kubernetes/issues/41996.

Post the PR first, I still need to manually test this, because we don't have test coverage for journald logging pluggin.

@yujuhong @dchen1107 
/cc @kubernetes/sig-node-pr-reviews
2017-03-01 20:08:08 -08:00
Random-Liu
7c261bfed7 Use docker logs directly if the docker logging driver is not
supported.
2017-03-01 10:50:11 -08:00
vefimova
fc8a37ec86 Added ability for Docker containers to set usage of dns settings along with hostNetwork is true
Introduced chages:
   1. Re-writing of the resolv.conf file generated by docker.
      Cluster dns settings aren't passed anymore to docker api in all cases, not only for pods with host network:
      the resolver conf will be overwritten after infra-container creation to override docker's behaviour.

   2. Added new one dnsPolicy - 'ClusterFirstWithHostNet', so now there are:
      - ClusterFirstWithHostNet - use dns settings in all cases, i.e. with hostNet=true as well
      - ClusterFirst - use dns settings unless hostNetwork is true
      - Default

Fixes #17406
2017-03-01 17:10:00 +00:00
Kubernetes Submit Queue
ed479163fa Merge pull request #42116 from vishh/gpu-experimental-support
Automatic merge from submit-queue

Extend experimental support to multiple Nvidia GPUs

Extended from #28216

```release-note
`--experimental-nvidia-gpus` flag is **replaced** by `Accelerators` alpha feature gate along with  support for multiple Nvidia GPUs. 
To use GPUs, pass `Accelerators=true` as part of `--feature-gates` flag.
Works only with Docker runtime.
```

1. Automated testing for this PR is not possible since creation of clusters with GPUs isn't supported yet in GCP.
1. To test this PR locally, use the node e2e.
```shell
TEST_ARGS='--feature-gates=DynamicKubeletConfig=true' FOCUS=GPU SKIP="" make test-e2e-node
```

TODO:

- [x] Run manual tests
- [x] Add node e2e
- [x] Add unit tests for GPU manager (< 100% coverage)
- [ ] Add unit tests in kubelet package
2017-03-01 04:52:50 -08:00
Vishnu kannan
2554b95994 Map nvidia devices one to one.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-28 13:42:08 -08:00
Vishnu kannan
69acb02394 use feature gate instead of flag to control support for GPUs
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-28 13:42:07 -08:00
Vishnu kannan
3b0a408e3b improve gpu integration
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-02-28 11:27:53 -08:00
Hui-Zhi
57c77ffbdd Add support for multiple nvidia gpus 2017-02-28 11:24:48 -08:00
Seth Jennings
b9adb66426 kubelet: cm: refactor QoS logic into seperate interface 2017-02-28 09:19:29 -06:00
Vishnu Kannan
cc5f5474d5 add support for node allocatable phase 2 to kubelet
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2017-02-27 21:24:44 -08:00
Kubernetes Submit Queue
16f87fe7d8 Merge pull request #40952 from dashpole/premption
Automatic merge from submit-queue (batch tested with PRs 41994, 41969, 41997, 40952, 40576)

Guaranteed admission for Critical Pods

This is the first step in implementing node-level preemption for critical pods.
It defines the AdmissionFailureHandler interface, which allows callers, like the kubelet, to define how failed predicates are handled, and take steps to correct failures if necessary.
In the kubelet's implementation, it triggers preemption if the pod being admitted is critical, and if the only failed predicates are InsufficientResourceErrors, then it prempts (not yet implemented) other other pods to allow admission of the critical pod.

cc: @vishh
2017-02-26 12:57:59 -08:00
Kubernetes Submit Queue
067f92e789 Merge pull request #41801 from riverzhang/patch-1
Automatic merge from submit-queue (batch tested with PRs 41854, 41801, 40088, 41590, 41911)

Fix  some typos

**Release note**:

```release-note
```
2017-02-25 05:02:53 -08:00
David Ashpole
c58970e47c critical pods can preempt other pods to be admitted 2017-02-23 10:31:20 -08:00
Andy Goldstein
9d8d6ad16c Switch scheduler to use generated listers/informers
Where possible, switch the scheduler to use generated listers and
informers. There are still some places where it probably makes more
sense to use one-off reflectors/informers (listing/watching just a
single node, listing/watching scheduled & unscheduled pods using a field
selector).
2017-02-23 09:57:12 -05:00
riverzhang
5156b7f8cf Fix some typos 2017-02-21 07:15:40 -06:00
Kubernetes Submit Queue
05c05de798 Merge pull request #41569 from yujuhong/add_healthcheck
Automatic merge from submit-queue (batch tested with PRs 38101, 41431, 39606, 41569, 41509)

Report node not ready on failed PLEG health check

Report node not ready if PLEG health check fails.
2017-02-16 15:49:18 -08:00
Kubernetes Submit Queue
6376ad134d Merge pull request #39606 from NickrenREN/kubelet-pod
Automatic merge from submit-queue (batch tested with PRs 38101, 41431, 39606, 41569, 41509)

optimize killPod() and syncPod() functions

make sure that one of the two arguments must be non-nil: runningPod, status ,just like the function note says
and judge the return value in syncPod() function before setting podKilled
2017-02-16 15:49:17 -08:00
Kubernetes Submit Queue
3c606cdd20 Merge pull request #41456 from dashpole/pod_volume_cleanup
Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416)

Delay Deletion of a Pod until volumes are cleaned up

#41436 fixed the bug that caused #41095 and #40239 to have to be reverted.  Now that the bug is fixed, this shouldn't cause problems.

 @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc
2017-02-16 10:14:05 -08:00
Yu-Ju Hong
5bb43a3a24 Report node not ready on failed PLEG health check 2017-02-16 09:00:22 -08:00
NickrenREN
b40e575076 optimize killPod() and syncPod() functions
make sure that one of the two arguments must be non-nil: runningPod, status ,just like the function note says
and judge the return value in syncPod() function before setting podKilled
2017-02-16 09:13:23 +08:00
Kubernetes Submit Queue
3bc575c91f Merge pull request #33550 from rtreffer/kubelet-allow-multiple-dns-server
Automatic merge from submit-queue

Allow multipe DNS servers as comma-seperated argument for kubelet --dns

This PR explores how kubectls "--dns" could be extended to specify multiple DNS servers for in-cluster PODs. Testing on the local libvirt-coreos cluster shows that multiple DNS server are injected without issues.

Specifying multiple DNS servers increases resilience against
- Packet drops
- Single server failure

I am debugging services that do 50+ DNS requests for a single incoming interactive request, thus highly increase the chance of a slowdown (+5s) due to a single packet drop. Switching to two DNS servers will reduce the impact of the issues (roughly +1s on glibc, 0s on musl, error-rate goes down to error-rate^2).

Note that there is no need to change any runtime related code as far as I know. In the case of "default" dns the /etc/resolv.conf is parsed and multiple DNS server are send to the backend anyway. This only adds the same capability for the clusterFirst case.

I've heard from @thockin that multiple DNS entries are somehow considered. I've no idea what was considered, though. This is what I would like to see for our production use, though.

```release-note
NONE
```
2017-02-15 12:45:32 -08:00
David Ashpole
1d38818326 Revert "Merge pull request #41202 from dashpole/revert-41095-deletion_pod_lifecycle"
This reverts commit ff87d13b2c, reversing
changes made to 46becf2c81.
2017-02-15 08:44:03 -08:00
Kubernetes Submit Queue
dd696683b7 Merge pull request #40647 from NickrenREN/secretManager
Automatic merge from submit-queue (batch tested with PRs 41360, 41423, 41430, 40647, 41352)

optimize NewSimpleSecretManager and cleanupOrphanedPodCgroups
2017-02-15 05:06:11 -08:00
Yu-Ju Hong
fb94f441ce Set EnableCRI to true by default
This change makes kubelet to use the CRI implementation by default,
unless the users opt out explicitly by using --enable-cri=false.
For the rkt integration, the --enable-cri flag will have no effect
since rktnetes does not use CRI.

Also, mark the original --experimental-cri flag hidden and deprecated,
so that we can remove it in the next release.
2017-02-14 16:15:51 -08:00
NickrenREN
31bfefca3c optimize NewSimpleSecretManager and cleanupOrphanedPodCgroups
remove NewSimpleSecretManager second return value and cleanupOrphanedPodCgroups's return since they will never return err
2017-02-14 09:47:05 +08:00
Kubernetes Submit Queue
e9de1b0221 Merge pull request #40992 from k82cn/rm_empty_line
Automatic merge from submit-queue (batch tested with PRs 41236, 40992)

Removed unnecessarly empty line.
2017-02-10 05:38:42 -08:00
Kubernetes Submit Queue
8188c3cca4 Merge pull request #40796 from wojtek-t/use_node_ttl_in_secret_manager
Automatic merge from submit-queue (batch tested with PRs 40796, 40878, 36033, 40838, 41210)

Implement TTL controller and use the ttl annotation attached to node in secret manager

For every secret attached to a pod as volume, Kubelet is trying to refresh it every sync period. Currently Kubelet has a ttl-cache of secrets of its pods and the ttl is set to 1 minute. That means that in large clusters we are targetting (5k nodes, 30pods/node), given that each pod has a secret associated with ServiceAccount from its namespaces, and with large enough number of namespaces (where on each node (almost) every pod is from a different namespace), that resource in ~30 GETs to refresh all secrets every minute from one node, which gives ~2500QPS for GET secrets to apiserver.

Apiserver cannot keep up with it very easily.

Desired solution would be to watch for secret changes, but because of security we don't want a node watching for all secrets, and it is not possible for now to watch only for secrets attached to pods from my node.

So as a temporary solution, we are introducing an annotation that would be a suggestion for kubelet for the TTL of secrets in the cache and a very simple controller that would be setting this annotation based on the cluster size (the large cluster is, the bigger ttl is). 
That workaround mean that only very local changes are needed in Kubelet, we are creating a well separated very simple controller, and once watching "my secrets" will be possible it will be easy to remove it and switch to that. And it will allow us to reach scalability goals.

@dchen1107 @thockin @liggitt
2017-02-10 00:04:44 -08:00
David Ashpole
b224f83c37 Revert "[Kubelet] Delay deletion of pod from the API server until volumes are deleted" 2017-02-09 08:45:18 -08:00
Wojciech Tyczynski
6c0535a939 Use secret TTL annotation in secret manager 2017-02-09 13:53:32 +01:00