Commit Graph

1469 Commits

Author SHA1 Message Date
Michael Taufen
cbebb61450 Kubelet flags take precedence over config from files/ConfigMaps
Changes the Kubelet configuration flag precedence order so that flags
take precedence over config from files/ConfigMaps.

See issue #56171 for more details.

Also modifies e2e node test suite to transform all relevant Kubelet
flags into a config file before starting tests when the
KubeletConfigFile feature gate is true, and turns on the
KubeletConfigFile gate for all e2e node tests. This allows the alpha
dynamic Kubelet config feature to continue to work in tests after
the precedence change.
2017-11-21 16:02:27 -08:00
Kubernetes Submit Queue
03b7d77be4 Merge pull request #54316 from dashpole/disk_request_eviction
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Take disk requests into account during evictions

fixes #54314

This PR is part of the local storage feature, and it makes the eviction manager take disk requests into account during disk evictions.
This uses the same eviction strategy as we do for memory.
Disk requests are only considered when the LocalStorageCapacityIsolation feature gate is enabled.  This is enforced by adding a check for the feature gate in getRequests().
I have added unit testing to ensure that previous behavior is preserved when the feature gate is disabled.
Most of the changes are testing.  Reviewers should focus on changes in **eviction/helpers.go**

/sig node
/assign @jingxu97  @vishh
2017-11-21 14:31:47 -08:00
Jiaying Zhang
048bafdd0b Adds device plugin registration count metric and allocation latency metric. 2017-11-21 13:44:10 -08:00
Balaji Subramaniam
16e0f12253 Enable cpu manager only if the test is not skipped.
- Also, if KubeReserved is nil, allocate a map.
2017-11-21 10:48:54 -08:00
David Ashpole
8b3bd5ae60 take disk requests into account during evictions 2017-11-21 10:21:30 -08:00
Jiaying Zhang
990113ce60 Extends gpu_device_plugin e2e_node test to verify that scheduled pods
can continue to run even after device plugin deletion and kubelet
restarts.
2017-11-20 23:40:27 -08:00
Kubernetes Submit Queue
9fe2a62b90 Merge pull request #55338 from dashpole/remove_disk_allocatable
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove Ephemeral Storage Allocatable Evictions

Issue #52336

Rationale and docs change: https://github.com/kubernetes/community/pull/1275

cc @kubernetes/sig-node-pr-reviews 
cc @derekwaynecarr @vishh 
/assign @jingxu97 
/assign @dchen1107
2017-11-20 21:43:24 -08:00
Jing Xu
75ef18c4d3 Add Pod-level local ephemeral storage metric in Summary API
This PR adds pod-level ephemeral storage metric into Summary API.
Pod-level ephemeral storage usage is the sum of all containers and local
ephemeral volume including EmptyDir (if not backed up by memory or
hugepages), configueMap, and downwardAPI.
2017-11-20 16:32:38 -08:00
Kubernetes Submit Queue
3679b54b19 Merge pull request #55898 from dashpole/fix_flaky_allocatable
Automatic merge from submit-queue (batch tested with PRs 54837, 55970, 55912, 55898, 52977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix Flaky Allocatable Setup Tests

**What this PR does / why we need it**:
Fixes a flaky node e2e serial test.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #55830

**Special notes for your reviewer**:
The test was flaking because we were reading the node status before the restarted kubelet had written it.
This fixes this by waiting until we see an updated node status (looking at the condition's heartbeat time).
This also fixes an incorrect error message.

**Release note**:
```release-note
NONE
```
2017-11-18 13:13:24 -08:00
Kubernetes Submit Queue
7d1085e122 Merge pull request #54837 from xiangpengzhao/conf-test
Automatic merge from submit-queue (batch tested with PRs 54837, 55970, 55912, 55898, 52977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use framework.ConformanceIt for node e2e conformance tests

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #54726 #53909

**Special notes for your reviewer**:
/cc @mml 

**Release note**:

```release-note
NONE
```
2017-11-18 13:13:17 -08:00
Kubernetes Submit Queue
ef3b27cbd4 Merge pull request #55642 from dashpole/disable_cadvisor_disk_for_cri
Automatic merge from submit-queue (batch tested with PRs 55642, 55897, 55835, 55496, 55313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Disable container disk metrics when using the CRI stats integration

Issue: https://github.com/kubernetes/kubernetes/issues/51798

As explained in the issue, runtimes which make use of the CRI Stats API still have the performance overhead of collecting those same stats through cAdvisor.
The CRI Stats API has metrics for CPU, Memory, and Disk.  This PR significantly reduces the added overhead due to collecting these stats in both cAdvisor and in the runtime.
This PR disables container disk metrics, which are very expensive to collect.

This PR does not disable node-level disk stats, as the "Raw" container handler does not currently respect ignoring DiskUsageMetrics.
This PR factors out the logic for determining whether or not to use the CRI stats provider into a helper function, as cAdvisor is instantiated before it is passed to the kubelet as a dependency.

cc @kubernetes/sig-node-pr-reviews @derekwaynecarr  
/kind feature
/sig node

/assign @Random-Liu @derekwaynecarr
2017-11-18 10:46:30 -08:00
David Ashpole
527611ee41 remove disk allocatable evictions 2017-11-18 10:34:59 -08:00
Derek Carr
db89b46ce7 kubelet summary api test updates 2017-11-17 22:30:49 -05:00
Andy Xie
64a8edfbcf fix network value for stats summary 2017-11-18 10:17:59 +08:00
xiangpengzhao
7fdea2b0cf Use framework.ConformanceIt for node e2e conformance tests 2017-11-17 17:28:20 +08:00
Michael Taufen
1085b6f730 Lift embedded structure out of eviction-related KubeletConfiguration fields
- Changes the following KubeletConfiguration fields from `string` to
`map[string]string`:
  - `EvictionHard`
  - `EvictionSoft`
  - `EvictionSoftGracePeriod`
  - `EvictionMinimumReclaim`
- Adds flag parsing shims to maintain Kubelet's public flags API, while
enabling structured input in the file API.
- Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag
parsing shim living in the kubeletconfig API group, and replaces it
with the `MapStringString` shim introduced in this PR. Flag parsing
shims belong in a common place, not in the kubeletconfig API.
I manually audited these to ensure that this wouldn't cause errors
parsing the command line for syntax that would have previously been
error free (`kubeletconfig.ConfigurationMap` was unique in that it
allowed keys to be provided on the CLI without values. I believe this was
done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag,
which rightfully accepts value-free keys, and that this shim was then
just copied to `kubeletconfig`). Fortunately, the affected fields
(`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect
non-empty strings in the values of the map, and as a result passing the
empty string is already an error. Thus requiring keys shouldn't break
anyone's scripts.
- Updates code and tests accordingly.

Regarding eviction operators, directionality is already implicit in the
signal type (for a given signal, the decision to evict will be made when
crossing the threshold from either above or below, never both). There is
no need to expose an operator, such as `<`, in the API. By changing
`EvictionHard` and `EvictionSoft` to `map[string]string`, this PR
simplifies the experience of working with these fields via the
`KubeletConfiguration` type. Again, flags stay the same.

Other things:
- There is another flag parsing shim, `flags.ConfigurationMap`, from the
shared flag utility. The `NodeLabels` field still uses
`flags.ConfigurationMap`. This PR moves the allocation of the
`map[string]string` for the `NodeLabels` field from
`AddKubeletConfigFlags` to the defaulter for the external
`KubeletConfiguration` type. Flags are layered on top of an internal
object that has undergone conversion from a defaulted external object,
which means that previously the mere registration of flags would have
overwritten any previously-defined defaults for `NodeLabels` (fortunately
there were none).
2017-11-16 18:35:13 -08:00
David Ashpole
8f3e2f315e fix flaky allocatable test 2017-11-16 11:16:58 -08:00
Kubernetes Submit Queue
779105673a Merge pull request #55188 from mindprince/accelerator-monitoring
Automatic merge from submit-queue (batch tested with PRs 55798, 49579, 54862, 55188, 51990). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add monitoring support for hardware accelerators

Currently only NVIDIA GPU monitoring is implemented.

Feature repo issue: https://github.com/kubernetes/features/issues/369
cAdvisor PR: https://github.com/google/cadvisor/pull/1762

/kind feature
/sig node
/sig instrumentation
/area hw-accelerators

**Release note**:
```release-note
Kubelet now exposes metrics for NVIDIA GPUs attached to the containers.
```
2017-11-16 03:09:21 -08:00
Yang Guo
7eb7cfe3ef Add a cloud-init script to disable live-restore 2017-11-14 21:40:13 -08:00
David Ashpole
220edbc6e3 disable container disk metrics when using the CRI stats integration 2017-11-14 11:43:08 -08:00
Kubernetes Submit Queue
41fe3ed5bc Merge pull request #54405 from resouer/clean-docker-dep
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[Part 1] Remove docker dep in kubelet startup

**What this PR does / why we need it**:

Remove dependency of docker during kubelet start up.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Part 1 of #54090 

**Special notes for your reviewer**:
Changes include:

1. Move docker client initialization into dockershim pkg.
2. Pass a docker `ClientConfig` from kubelet to dockershim
3. Pass parameters needed by `FakeDockerClient` thru `ClientConfig` to dockershim

(TODO, the second part) Make dockershim tolerate when dockerd is down, otherwise it will still fail kubelet

Please note after this PR, kubelet will still fail if dockerd is down, this will be fixed in the subsequent PR by making dockershim tolerate dockerd failure (initializing docker client in a separate goroutine), and refactoring cgroup and log driver detection. 

**Release note**:

```release-note
Remove docker dependency during kubelet start up 
```
2017-11-13 03:59:53 -08:00
Rohit Agarwal
9c38abd482 Expose accelerator metrics in the summary API. 2017-11-10 14:59:43 -08:00
Yang Guo
ed8cd396dd Use whitelisted test image 2017-11-10 14:16:27 -08:00
Yang Guo
8ea9417a37 Adjust GKE spec to validate images with kernel version 4.10+ 2017-11-10 09:47:08 -08:00
Penghao Cen
22b04c828b Append --feature-gates option iff TestContext.FeatureGates is not nil 2017-11-10 19:42:22 +08:00
Dr. Stefan Schimanski
bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Kubernetes Submit Queue
f7dc3966a4 Merge pull request #47497 from mikedanese/binary
Automatic merge from submit-queue (batch tested with PRs 54773, 52523, 47497, 55356, 49429). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

don't check in mounter binary

```release-note
GCI mounter is moved from the manifests tarball to the server tarball.
```
2017-11-08 22:11:53 -08:00
Kubernetes Submit Queue
fdeeed1001 Merge pull request #54688 from yanxuean/besteffort
Automatic merge from submit-queue (batch tested with PRs 53645, 54734, 54586, 55015, 54688). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

e2e-node:the value of bestEffortCgroup is wrong

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>

**What this PR does / why we need it**:
The value of bestEffortCgroup is wrong in e2e-node. The test case is invalid actually.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-11-06 15:33:50 -08:00
zouyee
68c5ce19b8 [test/e2e_node]Redirect dl.k8s.io to the kubernetes-release GCS bucket 2017-11-02 12:18:50 +08:00
Harry Zhang
de1c305356 Remove docker dep in kubelet startup
Update bazel
2017-11-01 10:03:01 +08:00
Kubernetes Submit Queue
94935721d5 Merge pull request #54160 from mtaufen/runtime-config-to-flags
Automatic merge from submit-queue (batch tested with PRs 54160, 54016). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move runtime-related flags from KubeletConfiguration to KubeletFlags

With respect to https://github.com/kubernetes/kubernetes/pull/53833#issuecomment-336317287, move runtime-related flags out of KubeletConfiguration.

Broader issue: https://github.com/kubernetes/features/issues/281

```release-note
NONE
```
2017-10-31 01:23:15 -07:00
Mike Danese
bef68f7dbc cluster: build gci mounter like other go binaries 2017-10-30 13:56:09 -07:00
Kubernetes Submit Queue
eff1a84638 Merge pull request #52256 from feiskyer/credential-provider-test
Automatic merge from submit-queue (batch tested with PRs 49762, 52256). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add node e2e tests for pulling images from credential providers

**What this PR does / why we need it**:

Add node e2e tests for pulling images from credential providers.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

Refer https://github.com/kubernetes/kubernetes/pull/51870#issuecomment-328234010

**Special notes for your reviewer**:

/assign @yujuhong @Random-Liu 

1. We still need to add ResetDefaultDockerProviderExpiration for facilitating tests
2. Do we need a separate image for pulling private image from credential provider?
3. Any suggestion of also adding this for sandbox images? the pause image is a global config of kubelet, but we only need to set a private one for just one test case. 

**Release note**:

```release-note
NONE
```
2017-10-27 22:48:28 -07:00
yanxuean
f849ebdefa e2e-node:the value of bestEffortCgroup is wrong
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-27 17:10:53 +08:00
Kevin
4c8539cece use core client with explicit version globally 2017-10-27 15:48:32 +08:00
Kubernetes Submit Queue
931bc9edf4 Merge pull request #53730 from bsteciuk/kubeadm-windows-e2e_node
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add Windows support to the system verification check

**What this PR does / why we need it**:  This PR (in conjunction with https://github.com/kubernetes/kubernetes/pull/53553 ) adds initial support for adding a Windows worker node to a Kubernetes cluster using
 kubeadm.  It was suggested on that PR to open a separate PR for the changes in test/e2e_node for review by sig-node devs.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #364 in conjuction with #53553 

**Special notes for your reviewer**:

**Release note**:

```release-note
Add Windows support to the system verification check
```
2017-10-26 19:23:01 -07:00
Pengfei Ni
2bbba1f662 Add node e2e tests for pulling images from credential providers 2017-10-26 20:55:13 +08:00
Bob Steciuk
94db64fcb3 Add Windows support to the system verification check
Pulled SysSpecs out of types.go and created two os specific implementations with build tags

Similarly created conditionally compiled implementations of KernelValidationHelper to get Kernel version in os specific manner, as well as os specific docker endpoints (socket vs named pipes)
2017-10-25 18:55:47 -04:00
Kubernetes Submit Queue
1336cc0b05 Merge pull request #53051 from tanshanshan/test925
Automatic merge from submit-queue (batch tested with PRs 53051, 52489, 53920). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix todo

**What this PR does / why we need it**:
fix todo 
thanks
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-24 21:38:17 -07:00
Michael Taufen
f90b46c784 Move runtime-related flags from KubeletConfiguration to KubeletFlags 2017-10-23 11:15:48 -07:00
Sen Lu
0538b65421 Add a notice for node e2e config files 2017-10-20 16:10:13 -07:00
foxyriver
7d71129ff0 delete archive 2017-10-20 11:40:52 +08:00
Di Xu
f7f3577035 use multi-arch busybox for e2e 2017-10-19 10:36:31 +08:00
Dr. Stefan Schimanski
cad0364e73 Update bazel 2017-10-18 17:24:04 +02:00
Dr. Stefan Schimanski
7773a30f67 pkg/api/legacyscheme: fixup imports 2017-10-18 17:23:55 +02:00
Kubernetes Submit Queue
855551dc80 Merge pull request #51250 from dixudx/bump_cni_v0.6.0
Automatic merge from submit-queue (batch tested with PRs 53106, 52193, 51250, 52449, 53861). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

bump CNI to v0.6.0

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #49480

**Special notes for your reviewer**:
/assign @luxas @bboreham @feiskyer 

**Release note**:

```release-note
bump CNI to v0.6.0
```
2017-10-16 14:47:23 -07:00
Jeff Grafton
aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
Di Xu
dba448c2a6 Update all binary download references to v0.6.0 2017-10-14 22:24:49 +08:00
David Ashpole
539fddb49d kubelet evictions take priority into account 2017-10-12 13:15:05 -07:00