Commit Graph

279 Commits

Author SHA1 Message Date
Jing Xu
a66ee2eb3f Add pod-level metric for CPU and memory stats
This PR adds the pod-level metrics for CPU and memory stats. cAdvisor
can get all pod cgroup information so we can add this pod-level CPU and
memory stats information from the corresponding pod cgroup
2017-11-22 09:25:23 -08:00
Jiaying Zhang
048bafdd0b Adds device plugin registration count metric and allocation latency metric. 2017-11-21 13:44:10 -08:00
Jiaying Zhang
1eb4e79453 Extends deviceplugin to gracefully handle full device plugin lifecycle.
- Instead of using cm.capacity field to communicate device plugin resource
capacity, this PR changes to use an explicit cm.GetDevicePluginResourceCapacity()
function that returns device plugin resource capacity as well as any inactive
device plugin resource. Kubelet syncNodeStatus call this function during its
periodic run to update node status capacity and allocatable. After this call,
device plugin can remove the inactive device plugin resource from its allDevices
field as the update is already pushed to API server.
- Extends device plugin checkpoint data to record registered resources
so that we can finish resource removing even upon kubelet restarts.
- Passes sourcesReady from kubelet to device plugin to avoid removing
inactive pods during grace period of kubelet restart.
2017-11-20 23:40:14 -08:00
Niklas Q. Nielsen
b16bfc768d Merging handler into manager API 2017-11-20 21:37:46 +00:00
Kubernetes Submit Queue
0b1d023aa7
Merge pull request #55884 from mpolednik/dpi-race-fix
Automatic merge from submit-queue (batch tested with PRs 55839, 54495, 55884, 55983, 56069). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

deviceplugin: fix race when multiple plugins are registered

**What this PR does / why we need it**:
When registering multiple device plugins to Kubelet concurrently, there exists a race that crashes the Kubelet.

Consider two plugins: D1 and D2. The call order method is roughly

D1 -> manager.go:register -> endpoint.go:listAndWatch -> device_plugin_handler.go:(*D1).callback
D2 -> manager.go:register -> endpoint.go:listAndWatch -> device_plugin_handler.go:(*D2).callback

The callback function accesses HandlerImpl's allDevices map that maps (resourceName -> DeviceID). If both plugins reach these accesses at the same time, Kubelet crashes with "fatal error: concurrent map read and map write".

This can be solved by making sure handler is locked when allDevices are being updated. The functionality is needed to avoid Kubelet crashes when multiple device plugins are trying to register with Kubelet at the same moment. Occurs frequently when single binary tries to register itself as multiple plugins.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-11-20 13:08:09 -08:00
Kubernetes Submit Queue
869b5ab191
Merge pull request #55841 from ConnorDoyle/cpuman-file-state-for-none-policy
Automatic merge from submit-queue (batch tested with PRs 55841, 55948, 55945). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

CPU Manager: file state for all policies

**What this PR does / why we need it**:

Before this change, the new file-backed state was only enabled for the static CPU manager policy. This patch enables persistent state for all policies.

This PR fixes #55736 and the potential CPU resource leak described in that issue.

**Release note**:

```release-note
NONE
```

/kind bug
/sig node
/assign @balajismaniam
2017-11-18 14:10:12 -08:00
Kubernetes Submit Queue
c60b35bcd3
Merge pull request #52977 from yanxuean/improvecgroup
Automatic merge from submit-queue (batch tested with PRs 54837, 55970, 55912, 55898, 52977). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve kubelet cgroup

**What this PR does / why we need it**:
1.Use arg cgroupRoot,not nodeConfig.CgroupRoot
    Using both arg cgroupRoot and nodeConfig.CgroupRoot is confused in function NewQOSContainerManager
2.improve cgroupmanager in qosContainerManager
3. improve arg "cgroupRoot" type in NewQOSContainerManager

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-11-18 13:13:28 -08:00
Michael Taufen
1085b6f730 Lift embedded structure out of eviction-related KubeletConfiguration fields
- Changes the following KubeletConfiguration fields from `string` to
`map[string]string`:
  - `EvictionHard`
  - `EvictionSoft`
  - `EvictionSoftGracePeriod`
  - `EvictionMinimumReclaim`
- Adds flag parsing shims to maintain Kubelet's public flags API, while
enabling structured input in the file API.
- Also removes `kubeletconfig.ConfigurationMap`, which was an ad-hoc flag
parsing shim living in the kubeletconfig API group, and replaces it
with the `MapStringString` shim introduced in this PR. Flag parsing
shims belong in a common place, not in the kubeletconfig API.
I manually audited these to ensure that this wouldn't cause errors
parsing the command line for syntax that would have previously been
error free (`kubeletconfig.ConfigurationMap` was unique in that it
allowed keys to be provided on the CLI without values. I believe this was
done in `flags.ConfigurationMap` to facilitate the `--node-labels` flag,
which rightfully accepts value-free keys, and that this shim was then
just copied to `kubeletconfig`). Fortunately, the affected fields
(`ExperimentalQOSReserved`, `SystemReserved`, and `KubeReserved`) expect
non-empty strings in the values of the map, and as a result passing the
empty string is already an error. Thus requiring keys shouldn't break
anyone's scripts.
- Updates code and tests accordingly.

Regarding eviction operators, directionality is already implicit in the
signal type (for a given signal, the decision to evict will be made when
crossing the threshold from either above or below, never both). There is
no need to expose an operator, such as `<`, in the API. By changing
`EvictionHard` and `EvictionSoft` to `map[string]string`, this PR
simplifies the experience of working with these fields via the
`KubeletConfiguration` type. Again, flags stay the same.

Other things:
- There is another flag parsing shim, `flags.ConfigurationMap`, from the
shared flag utility. The `NodeLabels` field still uses
`flags.ConfigurationMap`. This PR moves the allocation of the
`map[string]string` for the `NodeLabels` field from
`AddKubeletConfigFlags` to the defaulter for the external
`KubeletConfiguration` type. Flags are layered on top of an internal
object that has undergone conversion from a defaulted external object,
which means that previously the mere registration of flags would have
overwritten any previously-defined defaults for `NodeLabels` (fortunately
there were none).
2017-11-16 18:35:13 -08:00
Martin Polednik
6e3f8f3890 deviceplugin: fix race when multiple plugins are registered
Signed-off-by: Martin Polednik <mpolednik@redhat.com>
2017-11-16 15:20:00 +01:00
Connor Doyle
c95ee34234 Use file-backed state for all cpumanager policies
- Add unit test to verify policy name mismatch behavior.
2017-11-15 22:38:11 -08:00
Kubernetes Submit Queue
e99544d018
Merge pull request #54409 from intelsdi-x/cpu-enable-state-file
Automatic merge from submit-queue (batch tested with PRs 55764, 55683, 55468, 54409, 55546). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable file back state in static policy

**What this PR does / why we need it**:
Enables file back `State` in `static policy` and cpu manager + tests.
Upon policy start, state read from file is validated whether it meets the policy assumption. In case of any error, state is cleared.

Previous PR: #54408
Next PR: #54409
2017-11-15 22:16:05 -08:00
Kubernetes Submit Queue
6f35d49079
Merge pull request #52149 from lichuqiang/combineListwatch
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Deviceplugin refactoring: merge func list and listwatch in endpoint into one

**What this PR does / why we need it**:
merge func list and listwatch in endpoint into one, since we won't call list func individually

**Which issue this PR fixes**
fixes #51993
Part2

**Special notes for your reviewer**:
/cc @jiayingz @RenaudWasTaken @vishh

**Release note**:

```release-note
NONE
```
2017-11-15 16:56:51 -08:00
Jiaying Zhang
93916242f7 Adds jiayingz@ and vish@ as approvers for pkg/kubelet/cm/deviceplugin/. 2017-11-14 15:27:02 -08:00
Michał Stachowski
809ac834a0 Cpu manager file state tests 2017-11-14 18:26:41 +01:00
Szymon Scharmach
7e7301ffaf Enable file state in static policy 2017-11-14 18:25:58 +01:00
lichuqiang
4fa0fa5ad1 pass devices of previous endpoint into re-registered one to avoid potential orphaned devices upon re-registration 2017-11-14 16:43:19 +08:00
Kubernetes Submit Queue
e2c02f425a
Merge pull request #53970 from ScorpioCPH/add-more-comments
Automatic merge from submit-queue (batch tested with PRs 55283, 55461, 55288, 53970, 55487). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more comments for DevicePluginHandlerImpl struct

**What this PR does / why we need it**:

Add more comments

**Special notes for your reviewer**:

@jiayingz PTAL.

**Release note**:

```
NONE
```
2017-11-13 12:32:27 -08:00
Dr. Stefan Schimanski
bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Clayton Coleman
66590d6f83
Container manager has a bad fake interface 2017-11-03 22:21:29 -04:00
Penghao Cen
1d4e1942d8 Add more comments for HandlerImpl struct 2017-11-03 18:24:32 +08:00
Kubernetes Submit Queue
2084f7f4f3
Merge pull request #54488 from lichuqiang/plugin_base
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add admission handler for device resources allocation

**What this PR does / why we need it**:
Add admission handler for device resources allocation to fail fast during pod creation

**Which issue this PR fixes** 
fixes #51592

**Special notes for your reviewer**:
@jiayingz Sorry, there is something wrong with my branch in #51895. And I think the existing comments in the PR might be too long for others to view. So I closed it and opened the new one, as we have basically reach an agreement on the implement :)
I have covered the functionality and unit test part here, and would set about the e2e part ASAP

/cc @jiayingz @vishh @RenaudWasTaken 

**Release note**:

```release-note
NONE
```
2017-11-02 17:24:06 -07:00
lichuqiang
0630896383 update unit test for plugin resources allocation reinforcement 2017-11-02 09:18:24 +08:00
lichuqiang
ebd445eb8c add admission handler for device resources allocation 2017-11-02 09:17:48 +08:00
Shawn Hsiao
f7a15cb751 set leveled logging (v=4) for 'updating container' message 2017-11-01 16:54:23 -04:00
Kubernetes Submit Queue
94e77bd4ca Merge pull request #54408 from intelsdi-x/cpu-state-file
Automatic merge from submit-queue (batch tested with PRs 54656, 54552, 54389, 53634, 54408). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add file backed state to cpu manager

**What this PR does / why we need it**:
Adds file backed `State` implementation to cpu manger with tests.
Reads from `State` are done from memory, while each write triggers state save to a file.

Any failure in reading the state file results in empty state

Next PR: #54409
2017-10-26 21:08:38 -07:00
Rohit Agarwal
092429be1c Better error messages and logging while registering device plugins. 2017-10-26 15:17:38 -07:00
Michał Stachowski
97e3f7bf86 State file test fixes 2017-10-26 20:03:35 +02:00
Szymon Scharmach
4ee0adc77a Added Cpu Manager file state 2017-10-26 20:03:17 +02:00
lichuqiang
6a39ac3874 merge func list and listwatch into one 2017-10-26 16:36:16 +08:00
Jiaying Zhang
e501f01d85 Move podDevices code into a separate file. 2017-10-24 17:48:59 -07:00
Jiaying Zhang
ff4e8d429e Device plugin code refactoring to cope with file move.
While moving device_plugin_handler_test.go from pkg/kubelet/cm/ to
pkg/kubelet/cm/deviceplugin/, we can no longer uses cm in its tests
because that would cause a cycle dependency. To solve this problem,
I moved the main cm GetResources functionality as well as part of the
current device plugin handler Allocate functionality into a new device
plugin handler function, GetDeviceRunContainerOptions(). This
refactoring is also needed by another PR 51895 that moves device
allocation into admission phase. Now device plugin handler Allocate()
first checks whether there is cached device runtime state and only
issues Allocate grpc call if there is no cached state available.
The new GetDeviceRunContainerOptions() function simply returns device
runtime config from the cached state. To support this change, extended the
podDevices struct and checkpoint data structure with device runtime state.
2017-10-24 14:38:15 -07:00
Jiaying Zhang
796f488789 Move device plugin related files under pkg/kubelet/cm/deviceplugin/. 2017-10-24 14:17:20 -07:00
lichuqiang
fd8b04649e unnecessary functions cleanup for deviceplugin 2017-10-20 09:37:59 +08:00
Vishnu kannan
16b0363b95 Disabling k8s.io/kubernetes/pkg/kubelet/cm TestPodContainerDeviceAllocation due to #54100
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:35:24 -07:00
Vishnu kannan
e0032af916 bump device plugin version to v1alpha2 to reflect the change to AllocateResponce API
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:35:24 -07:00
Vishnu kannan
18eee1eaa0 Make AllocateResponse artifacts global across all devices per container in device plugin API
There is no use case known for passing artifacts per device as it currently exists. The current API is also
complex to use for simple clients. Hence this PR creates a flat namespace where artifacts like environment variables
and mount points apply globally to all devices returned as part of AllocateResponse proto.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2017-10-19 10:34:00 -07:00
Kubernetes Submit Queue
1d8f1e268f Merge pull request #47699 from supereagle/fix-typos
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix typos: remove duplicated word in comments

**What this PR does / why we need it**: Remove the duplicated word `the` in comments

**Which issue this PR fixes** : fixes #

**Special notes for your reviewer**:

```release-note
NONE
```
2017-10-17 02:35:52 -07:00
Jeff Grafton
aee5f457db update BUILD files 2017-10-15 18:18:13 -07:00
Kubernetes Submit Queue
3deab69d3b Merge pull request #53790 from yanxuean/cgroupredundancy
Automatic merge from submit-queue (batch tested with PRs 52959, 53790). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove redundancy code in setCPUCgroupConfig

fix #53925

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>



**What this PR does / why we need it**:

The check of burstableCPUShares is redundancy. We have done it in MilliCPUToShares. It is responsibility of MilliCPUToShares.
```
func (m *qosContainerManagerImpl) setCPUCgroupConfig(configs map[v1.PodQOSClass]*CgroupConfig) error {
        ........
	// set burstable shares based on current observe state
	burstableCPUShares := MilliCPUToShares(burstablePodCPURequest)
	if burstableCPUShares < uint64(MinShares) {
		burstableCPUShares = uint64(MinShares)
	}
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Improveing code.

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-13 19:19:32 -07:00
yanxuean
5d5fee8cab capitalize the first letter
capitalize the first letter for the field comment of containerManagerImpl

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-13 14:54:06 +08:00
Kubernetes Submit Queue
03adf92aa9 Merge pull request #53753 from derekwaynecarr/log-spam
Automatic merge from submit-queue (batch tested with PRs 53119, 53753, 53795, 52981). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce log spam in qos container manager

**What this PR does / why we need it**:
excessive log stmts make it hard to debug actual problems.

**Release note**:
```release-note
NONE
```
2017-10-12 08:28:36 -07:00
yanxuean
8adb2181eb remove redundancy code in setCPUCgroupConfig
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-10-12 18:42:18 +08:00
Derek Carr
328a12d160 Reduce log spam in qos container manager 2017-10-11 19:47:40 -04:00
Euan Kemp
7aa88b5103 kubelet/cm: remove unneeded fork of 'cat'
Reading a file in Go is perfectly possible without invoking cat.

I also removed an outdated comment.
2017-10-10 21:53:35 -07:00
Kubernetes Submit Queue
ec116fdc73 Merge pull request #53328 from intelsdi-x/lscpu_fix
Automatic merge from submit-queue (batch tested with PRs 53297, 53328). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Cpu Manager - make CoreID's platform unique

**What this PR does / why we need it**:
Cpu Manager uses topology from cAdvisor(`/proc/cpuinfo`) where coreID's are socket unique - not platform unique - this causes problems on multi-socket platforms.

All code assumes unique coreID's (on platform) -  `Discovery` function has been changed to assign CoreID as the lowest cpuID from all cpus belonging to the same core. This can be expressed as:
`CoreID=min(cpuID's on the same core)`

Since cpuID's are platform unique - above gives us guarantee that CoreID's will also be platform unique.



**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53323
2017-10-10 11:20:37 -07:00
Kubernetes Submit Queue
aaf14d4619 Merge pull request #53525 from sttts/sttts-scheme-copier-romoval
Automatic merge from submit-queue (batch tested with PRs 53525, 53652). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

apimachinery: remove ObjectCopier interface(s)

The big commit is a mechanical, transitive removal of the copier interfaces in all structs and function calls.
2017-10-10 08:31:41 -07:00
Szymon Scharmach
b86dc9c054 Make CoreID's platform unique 2017-10-10 10:45:44 +02:00
Kubernetes Submit Queue
c12dab37e7 Merge pull request #53547 from jiayingz/deviceplugin-fix
Automatic merge from submit-queue (batch tested with PRs 52662, 53547, 53588, 53573, 53599). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

In DevicePluginHandlerImpl.Allocate(), skips untracked extended resou…

…rces.

Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.



**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
https://github.com/kubernetes/kubernetes/issues/53548

**Special notes for your reviewer**:

**Release note**:

```release-note
Ignore extended resources that are not registered with kubelet
```
2017-10-09 12:51:17 -07:00
Kubernetes Submit Queue
85b252d47e Merge pull request #51771 from dixudx/refactor_nsenter
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Refactor nsenter

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #51273

**Special notes for your reviewer**:
/assign @jsafrane 

**Release note**:

```release-note
None
```
2017-10-08 23:27:32 -07:00
Dr. Stefan Schimanski
ecb65a6a71 Update generated files 2017-10-07 11:28:47 +02:00
Jiaying Zhang
ee1ffa619b In DevicePluginHandlerImpl.Allocate(), skips untracked extended resources.
Otherwise, we would fail a Pod allocation request that has an extended
resource not managed by any device plugin.
2017-10-06 13:57:53 -07:00
Dr. Stefan Schimanski
ed586da147 apimachinery: remove Scheme.DeepCopy 2017-10-06 14:59:17 +02:00
Kubernetes Submit Queue
5e2ce3aaf2 Merge pull request #53122 from resouer/fix-cpu
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Eliminate extra CRI call during processing cpu set

**What this PR does / why we need it**:

Encountered this during `kubernetes/frakti` node e2e test.

When cpuset is not set, there's still plenty of `runtime.UpdateContainerResources` been called, which seems unnecessary.

cc @ConnorDoyle Make sense? Fixes: #53304

**Special notes for your reviewer**:

**Release note**:

```release-note
Only do UpdateContainerResources when cpuset is set 
```
2017-10-01 15:30:56 -07:00
Harry Zhang
282973d87d Elimenate extra CRI call 2017-09-30 16:51:32 +08:00
Kubernetes Submit Queue
6fcf841d69 Merge pull request #52692 from wackxu/fbc
Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix the bad code comment and make the format unify

**What this PR does / why we need it**:

Fix the bad code comment and make the format unify

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #


**Release note**:

```release-note
NONE
```
2017-09-28 21:15:43 -07:00
Di Xu
57ead4898b use GetFileType per mount.Interface to check hostpath type 2017-09-26 09:57:06 +08:00
NickrenREN
7f9696201e Fix --kube-reserved storage key name and add test cases for node allocatable reservation 2017-09-26 09:32:21 +08:00
yanxuean
f011c044d4 improve cgroupmanager in qosContainerManager
improve arg "cgroupRoot" type in NewQOSContainerManager

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-09-25 16:59:15 +08:00
yanxuean
45146cff4e Use arg cgroupRoot,not nodeConfig.CgroupRoot
Using both arg cgroupRoot and nodeConfig.CgroupRoot is confused in function NewQOSContainerManager

Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2017-09-25 15:19:20 +08:00
wackxu
d8aa0ca82a fix the bad code comment and make the format unify 2017-09-19 11:15:10 +08:00
supereagle
87c29a08e1 fix typos: remove duplicated word in comments 2017-09-16 14:38:10 +08:00
Balaji Subramaniam
e2cb80db4a Added large topology tests for static policy in CPU Manager.
- Added comments for tests cases.
2017-09-06 13:15:22 -07:00
Kubernetes Submit Queue
dcc1aa0628 Merge pull request #51928 from mindprince/pr-45724-fix-build
Automatic merge from submit-queue

Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.

This was broken in #45724

**Release note**:
```release-note
NONE
```
/sig storage
/sig node

/cc @jsafrane, @vishh
2017-09-05 19:44:54 -07:00
Kubernetes Submit Queue
99aa992ce8 Merge pull request #51751 from dashpole/update_cadvisor_godep
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)

Update Cadvisor Dependency

Fixes: https://github.com/kubernetes/kubernetes/issues/51832
This is the worst dependency update ever... 
The root of the problem is the [name change of Sirupsen -> sirupsen](https://github.com/sirupsen/logrus/issues/570#issuecomment-313933276).  This means that in order to update cadvisor, which venders the lowercase, we need to update all dependencies to use the lower-cased version.  With that being said, this PR updates the following packages:

`github.com/docker/docker`
- `github.com/docker/distribution`
  - `github.com/opencontainers/go-digest`
  - `github.com/opencontainers/image-spec`
  - `github.com/opencontainers/runtime-spec`
  - `github.com/opencontainers/selinux`
  - `github.com/opencontainers/runc`
    - `github.com/mrunalp/fileutils`
  - `golang.org/x/crypto`
    - `golang.org/x/sys`
- `github.com/docker/go-connections`
- `github.com/docker/go-units`
- `github.com/docker/libnetwork`
- `github.com/docker/libtrust`
- `github.com/sirupsen/logrus`
- `github.com/vishvananda/netlink`

`github.com/google/cadvisor`
- `github.com/euank/go-kmsg-parser`

`github.com/json-iterator/go`

Fixed https://github.com/kubernetes/kubernetes/issues/51832

```release-note
Fix journalctl leak on kubelet restart
Fix container memory rss
Add hugepages monitoring support
Fix incorrect CPU usage metrics with 4.7 kernel
Add tmpfs monitoring support
```
2017-09-05 17:30:06 -07:00
Kubernetes Submit Queue
8b9e8cf80a Merge pull request #51744 from jiayingz/deviceplugin-checkpoint
Automatic merge from submit-queue (batch tested with PRs 50072, 51744)

Deviceplugin checkpoint

**What this PR does / why we need it**:
Extends on top of PR 51209 to checkpoint device to pod allocation information on Kubelet to recover from Kubelet restarts.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-09-05 13:33:01 -07:00
David Ashpole
e5a6a79fd7 update cadvisor, docker, and runc godeps 2017-09-05 12:38:57 -07:00
Jiaying Zhang
3b2bc58c11 Extends device_plugin_handler to checkpoint device to container allocation information. 2017-09-05 09:52:14 -07:00
Derek Carr
38d5dee677 Node validation restricts pre-allocated hugepages to single page size 2017-09-05 10:34:30 -04:00
Derek Carr
1ec2a69d9a Kubelet changes to support hugepages 2017-09-05 09:46:08 -04:00
Rohit Agarwal
08ea02b9a5 Make *fakeMountInterface in container_manager_unsupported_test.go implement mount.Interface again.
This was broken in #45724
2017-09-04 21:48:55 -07:00
Balaji Subramaniam
5b5958ecec Add tests for the static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle
d0bcbbb437 Added static cpumanager policy. 2017-09-04 07:24:59 -07:00
Connor Doyle
e03a6435bb Added cpu assignment helpers. 2017-09-04 07:24:59 -07:00
Szymon Scharmach
242439c9d7 Add topology helper and tests to cpumanager. 2017-09-04 07:24:59 -07:00
Connor Doyle
e4d5565228 Fix Start signature in container_manager_windows. 2017-09-04 07:24:59 -07:00
Connor Doyle
81ccd396d7 Fixed nil InternalContainerLifecycle in cm stubs. 2017-09-04 07:24:59 -07:00
Connor Doyle
ec706216e6 Un-revert "CPU manager wiring and none policy"
This reverts commit 8d2832021a.
2017-09-04 07:24:59 -07:00
Jiaying Zhang
29d178fbc3 Fixes a cross-build failure introduced in PR 51209. FYI, issue 51863. 2017-09-02 21:56:39 -07:00
Kubernetes Submit Queue
917f9f02ef Merge pull request #45724 from jsafrane/mount-propagation2
Automatic merge from submit-queue

Make /var/lib/kubelet as shared during startup

This is part of ~~https://github.com/kubernetes/community/pull/589~~ https://github.com/kubernetes/community/pull/659

We'd like kubelet to be able to consume mounts from containers in the future, therefore kubelet should make sure that `/var/lib/kubelet` has shared mount propagation to be able to see these mounts. 

On most distros, root directory is already mounted with shared mount propagation and this code will not do anything. On older distros such as Debian Wheezy, this code detects that `/var/lib/kubelet` is a directory on `/` which has private mount propagation and kubelet bind-mounts `/var/lib/kubelet` as rshared.

Both "regular" linux mounter and `NsenterMounter` are updated here.

@kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews 
@vishh 

Release note:
```release-note
Kubelet re-binds /var/lib/kubelet directory with rshared mount propagation during startup if it is not shared yet.
```
2017-09-02 12:00:30 -07:00
Jiaying Zhang
02001af752 Kubelet side extension to support device allocation 2017-09-01 11:56:35 -07:00
Renaud Gaubert
c4a1c97329 Device Plugin Kubelet integration 2017-09-01 11:47:09 -07:00
Shyam JVS
8d2832021a Revert "CPU manager wiring and none policy" 2017-09-01 18:17:36 +02:00
Connor Doyle
50674ec614 Added cpu-manager-reconcile-period config.
- Defaults to sync-frequency.
2017-08-30 23:42:32 -07:00
Connor Doyle
7c6e31617d CPU Manager initialization and lifecycle calls. 2017-08-30 08:50:41 -07:00
Connor Doyle
5dee682796 CPU manager config and feature gate. 2017-08-30 08:27:23 -07:00
Balaji Subramaniam
7567f1765f Added CPU manager unit tests (none policy) 2017-08-30 08:26:22 -07:00
Seth Jennings
ff471913f9 Added none policy for CPU manager. 2017-08-30 08:26:21 -07:00
Connor Doyle
01d1d8f23f Added in-memory CPU manager state. 2017-08-30 08:26:21 -07:00
Jan Safranek
d9500105d8 Share /var/lib/kubernetes on startup
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
2017-08-30 16:45:04 +02:00
Connor Doyle
726bd8e27b Add CPU manager interfaces. 2017-08-29 03:42:17 -07:00
Kubernetes Submit Queue
98fb8cacf9 Merge pull request #50773 from huzhengchuan/bug/50770
Automatic merge from submit-queue (batch tested with PRs 51391, 51338, 51340, 50773, 49599)

Delete "hugetlb" from whitelistControllers

**What this PR does / why we need it**:
Delete "hugetlb" from whitelistControllers

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50770

**Special notes for your reviewer**:

**Release note**:

```
NONE
```
2017-08-26 08:49:26 -07:00
NickrenREN
27901ad5df Change eviction policy to manage one single local storage resource 2017-08-26 05:14:49 +08:00
Connor Doyle
515d86faa0 Add CPUSetBuilder, make CPUSet immutable. 2017-08-22 22:33:04 -07:00
Connor Doyle
e686ecb6ea Renamed CPUSet.AsSlice() => CPUSet.ToSlice() 2017-08-22 21:21:26 -07:00
Connor Doyle
8f38abb350 Add cpuset helper library. 2017-08-22 11:42:01 -07:00
Kubernetes Submit Queue
d2cf96d6ef Merge pull request #48057 from NickrenREN/fix-validateNodeAllocatable
Automatic merge from submit-queue (batch tested with PRs 50758, 48057)

Fix node allocatable resource validation

GetNodeAllocatableReservation gets all the reserved resource value
Allocatable resource = capacity - reservation


**Release note**:

```release-note
NONE
```
2017-08-16 07:57:24 -07:00
zhengchuan hu
938bffcb04 Delete "hugetlb" from whitelistControllers 2017-08-16 22:52:56 +08:00
Michael Taufen
24bab4c20f move KubeletConfiguration out of componentconfig API group 2017-08-15 08:12:42 -07:00
NickrenREN
eadb7ca8c0 Fix node allocatable resource validation
GetNodeAllocatableReservation gets all the reserved resource, and we need to compare it with capacity
2017-08-14 10:20:40 +08:00