Commit Graph

433 Commits

Author SHA1 Message Date
danielqsj
79a3eb816c rename latency to duration in metrics 2019-02-18 17:40:04 +08:00
danielqsj
9fd99a48f5 Change kubelet metrics to conform guideline 2019-02-18 14:01:58 +08:00
Kubernetes Prow Robot
c88dcee3e9 Merge pull request #73824 from jiayingz/reallocate
Checks whether we have cached runtime state before starting a container
2019-02-15 20:35:30 -08:00
Kubernetes Prow Robot
888ff4097a Merge pull request #73651 from RobertKrawitz/node_pids_limit
Support total process ID limiting for nodes
2019-02-13 17:31:18 -08:00
Robert Krawitz
2597a1d97e Implement SupportNodePidsLimit, hand-tested 2019-02-13 14:56:17 -05:00
Kubernetes Prow Robot
b50c643be0 Merge pull request #73540 from rlenferink/patch-5
Updated OWNERS files to include link to docs
2019-02-08 09:05:56 -08:00
Jiaying Zhang
00b88c14b0 Checks whether we have cached runtime state before starting a container
that requests any device plugin resource. If not, re-issue Allocate
grpc calls. This allows us to handle the edge case that a pod got
assigned to a node even before it populates its extended resource
capacity.
2019-02-07 11:12:36 -08:00
Kubernetes Prow Robot
dc1244c6cd Merge pull request #72785 from derekwaynecarr/hugepages-ga
Graduate HugePages feature to GA
2019-02-05 13:56:51 -08:00
Roy Lenferink
b43c04452f Updated OWNERS files to include link to docs 2019-02-04 22:33:12 +01:00
Kubernetes Prow Robot
03b434c9d4 Merge pull request #58122 from tianshapjq/nit-int-is-enough
Len() is already int
2019-02-03 12:02:24 -08:00
Derek Carr
deae071d78 Graduate HugePages feature to GA 2019-02-02 00:21:10 -05:00
Andrew Kim
84191eb99b replace pkg/util/file with k8s.io/utils/path 2019-01-29 15:20:13 -05:00
Bernhard Altendorfer
736f35ec29 Fix golint failures 2019-01-24 00:14:25 +01:00
David Ashpole
2b8bc85f75 fix panic in NodeAllocatable node e2e test 2019-01-17 10:57:09 -08:00
ailusazh
10995f661d clean containers in reconcileState of cpuManager 2019-01-15 16:09:28 +08:00
Kubernetes Prow Robot
0dbc99719a Merge pull request #72076 from derekwaynecarr/pid-limiting
SupportPodPidsLimit feature beta with tests
2019-01-10 01:18:30 -08:00
Kubernetes Prow Robot
d88994cf9f Merge pull request #71306 from ping035627/k8s-181121
fix some typos
2019-01-09 09:06:31 -08:00
Derek Carr
bce9d5f204 SupportPodPidsLimit feature beta with tests 2019-01-09 10:50:59 -05:00
Kubernetes Prow Robot
4e8bea4bb7 Merge pull request #71194 from yanghaichao12/dev1119-1
Fix comment error of 'cpuManagerStateFileName'
2018-12-17 20:28:19 -08:00
yuexiao-wang
7b6f60f085 modify BUILD
Signed-off-by: yuexiao-wang <wang.yuexiao@zte.com.cn>
2018-12-11 11:22:06 +08:00
yuexiao-wang
f3353c358d [scheduler cleanup phase 2]: Rename to
Signed-off-by: yuexiao-wang <wang.yuexiao@zte.com.cn>
2018-12-11 11:21:12 +08:00
k8s-ci-robot
79e5cb2cb7 Merge pull request #71302 from liggitt/verify-unit-test-feature-gates
Split mutable and read-only access to feature gates, limit tests to readonly access
2018-11-29 21:45:12 -08:00
saad-ali
a7c5582bba Permit use of deprecated dir in device plugin. 2018-11-21 18:37:31 -08:00
saad-ali
8f666d9e41 Modify kubelet watcher to support old versions
Modify kubelet plugin watcher to support older CSI drivers that use an
the old plugins directory for socket registration.
Also modify CSI plugin registration to support multiple versions of CSI
registering with the same name.
2018-11-21 18:37:31 -08:00
PingWang
9d541911bb fix some typos
Signed-off-by: PingWang <wang.ping5@zte.com.cn>

fix typo

Signed-off-by: PingWang <wang.ping5@zte.com.cn>
2018-11-22 08:27:14 +08:00
Jordan Liggitt
70ad4dff48 Fix unit tests calling SetFeatureGateDuringTest incorrectly 2018-11-21 11:51:33 -05:00
yanghaichao12
982d1778f8 Fix comment error of 'cpuManagerStateFileName' 2018-11-19 08:07:04 -05:00
Vladimir Vivien
b195396154 Kubelet Plugin Registration v1 update fix 2018-11-15 17:40:35 -05:00
David Ashpole
630cb53f82 add kubelet grpc server for pod-resources service 2018-11-15 09:43:20 -08:00
Davanum Srinivas
954996e231 Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
David Ashpole
d4f6ae3615 fix slice sharing bug in cgroup manager 2018-11-05 17:42:42 -08:00
Pengfei Ni
856c83e637 Enable allocatable support for Windows nodes 2018-10-30 11:17:23 +08:00
Christoph Blecker
97b2992dc1 Update gofmt for go1.11 2018-10-05 12:59:38 -07:00
k8s-ci-robot
3fe21e5433 Merge pull request #68922 from BenTheElder/version-staging
move pkg/util/version to staging
2018-09-26 22:59:42 -07:00
k8s-ci-robot
0ca25b8db7 Merge pull request #68816 from FengyunPan2/cgroup-info
Add helpful log for checking cgrop path
2018-09-26 18:10:46 -07:00
FengyunPan2
34a8b1fd9f Add helpful log for checking cgrop path
Currently I just get 'xxx cgroup does not exist', but I don't know
which path has missed. Let's add log for it.
2018-09-25 10:10:12 +08:00
k8s-ci-robot
8346631860 Merge pull request #68053 from Pingan2017/rmifblock
clean up unneeded else block
2018-09-24 17:17:29 -07:00
Benjamin Elder
8b56eb8588 hack/update-gofmt.sh 2018-09-24 12:21:29 -07:00
Benjamin Elder
f828c6f662 hack/update-bazel.sh 2018-09-24 12:03:24 -07:00
Benjamin Elder
088cf3c37b find & replace version import 2018-09-24 12:03:24 -07:00
Renaud Gaubert
8dd1d27c03 Updated the device manager pluginwatcher handler 2018-09-06 15:34:46 +02:00
Sandor Szücs
588d2808b7 fix #51135 make CFS quota period configurable, adds a cli flag and config option to kubelet to be able to set cpu.cfs_period and defaults to 100ms as before.
It requires to enable feature gate CustomCPUCFSQuotaPeriod.

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
2018-09-01 20:19:59 +02:00
Pingan2017
2f1284bc34 cleanup unneeded if block 2018-08-30 17:18:56 +08:00
Kubernetes Submit Queue
c491d48cde Merge pull request #67430 from choury/cpumanager
Automatic merge from submit-queue (batch tested with PRs 67430, 67550). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

cpumanager: rollback state if updateContainerCPUSet failed

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63018

If `updateContainerCPUSet`  failed, the container will start failed. We should rollback the state to avoid CPU leak.
**Special notes for your reviewer**:

**Release note**:

```release-note
cpumanager: rollback state if updateContainerCPUSet failed
```
2018-08-21 23:20:58 -07:00
Ismo Puustinen
dd3eeb3f46 device manager: don't do operations on nil pointer.
If grpc.DialContext() fails, a nil connection is returned. Check the
error before calling conn.Close().
2018-08-21 15:20:36 +03:00
Kubernetes Submit Queue
d017bebf6b Merge pull request #67145 from jiayingz/reboot-fix
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fail container start if its requested device plugin resource is unknown.

With the change, Kubelet device manager now checks whether it has cached option state for the requested device plugin resource to make sure the resource is in ready state when we start the container.



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubernetes/issues/67107

**Special notes for your reviewer**:

**Release note**:

```release-note
Fail container start if its requested device plugin resource hasn't registered after Kubelet restart.
```
2018-08-21 01:48:54 -07:00
choury
36b92b9b29 cpumanager: rollback state if updateContainerCPUSet failed 2018-08-17 18:08:58 +08:00
tianshapjq
81081dc9e7 nits in manager.go 2018-08-15 08:16:04 +08:00
Jiaying Zhang
7b1ae66432 Fail container start if its requested device plugin resource doesn't
have cached option state to make sure the device plugin resource is
in ready state when we start the container.
2018-08-08 13:11:36 -07:00
Kubernetes Submit Queue
60ac433922 Merge pull request #66946 from LinEricYang/unused-variable
Automatic merge from submit-queue (batch tested with PRs 66512, 66946, 66083). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kubelet/cm/cpumanager: Fix unused variable "skipIfPermissionsError"

The variable "skipIfPermissionsError" is not needed even when
permission error happened.
2018-08-06 19:44:04 -07:00