Kubernetes Prow Robot
017842d49d
Merge pull request #83492 from ConnorDoyle/topo-align-all-qos
...
Topology manager aligns pods of all QoS classes.
2019-10-11 03:03:40 -07:00
Kubernetes Prow Robot
4561b67971
Merge pull request #83697 from klueska/fix-single-numa-with-one-provider
...
Fixed bug in TopologyManager with SingleNUMANode Policy
2019-10-10 19:00:33 -07:00
Kubernetes Prow Robot
3db6d3abcf
Merge pull request #83551 from dims/move-external-facing-kubelet-apis-to-staging
...
Move external facing kubelet apis to staging
2019-10-10 13:41:36 -07:00
Connor Doyle
a598369e3c
Gofmt.
2019-10-10 12:16:21 -07:00
Connor Doyle
a9203ebdcf
Topology manager aligns pods of all QoS classes.
2019-10-10 12:16:21 -07:00
Kevin Klues
5501f542cd
Fixed bug in TopologyManager with SingleNUMANode Policy
...
This patch fixes an issue in the TopologyManager that wouldn't allow
pods to be admitted if pods were launched with the SingleNUMANode policy
and any of the hint providers had no NUMA preferences.
This is due to 2 factors:
1) Any hint provider that passes back a `nil` as its hints, has its hint
automatically transformed into a single {11 true} hint before merging
2) We added a special casing for the SingleNumaNodePolicy() in the
TopologyManager that essentially turns these hints into a
{11 false} anytime a {11 true} is seen.
The current patch reworks this logic so the that TopologyManager can
tell the difference between a "don't care" hint and a true "{11 true}"
hint returned by the hint provider. Only true "{11 true}" hints will be
converted by the special casing for the SingleNumaNodePolicy(), while
"don't care" hints will not.
This is a short term fix for this issue until we do a larger refactoring
of this code for the 1.17 release.
2019-10-09 17:41:08 -07:00
mrobson
ad3dcb9fa0
Add podCgroup to process kill events to allow for correlation
2019-10-08 13:12:48 -04:00
Kubernetes Prow Robot
d70b2db1f2
Merge pull request #83296 from yutedz/kill-cgrp-proc
...
Only kill process where killing failed during previous iterations
2019-10-08 07:19:13 -07:00
Kubernetes Prow Robot
3f8f0a32fa
Merge pull request #83527 from odinuge/runc-rc9
...
Bump dependency opencontainers/runc@v1.0.0-rc9
2019-10-08 03:45:44 -07:00
Davanum Srinivas
f29d2272c8
fix gofmt and golint failures
...
Change-Id: I6535b506f50558b31663a13cd270b15023afa2c6
2019-10-06 18:43:17 -04:00
Kubernetes Prow Robot
48b90db9c3
Merge pull request #83495 from tanjunchen/fix-typo
...
remove the repeat word in documents
2019-10-06 15:05:08 -07:00
Davanum Srinivas
6ecc0f83af
update bazel BUILD files
...
Change-Id: Ia3917cec1453c0b22a958faf8c22bccd79242d14
2019-10-06 15:29:23 -04:00
Davanum Srinivas
d30c489c54
Move pkg/kubelet/pluginregistration and deviceplugin
...
Change-Id: I06adcb43bd278b430ffad2010869e1524c8cc4ff
2019-10-06 15:28:38 -04:00
tanjunchen
de3cf23414
remove the repeat word in documents
2019-10-06 23:32:01 +08:00
Odin Ugedal
b9cfb19321
Rename cgroupsystemd.Manager to LegacyManager
2019-10-05 14:22:35 +02:00
Kubernetes Prow Robot
d60bda1971
Merge pull request #83043 from ConnorDoyle/cleanup-cpumanger-topo-hints
...
Delegate topology hint gen to CPU manager policy
2019-10-05 00:59:39 -07:00
Kevin Klues
d2b53af7d7
Add klueska as reviewer for CPUManager and devicemanager
2019-10-03 13:01:41 -07:00
Ted Yu
6dbb533e3c
Only kill process where killing failed during previous iterations
2019-09-29 19:53:43 -07:00
Connor Doyle
389853894d
Delegate topology hint gen to CPU manager policy
...
- The previous implementation depended on a fixed set of policies.
2019-09-27 22:29:02 -07:00
zouyee
b1f6974f7b
using online instead to fix kubelet service failed with wrong number of possible NUMA nodes
...
Signed-off-by: Zou Nengren <zouyee1989@gmail.com >
2019-09-26 21:48:50 +08:00
Connor Doyle
e35301c19f
Rename package socketmask to bitmask.
...
- As discussed in reviews and other public channels,
this abstraction is used to represent numa nodes, not sockets.
- There is nothing inherently related to sockets in this package anyway.
2019-09-23 17:08:45 -07:00
Kubernetes Prow Robot
07cc813956
Merge pull request #81793 from lmdaly/topology-manager-owners
...
Added OWNERS file for Topology Manager
2019-09-11 18:26:52 -07:00
Louise Daly
fbccf25e29
Added OWNERS file for Topology Manager
2019-09-11 06:40:24 +01:00
Kubernetes Prow Robot
887edd2273
Merge pull request #82099 from lmdaly/single-numa-node-policy
...
Topology Manager Policy: single-numa-node
2019-08-30 11:21:26 -07:00
Kubernetes Prow Robot
9165f7bf56
Merge pull request #82104 from klueska/upstream-fix-cpu-manager-topology-bug
...
Fix bug in CPUManager with setting topology for policies
2019-08-30 08:00:44 -07:00
Louise Daly
8ad1b5ba3b
Single-numa-node Topology Manager bug fix
...
Added one off fix for single-numa-node policy to correctly
reject pod admission on a resource allocation that spans
NUMA nodes
Co-authored-by: Kevin Klues <kklues@nvidia.com >
2019-08-30 07:17:56 +01:00
Louise Daly
f6c085f60e
Added Single NUMA Node Policy which ensure resource are
...
aligned on a single NUMA node
Co-authored-by: Kevin Klues <kklues@nvidia.com >
2019-08-30 07:17:17 +01:00
Kevin Klues
5ed80dadcf
Update CanAdmitPodResult() in TopologyManager to take a TopologyHint
...
Previously it only took a bool, which limited the logic it could perform
to determine if a pod should be admitted or not based on the merged hint
from the policy.
2019-08-30 07:17:17 +01:00
Kevin Klues
eb0216e54e
Update semantics to set Preferred field in TopologyHint generation
...
We now only set Preferred to true if resources can be allocated with a
size equal to the minimimum _possible_ mask when all resources are
available.
2019-08-29 14:32:10 -05:00
Kevin Klues
e0e8b3e4fd
Update CPUManager topology helpers to accept multiple ids
2019-08-29 13:22:54 -05:00
Kevin Klues
dcc9f66311
Add devicemanager tests for TopologyHint consumption
2019-08-29 08:22:50 -05:00
Kevin Klues
cc567afaf0
Consume TopologyHints in the devicemanager
2019-08-29 08:22:50 -05:00
Kevin Klues
a3320f80d9
Add devicemanager tests for TopologyHint generation
2019-08-29 07:45:43 -05:00
Kevin Klues
d3d7a8f5d4
Generate TopologyHints from the devicemanager
2019-08-29 07:45:43 -05:00
Louise Daly
9a118ceac4
Added stub support for Topology Manager to Device Manager
...
Co-authored-by: Conor Nolan <conor.nolan@intel.com >
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com >
Co-authored-by: Kevin Klues <kklues@nvidia.com >
2019-08-29 07:45:43 -05:00
Kevin Klues
ddfd9ac0ca
Fix bug in CPUManager with setting topology for policies
...
Also add a check in the unit tests to avoid regressions
2019-08-28 17:32:25 -05:00
Kevin Klues
df1b54fc09
Fail fast with TopologyManager on machines with more than 8 NUMA Nodes
2019-08-28 11:04:52 -05:00
Kevin Klues
5660cd3cfb
Add NUMA Node awareness to the TopologyManager
2019-08-28 11:04:52 -05:00
Kubernetes Prow Robot
35867b160a
Merge pull request #81951 from klueska/upstream-update-cpu-amanger-numa-mapping
...
Update the CPUManager to include NUMANodeID in its topology information
2019-08-28 08:55:40 -07:00
Kubernetes Prow Robot
de1cfa9bc1
Merge pull request #81787 from lmdaly/topology-manager-rename-strict-policy
...
Renaming strict policy to restricted policy
2019-08-28 01:38:04 -07:00
Kevin Klues
f4dbd29cdb
Rename TopologyHint.SocketAffinity to TopologyHint.NUMANodeAffinity
...
As part of this, update the logic to use the NUMA information instead of
the Socket information when generating and consuming TopologyHints in
the CPUManager.
2019-08-27 16:51:05 -05:00
Kevin Klues
ecc14fe661
Update CPUManager to include NUMANodeID in CPUTopology
...
Unfortunately, the NUMA information is not readily available from
cadvisor, so we have to roll the logic to discover it by hand. In the
future, we should remove this custiom code to use the information
provided by cadvisor once it is made available.
2019-08-27 16:51:05 -05:00
Kevin Klues
869962fa48
Cache the discovered topology in the CPUManager instead of MachineInfo
2019-08-27 16:23:07 -05:00
Kubernetes Prow Robot
a3488b4cee
Merge pull request #81206 from tallclair/staticcheck-kubelet-push
...
Cleanup Kubelet static analysis issues
2019-08-22 15:09:43 -07:00
Kubernetes Prow Robot
6b47754740
Merge pull request #81627 from tallclair/copy
...
Delete duplicate resource.Quantity.Copy()
2019-08-22 11:13:13 -07:00
Louise Daly
2fb94231d0
Renaming strict policy to restricted policy
...
Restricted policy will fail admission of guaranteed pods where
all requested resources are not available on a single NUMA Node
2019-08-22 07:57:55 +01:00
Tim Allclair
a2c51674cf
Cleanup more static check issues (S1*,ST*)
2019-08-21 10:40:21 -07:00
Tim Allclair
8a495cb5e4
Clean up error messages (ST1005)
2019-08-21 10:40:21 -07:00
Tim Allclair
6510d26b6a
Fix misc static check issues
2019-08-21 10:40:21 -07:00
Tim Allclair
3f510c69f6
Remove dead code from pkg/kubelet/...
2019-08-21 10:40:21 -07:00