Commit Graph

840 Commits

Author SHA1 Message Date
Alexey Perevalov
5e6aed4137 Fixes sigfault in case of empty TopologyInfo
Device plugin which implements v1beta interface can return nil in
Topology field

For example nvidia-gpu-deviceplugin
3520254b75/nvidia.go (L147)
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-11-13 11:51:47 +03:00
Krzysztof Wiatrzyk
b7714918db Run ./update-all.sh
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
7b0ccaa1e9 Add tests for getPodDeviceRequest() for devicemanager
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
ba1e8abce7 Add tests for GetPodTopologyHints() for devicemanager
* Add additional test cases returned by getPodScopeTestCases()

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
1c4a1ba6ae Update topology hints tests to use pod object for devicemanager
Pod object is more flexible to use and construct
* Update TestGetTopologyHints() to work according to new test cases
* Update topologyHintTestCase{} to include proper field

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
7ad65bf22d Add tests for GetPodTopologyHints() for cpumanager
* Add tests for getPodRequestedCPU()

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
35b1f28d0f Refactor topology hints tests for cpumanager
* Extract common tests cases that will be used for both GetTopologyHints()
and GetPodTopologyHints()
* Extract machineInfo as it will be used for both functions as well

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk
656a08afdf Move scope specific tests from topologymanager under particular scopes
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk
c786c9a533 Move common tests from topologymanager under scope
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk
f5c0fe4ef6 Update topologymanager tests after adding scopes
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Byonggon Chun
9da0912a33 Implement devicemanager.GetPodLevelTopologyHints() function
* Add podDevices() func
* Add getPodDeviceRequest() func

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
27b7bcb41c Implement the cpumanager.GetPodTopologyHints() function
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk
6db58b2e92 Update logging to use a format util
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk
b2be584e5b Implement topology manager scopes
* Add topologyScopeName parameter to NewManager().
* Add scope interface and structure that implement common logic
* Add pod scope & container scopes
* Add pod lifecycle functions

Co-authored-by: sw.han <sw.han@samsung.com>

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:54 +01:00
sw.han
f5997fe537 Add GetPodTopologyHints() interface to Topology/CPU/Device Manager
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:54 +01:00
sw.han
d070bff273 Add kubelet configuration flag 'topology-manager-scope'
add kubelet config option.
* --topology-manager-scope=[ container | pod ]
* default=container

Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:54 +01:00
Alexey Perevalov
a8b8995ef2 Implement TopologyInfo and cpu_ids in podresources
It covers deviceplugin & cpumanager.

It has drawback, since cpuset and all other structs including cadvisor's keep
cpu as int, but for protobuf based interface is better to have fixed
int.
This patch also introduces additional interface CPUsProvider, while
DeviceProvider might have been extended too.

Checkpoint not covered by unit test.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-11-11 13:50:49 +03:00
Alexey Perevalov
62326a1846 Convert podDevices to struct
PodDevices will have its own guard

Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-11-11 13:50:48 +03:00
Alexey Perevalov
9f54dccc92 Change GetDevices interface
This change is necessary for supporting Topology in the ContainerDevices.

Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-11-11 12:41:31 +03:00
Kubernetes Prow Robot
bbc26ba7e6
Merge pull request #96048 from rphillips/fixes/device_plugin_stub_race
Deflake TestDevicePluginReRegistrationProbeMode: Devices of previous registered should be removed
2020-11-02 08:20:54 -08:00
Kubernetes Prow Robot
332d17c7f5
Merge pull request #95731 from farah/split-scheduler
Delete framework/v1alpha1 folder and change remaining import paths
2020-10-30 11:14:22 -07:00
Ryan Phillips
4fdfbc718c devicemanager: fix race in stub
There is a race when the server is coming up and the subsequent dial on
the socket. Fix the race with a PollImmediate retry.
2020-10-30 11:42:01 -05:00
Kubernetes Prow Robot
94cedd9f14
Merge pull request #95720 from draveness/feature/topology-manager-format
style: update comments in topology manager
2020-10-27 10:36:38 -07:00
draveness
60d3f99b1f style: update comments in topology manager 2020-10-23 18:20:50 +08:00
Ali
bfdeda58b7 Delete framework/v1alpha1 folder and change remaining import paths 2020-10-23 13:16:13 +11:00
chenyw1990
009d46f834 write checkpoint only when allocated devices updated. 2020-10-22 22:45:04 +08:00
Kubernetes Prow Robot
6ac2930ef0
Merge pull request #94574 from auxten/pkg-kubelet-staticchecks
Fix pkg/kubelet static checks
2020-09-21 21:22:47 -07:00
Srini Brahmaroutu
fbe5daed73 Change code to use staging/k8s.io/mount-utils 2020-09-16 21:51:24 -07:00
Kubernetes Prow Robot
09b3f6dbb3
Merge pull request #93214 from trashhalo/prefer-error
test: prefer NoError/Error over Nil/NotNil
2020-09-16 15:10:45 -07:00
Renaud Gaubert
4eadf40448 Run gofmt
Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>
2020-09-15 06:22:44 -07:00
Renaud Gaubert
ba95a8c641 run hack/update-vendor.sh
Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>
2020-09-15 05:13:33 -07:00
Renaud Gaubert
60304452ff Move podresources api to k8s.io/kubelet/pkg/apis
Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>
2020-09-15 05:13:33 -07:00
Kubernetes Prow Robot
119c94214c
Merge pull request #93931 from SataQiu/fix-kubelet-swap-20200812
kubelet: assume that swap is disabled when /proc/swaps does not exist
2020-09-11 04:20:14 -07:00
Kubernetes Prow Robot
293a53f2c0
Merge pull request #94140 from derekwaynecarr/pid-ga
Promote PidLimits to GA
2020-09-09 06:35:52 -07:00
auxten
a9c1acc044 Fix staticchecks ST1005,S1002,S1008,S1039 in pkg/kubelet 2020-09-07 10:53:43 +08:00
Kubernetes Prow Robot
60c421b6f6
Merge pull request #94541 from liggitt/deflake-cpucheckpoint
Deflake cpumanager checkpoint unit tests
2020-09-04 18:47:40 -07:00
Stephen Solka
203679cc61 prefer NoError/Error over Nil/NotNil 2020-09-04 18:35:52 -04:00
Jordan Liggitt
7268b1d557 Deflake cpumanager checkpoint unit tests 2020-09-04 15:06:04 -04:00
Jordan Liggitt
803da10d8b Use unique socket name per cm test 2020-09-04 14:55:23 -04:00
Kubernetes Prow Robot
81bf1f8789
Merge pull request #90980 from AlexeyPerevalov/GetNUMANodeInfo
Avoid using socket for hints in generateCPUTopologyHints
2020-09-02 03:41:06 -07:00
Kubernetes Prow Robot
f19118eea8
Merge pull request #94111 from giuseppe/fix-cgroup-v2-cgroupfs-path
kubelet, cgroupv2: do not create /sys/fs/cgroup/sys with cgroupfs
2020-09-01 19:41:33 -07:00
Kubernetes Prow Robot
2e59a17dc1
Merge pull request #92288 from zhijianli88/cleanup-tempfiles
Cleanup tempfiles
2020-08-27 17:56:54 -07:00
Derek Carr
6f2153986a Promote PidLimits to GA 2020-08-24 13:57:48 -04:00
Giuseppe Scrivano
49cbf91fce
kubelet, cgroupv2: do not create /sys/fs/cgroup/sys with cgroupfs
Closes: https://github.com/kubernetes/kubernetes/issues/94104

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-08-19 22:29:38 +02:00
SataQiu
ad1739f8bc kubelet: assume that swap is disabled when /proc/swaps does not exist 2020-08-12 22:43:58 +08:00
Jordan Liggitt
f33dc28094 generated: hack/update-hack-tools.sh && hack/update-vendor.sh 2020-07-25 16:45:02 -04:00
Alexey Perevalov
a047e8aa1b move to cadvisor.MachineInfo
This patch removes GetNUMANodeInfo, cadvisor.MachineInfo will be used
instead of it. GetNUMANodeInfo was introduced due to difference of meaning of
MachineInfo.Topology. On the arm it was NUMA nodes, but on the x86 it
represents sockets (since reading from /proc/cpuinfo). Now it unified
and MachineInfo.Topology represents NUMA node.

Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-07-24 09:29:41 -04:00
Alexey Perevalov
e33ba9e974 Avoid using socket for hints
Sockets don't affect performance as NUMA node does, since NUMA
node has dedicated memory controller, but socket it's physical
extension point.
Socket it's only cpu specific thing and it's strange to merge bitmask of
deviceplugin's and cpu manager, when cpu manager takes into account
socket.

Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-07-22 05:14:34 -04:00
Kubernetes Prow Robot
b6174e605f
Merge pull request #93189 from klueska/upstream-fix-bug-topology-manager
Fix a bug whereby reusable CPUs and devices were not being honored
2020-07-21 04:35:17 -07:00
Kubernetes Prow Robot
1fdd8fb213
Merge pull request #93263 from liggitt/windows
Fix windows kubelet startup
2020-07-20 19:51:57 -07:00