Commit Graph

1418 Commits

Author SHA1 Message Date
Kevin Klues
c45f1317eb Fix some whitespacing and comments in devicemanager 2020-07-02 15:15:44 +00:00
Giuseppe Scrivano
e94aebf4cb pkg/kubelet: adapt to new libcontainer API
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-06-24 18:39:51 +02:00
Li Zhijian
02eaa4f354 cleanup tempfiles in unit test
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
2020-06-23 11:47:18 +08:00
Kubernetes Prow Robot
86ad0df820 Merge pull request #92203 from sjenning/add-sjenning-node-approver
Add sjenning as kubelet approver
2020-06-19 21:52:02 -07:00
Seth Jennings
45d2b98aa8 add sjenning as kubelet approver 2020-06-19 13:00:55 -05:00
kadisi
a75323c76b fix unexpected append mutations about pkg/kubelet package
Signed-off-by: kadisi <iamkadisi@163.com>
Co-authored-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2020-06-03 13:36:57 +08:00
Kubernetes Prow Robot
0e37bcce2c Merge pull request #88385 from tallclair/node-reviews
Remove tallclair from some OWNERS files
2020-05-24 20:23:11 -07:00
Kubernetes Prow Robot
b170451caa Merge pull request #90183 from dims/update-kubernetes-to-klog-v2
Update kubernetes to klog v2
2020-05-16 18:59:51 -07:00
Kubernetes Prow Robot
f011430e85 Merge pull request #84599 from mrobson/log-destroy
Errors from cgroup destroy are swallowed. Log error at warning level.
2020-05-16 18:59:36 -07:00
Davanum Srinivas
07d88617e5 Run hack/update-vendor.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:33 -04:00
Davanum Srinivas
442a69c3bd switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
caiweidong
5ed8fb690c Use klog to replace log to keep them in consistence 2020-05-12 13:49:10 +08:00
Tim Allclair
029a144ae9 Remove tallclair from some OWNERS files 2020-05-11 11:44:38 -07:00
Kubernetes Prow Robot
7fdc1275d9 Merge pull request #90377 from cbf123/container_cpuset_fixup_2
Fix exclusive CPU allocations being deleted at container restart
2020-04-27 13:40:04 -07:00
Chris Friesen
ab5870d808 Fix exclusive CPU allocations being deleted at container restart
The expectation is that exclusive CPU allocations happen at pod
creation time. When a container restarts, it should not have its
exclusive CPU allocations removed, and it should not need to
re-allocate CPUs.

There are a few places in the current code that look for containers
that have exited and call CpuManager.RemoveContainer() to clean up
the container.  This will end up deleting any exclusive CPU
allocations for that container, and if the container restarts within
the same pod it will end up using the default cpuset rather than
what should be exclusive CPUs.

Removing those calls and adding resource cleanup at allocation
time should get rid of the problem.

Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
2020-04-27 11:36:54 -06:00
Kevin Klues
751b9f3e13 Update strategy used to reuse CPUs from init containers in CPUManager
With the old strategy, it was possible for an init container to end up
running without some of its CPUs being exclusive if it requested more
guaranteed CPUs than the sum of all guaranteed CPUs requested by app
containers. Unfortunately, this case was not caught by our unit tests
because they didn't validate the state of the defaultCPUSet to ensure
there was no overlap with CPUs assigned to containers. This patch
updates the strategy to reuse the CPUs assigned to init containers
across into app containers, while avoiding this edge case. It also
updates the unit tests to now catch this type of error in the future.
2020-04-23 20:27:43 +00:00
Kubernetes Prow Robot
d92fdebd85 Merge pull request #89897 from giuseppe/test-e2e-node
kubelet: fix e2e-node cgroups test on cgroup v2
2020-04-20 15:54:12 -07:00
Kubernetes Prow Robot
d0183703cb Merge pull request #90059 from ahg-g/ahg-nodeinfo2
Cleanup obsolete NodeInfo methods
2020-04-14 17:32:04 -07:00
Abdullah Gharaibeh
d6522e0e74 rename framework pkg with schedulerframework for all instances under pkg/kubelet 2020-04-14 14:24:07 -04:00
Kubernetes Prow Robot
105c0c6951 Merge pull request #88970 from mysunshine92/correct-NodeAllocatableRoot
fix function NodeAllocatableRoot
2020-04-14 11:04:13 -07:00
Abdullah Gharaibeh
bed9b2f23b Cleanup obsolete NodeInfo methods 2020-04-12 18:13:46 -04:00
Giuseppe Scrivano
26d94ad628 kubelet: do not configure the device cgroup
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-09 16:18:06 +02:00
Giuseppe Scrivano
a9772b2290 kubelet: adapt cgroup_manager to cgroup v2
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-09 16:18:04 +02:00
Giuseppe Scrivano
6d16fee229 kubelet: cpu hard capping is supported on cgroup v2
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-04-09 16:18:03 +02:00
Francesco Romani
623587ec8b cpumanager: test: add missing helper
add back the missing AssertStateEqual helper;
it is needed by some tests we still want to run.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-04-07 16:59:07 +02:00
Francesco Romani
be0fe3df9b cpumanager: drop old custom file backend
The cpumanager file-based state backend was obsoleted since few
releases, aving the cpumanager moved to the checkpointmanager common
infrastructure.
The old test checking compatibility to/from the old format is
also no longer needed, because the checkpoint format is stable
(see
https://github.com/kubernetes/kubernetes/tree/master/pkg/kubelet/checkpointmanager).

Signed-off-by: Francesco Romani <fromani@redhat.com>
2020-04-07 13:24:48 +02:00
Kubernetes Prow Robot
b030be376b Merge pull request #89581 from Wenfeng-GAO/simplify
simplify code in topologymanager
2020-04-02 23:07:46 -07:00
Wenfeng-GAO
1aebbee7da simplify code in topologymanager 2020-03-28 00:04:51 +08:00
Giuseppe Scrivano
c4429d8bd4 kubelet: add tests for cgroup v2 conversions
follow-up for https://github.com/kubernetes/kubernetes/pull/85218

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-03-27 13:50:57 +01:00
Kubernetes Prow Robot
34c8b26c9f Merge pull request #85218 from giuseppe/cgroupv2
kubelet: add initial support for cgroupv2
2020-03-26 14:10:23 -07:00
Kubernetes Prow Robot
4488fd4749 Merge pull request #89053 from bg-chun/move_package
migration of re-usable package from pkg/kubelet/cm/cpumanager to pkg/kubelet/cm
2020-03-26 11:14:09 -07:00
yameiwang
6783f991c3 fix function NodeAllocatableRoot 2020-03-26 18:48:05 +08:00
Byonggon Chun
a3047672d0 move pkg/kubelet/cm/cpumanager/containermap to pkg/kubelet/cm/containermap for reusing
containerMap is used in CPU Manager to store all containers information in the node.
containerMap provides a mapping from (pod, container) -> containerID for all containers a pod
It is reusable in another component in pkg/kubelet/cm which needs to track changes of all containers in the node.

Signed-off-by: Byonggon Chun <bg.chun@samsung.com>
2020-03-14 02:38:51 +09:00
Giuseppe Scrivano
bb5ed1b797 kubelet: add initial support for cgroupv2
do a conversion from the cgroups v1 limits to cgroups v2.

e.g. cpu.shares on cgroups v1 has a range of [2-262144] while the
equivalent on cgroups v2 is cpu.weight that uses a range [1-10000].

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-03-12 08:50:19 +01:00
Adelina Tuvenie
a9f834d17d Implement noopWindowsResourceAllocator
On Windows, the podAdmitHandler returned by the GetAllocateResourcesPodAdmitHandler() func
and registered by the Kubelet is nil.

We implement a noopWindowsResourceAllocator that would admit any pod for Windows in order
to be consistent with the original implementation.
2020-03-10 21:32:23 +01:00
Kubernetes Prow Robot
cd0057c16a Merge pull request #88876 from nolancon/none-policy-fix
Topology Manager none policy bug fix
2020-03-05 21:40:33 -08:00
Kubernetes Prow Robot
ce01a9bad0 Merge pull request #88857 from nolancon/test-fix
Check for nil cpuManager in container manager
2020-03-05 20:05:14 -08:00
Kubernetes Prow Robot
48541a0b16 Merge pull request #87650 from nolancon/beta-feature-gate
Update TopologyManager Feature Gate
2020-03-05 20:03:04 -08:00
nolancon
0551d408ac Bug fix for TM none policy 2020-03-05 14:25:48 +00:00
nolancon
4baa1d967d Check for nil cpuManager 2020-03-05 07:54:33 +00:00
Kubernetes Prow Robot
ac32644d6e Merge pull request #87759 from klueska/upstream-move-cpu-allocation-to-pod-admit
Guarantee aligned resources across containers
2020-03-04 20:12:37 -08:00
nolancon
e8538d9b76 Add mutex to Topology Manager Add/RemoveContainer
This was exposed as a potential bug during e2e test debugging of this
PR.
2020-03-02 04:07:21 +00:00
Kevin Klues
2327934a86 Rename GetTopologyPodAmitHandler() as
GetAllocateResourcesPodAdmitHandler(). It is named as such to reflect its
new function. Also remove the Topology Manager feature gate check at higher level
kubelet.go, as it is now done in GetAllocateResourcesPodAdmitHandler().
2020-02-27 07:52:43 +00:00
nolancon
a9c6129577 Device Manager - Update unit tests
- Pass container to Allocate().
- Loop through containers to call Allocate() on container by container
basis.
2020-02-27 07:24:34 +00:00
nolancon
cb9fdc49db Device Manager - Refactor allocatePodResources
- allocatePodResources logic altered to allow for container by container
device allocation.
- New type PodReusableDevices
- New field in devicemanager devicesToReuse
2020-02-27 07:24:34 +00:00
nolancon
0a9bd0334d CPU Manager - Updates to unit tests:
- Where previously we called manager.AddContainer(), we now call both
manager.Allocate() and manager.AddContainer().
- Some test cases now have two expected errors. One each
from Allocate() and AddContainer(). Existing outcomes are unchanged.
2020-02-27 07:24:34 +00:00
nolancon
467f66580b CPU Manager - Add check to policy.Allocate() for init conatiners
If container allocated CPUs is an init container, release those CPUs
back into the shared pool for re-allocation to next container.
2020-02-27 07:24:33 +00:00
nolancon
709989efa2 CPU Manager - Rename policy.AddContainer() to policy.Allocate() 2020-02-27 07:24:33 +00:00
Kevin Klues
0d68bffd03 Change GetTopologyPodAdmitHandler() to be more general
GetTopologyPodAdmitHandler() now returns a lifecycle.PodAdmitHandler
type instead of the TopologyManager directly. The handler it returns
is generally responsible for attempting to allocate any resources that
require a pod admission check. When the TopologyManager feature gate
is on, this comes directly from the TopologyManager. When it is off,
we simply attempt the allocations ourselves and fail the admission
on an unexpected error. The higher level kubelet.go feature gate
check will be removed in an upcoming PR.
2020-02-27 07:24:26 +00:00
Oleg Chunikhin
b651178849 fix incorrect configuration of kubepods.slice unit by kubelet (issue #88197) 2020-02-17 13:22:45 -05:00