* Add topologyScopeName parameter to NewManager().
* Add scope interface and structure that implement common logic
* Add pod scope & container scopes
* Add pod lifecycle functions
Co-authored-by: sw.han <sw.han@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
It covers deviceplugin & cpumanager.
It has drawback, since cpuset and all other structs including cadvisor's keep
cpu as int, but for protobuf based interface is better to have fixed
int.
This patch also introduces additional interface CPUsProvider, while
DeviceProvider might have been extended too.
Checkpoint not covered by unit test.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This patch removes GetNUMANodeInfo, cadvisor.MachineInfo will be used
instead of it. GetNUMANodeInfo was introduced due to difference of meaning of
MachineInfo.Topology. On the arm it was NUMA nodes, but on the x86 it
represents sockets (since reading from /proc/cpuinfo). Now it unified
and MachineInfo.Topology represents NUMA node.
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
use the new libcontainer feature of skipping setting the devices
cgroup. This is necessary on cgroup v2 to avoid leaking a eBPF
program every time the cgroup is re-configured.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
containerMap is used in CPU Manager to store all containers information in the node.
containerMap provides a mapping from (pod, container) -> containerID for all containers a pod
It is reusable in another component in pkg/kubelet/cm which needs to track changes of all containers in the node.
Signed-off-by: Byonggon Chun <bg.chun@samsung.com>
do a conversion from the cgroups v1 limits to cgroups v2.
e.g. cpu.shares on cgroups v1 has a range of [2-262144] while the
equivalent on cgroups v2 is cpu.weight that uses a range [1-10000].
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
GetAllocateResourcesPodAdmitHandler(). It is named as such to reflect its
new function. Also remove the Topology Manager feature gate check at higher level
kubelet.go, as it is now done in GetAllocateResourcesPodAdmitHandler().
GetTopologyPodAdmitHandler() now returns a lifecycle.PodAdmitHandler
type instead of the TopologyManager directly. The handler it returns
is generally responsible for attempting to allocate any resources that
require a pod admission check. When the TopologyManager feature gate
is on, this comes directly from the TopologyManager. When it is off,
we simply attempt the allocations ourselves and fail the admission
on an unexpected error. The higher level kubelet.go feature gate
check will be removed in an upcoming PR.
Instead of having a single call for Allocate(), we now split this into two
functions Allocate() and UpdatePluginResources().
The semantics split across them:
// Allocate configures and assigns devices to a pod. From the requested
// device resources, Allocate will communicate with the owning device
// plugin to allow setup procedures to take place, and for the device
// plugin to provide runtime settings to use the device (environment
// variables, mount points and device files).
Allocate(pod *v1.Pod) error
// UpdatePluginResources updates node resources based on devices already
// allocated to pods. The node object is provided for the device manager to
// update the node capacity to reflect the currently available devices.
UpdatePluginResources(
node *schedulernodeinfo.NodeInfo,
attrs *lifecycle.PodAdmitAttributes) error
As we move to a model in which the TopologyManager is able to ensure
aligned allocations from the CPUManager, devicemanger, and any
other TopologManager HintProviders in the same synchronous loop, we will
need to be able to call Allocate() independently from an
UpdatePluginResources(). This commit makes that possible.
These information associatedd with these containers is used to migrate
the CPUManager state from it's old format to its new (i.e. keyed off of
podUID and containerName instead of containerID).
This patch removes pkg/util/mount completely, and replaces it with the
mount package now located at k8s.io/utils/mount. The code found at
k8s.io/utils/mount was moved there from pkg/util/mount, so the code is
identical, just no longer in-tree to k/k.
Unfortunately, the NUMA information is not readily available from
cadvisor, so we have to roll the logic to discover it by hand. In the
future, we should remove this custiom code to use the information
provided by cadvisor once it is made available.
isKernelPid should explicitly check the error returned from os.Readlink and return true
only if the error value is ENOENT. Without this fix, if Readlink
returned say ENAMETOOLONG or EACESS, we would still count the process as
a kernel process (which is not true).
The container manager used in kubelet checks for docker daemon process either via pidfile
or process name. While the pidfile points to the docker daemon process PID, the
dockerProcessName constant stores a docker cli name ( docker ) instead of docker daemon
name ( dockerd ).