If the cpumanager feature gate is disabled, the corresponsing field
of the containerManager will be nil.
A couple functions don't check for this occurrence and happily
deference the pointer unconditionally, leading to possible segfaults.
The relevant functions were introduced to support the podresources API,
so to trigger this segfault all the following are needed:
- cpumanager feature gate has to be disabled explicitely
- any podresources API must be called
Worth pointing out that when the new functions were introduced (around
kubernetes 1.20) the default feature gate for cpumanager was already set
to true, hence this bug is expected to be triggered rarely.
Signed-off-by: Francesco Romani <fromani@redhat.com>
during the review, we convened that the manager types
(CPUSet, ResourceDeviceInstances) should not cross the
containermanager API boundary; thus, the ContainerManager layer
is the correct place to do the type conversion
We push back the type conversions from the podresources server
layer, fixing tests accordingly.
Signed-off-by: Francesco Romani <fromani@redhat.com>
We want to make the return type of the GetDevices() method of the
podresources DevicesProvider interface consistent with
the newly added GetAllocatableDevices type.
This makes the code easier to read and reduces the coupling between
the podresourcesapi server and the devicemanager code.
No intended changes in behaviour, but the different return types
now requires some data massaging. Tests are updated accordingly.
Signed-off-by: Francesco Romani <fromani@redhat.com>
a upcoming patch wants to add GetAllocatableCPUs() returning a cpuset.
To make the code consistent and a bit more flexible, we change the
existing interface to also return a cpuset.
Signed-off-by: Francesco Romani <fromani@redhat.com>
- libcontainer renamed
`github.com/opencontainers/runc/libcontainer/configs` to
`github.com/opencontainers/runc/libcontainer/devices` so use the new
references
- Update `dockershim` `ContainerCreate` call after docker update to
v20.10.2
Pass memory manager flags to the container manager and call all relevant memory manager
methods under the container manager.
Signed-off-by: Byonggon Chun <bg.chun@samsung.com>
* Add topologyScopeName parameter to NewManager().
* Add scope interface and structure that implement common logic
* Add pod scope & container scopes
* Add pod lifecycle functions
Co-authored-by: sw.han <sw.han@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
It covers deviceplugin & cpumanager.
It has drawback, since cpuset and all other structs including cadvisor's keep
cpu as int, but for protobuf based interface is better to have fixed
int.
This patch also introduces additional interface CPUsProvider, while
DeviceProvider might have been extended too.
Checkpoint not covered by unit test.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
This patch removes GetNUMANodeInfo, cadvisor.MachineInfo will be used
instead of it. GetNUMANodeInfo was introduced due to difference of meaning of
MachineInfo.Topology. On the arm it was NUMA nodes, but on the x86 it
represents sockets (since reading from /proc/cpuinfo). Now it unified
and MachineInfo.Topology represents NUMA node.
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
use the new libcontainer feature of skipping setting the devices
cgroup. This is necessary on cgroup v2 to avoid leaking a eBPF
program every time the cgroup is re-configured.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
containerMap is used in CPU Manager to store all containers information in the node.
containerMap provides a mapping from (pod, container) -> containerID for all containers a pod
It is reusable in another component in pkg/kubelet/cm which needs to track changes of all containers in the node.
Signed-off-by: Byonggon Chun <bg.chun@samsung.com>
do a conversion from the cgroups v1 limits to cgroups v2.
e.g. cpu.shares on cgroups v1 has a range of [2-262144] while the
equivalent on cgroups v2 is cpu.weight that uses a range [1-10000].
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
GetAllocateResourcesPodAdmitHandler(). It is named as such to reflect its
new function. Also remove the Topology Manager feature gate check at higher level
kubelet.go, as it is now done in GetAllocateResourcesPodAdmitHandler().
GetTopologyPodAdmitHandler() now returns a lifecycle.PodAdmitHandler
type instead of the TopologyManager directly. The handler it returns
is generally responsible for attempting to allocate any resources that
require a pod admission check. When the TopologyManager feature gate
is on, this comes directly from the TopologyManager. When it is off,
we simply attempt the allocations ourselves and fail the admission
on an unexpected error. The higher level kubelet.go feature gate
check will be removed in an upcoming PR.
Instead of having a single call for Allocate(), we now split this into two
functions Allocate() and UpdatePluginResources().
The semantics split across them:
// Allocate configures and assigns devices to a pod. From the requested
// device resources, Allocate will communicate with the owning device
// plugin to allow setup procedures to take place, and for the device
// plugin to provide runtime settings to use the device (environment
// variables, mount points and device files).
Allocate(pod *v1.Pod) error
// UpdatePluginResources updates node resources based on devices already
// allocated to pods. The node object is provided for the device manager to
// update the node capacity to reflect the currently available devices.
UpdatePluginResources(
node *schedulernodeinfo.NodeInfo,
attrs *lifecycle.PodAdmitAttributes) error
As we move to a model in which the TopologyManager is able to ensure
aligned allocations from the CPUManager, devicemanger, and any
other TopologManager HintProviders in the same synchronous loop, we will
need to be able to call Allocate() independently from an
UpdatePluginResources(). This commit makes that possible.