This was added by commit a9772b2290.
In the current codebase, the cgroup being updated was created using
runc/opencontainers' manager.Apply(), which already does controllers
propagation, so there is no need to repeat that on every update.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This sets cgroup config via libcontainer to make sure we apply the
correct values to the systemd slices and scopes.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
runc rc95 contains a fix for CVE-2021-30465.
runc rc94 provides fixes and improvements.
One notable change is cgroup manager's Set now accept Resources rather
than Cgroup (see https://github.com/opencontainers/runc/pull/2906).
Modify the code accordingly.
Also update runc dependencies (as hinted by hack/lint-depdendencies.sh):
github.com/cilium/ebpf v0.5.0
github.com/containerd/console v1.0.2
github.com/coreos/go-systemd/v22 v22.3.1
github.com/godbus/dbus/v5 v5.0.4
github.com/moby/sys/mountinfo v0.4.1
golang.org/x/sys v0.0.0-20210426230700-d19ff857e887
github.com/google/go-cmp v0.5.4
github.com/kr/pretty v0.2.1
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
One notable change is cgroup manager's Set now accept Resources rather
than Cgroup (see https://github.com/opencontainers/runc/pull/2906).
Modify the code accordingly.
Also update runc dependencies (as hinted by hack/lint-depdendencies.sh):
github.com/cilium/ebpf v0.5.0
github.com/containerd/console v1.0.2
github.com/coreos/go-systemd/v22 v22.3.1
github.com/godbus/dbus/v5 v5.0.4
github.com/moby/sys/mountinfo v0.4.1
golang.org/x/sys v0.0.0-20210426230700-d19ff857e887
github.com/google/go-cmp v0.5.4
github.com/kr/pretty v0.2.1
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
- libcontainer renamed
`github.com/opencontainers/runc/libcontainer/configs` to
`github.com/opencontainers/runc/libcontainer/devices` so use the new
references
- Update `dockershim` `ContainerCreate` call after docker update to
v20.10.2
This fixes issues where kubelet enforces qos and nodeAllocatable on the
worng hierarchy. Kublet will now create the files
/sys/fs/cgroup/kubepods/{burstable,besteffort,}/pod-xyz
when running with systemd as the driver, making it impossible to enforce
the limits on nodeAllocatable.
use the new libcontainer feature of skipping setting the devices
cgroup. This is necessary on cgroup v2 to avoid leaking a eBPF
program every time the cgroup is re-configured.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
do a conversion from the cgroups v1 limits to cgroups v2.
e.g. cpu.shares on cgroups v1 has a range of [2-262144] while the
equivalent on cgroups v2 is cpu.weight that uses a range [1-10000].
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Unit test for updating container hugepage limit
Add warning message about ignoring case.
Update error handling about hugepage size requirements
Signed-off-by: sewon.oh <sewon.oh@samsung.com>
cause by kubelet startup be interrupted on setting list of cgroups
In the 'cgroupManagerImpl.Exists' not check&recreate the hugetlb cgroup dir. Then setting the limits in non-exist cgroup dir will cause kubelet start failed.
Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>
Use the exported list from runc that uses "KB" and not "kB".
This issue breaks kubelet on AArch64 (arm 64).
var HugePageSizeUnitList = []string{"B", "KB", "MB", "GB", "TB", "PB"}
The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).
The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with huge
pages smaller than 1MiB. This is the case for AArch64.
As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.
Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:
- "hugepages-64kB"
- "hugepages-2048kB"
- "hugepages-32768kB"
- "hugepages-1048576kB"
And the corresponding cgroup files:
- "hugetlb.64KB._____"
- "hugetlb.2MB._____"
- "hugetlb.32MB._____"
- "hugetlb.1GB._____"
Signed-off-by: Odin Ugedal <odin@ugedal.com>
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
The slice of strings more precisely captures the hierarchic nature of
the cgroup paths we use to represent pods and their groupings.
It also ensures we're reducing the chances of passing an incorrect path
format to a cgroup driver that requires a different path naming, since
now explicit conversions are always needed.
The new constructor NewCgroupName starts from an existing CgroupName,
which enforces a hierarchy where a root is always needed. It also
performs checking on the component names to ensure invalid characters
("/" and "_") are not in use.
A RootCgroupName for the top of the cgroup hierarchy tree is introduced.
This refactor results in a net reduction of around 30 lines of code,
mainly with the demise of ConvertCgroupNameToSystemd which had fairly
complicated logic in it and was doing just too many things.
There's a small TODO in a helper updateSystemdCgroupInfo that was
introduced to make this commit possible. That logic really belongs in
libcontainer, I'm planning to send a PR there to include it there.
(The API already takes a field with that information, only that field is
only processed in cgroupfs and not systemd driver, we should fix that.)
Tested by running the e2e-node tests on both Ubuntu 16.04 (with cgroupfs
driver) and CentOS 7 (with systemd driver.)
Add a new Alpha Feature to set a maximum number of pids per Pod.
This is to allow the use case where cluster administrators wish
to limit the pids consumed per pod (example when running a CI system).
By default, we do not set any maximum limit, If an administrator wants
to enable this, they should enable `SupportPodPidsLimit=true` in the
`--feature-gates=` parameter to kubelet and specify the limit using the
`--pod-max-pids` parameter.
The limit set is the total count of all processes running in all
containers in the pod.
This PR adds the pod-level metrics for CPU and memory stats. cAdvisor
can get all pod cgroup information so we can add this pod-level CPU and
memory stats information from the corresponding pod cgroup