Commit Graph

1382 Commits

Author SHA1 Message Date
Kevin Klues
8278d1134c Consume TopologyHints in the CPUManager
Co-Authored-By: Conor Nolan <conor.nolan@intel.com>
2019-08-14 06:22:56 +02:00
Sreemanti Ghosh
7c626a2a00 Add CPUManager tests for TopologyHint generation
Co-Authored-By: Conor Nolan <conor.nolan@intel.com>
Co-Authored-By: Kevin Klues <kklues@nvidia.com>
2019-08-14 06:22:56 +02:00
Kevin Klues
156b3f6af8 Generate TopologyHints from the CPUManager 2019-08-14 06:22:56 +02:00
Kevin Klues
9a6788cb13 Add IterateSocketMasks() function to socketmask abstraction 2019-08-14 06:22:56 +02:00
Kubernetes Prow Robot
ac2295a24d Merge pull request #78587 from kad/socketmask-string
Use go standard library for common bit operations
2019-08-13 00:03:39 -07:00
Kubernetes Prow Robot
d47f9ff132 Merge pull request #81086 from dims/fix-incorrect-readlink-check-for-checking-kernel-pids
[TOB-K8S-027] Fix Incorrect isKernelPid check
2019-08-08 17:58:04 -07:00
Davanum Srinivas
bd925d6611 [TOB-K8S-027] Fix Incorrect isKernelPid check
isKernelPid should explicitly check the error returned from os.Readlink and return true
only if the error value is ENOENT. Without this fix, if Readlink
returned say ENAMETOOLONG or EACESS, we would still count the process as
a kernel process (which is not true).
2019-08-07 11:19:19 -04:00
Davanum Srinivas
bc71c23bee [TOB-K8S-025] Incorrect docker daemon process name in container manager
The container manager used in kubelet checks for docker daemon process either via pidfile
or process name. While the pidfile points to the docker daemon process PID, the
dockerProcessName constant stores a docker cli name ( docker ) instead of docker daemon
name ( dockerd ).
2019-08-07 10:59:37 -04:00
Conor Nolan
e33af11add Add stub support for TopologyManager to CPUManager
Co-Authored-By: Louise Daly <louise.m.daly@intel.com>
2019-08-07 15:56:05 +02:00
Jianfei Bai
5726b22fbc Move docker specific const to dockershim. 2019-08-05 10:28:08 +08:00
Kubernetes Prow Robot
c63000ef81 Merge pull request #78793 from mattjmcnaughton/mattjmcnaughton/78629-fix-reserved-cgroup-systemd
Fix reserved cgroup systemd
2019-08-02 17:23:52 -07:00
Kubernetes Prow Robot
93e6fb30f0 Merge pull request #74357 from lmdaly/topology-manager-container-manager
Updates to container manager and internal container lifecycle to accommodate TopologyManager
2019-08-01 11:52:17 -07:00
Kubernetes Prow Robot
1a8844cd03 Merge pull request #80683 from moshe010/rename_files
TopologyManager: Fix rename best-effort policy files
2019-07-31 00:25:00 -07:00
Kubernetes Prow Robot
320bc21dbe Merge pull request #78762 from klueska/upstream-inherit-cpus-from-init-containers
Proactively remove init Containers in CPUManager static policy
2019-07-30 03:35:18 -07:00
Moshe Levi
3b83c5c7c6 TopologyManager: Fix rename best-effort policy files
PR https://github.com/kubernetes/kubernetes/pull/80301 rename
the preferred policy to best-effort, but the files names are
still policy_preferred.go and policy_preferred_test.go. This
PR fix that.
2019-07-28 19:35:16 +03:00
Kevin Klues
9f36f1a173 Add tests for proactive init Container removal in the CPUManager static policy 2019-07-26 14:34:51 +02:00
Kevin Klues
6a7db380de Add tests for new containertMap type in the CPUManager 2019-07-26 14:34:51 +02:00
Kevin Klues
c6d9bbcb74 Proactively remove init Containers in CPUManager static policy
This patch fixes a bug in the CPUManager, whereby it doesn't honor the
"effective requests/limits" of a Pod as defined by:

    https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resources

The rule states that a Pod’s "effective request/limit" for a resource
should be the larger of:
    * The highest of any particular resource request or limit
      defined on all init Containers
    * The sum of all app Containers request/limit for a
      resource

Moreover, the rule states that:
    * The effective QoS tier is the same for init Containers
      and app containers alike

This means that the resource requests of init Containers and app
Containers should be able to overlap, such that the larger of the two
becomes the "effective resource request/limit" for the Pod. Likewise,
if a QoS tier of "Guaranteed" is determined for the Pod, then both init
Containers and app Containers should run in this tier.

In its current implementation, the CPU manager honors the effective QoS
tier for both init and app containers, but doesn't honor the "effective
request/limit" correctly.

Instead, it treats the "effective request/limit" as:
    * The sum of all init Containers plus the sum of all app
      Containers request/limit for a resource

It does this by not proactively removing the CPUs given to previous init
containers when new containers are being created. In the worst case,
this causes the CPUManager to give non-overlapping CPUs to all
containers (whether init or app) in the "Guaranteed" QoS tier before any
of the containers in the Pod actually start.

This effectively blocks these Pods from running if the total number of
CPUs being requested across init and app Containers goes beyond the
limits of the system.

This patch fixes this problem by updating the CPUManager static policy
so that it proactively removes any guaranteed CPUs it has granted to
init Containers before allocating CPUs to app containers. Since all init
container are run sequentially, it also makes sure this proactive
removal happens for previous init containers when allocating CPUs to
later ones.
2019-07-26 14:34:51 +02:00
Kevin Klues
7eccc71c9e Rename 'preferred' TopologyManager policy to 'best-effort' 2019-07-25 10:44:36 +02:00
Louise Daly
9f0081cc36 Updates to container manager and internal container lifecycle to accommodate Topology Manager
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
2019-07-24 08:09:38 +01:00
Kubernetes Prow Robot
5b496fe8f5 Merge pull request #80315 from klueska/upstream-cleanup-socketmask
Cleanup the TopologyManager socketmask abstraction
2019-07-23 11:40:28 -07:00
Kevin Klues
65b07312b0 Cleanup comments in TopologyManager socketmask abstraction 2019-07-18 18:52:19 -07:00
Kevin Klues
0edfd4be35 Add package level And/Or calls to TopologyManager socketmask abstraction 2019-07-18 09:06:51 -07:00
Kevin Klues
434fd34e0b Add NewEmtpySocketMask() call to TopologyManager socketmask abstraction 2019-07-18 09:05:55 -07:00
Kevin Klues
4ee5d5409e Update the topologymanager to error out if an invalid policy is given
Previously, the topologymanager would simply fall back to the None() policy
if an invalid policy was specified. This patch updates this to return an
error when an invalid policy is passed, forcing the kubelet to fail
fast when this occurs.

These semantics should be preferable because an invalid policy likely
indicates operator error in setting the policy flag on the kubelet
correctly (e.g. misspelling 'strict' as 'striict'). In this case it is
better to fail fast so the operator can detect this and correct the
mistake, than to mask the error and essentially disable the
topologymanager unexpectedly.
2019-07-18 13:24:09 +02:00
Kevin Klues
5dc5f1de06 Update the cpumanager to error out if an invalid policy is given
Previously, the cpumanager would simply fall back to the None() policy
if an invalid policy was specified. This patch updates this to return an
error when an invalid policy is passed, forcing the kubelet to fail
fast when this occurs.

These semantics should be preferable because an invalid policy likely
indicates operator error in setting the policy flag on the kubelet
correctly (e.g. misspelling 'static' as 'statiic'). In this case it is
better to fail fast so the operator can detect this and correct the
mistake, than to mask the error and essentially disable the cpumanager
unexpectedly.
2019-07-18 13:24:09 +02:00
Kubernetes Prow Robot
1125054612 Merge pull request #80235 from moshe010/remove_string
Remove unnecessary string() from policy_none
2019-07-17 19:34:49 -07:00
Louise Daly
9d7e31e66e Topology Manager Implementation based on Interfaces
Co-authored-by: Kevin Klues <kklues@nvidia.com>
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>
2019-07-17 02:30:21 +01:00
Moshe Levi
d52985d3a0 Remove unnecessary string() from policy_none
Signed-off-by: Moshe Levi <moshele@mellanox.com>
2019-07-17 01:58:43 +03:00
Kubernetes Prow Robot
4197adaf2d Merge pull request #79343 from nolancon/topology-manager-none
Add Policy None for Topology Manager
2019-07-16 13:22:47 -07:00
Kubernetes Prow Robot
80537a9c5f Merge pull request #77323 from tedyu/cgroup-mgr-linux
Check error return from Update
2019-07-15 14:53:24 -07:00
Kubernetes Prow Robot
923f08e29b Merge pull request #79900 from mikebrow/todo-cleanup-container-manager-linux
update code documentation to reflect change in status
2019-07-11 18:33:35 -07:00
Kubernetes Prow Robot
920ac08361 Merge pull request #76518 from haiyanmeng/limit
Limit the read length of ioutil.ReadAll in `pkg/kubelet` and `pkg/probe`
2019-07-11 17:01:07 -07:00
Kubernetes Prow Robot
f0d1b10092 Merge pull request #77429 from tedyu/container-linux-err
Avoid unnecessary concatenation of errors
2019-07-11 14:33:08 -07:00
Haiyan Meng
1f270ef4e2 Limit the read length of ioutil.ReadAll in pkg/kubelet and pkg/probe
Signed-off-by: Haiyan Meng <haiyanmeng@google.com>
2019-07-11 13:18:06 -07:00
Kubernetes Prow Robot
d4d8daea73 Merge pull request #78558 from tedyu/policy-str
Remove unnecessary string()
2019-07-11 13:13:06 -07:00
Kubernetes Prow Robot
858fce1634 Merge pull request #79531 from odinuge/kubelet-dead-code
Remove unnecessary variable declaration
2019-07-08 14:28:01 -07:00
Mike Brown
6da266784a update code documentation to reflect change in status
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2019-07-08 16:15:59 -05:00
Odin Ugedal
4ee5fe23e8 Fix cgroup hugetlb size prefix for kB
Use the exported list from runc that uses "KB" and not "kB".

This issue breaks kubelet on AArch64 (arm 64).

var HugePageSizeUnitList = []string{"B", "KB", "MB", "GB", "TB", "PB"}

The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).

The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with huge
pages smaller than 1MiB. This is the case for AArch64.

As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.

Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:
- "hugepages-64kB"
- "hugepages-2048kB"
- "hugepages-32768kB"
- "hugepages-1048576kB"

And the corresponding cgroup files:
- "hugetlb.64KB._____"
- "hugetlb.2MB._____"
- "hugetlb.32MB._____"
- "hugetlb.1GB._____"

Signed-off-by: Odin Ugedal <odin@ugedal.com>
2019-06-28 21:28:26 +02:00
Odin Ugedal
2bcdb944f0 Update dependency opencontainer/runc 2019-06-28 21:23:05 +02:00
Odin Ugedal
9c2aa843bd Remove unnecessary variable declaration 2019-06-28 18:03:23 +02:00
Kubernetes Prow Robot
c64f81d082 Merge pull request #78653 from sjenning/add-sjenning-owners
kubelet: add sjenning to kubelet subdirectory owners files
2019-06-25 14:47:15 -07:00
nolancon
2d7ac702d6 Add Policy None for Topology Manager
Update naming of test functions.
2019-06-25 03:24:31 +01:00
rafatio
08c258add9 Ignore cgroup pid support if related feature gates are disabled 2019-06-15 18:45:27 -03:00
Kubernetes Prow Robot
d30fbab4b8 Merge pull request #77915 from SataQiu/fix-golint-util-20190515
Fix golint failures of pkg/util/parsers pkg/util/sysctl pkg/util/system
2019-06-14 00:29:00 -07:00
mattjmcnaughton
5539e61032 Fix reserved cgroup systemd
Fix an issue in which, when trying to specify the `--kube-reserved-cgroup`
(or `--system-reserved-cgroup`) with `--cgroup-driver=systemd`, we will
not properly convert the `systemd` cgroup name into the internal cgroup
name that k8s expects. Without this change, specifying
`--kube-reserved-cgroup=/test.slice --cgroup-driver=systemd` will fail,
and only `--kube-reserved-cgroup=/test --crgroup-driver=systemd` will succeed,
even if the actual cgroup existing on the host is `/test.slice`.

Additionally, add light unit testing of our process from converting to a
systemd cgroup name to kubernetes internal cgroup name.
2019-06-07 10:48:42 -04:00
Seth Jennings
89dc2c65e4 kubelet: add sjenning to kubelet subdirectory owners files 2019-06-03 08:26:24 -05:00
Alexander Kanevskiy
89481f8c27 Use go standard library for common bit operations
PR#72913 introduced own versions of the bit operations that are
less efficient than ones from standard library.
2019-06-01 19:54:38 +03:00
Kubernetes Prow Robot
9ac58bae56 Merge pull request #78515 from klueska/upstream-socketmask-updates
Updates to the SocketMask abstraction for the TopologyManager
2019-06-01 09:50:16 -07:00
Kubernetes Prow Robot
46c74629cf Merge pull request #78516 from klueska/upstream-topology-manager-interface-updates
Update the TopologyManager interfaces
2019-06-01 08:00:19 -07:00