kubernetes

Author	SHA1	Message	Date
Louise Daly	2fb94231d0	Renaming strict policy to restricted policy Restricted policy will fail admission of guaranteed pods where all requested resources are not available on a single NUMA Node	2019-08-22 07:57:55 +01:00
Tim Allclair	a2c51674cf	Cleanup more static check issues (S1,ST)	2019-08-21 10:40:21 -07:00
Tim Allclair	8a495cb5e4	Clean up error messages (ST1005)	2019-08-21 10:40:21 -07:00
Tim Allclair	6510d26b6a	Fix misc static check issues	2019-08-21 10:40:21 -07:00
Tim Allclair	3f510c69f6	Remove dead code from pkg/kubelet/...	2019-08-21 10:40:21 -07:00
Tim Allclair	49f50484b8	Delete duplicate resource.Quantity.Copy()	2019-08-19 17:23:14 -07:00
Kevin Klues	4fdd52b058	Update GetTopologyHints() API to return a map At present, there is no way for a hint provider to return distinct hints for different resource types via a call to GetTopologyHints(). This means that hint providers that govern multiple resource types (e.g. the devicemanager) must do some sort of "pre-merge" on the hints it generates for each resource type before passing them back to the TopologyManager. This patch changes the GetTopologyHints() interface to allow a hint provider to pass back raw hints for each resource type, and allow the TopologyManager to merge them using a single unified strategy. This change also allows the TopologyManager to recognize which resource type a set of hints originated from, should this information become useful in the future.	2019-08-16 08:06:12 +02:00
Kubernetes Prow Robot	f2dd24820a	Merge pull request #73920 from nolancon/topology-manager-cpu-manager Changes to make CPU Manager a Hint Provider for Topology Manager	2019-08-15 05:44:33 -07:00
Kevin Klues	b3f4bed97f	Add CPUManager tests for TopologyHint consumption	2019-08-14 06:22:56 +02:00
Kevin Klues	8278d1134c	Consume TopologyHints in the CPUManager Co-Authored-By: Conor Nolan <conor.nolan@intel.com>	2019-08-14 06:22:56 +02:00
Sreemanti Ghosh	7c626a2a00	Add CPUManager tests for TopologyHint generation Co-Authored-By: Conor Nolan <conor.nolan@intel.com> Co-Authored-By: Kevin Klues <kklues@nvidia.com>	2019-08-14 06:22:56 +02:00
Kevin Klues	156b3f6af8	Generate TopologyHints from the CPUManager	2019-08-14 06:22:56 +02:00
Kevin Klues	9a6788cb13	Add IterateSocketMasks() function to socketmask abstraction	2019-08-14 06:22:56 +02:00
Kubernetes Prow Robot	ac2295a24d	Merge pull request #78587 from kad/socketmask-string Use go standard library for common bit operations	2019-08-13 00:03:39 -07:00
Kubernetes Prow Robot	d47f9ff132	Merge pull request #81086 from dims/fix-incorrect-readlink-check-for-checking-kernel-pids [TOB-K8S-027] Fix Incorrect isKernelPid check	2019-08-08 17:58:04 -07:00
Davanum Srinivas	bd925d6611	[TOB-K8S-027] Fix Incorrect isKernelPid check isKernelPid should explicitly check the error returned from os.Readlink and return true only if the error value is ENOENT. Without this fix, if Readlink returned say ENAMETOOLONG or EACESS, we would still count the process as a kernel process (which is not true).	2019-08-07 11:19:19 -04:00
Davanum Srinivas	bc71c23bee	[TOB-K8S-025] Incorrect docker daemon process name in container manager The container manager used in kubelet checks for docker daemon process either via pidfile or process name. While the pidfile points to the docker daemon process PID, the dockerProcessName constant stores a docker cli name ( docker ) instead of docker daemon name ( dockerd ).	2019-08-07 10:59:37 -04:00
Conor Nolan	e33af11add	Add stub support for TopologyManager to CPUManager Co-Authored-By: Louise Daly <louise.m.daly@intel.com>	2019-08-07 15:56:05 +02:00
Jianfei Bai	5726b22fbc	Move docker specific const to dockershim.	2019-08-05 10:28:08 +08:00
Kubernetes Prow Robot	c63000ef81	Merge pull request #78793 from mattjmcnaughton/mattjmcnaughton/78629-fix-reserved-cgroup-systemd Fix reserved cgroup systemd	2019-08-02 17:23:52 -07:00
Kubernetes Prow Robot	93e6fb30f0	Merge pull request #74357 from lmdaly/topology-manager-container-manager Updates to container manager and internal container lifecycle to accommodate TopologyManager	2019-08-01 11:52:17 -07:00
Kubernetes Prow Robot	1a8844cd03	Merge pull request #80683 from moshe010/rename_files TopologyManager: Fix rename best-effort policy files	2019-07-31 00:25:00 -07:00
Kubernetes Prow Robot	320bc21dbe	Merge pull request #78762 from klueska/upstream-inherit-cpus-from-init-containers Proactively remove init Containers in CPUManager static policy	2019-07-30 03:35:18 -07:00
Moshe Levi	3b83c5c7c6	TopologyManager: Fix rename best-effort policy files PR https://github.com/kubernetes/kubernetes/pull/80301 rename the preferred policy to best-effort, but the files names are still policy_preferred.go and policy_preferred_test.go. This PR fix that.	2019-07-28 19:35:16 +03:00
Kevin Klues	9f36f1a173	Add tests for proactive init Container removal in the CPUManager static policy	2019-07-26 14:34:51 +02:00
Kevin Klues	6a7db380de	Add tests for new containertMap type in the CPUManager	2019-07-26 14:34:51 +02:00
Kevin Klues	c6d9bbcb74	Proactively remove init Containers in CPUManager static policy This patch fixes a bug in the CPUManager, whereby it doesn't honor the "effective requests/limits" of a Pod as defined by: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resources The rule states that a Pod’s "effective request/limit" for a resource should be the larger of: * The highest of any particular resource request or limit defined on all init Containers * The sum of all app Containers request/limit for a resource Moreover, the rule states that: * The effective QoS tier is the same for init Containers and app containers alike This means that the resource requests of init Containers and app Containers should be able to overlap, such that the larger of the two becomes the "effective resource request/limit" for the Pod. Likewise, if a QoS tier of "Guaranteed" is determined for the Pod, then both init Containers and app Containers should run in this tier. In its current implementation, the CPU manager honors the effective QoS tier for both init and app containers, but doesn't honor the "effective request/limit" correctly. Instead, it treats the "effective request/limit" as: * The sum of all init Containers plus the sum of all app Containers request/limit for a resource It does this by not proactively removing the CPUs given to previous init containers when new containers are being created. In the worst case, this causes the CPUManager to give non-overlapping CPUs to all containers (whether init or app) in the "Guaranteed" QoS tier before any of the containers in the Pod actually start. This effectively blocks these Pods from running if the total number of CPUs being requested across init and app Containers goes beyond the limits of the system. This patch fixes this problem by updating the CPUManager static policy so that it proactively removes any guaranteed CPUs it has granted to init Containers before allocating CPUs to app containers. Since all init container are run sequentially, it also makes sure this proactive removal happens for previous init containers when allocating CPUs to later ones.	2019-07-26 14:34:51 +02:00
Kevin Klues	7eccc71c9e	Rename 'preferred' TopologyManager policy to 'best-effort'	2019-07-25 10:44:36 +02:00
Louise Daly	9f0081cc36	Updates to container manager and internal container lifecycle to accommodate Topology Manager Co-authored-by: Conor Nolan <conor.nolan@intel.com>	2019-07-24 08:09:38 +01:00
Kubernetes Prow Robot	5b496fe8f5	Merge pull request #80315 from klueska/upstream-cleanup-socketmask Cleanup the TopologyManager socketmask abstraction	2019-07-23 11:40:28 -07:00
Kevin Klues	65b07312b0	Cleanup comments in TopologyManager socketmask abstraction	2019-07-18 18:52:19 -07:00
Kevin Klues	0edfd4be35	Add package level And/Or calls to TopologyManager socketmask abstraction	2019-07-18 09:06:51 -07:00
Kevin Klues	434fd34e0b	Add NewEmtpySocketMask() call to TopologyManager socketmask abstraction	2019-07-18 09:05:55 -07:00
Kevin Klues	4ee5d5409e	Update the topologymanager to error out if an invalid policy is given Previously, the topologymanager would simply fall back to the None() policy if an invalid policy was specified. This patch updates this to return an error when an invalid policy is passed, forcing the kubelet to fail fast when this occurs. These semantics should be preferable because an invalid policy likely indicates operator error in setting the policy flag on the kubelet correctly (e.g. misspelling 'strict' as 'striict'). In this case it is better to fail fast so the operator can detect this and correct the mistake, than to mask the error and essentially disable the topologymanager unexpectedly.	2019-07-18 13:24:09 +02:00
Kevin Klues	5dc5f1de06	Update the cpumanager to error out if an invalid policy is given Previously, the cpumanager would simply fall back to the None() policy if an invalid policy was specified. This patch updates this to return an error when an invalid policy is passed, forcing the kubelet to fail fast when this occurs. These semantics should be preferable because an invalid policy likely indicates operator error in setting the policy flag on the kubelet correctly (e.g. misspelling 'static' as 'statiic'). In this case it is better to fail fast so the operator can detect this and correct the mistake, than to mask the error and essentially disable the cpumanager unexpectedly.	2019-07-18 13:24:09 +02:00
Kubernetes Prow Robot	1125054612	Merge pull request #80235 from moshe010/remove_string Remove unnecessary string() from policy_none	2019-07-17 19:34:49 -07:00
Louise Daly	9d7e31e66e	Topology Manager Implementation based on Interfaces Co-authored-by: Kevin Klues <kklues@nvidia.com> Co-authored-by: Conor Nolan <conor.nolan@intel.com> Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>	2019-07-17 02:30:21 +01:00
Moshe Levi	d52985d3a0	Remove unnecessary string() from policy_none Signed-off-by: Moshe Levi <moshele@mellanox.com>	2019-07-17 01:58:43 +03:00
Kubernetes Prow Robot	4197adaf2d	Merge pull request #79343 from nolancon/topology-manager-none Add Policy None for Topology Manager	2019-07-16 13:22:47 -07:00
Kubernetes Prow Robot	80537a9c5f	Merge pull request #77323 from tedyu/cgroup-mgr-linux Check error return from Update	2019-07-15 14:53:24 -07:00
Kubernetes Prow Robot	923f08e29b	Merge pull request #79900 from mikebrow/todo-cleanup-container-manager-linux update code documentation to reflect change in status	2019-07-11 18:33:35 -07:00
Kubernetes Prow Robot	920ac08361	Merge pull request #76518 from haiyanmeng/limit Limit the read length of ioutil.ReadAll in `pkg/kubelet` and `pkg/probe`	2019-07-11 17:01:07 -07:00
Kubernetes Prow Robot	f0d1b10092	Merge pull request #77429 from tedyu/container-linux-err Avoid unnecessary concatenation of errors	2019-07-11 14:33:08 -07:00
Haiyan Meng	1f270ef4e2	Limit the read length of ioutil.ReadAll in `pkg/kubelet` and `pkg/probe` Signed-off-by: Haiyan Meng <haiyanmeng@google.com>	2019-07-11 13:18:06 -07:00
Kubernetes Prow Robot	d4d8daea73	Merge pull request #78558 from tedyu/policy-str Remove unnecessary string()	2019-07-11 13:13:06 -07:00
Kubernetes Prow Robot	858fce1634	Merge pull request #79531 from odinuge/kubelet-dead-code Remove unnecessary variable declaration	2019-07-08 14:28:01 -07:00
Mike Brown	6da266784a	update code documentation to reflect change in status Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2019-07-08 16:15:59 -05:00
Odin Ugedal	4ee5fe23e8	Fix cgroup hugetlb size prefix for kB Use the exported list from runc that uses "KB" and not "kB". This issue breaks kubelet on AArch64 (arm 64). var HugePageSizeUnitList = []string{"B", "KB", "MB", "GB", "TB", "PB"} The hugetlb cgroup control files (introduced here in 2012: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773) use "KB" and not "kB" (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349). The behavior in the kernel has not changed since the introduction, and the current code using "kB" will therefore fail on devices with huge pages smaller than 1MiB. This is the case for AArch64. As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB", "MB" and "GB" are used, so the others may be removed as well. Here is a real world example of the files inside the "/sys/kernel/mm/hugepages/" directory: - "hugepages-64kB" - "hugepages-2048kB" - "hugepages-32768kB" - "hugepages-1048576kB" And the corresponding cgroup files: - "hugetlb.64KB._____" - "hugetlb.2MB._____" - "hugetlb.32MB._____" - "hugetlb.1GB._____" Signed-off-by: Odin Ugedal <odin@ugedal.com>	2019-06-28 21:28:26 +02:00
Odin Ugedal	2bcdb944f0	Update dependency opencontainer/runc	2019-06-28 21:23:05 +02:00
Odin Ugedal	9c2aa843bd	Remove unnecessary variable declaration	2019-06-28 18:03:23 +02:00

... 4 5 6 7 8 ...

791 Commits