kubernetes

Author	SHA1	Message	Date
Francesco Romani	6dcec345df	smtalign: cm: factor out admission response Introduce a new `admission` subpackage to factor out the responsability to create `PodAdmitResult` objects. This enables resource manager to report specific errors in Allocate() and to bubble up them in the relevant fields of the `PodAdmitResult`. To demonstrate the approach we refactor TopologyAffinityError as a proper error. Co-authored-by: Kevin Klues <kklues@nvidia.com> Co-authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-07-08 23:15:37 +02:00
Krzysztof Wiatrzyk	656a08afdf	Move scope specific tests from topologymanager under particular scopes Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk	c786c9a533	Move common tests from topologymanager under scope Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk	f5c0fe4ef6	Update topologymanager tests after adding scopes Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:55 +01:00
sw.han	f5997fe537	Add GetPodTopologyHints() interface to Topology/CPU/Device Manager Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:54 +01:00
Kevin Klues	9e4ee5ecc3	Add Allocate() call to TopologyManager's HintProvider interface Having this interface allows us to perform a tight loop of: for each container { containerHints = {} for each provider { containerHints[provider] = provider.GatherHints(container) } containerHints.MergeAndPublish() for each provider { provider.Allocate(container) } } With this in place we can now be sure that the hints gathered in one iteration of the loop always consider the allocations made in the previous.	2020-02-10 03:27:47 +00:00
Kevin Klues	bc686ea27b	Update TopologyManager.GetTopologyHints() to take pointers Previously, this function was taking full Pod and Container objects unnecessarily. This commit updates this so that they will take pointers instead.	2020-02-03 17:13:28 +00:00
Kevin Klues	adaa58b6cb	Update TopologyManager.Policy.Merge() to return a simple bool Previously, the verious Merge() policies of the TopologyManager all eturned their own lifecycle.PodAdmitResult result. However, for consistency in any failed admits, this is better handled in the top-level Topology manager, with each policy only returning a boolean about whether or not they would like to admit the pod or not. This commit changes the semantics to match this logic.	2020-02-03 17:13:28 +00:00
Adrian Chiris	9f21f49493	Additional unit tests for Topology Manager methods	2020-01-16 08:13:05 +00:00
Kevin Klues	7ea1fc9be4	Move TopologyManager TestPolicyMerge() to shared test file	2019-11-04 18:43:07 +01:00
Kevin Klues	d7d7bfcda0	Abstract TopologyManager Policy Merge() tests into their own function	2019-11-04 18:43:07 +01:00
Adrian Chiris	d95464645c	Add Merge() API to TopologyManager Policy abstraction This abstraction moves the responsibility of merging topology hints to the individual policies themselves. As part of this, it removes the CanAdmitPodResult() API from the policy abstraction, and rolls it into a second return value from Merge()	2019-11-04 18:43:07 +01:00
Adrian Chiris	e72847676f	Pass a list of NUMA nodes to the various TopologyManager policies This is in preparation for a larger refactoring effort that will add a 'Merge()' API to the TopologyManager policy API.	2019-11-04 18:43:07 +01:00
Louise Daly	a353247d44	Fixed bug in TopologyManager with SingleNUMANode Policy This patch fixes an issue where best-effort pods were not admitted to the node if the single-numa-node policy was set. This was because the Admit policy in single-numa-node policy does not admit any pod where the hint is anything but single NUMA node. The 'best hint' in this case is {<set bits for num. Numa Nodes on machine>, true} So on a machine with 2 NUMA nodes the best hint for a best-effort pod is {11,true} as best-effort pods have no Topology preferences. The single-numa-node policy fails any pod with a not preferred hint OR a hint where > 1 bits are set, thus the above example resulting in termintaed pods with a Topology Affinity Error. This is a short term fix for the single-numa-node policy, as there will be code refactoring for the 1.17 release.	2019-10-11 07:00:37 +01:00
Kubernetes Prow Robot	017842d49d	Merge pull request #83492 from ConnorDoyle/topo-align-all-qos Topology manager aligns pods of all QoS classes.	2019-10-11 03:03:40 -07:00
Connor Doyle	a9203ebdcf	Topology manager aligns pods of all QoS classes.	2019-10-10 12:16:21 -07:00
Kevin Klues	5501f542cd	Fixed bug in TopologyManager with SingleNUMANode Policy This patch fixes an issue in the TopologyManager that wouldn't allow pods to be admitted if pods were launched with the SingleNUMANode policy and any of the hint providers had no NUMA preferences. This is due to 2 factors: 1) Any hint provider that passes back a `nil` as its hints, has its hint automatically transformed into a single {11 true} hint before merging 2) We added a special casing for the SingleNumaNodePolicy() in the TopologyManager that essentially turns these hints into a {11 false} anytime a {11 true} is seen. The current patch reworks this logic so the that TopologyManager can tell the difference between a "don't care" hint and a true "{11 true}" hint returned by the hint provider. Only true "{11 true}" hints will be converted by the special casing for the SingleNumaNodePolicy(), while "don't care" hints will not. This is a short term fix for this issue until we do a larger refactoring of this code for the 1.17 release.	2019-10-09 17:41:08 -07:00
Connor Doyle	e35301c19f	Rename package socketmask to bitmask. - As discussed in reviews and other public channels, this abstraction is used to represent numa nodes, not sockets. - There is nothing inherently related to sockets in this package anyway.	2019-09-23 17:08:45 -07:00
Louise Daly	8ad1b5ba3b	Single-numa-node Topology Manager bug fix Added one off fix for single-numa-node policy to correctly reject pod admission on a resource allocation that spans NUMA nodes Co-authored-by: Kevin Klues <kklues@nvidia.com>	2019-08-30 07:17:56 +01:00
Kevin Klues	5660cd3cfb	Add NUMA Node awareness to the TopologyManager	2019-08-28 11:04:52 -05:00
Kubernetes Prow Robot	35867b160a	Merge pull request #81951 from klueska/upstream-update-cpu-amanger-numa-mapping Update the CPUManager to include NUMANodeID in its topology information	2019-08-28 08:55:40 -07:00
Kevin Klues	f4dbd29cdb	Rename TopologyHint.SocketAffinity to TopologyHint.NUMANodeAffinity As part of this, update the logic to use the NUMA information instead of the Socket information when generating and consuming TopologyHints in the CPUManager.	2019-08-27 16:51:05 -05:00
Louise Daly	2fb94231d0	Renaming strict policy to restricted policy Restricted policy will fail admission of guaranteed pods where all requested resources are not available on a single NUMA Node	2019-08-22 07:57:55 +01:00
Kevin Klues	4fdd52b058	Update GetTopologyHints() API to return a map At present, there is no way for a hint provider to return distinct hints for different resource types via a call to GetTopologyHints(). This means that hint providers that govern multiple resource types (e.g. the devicemanager) must do some sort of "pre-merge" on the hints it generates for each resource type before passing them back to the TopologyManager. This patch changes the GetTopologyHints() interface to allow a hint provider to pass back raw hints for each resource type, and allow the TopologyManager to merge them using a single unified strategy. This change also allows the TopologyManager to recognize which resource type a set of hints originated from, should this information become useful in the future.	2019-08-16 08:06:12 +02:00
Kevin Klues	7eccc71c9e	Rename 'preferred' TopologyManager policy to 'best-effort'	2019-07-25 10:44:36 +02:00
Kevin Klues	4ee5d5409e	Update the topologymanager to error out if an invalid policy is given Previously, the topologymanager would simply fall back to the None() policy if an invalid policy was specified. This patch updates this to return an error when an invalid policy is passed, forcing the kubelet to fail fast when this occurs. These semantics should be preferable because an invalid policy likely indicates operator error in setting the policy flag on the kubelet correctly (e.g. misspelling 'strict' as 'striict'). In this case it is better to fail fast so the operator can detect this and correct the mistake, than to mask the error and essentially disable the topologymanager unexpectedly.	2019-07-18 13:24:09 +02:00
Louise Daly	9d7e31e66e	Topology Manager Implementation based on Interfaces Co-authored-by: Kevin Klues <kklues@nvidia.com> Co-authored-by: Conor Nolan <conor.nolan@intel.com> Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>	2019-07-17 02:30:21 +01:00

27 Commits