Commit Graph

9 Commits

Author SHA1 Message Date
Louise Daly
8ad1b5ba3b Single-numa-node Topology Manager bug fix
Added one off fix for single-numa-node policy to correctly
reject pod admission on a resource allocation that spans
NUMA nodes

Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-30 07:17:56 +01:00
Kevin Klues
5660cd3cfb Add NUMA Node awareness to the TopologyManager 2019-08-28 11:04:52 -05:00
Kubernetes Prow Robot
35867b160a
Merge pull request #81951 from klueska/upstream-update-cpu-amanger-numa-mapping
Update the CPUManager to include NUMANodeID in its topology information
2019-08-28 08:55:40 -07:00
Kevin Klues
f4dbd29cdb Rename TopologyHint.SocketAffinity to TopologyHint.NUMANodeAffinity
As part of this, update the logic to use the NUMA information instead of
the Socket information when generating and consuming TopologyHints in
the CPUManager.
2019-08-27 16:51:05 -05:00
Louise Daly
2fb94231d0 Renaming strict policy to restricted policy
Restricted policy will fail admission of guaranteed pods where
all requested resources are not available on a single NUMA Node
2019-08-22 07:57:55 +01:00
Kevin Klues
4fdd52b058 Update GetTopologyHints() API to return a map
At present, there is no way for a hint provider to return distinct hints
for different resource types via a call to GetTopologyHints(). This
means that hint providers that govern multiple resource types (e.g. the
devicemanager) must do some sort of "pre-merge" on the hints it
generates for each resource type before passing them back to the
TopologyManager.

This patch changes the GetTopologyHints() interface to allow a hint
provider to pass back raw hints for each resource type, and allow the
TopologyManager to merge them using a single unified strategy.

This change also allows the TopologyManager to recognize which
resource type a set of hints originated from, should this information
become useful in the future.
2019-08-16 08:06:12 +02:00
Kevin Klues
7eccc71c9e Rename 'preferred' TopologyManager policy to 'best-effort' 2019-07-25 10:44:36 +02:00
Kevin Klues
4ee5d5409e Update the topologymanager to error out if an invalid policy is given
Previously, the topologymanager would simply fall back to the None() policy
if an invalid policy was specified. This patch updates this to return an
error when an invalid policy is passed, forcing the kubelet to fail
fast when this occurs.

These semantics should be preferable because an invalid policy likely
indicates operator error in setting the policy flag on the kubelet
correctly (e.g. misspelling 'strict' as 'striict'). In this case it is
better to fail fast so the operator can detect this and correct the
mistake, than to mask the error and essentially disable the
topologymanager unexpectedly.
2019-07-18 13:24:09 +02:00
Louise Daly
9d7e31e66e Topology Manager Implementation based on Interfaces
Co-authored-by: Kevin Klues <kklues@nvidia.com>
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>
2019-07-17 02:30:21 +01:00