Commit Graph

124 Commits

Author SHA1 Message Date
Adrian Chiris
dc36924c37 Update policy_none removing canAdmitPodResult
Update unit tests for none_policy
Add Name test for policy_restricted
2020-01-16 08:13:05 +00:00
Adrian Chiris
cf8b098dda Refactor policy-best-effort
- Modularize code with mergePermutation method
2020-01-16 08:13:05 +00:00
Kevin Klues
b5f52e6072 Modularize TopologyManager policy Merge() tests
These changes make it so that a set of common test cases can be used for
all merge strategies, with specific test cases being able to be
specified on a policy-by-policy basis.
2019-11-04 18:43:07 +01:00
Kevin Klues
7ea1fc9be4 Move TopologyManager TestPolicyMerge() to shared test file 2019-11-04 18:43:07 +01:00
Kevin Klues
d7d7bfcda0 Abstract TopologyManager Policy Merge() tests into their own function 2019-11-04 18:43:07 +01:00
Adrian Chiris
dee22d1fbc Fix comments in TopologyManager 2019-11-04 18:43:07 +01:00
Adrian Chiris
5f7db54d3c Move function from top-level TopologyManager to best-effort policy
This is in preparation for removing the special-case of the
SingleNumaNode policy in mergeProvidersHints() in favor of a custom
merging strategy with much less overhead.
2019-11-04 18:43:07 +01:00
Adrian Chiris
d95464645c Add Merge() API to TopologyManager Policy abstraction
This abstraction moves the responsibility of merging topology hints to
the individual policies themselves. As part of this, it removes the
CanAdmitPodResult() API from the policy abstraction, and rolls it into a
second return value from Merge()
2019-11-04 18:43:07 +01:00
Adrian Chiris
78d7856288 Globalize a few TopologyManager functions
This is in preparation for a larger refactoring effort that will add a
'Merge()'  API to the TopologyManager policy API.
2019-11-04 18:43:07 +01:00
Adrian Chiris
e72847676f Pass a list of NUMA nodes to the various TopologyManager policies
This is in preparation for a larger refactoring effort that will add a
'Merge()'  API to the TopologyManager policy API.
2019-11-04 18:43:07 +01:00
Adrian Chiris
6fd8a6eb69 Make restricted TopologyManager policy inherit from best-effort policy
These policies only differ on whether they admit the pod or not when a
TopologyHint is preferred or not. As such, the restricted policy should
simply inherit whatever it can from the best effort policy and only
overwrite what is necessary.

This does not matter for now, but will become important when we add a
new 'Merge()' abstraction to a Policy later on.
2019-11-04 18:43:07 +01:00
Adrian Chiris
3391daeb00 Break TopologyManager.calculateAffinity() into more modular functions
This modularization is in preparation for a larger refactoring effort
that will add a 'Merge()'  API to the TopologyManager policy API.
2019-11-04 18:43:07 +01:00
Adrian Chiris
b17706b149 Added LessThan() and IsEqual() methods for TopologyHints 2019-11-04 18:43:07 +01:00
Kubernetes Prow Robot
002dbf6a4c
Merge pull request #83777 from lmdaly/fix-single-numa-node-with-best-effort-pods
Fixed bug in TopologyManager with SingleNUMANode Policy
2019-11-01 04:53:23 -07:00
nolancon
b0a85177d2 Clean-up and additional test cases for socket-mask unit test. 2019-10-18 04:16:06 +01:00
Kubernetes Prow Robot
017842d49d
Merge pull request #83492 from ConnorDoyle/topo-align-all-qos
Topology manager aligns pods of all QoS classes.
2019-10-11 03:03:40 -07:00
Louise Daly
a353247d44 Fixed bug in TopologyManager with SingleNUMANode Policy
This patch fixes an issue where best-effort pods were not admitted
to the node if the single-numa-node policy was set.

This was because the Admit policy in single-numa-node policy does
not admit any pod where the hint is anything but single NUMA node. The 'best hint' in this case is {<set bits for num. Numa Nodes on machine>, true}
So on a machine with 2 NUMA nodes the best hint for a best-effort pod is {11,true} as best-effort pods have no Topology preferences.

The single-numa-node policy fails any pod with a not preferred hint OR a hint where > 1 bits are set, thus the above example resulting in termintaed pods with a Topology Affinity Error.

This is a short term fix for the single-numa-node policy, as there will be code refactoring for the 1.17 release.
2019-10-11 07:00:37 +01:00
Connor Doyle
a598369e3c Gofmt. 2019-10-10 12:16:21 -07:00
Connor Doyle
a9203ebdcf Topology manager aligns pods of all QoS classes. 2019-10-10 12:16:21 -07:00
Kevin Klues
5501f542cd Fixed bug in TopologyManager with SingleNUMANode Policy
This patch fixes an issue in the TopologyManager that wouldn't allow
pods to be admitted if pods were launched with the SingleNUMANode policy
and any of the hint providers had no NUMA preferences.

This is due to 2 factors:

1) Any hint provider that passes back a `nil` as its hints, has its hint
automatically transformed into a single {11 true} hint before merging

2) We added a special casing for the SingleNumaNodePolicy() in the
TopologyManager that essentially turns these hints into a
{11 false} anytime a {11 true} is seen.

The current patch reworks this logic so the that TopologyManager can
tell the difference between a "don't care" hint and a true "{11 true}"
hint returned by the hint provider. Only true "{11 true}" hints will be
converted by the special casing for the SingleNumaNodePolicy(), while
"don't care" hints will not.

This is a short term fix for this issue until we do a larger refactoring
of this code for the 1.17 release.
2019-10-09 17:41:08 -07:00
Connor Doyle
389853894d Delegate topology hint gen to CPU manager policy
- The previous implementation depended on a fixed set of policies.
2019-09-27 22:29:02 -07:00
Connor Doyle
e35301c19f Rename package socketmask to bitmask.
- As discussed in reviews and other public channels,
  this abstraction is used to represent numa nodes, not sockets.
- There is nothing inherently related to sockets in this package anyway.
2019-09-23 17:08:45 -07:00
Kubernetes Prow Robot
07cc813956
Merge pull request #81793 from lmdaly/topology-manager-owners
Added OWNERS file for Topology Manager
2019-09-11 18:26:52 -07:00
Louise Daly
fbccf25e29 Added OWNERS file for Topology Manager 2019-09-11 06:40:24 +01:00
Louise Daly
8ad1b5ba3b Single-numa-node Topology Manager bug fix
Added one off fix for single-numa-node policy to correctly
reject pod admission on a resource allocation that spans
NUMA nodes

Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-30 07:17:56 +01:00
Louise Daly
f6c085f60e Added Single NUMA Node Policy which ensure resource are
aligned on a single NUMA node

Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-30 07:17:17 +01:00
Kevin Klues
5ed80dadcf Update CanAdmitPodResult() in TopologyManager to take a TopologyHint
Previously it only took a bool, which limited the logic it could perform
to determine if a pod should be admitted or not based on the merged hint
from the policy.
2019-08-30 07:17:17 +01:00
Kevin Klues
df1b54fc09 Fail fast with TopologyManager on machines with more than 8 NUMA Nodes 2019-08-28 11:04:52 -05:00
Kevin Klues
5660cd3cfb Add NUMA Node awareness to the TopologyManager 2019-08-28 11:04:52 -05:00
Kubernetes Prow Robot
35867b160a
Merge pull request #81951 from klueska/upstream-update-cpu-amanger-numa-mapping
Update the CPUManager to include NUMANodeID in its topology information
2019-08-28 08:55:40 -07:00
Kubernetes Prow Robot
de1cfa9bc1
Merge pull request #81787 from lmdaly/topology-manager-rename-strict-policy
Renaming strict policy to restricted policy
2019-08-28 01:38:04 -07:00
Kevin Klues
f4dbd29cdb Rename TopologyHint.SocketAffinity to TopologyHint.NUMANodeAffinity
As part of this, update the logic to use the NUMA information instead of
the Socket information when generating and consuming TopologyHints in
the CPUManager.
2019-08-27 16:51:05 -05:00
Louise Daly
2fb94231d0 Renaming strict policy to restricted policy
Restricted policy will fail admission of guaranteed pods where
all requested resources are not available on a single NUMA Node
2019-08-22 07:57:55 +01:00
Tim Allclair
a2c51674cf Cleanup more static check issues (S1*,ST*) 2019-08-21 10:40:21 -07:00
Kevin Klues
4fdd52b058 Update GetTopologyHints() API to return a map
At present, there is no way for a hint provider to return distinct hints
for different resource types via a call to GetTopologyHints(). This
means that hint providers that govern multiple resource types (e.g. the
devicemanager) must do some sort of "pre-merge" on the hints it
generates for each resource type before passing them back to the
TopologyManager.

This patch changes the GetTopologyHints() interface to allow a hint
provider to pass back raw hints for each resource type, and allow the
TopologyManager to merge them using a single unified strategy.

This change also allows the TopologyManager to recognize which
resource type a set of hints originated from, should this information
become useful in the future.
2019-08-16 08:06:12 +02:00
Kubernetes Prow Robot
f2dd24820a
Merge pull request #73920 from nolancon/topology-manager-cpu-manager
Changes to make CPU Manager a Hint Provider for Topology Manager
2019-08-15 05:44:33 -07:00
Kevin Klues
9a6788cb13 Add IterateSocketMasks() function to socketmask abstraction 2019-08-14 06:22:56 +02:00
Kubernetes Prow Robot
ac2295a24d
Merge pull request #78587 from kad/socketmask-string
Use go standard library for common bit operations
2019-08-13 00:03:39 -07:00
Moshe Levi
3b83c5c7c6 TopologyManager: Fix rename best-effort policy files
PR https://github.com/kubernetes/kubernetes/pull/80301 rename
the preferred policy to best-effort, but the files names are
still policy_preferred.go and policy_preferred_test.go. This
PR fix that.
2019-07-28 19:35:16 +03:00
Kevin Klues
7eccc71c9e Rename 'preferred' TopologyManager policy to 'best-effort' 2019-07-25 10:44:36 +02:00
Kubernetes Prow Robot
5b496fe8f5
Merge pull request #80315 from klueska/upstream-cleanup-socketmask
Cleanup the TopologyManager socketmask abstraction
2019-07-23 11:40:28 -07:00
Kevin Klues
65b07312b0 Cleanup comments in TopologyManager socketmask abstraction 2019-07-18 18:52:19 -07:00
Kevin Klues
0edfd4be35 Add package level And/Or calls to TopologyManager socketmask abstraction 2019-07-18 09:06:51 -07:00
Kevin Klues
434fd34e0b Add NewEmtpySocketMask() call to TopologyManager socketmask abstraction 2019-07-18 09:05:55 -07:00
Kevin Klues
4ee5d5409e Update the topologymanager to error out if an invalid policy is given
Previously, the topologymanager would simply fall back to the None() policy
if an invalid policy was specified. This patch updates this to return an
error when an invalid policy is passed, forcing the kubelet to fail
fast when this occurs.

These semantics should be preferable because an invalid policy likely
indicates operator error in setting the policy flag on the kubelet
correctly (e.g. misspelling 'strict' as 'striict'). In this case it is
better to fail fast so the operator can detect this and correct the
mistake, than to mask the error and essentially disable the
topologymanager unexpectedly.
2019-07-18 13:24:09 +02:00
Kubernetes Prow Robot
1125054612
Merge pull request #80235 from moshe010/remove_string
Remove unnecessary string() from policy_none
2019-07-17 19:34:49 -07:00
Louise Daly
9d7e31e66e Topology Manager Implementation based on Interfaces
Co-authored-by: Kevin Klues <kklues@nvidia.com>
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>
2019-07-17 02:30:21 +01:00
Moshe Levi
d52985d3a0 Remove unnecessary string() from policy_none
Signed-off-by: Moshe Levi <moshele@mellanox.com>
2019-07-17 01:58:43 +03:00
Kubernetes Prow Robot
4197adaf2d
Merge pull request #79343 from nolancon/topology-manager-none
Add Policy None for Topology Manager
2019-07-16 13:22:47 -07:00
Kubernetes Prow Robot
d4d8daea73
Merge pull request #78558 from tedyu/policy-str
Remove unnecessary string()
2019-07-11 13:13:06 -07:00
nolancon
2d7ac702d6 Add Policy None for Topology Manager
Update naming of test functions.
2019-06-25 03:24:31 +01:00
Alexander Kanevskiy
89481f8c27 Use go standard library for common bit operations
PR#72913 introduced own versions of the bit operations that are
less efficient than ones from standard library.
2019-06-01 19:54:38 +03:00
Kubernetes Prow Robot
9ac58bae56
Merge pull request #78515 from klueska/upstream-socketmask-updates
Updates to the SocketMask abstraction for the TopologyManager
2019-06-01 09:50:16 -07:00
Kubernetes Prow Robot
46c74629cf
Merge pull request #78516 from klueska/upstream-topology-manager-interface-updates
Update the TopologyManager interfaces
2019-06-01 08:00:19 -07:00
Ted Yu
1a755d13a6 Remove unnecessary string() 2019-05-30 19:48:26 -07:00
Kevin Klues
0a43d21c26 Add IsNarrowerThan() function to socketmask abstraction 2019-05-30 06:00:22 -07:00
Kevin Klues
617a1fa394 Update the TopologyManager interfaces
These updates are based on discussions had about the preferred semantics
of the TopologyManager and will be reflected in changes to an upcoming
PR that adds the actual TopologyManager implementation.
2019-05-30 05:52:11 -07:00
Kevin Klues
cdb59d3c7a Fix incorrect names for tests in socketmask 2019-05-30 04:16:53 -07:00
nolancon
0244c0e658 remove dependency on implementation from policy preferred and strict
update build
2019-05-30 05:57:39 +01:00
nolancon
ef9baf313d Update unit tests for TopologyHints - Topology Manager Policies 2019-05-30 05:44:01 +01:00
nolancon
e82fa41fb2 More Intuitive TopologyHints - topology manager policies 2019-05-30 05:44:01 +01:00
Sreemanti Ghosh
4e503597b8 Unit test for Topology Manager policy_strict and policy_preferred 2019-05-30 05:44:01 +01:00
nolancon
eff568e496 Add Policies Strict and Preferred for Topology Manager 2019-05-30 05:44:01 +01:00
lmdaly
c1a4457573 Update Bazel files to include SocketMask 2019-05-29 02:21:51 +01:00
Conor Nolan
d99bac12e6 Update Remove/AddPod to Container (#26)
More intuitive TopologyHints
2019-05-29 02:11:15 +01:00
lmdaly
e64c558a11 Added BUILD files and updates to Boilerplates 2019-05-29 02:11:15 +01:00
lmdaly
71bbc6d538 Add Topology Manager Interfaces
*Topology Manager
*Policy
2019-05-29 02:10:46 +01:00
nolancon
b7f6b8f8f1 Updated unit test for socketmask 2019-05-28 05:00:04 +01:00
nolancon
283dff9335 Update SocketMask based on feedback
TODO: Unit tests to be updated
2019-05-27 07:19:03 +01:00
nolancon
e8566caa3f Update to unit test and comment bug fixed 2019-05-13 06:41:44 +01:00
nolancon
7c525ffaa8 More intuitive TopologyHints - socketmask.go 2019-05-08 04:22:39 +01:00
Sreemanti-Ghosh
ce56956409 Socket mask unit test (#4) 2019-03-05 08:00:04 +00:00
nolancon
a273333f1f Add BUILD files and Boilerplates
Updates based on comments
* Export comments added
* glog changed to klog
* Other small edits
2019-03-05 07:59:51 +00:00
nolancon
f10e76962f Add Socket Mask for Topology Manager 2019-03-01 07:20:47 +00:00