These changes make it so that a set of common test cases can be used for
all merge strategies, with specific test cases being able to be
specified on a policy-by-policy basis.
This is in preparation for removing the special-case of the
SingleNumaNode policy in mergeProvidersHints() in favor of a custom
merging strategy with much less overhead.
This abstraction moves the responsibility of merging topology hints to
the individual policies themselves. As part of this, it removes the
CanAdmitPodResult() API from the policy abstraction, and rolls it into a
second return value from Merge()
These policies only differ on whether they admit the pod or not when a
TopologyHint is preferred or not. As such, the restricted policy should
simply inherit whatever it can from the best effort policy and only
overwrite what is necessary.
This does not matter for now, but will become important when we add a
new 'Merge()' abstraction to a Policy later on.
This patch fixes an issue where best-effort pods were not admitted
to the node if the single-numa-node policy was set.
This was because the Admit policy in single-numa-node policy does
not admit any pod where the hint is anything but single NUMA node. The 'best hint' in this case is {<set bits for num. Numa Nodes on machine>, true}
So on a machine with 2 NUMA nodes the best hint for a best-effort pod is {11,true} as best-effort pods have no Topology preferences.
The single-numa-node policy fails any pod with a not preferred hint OR a hint where > 1 bits are set, thus the above example resulting in termintaed pods with a Topology Affinity Error.
This is a short term fix for the single-numa-node policy, as there will be code refactoring for the 1.17 release.
This patch fixes an issue in the TopologyManager that wouldn't allow
pods to be admitted if pods were launched with the SingleNUMANode policy
and any of the hint providers had no NUMA preferences.
This is due to 2 factors:
1) Any hint provider that passes back a `nil` as its hints, has its hint
automatically transformed into a single {11 true} hint before merging
2) We added a special casing for the SingleNumaNodePolicy() in the
TopologyManager that essentially turns these hints into a
{11 false} anytime a {11 true} is seen.
The current patch reworks this logic so the that TopologyManager can
tell the difference between a "don't care" hint and a true "{11 true}"
hint returned by the hint provider. Only true "{11 true}" hints will be
converted by the special casing for the SingleNumaNodePolicy(), while
"don't care" hints will not.
This is a short term fix for this issue until we do a larger refactoring
of this code for the 1.17 release.
- As discussed in reviews and other public channels,
this abstraction is used to represent numa nodes, not sockets.
- There is nothing inherently related to sockets in this package anyway.
Added one off fix for single-numa-node policy to correctly
reject pod admission on a resource allocation that spans
NUMA nodes
Co-authored-by: Kevin Klues <kklues@nvidia.com>
Previously it only took a bool, which limited the logic it could perform
to determine if a pod should be admitted or not based on the merged hint
from the policy.