Commit Graph

61 Commits

Author SHA1 Message Date
vinay kulkarni
01b96e7704 Rename ContainerStatus.ResourcesAllocated to ContainerStatus.AllocatedResources 2023-03-10 14:49:26 +00:00
Kubernetes Prow Robot
efe20f6c9b
Merge pull request #114114 from ffromani/full-pcpus-stricter-precheck-issue113537
node: cpumgr: stricter pre-check for  the policy option full-pcpus-only
2023-03-02 09:04:56 -08:00
Francesco Romani
0e9b92090c node: cpumgr: stricter precheck for full-pcpus-only
In order to implement the `full-pcpus-only` cpumanager policy option,
we leverage the implementation of the algorithm which picks CPUs.
By design, CPUs are taken from the biggest chunk available (socket
or NUMA zone) to physical cores, down to single cores.

Leveraging this, if the requested CPU count is a multiple of the SMT
level (commonly 2), we're guaranteed that only full physical cores
will be taken.

The hidden assumption here is this holds true by construction iff
the user reserved CPUs (if any) considering full physical CPUs.
IOW, if the user did intentionally or mistakely reserve single threads
which are no core siblings[1], then the simple check we implemented
is not sufficient.

A easy example can probably outline this better. With this setup:

cores: [(0, 4), (1, 5), (2, 6), (3, 8)] (in parens: thread siblings).
SMT level: 2 (each tuple is 2 elements)
Reserved CPUs: 0,1 (explicit pick using `--reserved-cpus`)

A container then requests 6 cpus. full-pcpus-only check: 6 % 2 == 0. Passed.
The CPU allocator will take first full cores, (2,6) and (3,8), and will
then pick the remaining single CPUs. The allocation will succeed, but
it's incorrect.

We can fix this case with a stricter precheck.
We need to additionally consider all the core siblings of the reserved
CPUs as unavailable when computing the free cpus, before to start the
actual allocation. Doing so, we fall back in the intended behavior, and
by construction all possible CPUs allocation whose number is multiple
of the SMT level are now correct again.

+++

[1] or thread siblings in the linux parlance, in any case:
hyperthread siblings of the same physical core

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-02 16:00:58 +01:00
Chen Wang
7db339dba2 This commit contains the following:
1. Scheduler bug-fix + scheduler-focussed E2E tests
2. Add cgroup v2 support for in-place pod resize
3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes.

Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>
2023-02-24 18:21:21 +00:00
Vinay Kulkarni
f2bd94a0de In-place Pod Vertical Scaling - core implementation
1. Core Kubelet changes to implement In-place Pod Vertical Scaling.
2. E2E tests for In-place Pod Vertical Scaling.
3. Refactor kubelet code and add missing tests (Derek's kubelet review)
4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature.
5. Fix corner-case where resize A->B->A gets ignored
6. Add cgroup v2 support to pod resize E2E test.
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources

Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>
2023-02-24 18:21:21 +00:00
Ian K. Coolidge
f3829c4be3 cpuset: Rename 'NewCPUSet' to 'New' 2023-01-06 23:32:51 +00:00
Ian K. Coolidge
e5143d16c2 cpuset: Make 'ToSlice*' methods look like 'set' methods
In 'set', conversions to slice are done also, but with different names:

ToSliceNoSort() -> UnsortedList()
ToSlice() -> List()

Reimplement List() in terms of UnsortedList to save some duplication.
2023-01-06 23:32:51 +00:00
Ian K. Coolidge
824bd57ad6 cpuset: Convert Union arguments to variadic
This allows Union to implement UnionAll easily.
2023-01-06 23:32:50 +00:00
Francesco Romani
5e12338a22 node: cpumgr: address golint complains
Add docstrings and trivial fixes.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-11-02 18:41:42 +01:00
Kubernetes Prow Robot
d0e86111ef
Merge pull request #112855 from fromanirh/cpumanager-metrics
node: metrics: cpumanager: add metrics about pinning
2022-10-31 03:12:56 -07:00
Francesco Romani
47d3299781 node: metrics: cpumanager: add pinning metrics
In order to improve the observability of the cpumanager,
add and populate metrics to track if the combination of
the kubelet configuration and podspec would trigger
exclusive core allocation and pinning.

We should avoid leaking any node/machine specific information
(e.g. core ids, even though this is admittedly an extreme example);
tracking these metrics seems to be a good first step, because
it allows us to get feedback without exposing details.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-10-27 14:40:40 +02:00
Garrybest
d446f5f90e fix GetAllocatableCPUs in cpumanager
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-10-27 19:57:06 +08:00
Arpit Singh
d92fd8392d Adding unit test for align-by-socket policy option
Also addressed MR comments as part of same commit.
2022-08-02 11:02:07 -07:00
Arpit Singh
06f347f645 Adding validity checks for topology manager align-by-socket 2022-08-02 11:02:07 -07:00
Arpit Singh
35849bf7fb KEP-3327: Add CPUManager policy option to align CPUs by Socket instead of by NUMA node 2022-08-02 11:02:07 -07:00
Davanum Srinivas
a9593d634c
Generate and format files
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-07-26 13:14:05 -04:00
Kubernetes Prow Robot
03ee86c09c
Merge pull request #104837 from eggiter/fix-release-reused-cpus
fix(cpumanager): Do not release CPUs of init containers while they are being reused in app containers
2022-01-06 11:46:38 -08:00
Kevin Klues
70e0f47191 Support full-pcpus-only with the new NUMA distribution policy option
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-10-16 19:31:02 +00:00
Kevin Klues
462544d079 Split CPUManager takeByTopology() into two different algorithms
The first implements the original algorithm which packs CPUs onto NUMA nodes if
more than one NUMA node is required to satisfy the allocation. The second
disitributes CPUs across NUMA nodes if they can't all fit into one.

The "distributing" algorithm is currently a noop and just returns an error of
"unimplemented". A subsequent commit will add the logic to implement this
algorithm according to KEP 2902:

https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2902-cpumanager-distribute-cpus-policy-option

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2021-10-16 14:46:19 +00:00
eggiter
20d3bc32ac fix(cpumanager): Do not release cpus of init containers while they are reused in app containers 2021-09-10 10:01:35 +08:00
Francesco Romani
23abdab2b7 smtalign: propagate policy options to policies
Consume in the static policy the cpu manager policy options from
the cpumanager instance.
Validate in the none policy if any option is given, and fail if so -
this is almost surely a configuration mistake.

Add new cpumanager.Options type to hold the options and translate from
user arguments to flags.

Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-07-08 23:15:37 +02:00
pacoxu
8d24c8d0ab update structured log for cpumanager/cpu_manager.go 2021-03-16 09:40:53 +08:00
pacoxu
9e024e839b update structured log for policy_static.go 2021-03-12 16:26:20 +08:00
Francesco Romani
6d33354e4c node: podresources: implement GetAllocatableResources API
Extend the podresources API implementing the GetAllocatableResources endpoint,
as specified in the KEPs:

https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments
https://github.com/kubernetes/enhancements/pull/2404

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-09 13:13:36 +01:00
sw.han
27b7bcb41c Implement the cpumanager.GetPodTopologyHints() function
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
Krzysztof Wiatrzyk
6db58b2e92 Update logging to use a format util
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:55 +01:00
sw.han
f5997fe537 Add GetPodTopologyHints() interface to Topology/CPU/Device Manager
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
2020-11-12 12:25:54 +01:00
Alexey Perevalov
e33ba9e974 Avoid using socket for hints
Sockets don't affect performance as NUMA node does, since NUMA
node has dedicated memory controller, but socket it's physical
extension point.
Socket it's only cpu specific thing and it's strange to merge bitmask of
deviceplugin's and cpu manager, when cpu manager takes into account
socket.

Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-07-22 05:14:34 -04:00
Kevin Klues
00df26a985 Fix a bug whereby reusable CPUs and devices were not being honored
Previously, it was possible for reusable CPUs and reusable devices (i.e.
those previously consumed by init containers) to not be reused by
subsequent init containers or app containers if the TopologyManager was
enabled. This would happen because hint generation for the
TopologyManager was not considering the reusable devices when it made
its hint calculation.

As such, it would sometimes:
1) Generate a hint for a differnent NUMA node, causing the CPUs and
devices to be allocated from that node instead of the one where the
reusable devices live; or
2) End up thinking there were not enough CPUs or devices to allocate and
throw a TopologyAffinity admission error

This patch fixes this by ensuring that reusable CPUs and devices are
considered as part of TopologyHint generation. This frunctionality is
difficult to unit test since it spans multiple components, but an e2e
test will be added in a subsequent patch to test this functionality.
2020-07-20 11:41:13 +00:00
Davanum Srinivas
442a69c3bd
switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
Kevin Klues
751b9f3e13 Update strategy used to reuse CPUs from init containers in CPUManager
With the old strategy, it was possible for an init container to end up
running without some of its CPUs being exclusive if it requested more
guaranteed CPUs than the sum of all guaranteed CPUs requested by app
containers. Unfortunately, this case was not caught by our unit tests
because they didn't validate the state of the defaultCPUSet to ensure
there was no overlap with CPUs assigned to containers. This patch
updates the strategy to reuse the CPUs assigned to init containers
across into app containers, while avoiding this edge case. It also
updates the unit tests to now catch this type of error in the future.
2020-04-23 20:27:43 +00:00
nolancon
467f66580b CPU Manager - Add check to policy.Allocate() for init conatiners
If container allocated CPUs is an init container, release those CPUs
back into the shared pool for re-allocation to next container.
2020-02-27 07:24:33 +00:00
nolancon
709989efa2 CPU Manager - Rename policy.AddContainer() to policy.Allocate() 2020-02-27 07:24:33 +00:00
Kevin Klues
bc686ea27b Update TopologyManager.GetTopologyHints() to take pointers
Previously, this function was taking full Pod and Container objects
unnecessarily. This commit updates this so that they will take pointers
instead.
2020-02-03 17:13:28 +00:00
whypro
f4bd4e2e96 Return error instead of panic when cpu manager starts failed. 2019-12-19 21:56:23 +08:00
Kevin Klues
185e790f71 Update CPUManager policies to adhere to new state semantics 2019-12-11 23:02:51 +01:00
Kevin Klues
765aae93f8 Move containerMap out of static policy and into top-level CPUManager 2019-12-11 23:02:51 +01:00
Kevin Klues
1d995c98ef Update CPUmanager containerMap to allow removal by containerRef 2019-12-11 23:02:47 +01:00
Kevin Klues
0639bd0942 Change CPUManager containerMap to key off of (podUID, containerName)
Previously it keyed off of a pointer to the actual pod / container,
which was unnecessary, and hard to work with (especially on the
retrieval side).
2019-12-11 23:02:11 +01:00
Kevin Klues
3881e50cce Update CPUmanager containerMap to also return a containerRef 2019-12-11 23:01:01 +01:00
Kevin Klues
347d5f57ac Move CPUManager ContainerMap to its own package 2019-12-11 22:59:00 +01:00
Kubernetes Prow Robot
73b2c82b28
Merge pull request #83592 from jianzzha/opt-reserved-cpus
added --reserved-cpus kubelet command option
2019-11-06 22:14:42 -08:00
Jianzhu Zhang
89dfd24483 added --reserved-cpus kubelet command option 2019-11-06 07:33:52 -05:00
Kevin Klues
9dc116eb08 Ensure CPUManager TopologyHints are regenerated after kubelet restart
This patch also includes test to make sure the newly added logic works
as expected.
2019-11-05 15:48:51 +00:00
Connor Doyle
389853894d Delegate topology hint gen to CPU manager policy
- The previous implementation depended on a fixed set of policies.
2019-09-27 22:29:02 -07:00
Connor Doyle
e35301c19f Rename package socketmask to bitmask.
- As discussed in reviews and other public channels,
  this abstraction is used to represent numa nodes, not sockets.
- There is nothing inherently related to sockets in this package anyway.
2019-09-23 17:08:45 -07:00
Kevin Klues
e0e8b3e4fd Update CPUManager topology helpers to accept multiple ids 2019-08-29 13:22:54 -05:00
Kevin Klues
f4dbd29cdb Rename TopologyHint.SocketAffinity to TopologyHint.NUMANodeAffinity
As part of this, update the logic to use the NUMA information instead of
the Socket information when generating and consuming TopologyHints in
the CPUManager.
2019-08-27 16:51:05 -05:00
Kevin Klues
8278d1134c Consume TopologyHints in the CPUManager
Co-Authored-By: Conor Nolan <conor.nolan@intel.com>
2019-08-14 06:22:56 +02:00
Kevin Klues
156b3f6af8 Generate TopologyHints from the CPUManager 2019-08-14 06:22:56 +02:00