kubernetes/pkg/kubelet
Francesco Romani 0e9b92090c node: cpumgr: stricter precheck for full-pcpus-only
In order to implement the `full-pcpus-only` cpumanager policy option,
we leverage the implementation of the algorithm which picks CPUs.
By design, CPUs are taken from the biggest chunk available (socket
or NUMA zone) to physical cores, down to single cores.

Leveraging this, if the requested CPU count is a multiple of the SMT
level (commonly 2), we're guaranteed that only full physical cores
will be taken.

The hidden assumption here is this holds true by construction iff
the user reserved CPUs (if any) considering full physical CPUs.
IOW, if the user did intentionally or mistakely reserve single threads
which are no core siblings[1], then the simple check we implemented
is not sufficient.

A easy example can probably outline this better. With this setup:

cores: [(0, 4), (1, 5), (2, 6), (3, 8)] (in parens: thread siblings).
SMT level: 2 (each tuple is 2 elements)
Reserved CPUs: 0,1 (explicit pick using `--reserved-cpus`)

A container then requests 6 cpus. full-pcpus-only check: 6 % 2 == 0. Passed.
The CPU allocator will take first full cores, (2,6) and (3,8), and will
then pick the remaining single CPUs. The allocation will succeed, but
it's incorrect.

We can fix this case with a stricter precheck.
We need to additionally consider all the core siblings of the reserved
CPUs as unavailable when computing the free cpus, before to start the
actual allocation. Doing so, we fall back in the intended behavior, and
by construction all possible CPUs allocation whose number is multiple
of the SMT level are now correct again.

+++

[1] or thread siblings in the linux parlance, in any case:
hyperthread siblings of the same physical core

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-02 16:00:58 +01:00
..
apis kubelet podresource: fix GetAllocatableResources metrics 2023-01-04 10:58:55 +02:00
cadvisor Remove AcceleratorUsageMetrics from kubelet 2023-01-11 16:07:39 -08:00
certificate kubelet: add key encipherment usage only if it is rsa key 2022-12-27 16:04:25 +08:00
checkpointmanager
client Merge pull request #96004 from serathius/datapolicy-kubelet-pkg 2022-03-04 15:34:51 -08:00
cloudresource
cm node: cpumgr: stricter precheck for full-pcpus-only 2023-03-02 16:00:58 +01:00
config Merge pull request #112021 from mrunalp/test_host_path_pv_selinux_fix 2022-12-23 12:35:27 -08:00
configmap Generate and format files 2022-07-26 13:14:05 -04:00
container kubelet: wire ListPodSandboxMetrics 2022-11-08 14:47:08 -05:00
cri Add fake runtimes and CRI changes for KEP-2371 2022-11-08 14:47:08 -05:00
envvars
events
eviction remove GA featuregates: CSIInlineVolume, CSIMigration, DaemonSetUpdateSurge, EphemeralContainers, IdentifyPodOS, LocalStorageCapacityIsolation, NetworkPolicyEndPort, StatefulSetMinReadySeconds 2022-12-11 19:27:41 +08:00
images image pull event include duration with waiting 2022-11-06 13:42:44 +08:00
kubeletconfig unittests: Fixes TestReplaceFile for Windows 2022-12-07 11:36:13 +00:00
kuberuntime Merge pull request #113255 from claudiubelu/path-filepath-update-kubelet 2022-12-09 22:27:41 -08:00
leaky
lifecycle Fix indentation/spacing in comments to render correctly in godoc 2022-12-17 23:27:38 -05:00
logs Second attempt: Plumb context to Kubelet CRI calls (#113591) 2022-11-05 06:02:13 -07:00
metrics kubelet/metrics: add cri_metrics 2022-11-08 14:47:08 -05:00
network pkg/kubelet/network/dns: omit unnecessary fmt.Sprintf 2022-11-29 14:44:14 +08:00
nodeshutdown Enable the feature into beta 2022-11-09 09:02:40 +01:00
nodestatus Second attempt: Plumb context to Kubelet CRI calls (#113591) 2022-11-05 06:02:13 -07:00
oom linux: fix kubelet start unit test 2022-11-09 07:17:05 +08:00
pleg Add support for Evented PLEG 2022-11-08 20:06:16 +05:30
pluginmanager Merge pull request #114187 from claudiubelu/refactor-platform-deps-3 2023-01-10 15:25:26 -08:00
pod kubelet: cleanup secretManager and configManager in podManager 2022-11-14 23:05:32 +08:00
preemption feat: improve naming 2022-07-24 19:04:08 +09:00
prober kubelet: cleanup secretManager and configManager in podManager 2022-11-14 23:05:32 +08:00
qos
runtimeclass Generate and format files 2022-07-26 13:14:05 -04:00
secret Generate and format files 2022-07-26 13:14:05 -04:00
server kubelet: add cri metrics to server 2022-11-08 14:47:08 -05:00
stats Replaces path.Operation with filepath.Operation (kubelet) 2022-11-08 16:05:48 +00:00
status kubelet: cleanup secretManager and configManager in podManager 2022-11-14 23:05:32 +08:00
sysctl remove the unused constant AnnotationInvalidReason since sysctl annotations are deprecated and migrated to fields 2022-09-30 14:53:46 +08:00
token Merge pull request #99685 from yangjunmyfm192085/run-test24 2022-05-03 17:16:47 -07:00
types Enable the feature into beta 2022-11-09 09:02:40 +01:00
util Merge pull request #111930 from azylinski/new-histogram-pod_start_sli_duration_seconds 2022-11-04 07:28:14 -07:00
volumemanager fix: kubelet event about unattached volumes is incorrect (#112719) 2023-01-04 01:51:59 -08:00
winstats Merge pull request #111418 from muyangren2/winstats_assert 2022-07-29 19:29:29 -07:00
active_deadline_test.go Add comment for 0th case 2022-10-08 12:06:42 +03:00
active_deadline.go
doc.go
errors.go
kubelet_getters_test.go Add test case for getPodVolumeSubpathsDir 2022-04-27 16:33:28 +08:00
kubelet_getters.go Second attempt: Plumb context to Kubelet CRI calls (#113591) 2022-11-05 06:02:13 -07:00
kubelet_network_linux.go Add IPTablesOwnershipCleanup feature to disable kubelet iptables setup 2022-07-27 13:33:09 -04:00
kubelet_network_others.go
kubelet_network_test.go
kubelet_network.go Second attempt: Plumb context to Kubelet CRI calls (#113591) 2022-11-05 06:02:13 -07:00
kubelet_node_status_others.go
kubelet_node_status_test.go kubelet: Keep trying fast status update at startup until node is ready 2022-11-09 15:55:20 +00:00
kubelet_node_status_windows.go
kubelet_node_status.go kubelet: Keep trying fast status update at startup until node is ready 2022-11-09 15:55:20 +00:00
kubelet_pods_linux_test.go Promote Local storage capacity isolation feature to GA 2022-08-02 23:45:48 -07:00
kubelet_pods_test.go Enable the feature into beta 2022-11-09 09:02:40 +01:00
kubelet_pods_windows_test.go unittests: Fixes unit tests for Windows 2022-10-25 23:46:56 +03:00
kubelet_pods.go Merge pull request #113255 from claudiubelu/path-filepath-update-kubelet 2022-12-09 22:27:41 -08:00
kubelet_resources_test.go
kubelet_resources.go
kubelet_test.go kubelet: cleanup secretManager and configManager in podManager 2022-11-14 23:05:32 +08:00
kubelet_volumes_linux_test.go Remove ioutil in kubelet and its tests 2022-07-30 12:35:26 +09:00
kubelet_volumes_test.go Upgrade CSIMigrationGCE feature gate to GA 2022-08-02 09:14:27 -07:00
kubelet_volumes.go remove ioutil in kubelet 2022-04-27 21:08:42 +08:00
kubelet.go Merge pull request #112136 from pacoxu/migrate-runtime-endpoint-flags 2023-01-03 09:29:31 -08:00
OWNERS Check in OWNERS modified by update-yamlfmt.sh 2021-12-09 21:31:26 -05:00
pod_container_deletor_test.go
pod_container_deletor.go Second attempt: Plumb context to Kubelet CRI calls (#113591) 2022-11-05 06:02:13 -07:00
pod_workers_test.go Merge pull request #110071 from gjkim42/deflake-TestStaticPodExclusion 2022-07-29 13:17:43 -07:00
pod_workers.go grammar: replace all occurrences of "the the" with "the" 2022-10-14 09:03:14 +02:00
reason_cache_test.go
reason_cache.go Generate and format files 2022-07-26 13:14:05 -04:00
runonce_test.go kubelet: cleanup secretManager and configManager in podManager 2022-11-14 23:05:32 +08:00
runonce.go Second attempt: Plumb context to Kubelet CRI calls (#113591) 2022-11-05 06:02:13 -07:00
runtime.go
userns_manager_test.go kubelet: drop bitArray implementation 2022-08-19 16:55:15 +02:00
userns_manager.go kubelet: drop bitArray implementation 2022-08-19 16:55:15 +02:00
volume_host.go linux: fix kubelet start unit test 2022-11-09 07:17:05 +08:00