kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	3bd422dc76	Merge pull request #107293 from dims/jan-1-owners-cleanup Cleanup OWNERS files - Jan 2021 Week 1	2022-01-13 10:30:30 -08:00
Kubernetes Prow Robot	cadbe8dfb5	Merge pull request #107250 from cndoit18/use-errors cleanup(kubelet): use errors.Is(err, os.ErrProcessDone)	2022-01-11 10:49:01 -08:00
Davanum Srinivas	9682b7248f	OWNERS cleanup - Jan 2021 Week 1 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2022-01-10 08:14:29 -05:00
Kubernetes Prow Robot	03ee86c09c	Merge pull request #104837 from eggiter/fix-release-reused-cpus fix(cpumanager): Do not release CPUs of init containers while they are being reused in app containers	2022-01-06 11:46:38 -08:00
cndoit18	601d02b90f	refactor(kubelet): use errors.Is(err, os.ErrProcessDone) use errors.Is(err, os.ErrProcessDone) here and remove "process already finished" string comparison. Signed-off-by: cndoit18 <cndoit18@outlook.com>	2021-12-29 18:10:06 +08:00
yxxhero	a90b149be0	add more message for no PodSandbox container Signed-off-by: yxxhero <aiopsclub@163.com>	2021-12-18 09:52:03 +08:00
Davanum Srinivas	497e9c1971	Cleanup OWNERS files (No Activity in the last year) Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2021-12-15 10:34:02 -05:00
Kubernetes Prow Robot	1d66302c42	Merge pull request #106458 from dims/lint-yaml-in-owners-files Lint/Beautify yaml in OWNERS files	2021-12-10 06:39:12 -08:00
Davanum Srinivas	9405e9b55e	Check in OWNERS modified by update-yamlfmt.sh Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2021-12-09 21:31:26 -05:00
Kevin Klues	f8511877e2	Add regression test for CPUManager distribute NUMA algorithm We witnessed this exact allocation attempt in a live cluster and witnessed the algorithm fail with an accounting error. This test was added to verify that this case is now handled by the updates to the algorithm and that we don't regress from it in the future. "test" description="ensure previous failure encountered on live machine has been fixed (1/1)" "combo remainderSet balance" combo=[2 4 6] remainderSet=[2 4 6] distribution=9 remainder=1 available=[14 2 4 4 0 3 4 1] balance=4.031 "combo remainderSet balance" combo=[2 4 6] remainderSet=[2 4] distribution=9 remainder=1 available=[0 3 4 1 14 2 4 4] balance=4.031 "combo remainderSet balance" combo=[2 4 6] remainderSet=[2 6] distribution=9 remainder=1 available=[1 14 2 4 4 0 3 4] balance=4.031 "combo remainderSet balance" combo=[2 4 6] remainderSet=[4 6] distribution=9 remainder=1 available=[1 3 4 0 14 2 4 4] balance=4.031 "combo remainderSet balance" combo=[2 4 6] remainderSet=[2] distribution=9 remainder=1 available=[4 0 3 4 1 14 2 4] balance=4.031 "combo remainderSet balance" combo=[2 4 6] remainderSet=[4] distribution=9 remainder=1 available=[3 4 0 14 2 4 4 1] balance=4.031 "combo remainderSet balance" combo=[2 4 6] remainderSet=[6] distribution=9 remainder=1 available=[1 13 2 4 4 1 3 4] balance=3.606 "bestCombo found" distribution=9 bestCombo=[2 4 6] bestRemainder=[6] Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 20:49:58 +00:00
Kevin Klues	e284c74d93	Add unit test for CPUManager distribute NUMA algorithm verifying fixes Before Change: "test" description="ensure bestRemainder chosen with NUMA nodes that have enough CPUs to satisfy the request" "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 1] distribution=8 remainder=2 available=[-1 -1 0 6] balance=2.915 "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 2] distribution=8 remainder=2 available=[-1 0 -1 6] balance=2.915 "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[0 3] distribution=8 remainder=2 available=[5 -1 0 0] balance=2.345 "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[1 2] distribution=8 remainder=2 available=[0 -1 -1 6] balance=2.915 "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[1 3] distribution=8 remainder=2 available=[0 -1 0 5] balance=2.345 "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[2 3] distribution=8 remainder=2 available=[0 0 -1 5] balance=2.345 "bestCombo found" distribution=8 bestCombo=[0 1 2 3] bestRemainder=[0 3] --- FAIL: TestTakeByTopologyNUMADistributed (0.01s) --- FAIL: TestTakeByTopologyNUMADistributed/ensure_bestRemainder_chosen_with_NUMA_nodes_that_have_enough_CPUs_to_satisfy_the_request (0.00s) cpu_assignment_test.go:867: unexpected error [accounting error, not enough CPUs allocated, remaining: 1] After Change: "test" description="ensure bestRemainder chosen with NUMA nodes that have enough CPUs to satisfy the request" "combo remainderSet balance" combo=[0 1 2 3] remainderSet=[3] distribution=8 remainder=2 available=[0 0 0 4] balance=1.732 "bestCombo found" distribution=8 bestCombo=[0 1 2 3] bestRemainder=[3] SUCCESS Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 20:45:37 +00:00
Kevin Klues	031f11513d	Fix accounting bug in CPUManager distribute NUMA policy Without this fix, the algorithm may decide to allocate "remainder" CPUs from a NUMA node that has no more CPUs to allocate. Moreover, it was only considering allocation of remainder CPUs from NUMA nodes such that each NUMA node in the remainderSet could only allocate 1 (i.e. 'cpuGroupSize') more CPUs. With these two issues in play, one could end up with an accounting error where not enough CPUs were allocated by the time the algorithm runs to completion. The updated algorithm will now omit any NUMA nodes that have 0 CPUs left from the set of NUMA nodes considered for allocating remainder CPUs. Additionally, we now consider all combinations of nodes from the remainder set of size 1..len(remainderSet). This allows us to find a better solution if allocating CPUs from a smaller set leads to a more balanced allocation. Finally, we loop through all NUMA nodes 1-by-1 in the remainderSet until all rmeainer CPUs have been accounted for and allocated. This ensure that we will not hit an accounting error later on because we explicitly remove CPUs from the remainder set until there are none left. A follow-on commit adds a set of unit tests that will fail before these changes, but succeeds after them. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 19:18:11 +00:00
Kevin Klues	5317a2e2ac	Fix error handling in CPUManager distribute NUMA tests Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:31 +00:00
Kevin Klues	dc4430b663	Add a sum() helper to the CPUManager cpuassignment logic Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:29 +00:00
Kevin Klues	cfacc22459	Allow the map.Values() function in the CPUManager to take a set of keys Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:28 +00:00
Kevin Klues	a160d9a8cd	Fix CPUManager algo to calculate min NUMA nodes needed for distribution Previously the algorithm was too restrictive because it tried to calculate the minimum based on the number of available NUMA nodes and the number of available CPUs on those NUMA nodes. Since there was no (easy) way to tell how many CPUs an individual NUMA node happened to have, the average across them was used. Using this value however, could result in thinking you need more NUMA nodes to possibly satisfy a request than you actually do. By using the total number of NUMA nodes and CPUs per NUMA node, we can get the true minimum number of nodes required to satisfy a request. For a given "current" allocation this may not be the true minimum, but its better to start with fewer and move up than to start with too many and miss out on a better option. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:26 +00:00
Kevin Klues	209cd20548	Fix unit tests following bug fix in CPUManager for map functions (2/2) Now that the algorithm for balancing CPU distributions across NUMA nodes is correct, this test actually behaves differently for the "packed" vs. "distributed" allocation algorithms (as it should). In the "packed" case we need to ensure that CPUs are allocated such that they are packed onto cores. Since one CPU is already allocated from a core on NUMA node 0, we want the next CPU to be its hyperthreaded pair (even though the first available CPU id is on Socket 1). In the "distributed" case, however, we want to ensure CPUs are allocated such that we have an balanced distribution of CPUs across all NUMA nodes. This points to allocating from Socket 1 if the only other CPU allocated has been done on Socket 0. To allow CPUs allocations to be packed onto full cores, one can allocate them from the "distributed" algorithm with a 'cpuGroupSize' equal to the number of hypthreads per core (in this case 2). We added an explicit test case for this, demonstrating that we get the same result as the "packed" algorithm does, even though the "distributed" algorithm is in use. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:24 +00:00
Kevin Klues	67f719cb1d	Fix unit tests following bug fix in CPUManager for map functions (1/2) This fixes two related tests to better test our "balanced" distribution algorithm. The first test originally provided an input with the following number of CPUs available on each NUMA node: Node 0: 16 Node 1: 20 Node 2: 20 Node 3: 20 It then attempted to distribute 48 CPUs across them with an expectation that each of the first 3 NUMA nodes would have 16 CPUs taken from them (leaving Node 0 with no more CPUs in the end). This would have resulted in the following amount of CPUs on each node: Node 0: 0 Node 1: 4 Node 2: 4 Node 3: 20 Which results in a standard deviation of 7.6811 However, a more balanced solution would actually be to pull 16 CPUs from NUMA nodes 1, 2, and 3, and leave 0 untouched, i.e.: Node 0: 16 Node 1: 4 Node 2: 4 Node 3: 4 Which results in a standard deviation of 5.1961524227066 To fix this test we changed the original number of available CPUs to start with 4 less CPUs on NUMA node 3, and 2 more CPUs on NUMA node 0, i.e.: Node 0: 18 Node 1: 20 Node 2: 20 Node 3: 16 So that we end up with a result of: Node 0: 2 Node 1: 4 Node 2: 4 Node 3: 16 Which pulls the CPUs from where we want and results in a standard deviation of 5.5452 For the second test, we simply reverse the number of CPUs available for Nodes 0 and 3 as: Node 0: 16 Node 1: 20 Node 2: 20 Node 3: 18 Which forces the allocation to happen just as it did for the first test, except now on NUMA nodes 1, 2, and 3 instead of NUMA nodes 0,1, and 2. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:23 +00:00
Kevin Klues	4008ea0b4c	Fix bug in CPUManager map.Keys() and map.Values() implementations Previously these would return lists that were too long because we appended to pre-initialized lists with a specific size. Since the primary place these functions are used is in the mean and standard deviation calculations for the NUMA distribution algorithm, it meant that the results of these calculations were often incorrect. As a result, some of the unit tests we have are actually incorrect (because the results we expect do not actually produce the best balanced distribution of CPUs across all NUMA nodes for the input provided). These tests will be patched up in subsequent commits. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:21 +00:00
Kevin Klues	446c58e0e7	Ensure we balance across all NUMA nodes in NUMA distribution algo Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:19 +00:00
Kevin Klues	c8559bc43e	Short-circuit CPUManager distribute NUMA algo for unusable cpuGroupSize Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:16 +00:00
Kevin Klues	b28c1392d7	Round the CPUManager mean and stddev calculations to the nearest 1000th Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-11-24 16:51:13 +00:00
Sascha Grunert	de37b9d293	Make CRI `v1` the default and allow a fallback to `v1alpha2` This patch makes the CRI `v1` API the new project-wide default version. To allow backwards compatibility, a fallback to `v1alpha2` has been added as well. This fallback can either used by automatically determined by the kubelet. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-11-17 11:05:05 -08:00
Antonio Ojea	d126b14838	migrate nolint coments to golangci-lint	2021-11-17 13:58:53 +01:00
Neha Lohia	fa1b6765d5	move pkg/util/node to component-helpers/node/util (#105347 ) Signed-off-by: Neha Lohia <nehapithadiya444@gmail.com>	2021-11-12 07:52:27 -08:00
Kubernetes Prow Robot	3ca3daac76	Merge pull request #103415 from tiloso/staticcheck-kubelet Fix staticcheck failure in pkg/kubelet/cm/cpuset	2021-11-11 15:15:13 -08:00
Kubernetes Prow Robot	359b722c19	Merge pull request #102882 from fromanirh/device-manager-checkpoints devicemanager: checkpoint: support pre-1.20 data	2021-11-02 16:56:57 -07:00
Kubernetes Prow Robot	08bf54678e	Merge pull request #101909 from nolancon/cpu-mgr-testing Additional cases for reconcileState testing	2021-10-30 00:01:17 -07:00
Francesco Romani	2f426fdba6	devicemanager: checkpoint: support pre-1.20 data The commit `a8b8995ef2` changed the content of the data kubelet writes in the checkpoint. Unfortunately, the checkpoint restore code was not updated, so if we upgrade kubelet from pre-1.20 to 1.20+, the device manager cannot anymore restore its state correctly. The only trace of this misbehaviour is this line in the kubelet logs: ``` W0615 07:31:49.744770 4852 manager.go:244] Continue after failing to read checkpoint file. Device allocation info may NOT be up-to-date. Err: json: cannot unmarshal array into Go struct field PodDevicesEntry.Data.PodDeviceEntries.DeviceIDs of type checkpoint.DevicesPerNUMA ``` If we hit this bug, the device allocation info is indeed NOT up-to-date up until the device plugins register themselves again. This can take up to few minutes, depending on the specific device plugin. While the device manager state is inconsistent: 1. the kubelet will NOT update the device availability to zero, so the scheduler will send pods towards the inconsistent kubelet. 2. at pod admission time, the device manager allocation will not trigger, so pods will be admitted without devices actually being allocated to them. To fix these issues, we add support to the device manager to read pre-1.20 checkpoint data. We retroactively call this format "v1". Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-26 09:54:11 +02:00
Kevin Klues	86f9c266bc	Add optimizations to reduce iterations in distributed NUMA algorithm Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-18 08:53:25 +00:00
Kevin Klues	70e0f47191	Support full-pcpus-only with the new NUMA distribution policy option Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	d54445a84d	Generalize the NUMA distribution algorithm to take cpuGroupSize This parameter ensures that CPUs are always allocated in groups of size 'cpuGroupSize'. This is important, for example, to ensure that all CPUs (i.e. hyperthreads) from the same core are handed out together. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	1436e33642	Add more extensive testing for NUMA distribution algorithm in CPUManager Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	cf3afb8602	Add 2 distinguishing test cases between the 2 takeByTopology algorithms Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	eb78e2406b	Add a new TestTakeByTopologyNUMADistributed() test to the CPUManager As part of this, pull out all of the existing "TakeByTopology" tests and have them be called by the original TestTakeByTopologyNUMAPacked() as well as the new TestTakeByTopologyNUMADistributed() test. In a subsequent commit, we will add some tests that should differ between these two algorithms. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	876dd9b078	Added algorithm to CPUManager to distribute CPUs across NUMA nodes Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	462544d079	Split CPUManager takeByTopology() into two different algorithms The first implements the original algorithm which packs CPUs onto NUMA nodes if more than one NUMA node is required to satisfy the allocation. The second disitributes CPUs across NUMA nodes if they can't all fit into one. The "distributing" algorithm is currently a noop and just returns an error of "unimplemented". A subsequent commit will add the logic to implement this algorithm according to KEP 2902: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2902-cpumanager-distribute-cpus-policy-option Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 14:46:19 +00:00
Kevin Klues	0e7928edce	Add new CPUManager policy option for "distribute-cpus-across-numa" This commit only adds the option to the policy options framework. A subsequent commit will add the logic to utilize it. The KEP describing this new option can be found here: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2902-cpumanager-distribute-cpus-policy-option Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 14:46:19 +00:00
Francesco Romani	4bae656835	cpumanager: test NUMA node support for CPU assign (2) This batch of tests adds a fake topology on which each numa node has multiple sockets. We didn't find yet a real HW topology in the wild like this, but we need one to fully exercise the code. So, until we find a HW topology, we add a fake one flipping the NUMA/socket config of the existing xeon dual gold 6320. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Francesco Romani	547996f3f6	cpumanager: test NUMA node support for CPU assign (1) This batch of tests adds a real topology on which each physical socket has multiple NUMA zones. Taken by a real dual xeon 6320 gold. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Francesco Romani	f6ccc4426a	cpumanager: test: use proper subtests The exisiting unit tests where performing subtests without actually using the full features of the testing package (https://pkg.go.dev/testing#hdr-Subtests_and_Sub_benchmarks) Update them with fairly minimal changes. The patch is deceptively large because we need to move the code inside a new block. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Francesco Romani	15caa134b2	cpumanager: topology: use rich cmp package User the `cmp.Diff` package in the unit tests, moving away from `reflect.DeepEqual`. This gives us a clearer picture of the differences when the tests fail. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Kevin Klues	aff54a0914	Abstract out whether NUMA or Sockets come first in the memory hierarchy This allows us to get rid of the check for determining which one is higher all throughout the code. Now we just check once and instantiate an interface of the appropriate type that makes sure the ordering in the hierarchy is preserved through the appropriate calls. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-15 10:29:15 +00:00
Kevin Klues	17c7e86c6d	Add NUMA support to the CPU assignment algorithm in the CPUManager Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-15 08:35:59 +00:00
nolancon	6bbb36df10	Additional cases for reconcileState testing	2021-10-11 16:17:21 +00:00
Kubernetes Prow Robot	63f66e6c99	Merge pull request #105012 from fromanirh/cpumanager-policy-options-beta node: graduate CPUManagerPolicyOptions to beta	2021-10-08 07:32:59 -07:00
Alexey Perevalov	5d9032007a	Return only isolated cpus in podresources interface Co-Authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2021-10-07 15:34:08 +01:00
Kubernetes Prow Robot	c4d802b0b5	Merge pull request #103289 from AlexeyPerevalov/DoNotExportEmptyTopology podresources: do not export empty NUMA topology	2021-10-07 07:11:46 -07:00
Kubernetes Prow Robot	c91f9bdc60	Merge pull request #104689 from cynepco3hahue/memory_manager_restricted_policy_fix kubelet: memory manager: fix preferred topology hints calculation	2021-10-05 06:47:08 -07:00
Kubernetes Prow Robot	883250145c	Merge pull request #104788 from 249043822/memorymanager-br Fix initContainersReusableMemory delete bug in MemoryManager	2021-10-01 05:27:22 -07:00
Francesco Romani	077c0aa1be	node: graduate CPUManagerPolicyOptions to beta We graduate the `CPUManagerPolicyOptions` feature to beta in the 1.23 cycle, and we add new experimental feature gates to guard new options which are planned in the 1.23 and in the following cycles. We introduce additional feature gate called `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions`. The basic idea is to avoid the cumbersome process of adding a feature gate for each option, and to have feature gates which track the maturity level of _groups_ of options. Besides this change, the graduation process, and the process in general, for adding new policy options is still unchanged. The `full-pcpus-only` option added in the 1.22 cycle is intentionally moved into the beta policy options For more details: - KEP: https://github.com/kubernetes/enhancements/pull/2933 - sig-arch discussion: https://groups.google.com/u/1/g/kubernetes-sig-architecture/c/Nxsc7pfe5rw Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-09-29 11:40:03 +02:00
Kubernetes Prow Robot	2541fcf256	Merge pull request #104123 from fromanirh/podresources-not-report-unhealthy-devices devicemanager: skip unhealthy devices in GetAllocatable	2021-09-23 05:39:21 -07:00
Francesco Romani	1b6efa5e21	devicemanager: skip unhealthy devs in GetAllocatable The GetAllocatableDevices, needed to support the podresources API, doesn't take into account the device health when computing its output. In this PR we address this gap and add unit tests along the way to prevent regressions. This gives us a good initial coverage, E2E tests to cover this case are much harder to write, because we would need to inject faults to trigger the unhealthy status. We will evaluate if adding these tests into later PRs. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-09-22 19:20:04 +02:00
Ricardo Pchevuzinske Katz	37d11bcdaf	Move node and networking related helpers from pkg/util to component helpers Signed-off-by: Ricardo Katz <rkatz@vmware.com>	2021-09-16 17:00:19 -03:00
KeZhang	a629ceeb58	Fix initContainersReusableMemory delete bug	2021-09-15 10:04:49 +08:00
eggiter	20d3bc32ac	fix(cpumanager): Do not release cpus of init containers while they are reused in app containers	2021-09-10 10:01:35 +08:00
Shiming Zhang	7706d3d281	pkg/kubelet/cm/memorymanager: Fix ErrorS key/value pair	2021-09-06 17:37:04 +08:00
Artyom Lukianov	9ea9798759	kubelet: memory manager: fix topology preferred topology hints calculation Prevent starting pods with resources satisfied by a single NUMA node on multiple NUMA nodes. The code returned before it updated the minimal amount of NUMA nodes that can satisfy the container requests. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-31 17:46:59 +03:00
tiloso	2b86541313	Fix staticcheck failure in pkg/kubelet/cm/cpuset	2021-08-26 08:50:08 +02:00
Kubernetes Prow Robot	cbd0611d49	Merge pull request #104528 from kolyshkin/runc-1.0.2 vendor: bump runc to 1.0.2	2021-08-25 18:17:23 -07:00
Stephen Augustus	481cf6fbe7	generated: Run hack/update-gofmt.sh Signed-off-by: Stephen Augustus <foo@auggie.dev>	2021-08-24 15:47:49 -04:00
Alexey Perevalov	bb81101570	podresource: do not export NUMA topology if it's empty If device plugin returns device without topology, keep it internaly as NUMA node -1, it helps at podresources level to not export NUMA topology, otherwise topology is exported with NUMA node id 0, which is not accurate. It's imposible to unveile this bug just by tracing json.Marshal(resp) in podresource client, because NUMANodes field ID has json property omitempty, in this case when ID=0 shown as emtpy NUMANode. To reproduce it, better to iterate on devices and just trace dev.Topology.Nodes[0].ID. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2021-08-24 15:38:21 +00:00
Kir Kolyshkin	c06a851042	pkg/kubelet/cm: use SkipFreezeOnSet This is a knob added by runc 1.0.2 specifically for kubernetes, which tells runc/libcontainer/cgroups/systemd v1 manager to not freeze the cgroup in Set(). We set this knob here because this code is only used for pods (rather than containers) management, and in this place we create or update the pod cgroup with no device limits set, so we can skip the freeze. If this knob is not set, libcontainer's cgroup v1 manager tries to figure out whether the freeze is needed or not, but it's a somewhat expensive check to perform, thus the knob is a shortcut. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-08-23 13:41:51 -07:00
Kubernetes Prow Robot	a9aad7e034	Merge pull request #103107 from pacoxu/fix-93300 ResourceConfigForPod: check initContainers as other QoS func	2021-08-17 11:41:37 -07:00
Artyom Lukianov	73a5cce3e6	device manager: do not clean admitted pods from the state Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-08 16:46:06 +03:00
Artyom Lukianov	93a237abd8	memory manager: do not clean admitted pods from the state Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-08 16:46:06 +03:00
Artyom Lukianov	66babd1a90	cpu manager: do not clean admitted pods from the state Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-08 16:46:06 +03:00
Wesley Williams	ff165c8823	Replace usage of Whitelist with Allowlist within Kubelet's sysctl package (#102298 ) * Change uses of whitelist to allowlist in kubelet sysctl * Rename whitelist files to allowlist in Kubelet sysctl * Further renames of whitelist to allowlist in Kubelet * Rename podsecuritypolicy uses of whitelist to allowlist * Update pkg/kubelet/kubelet.go Co-authored-by: Danielle <dani@builds.terrible.systems> Co-authored-by: Danielle <dani@builds.terrible.systems>	2021-08-04 18:59:35 -07:00
Kir Kolyshkin	e5b434e990	kubelet/cm: don't set Devices Since runc 1.0.0 it is now sufficient to have SkipDevices: true. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-07-16 12:45:35 -07:00
Francesco Romani	23abdab2b7	smtalign: propagate policy options to policies Consume in the static policy the cpu manager policy options from the cpumanager instance. Validate in the none policy if any option is given, and fail if so - this is almost surely a configuration mistake. Add new cpumanager.Options type to hold the options and translate from user arguments to flags. Co-authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-07-08 23:15:37 +02:00
Francesco Romani	6dcec345df	smtalign: cm: factor out admission response Introduce a new `admission` subpackage to factor out the responsability to create `PodAdmitResult` objects. This enables resource manager to report specific errors in Allocate() and to bubble up them in the relevant fields of the `PodAdmitResult`. To demonstrate the approach we refactor TopologyAffinityError as a proper error. Co-authored-by: Kevin Klues <kklues@nvidia.com> Co-authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-07-08 23:15:37 +02:00
Francesco Romani	c5cb263dcf	smtalign: propagate policy options to cpumanager The CPUManagerPolicyOptions received from the kubelet config/command line args is propogated to the Container Manager. We defer the consumption of the options to a later patch(set). Co-authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-07-08 23:15:35 +02:00
Li Bo	c3d9b10ca8	feature: support Memory QoS for cgroups v2	2021-07-08 09:26:46 +08:00
Akihiro Suda	dbe0155139	kubelet/cm: ignore sysctl error when running in userns Errors during setting the following sysctl values are ignored: - vm.overcommit_memory - vm.panic_on_oom - kernel.panic - kernel.panic_on_oops - kernel.keys.root_maxkeys - kernel.keys.root_maxbytes Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2021-07-07 14:23:29 +09:00
Kubernetes Prow Robot	eae87bfe7e	Merge pull request #103483 from odinuge/revert-102508-runc-1.0 Revert "Update runc to 1.0.0"	2021-07-06 10:42:56 -07:00
Artyom Lukianov	bb6d5b1f95	memory manager: provide unittests for init containers re-use - provide tests for static policy allocation, when init containers requested memory bigger than the memory requested by app containers - provide tests for static policy allocation, when init containers requested memory smaller than the memory requested by app containers - provide tests to verify that init containers removed from the state file once the app container started Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-07-05 20:52:25 +03:00
Artyom Lukianov	960da7895c	memory manager: remove init containers once app container started Remove init containers from the state file once the app container started, it will release the memory allocated for the init container and can intense the density of containers on the NUMA node in cases when the memory allocated for init containers is bigger than the memory allocated for app containers. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-07-05 20:52:25 +03:00
Artyom Lukianov	b965502c49	memory manager: re-use the memory allocated for init containers The idea that during allocation phase we will: - during call to `Allocate` and `GetTopologyHints` we will take into account the init containers reusable memory, which means that we will re-use the memory and update container memory blocks accordingly. For example for the pod with two init containers that requested: 1Gi and 2Gi, and app container that requested 4Gi, we can re-use 2Gi of memory. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-07-05 20:52:25 +03:00
Odin Ugedal	61d88af9e4	Revert "Update runc to 1.0.0"	2021-07-05 14:03:04 +02:00
Kir Kolyshkin	ab5b77944e	kubelet/cm: don't set Devices Since runc 1.0.0 it is now sufficient to have SkipDevices: true. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-06-30 16:17:35 -07:00
pacoxu	f2eec0a816	ResourceConfigForPod: check initContainers as other QoS func Signed-off-by: pacoxu <paco.xu@daocloud.io>	2021-06-28 19:22:42 +08:00
Kubernetes Prow Robot	07358f1663	Merge pull request #103146 from tech-geek29/fix-95380 Change log level to Debug	2021-06-25 07:44:45 -07:00
Rishabh Jain	8f08db9164	Change log level to Debug	2021-06-24 14:23:06 +05:30
Kenta Tada	89a4d4b071	kubelet: modify the function of getCgroupSubsystemsV2 to use libcontainer API	2021-06-24 16:58:05 +09:00
Kubernetes Prow Robot	985ac8ae50	Merge pull request #101030 from cynepco3hahue/pod_resources_memory_interface Extend pod resource API response to return the information from memory manager	2021-06-22 06:35:58 -07:00
Artyom Lukianov	03830db82d	Implement all necessary methods to provide memory manager data under pod resources metrics Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-06-22 13:06:32 +03:00
Kubernetes Prow Robot	3bd29bc53d	Merge pull request #102829 from snowplayfire/update-devicemanager Add resource capacity to ListAndWatch grpc logging	2021-06-21 16:28:09 -07:00
jingxueli	45d18acbcc	add info for possible failed listAndWatch grpc call	2021-06-17 16:25:20 +08:00
Kubernetes Prow Robot	85f0931ab9	Merge pull request #102772 from saintube/patch-1 cleanup: fix kubelet cpuset typo	2021-06-14 19:00:13 -07:00
Francesco Romani	369416b763	cm: handle nil cpumanager avoiding segfault If the cpumanager feature gate is disabled, the corresponsing field of the containerManager will be nil. A couple functions don't check for this occurrence and happily deference the pointer unconditionally, leading to possible segfaults. The relevant functions were introduced to support the podresources API, so to trigger this segfault all the following are needed: - cpumanager feature gate has to be disabled explicitely - any podresources API must be called Worth pointing out that when the new functions were introduced (around kubernetes 1.20) the default feature gate for cpumanager was already set to true, hence this bug is expected to be triggered rarely. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-06-10 16:22:43 +02:00
Frame	9255f2ccf3	Fix kubelet cpuset typo	2021-06-10 18:17:04 +08:00
Kubernetes Prow Robot	1795a98eeb	Merge pull request #102221 from kikimo/add-hint-to-fake-topology-manager Add hint to fake topology manager.	2021-06-02 03:40:05 -07:00
kikimo	86d68effc2	clean code	2021-06-02 09:07:53 +08:00
Kubernetes Prow Robot	7c7a0865cd	Merge pull request #102218 from kolyshkin/cgroup-cleanups pkg/kubelet/cm: cgroup-related cleanups	2021-06-01 13:45:51 -07:00
kikimo	9d2135f703	reuse fake topology manager	2021-06-02 01:35:00 +08:00
kikimo	8b3162d67b	clean code	2021-06-02 01:17:04 +08:00
sanwishe	9e257ec194	Optimization logging format for pkg/kubelet Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>	2021-05-25 08:52:08 +08:00
Kubernetes Prow Robot	cf59c68e15	Merge pull request #102088 from wzshiming/fix/pod-devices-has-pod-lock Add the missing RLock	2021-05-24 15:16:20 -07:00
Kir Kolyshkin	f1aee7e049	kubelet/cm: GetResourceStats -> MemoryUsage Commit `cc50aa9dfb` introduced GetResourceStats, a method which collected all the statistics from various cgroup controllers, only to discard all of the info collected except a single value (memory usage). While one may argue that this method can potentially be used from other places, this did not happen since it was added 4+ years ago. Let's streamline this code and only collect what we need, i.e. memory usage. Rename the method accordingly. While at it, fix pkg/kubelet/cm/cgroup_manager_unsupported.go to not instantiate a new error every time a method is called. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-05-23 20:43:52 -07:00
kikimo	20c02357ca	Add hint to fake topology manager.	2021-05-22 15:29:08 +08:00

1 2 3 4 5 ...

1118 Commits