kubernetes

Author	SHA1	Message	Date
Jan Safranek	77aa06d0c8	Remove util/selinux package The package says: > the libcontainer SELinux package is only built for Linux, so it is > necessary to have a NOP wrapper which is built for non-Linux platforms This is not true, Kubernetes now imports github.com/opencontainers/selinux/go-selinux and it has proper multiplatform support (i.e. NOOP on non-Linux platforms). Removing the whole package and calling go-selinux directly.	2022-02-11 15:20:35 +01:00
Francesco Romani	2f426fdba6	devicemanager: checkpoint: support pre-1.20 data The commit `a8b8995ef2` changed the content of the data kubelet writes in the checkpoint. Unfortunately, the checkpoint restore code was not updated, so if we upgrade kubelet from pre-1.20 to 1.20+, the device manager cannot anymore restore its state correctly. The only trace of this misbehaviour is this line in the kubelet logs: ``` W0615 07:31:49.744770 4852 manager.go:244] Continue after failing to read checkpoint file. Device allocation info may NOT be up-to-date. Err: json: cannot unmarshal array into Go struct field PodDevicesEntry.Data.PodDeviceEntries.DeviceIDs of type checkpoint.DevicesPerNUMA ``` If we hit this bug, the device allocation info is indeed NOT up-to-date up until the device plugins register themselves again. This can take up to few minutes, depending on the specific device plugin. While the device manager state is inconsistent: 1. the kubelet will NOT update the device availability to zero, so the scheduler will send pods towards the inconsistent kubelet. 2. at pod admission time, the device manager allocation will not trigger, so pods will be admitted without devices actually being allocated to them. To fix these issues, we add support to the device manager to read pre-1.20 checkpoint data. We retroactively call this format "v1". Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-26 09:54:11 +02:00
Kubernetes Prow Robot	c4d802b0b5	Merge pull request #103289 from AlexeyPerevalov/DoNotExportEmptyTopology podresources: do not export empty NUMA topology	2021-10-07 07:11:46 -07:00
Francesco Romani	1b6efa5e21	devicemanager: skip unhealthy devs in GetAllocatable The GetAllocatableDevices, needed to support the podresources API, doesn't take into account the device health when computing its output. In this PR we address this gap and add unit tests along the way to prevent regressions. This gives us a good initial coverage, E2E tests to cover this case are much harder to write, because we would need to inject faults to trigger the unhealthy status. We will evaluate if adding these tests into later PRs. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-09-22 19:20:04 +02:00
Alexey Perevalov	bb81101570	podresource: do not export NUMA topology if it's empty If device plugin returns device without topology, keep it internaly as NUMA node -1, it helps at podresources level to not export NUMA topology, otherwise topology is exported with NUMA node id 0, which is not accurate. It's imposible to unveile this bug just by tracing json.Marshal(resp) in podresource client, because NUMANodes field ID has json property omitempty, in this case when ID=0 shown as emtpy NUMANode. To reproduce it, better to iterate on devices and just trace dev.Topology.Nodes[0].ID. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2021-08-24 15:38:21 +00:00
Artyom Lukianov	73a5cce3e6	device manager: do not clean admitted pods from the state Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-08 16:46:06 +03:00
sanwishe	9e257ec194	Optimization logging format for pkg/kubelet Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>	2021-05-25 08:52:08 +08:00
kikimo	c0a7939cbb	remove redundant test branch in sorting algorithm	2021-05-20 20:31:47 +08:00
kikimo	445b9c0762	minor tweak on numa node sorting algorithm	2021-05-20 08:21:20 +08:00
kikimo	ecfa609b71	simplify sorting comparator of numa nodes	2021-05-19 21:19:47 +08:00
kikimo	7d30bfecd5	simplify sorting comparator of numa nodes	2021-05-19 10:07:37 +08:00
kikimo	2ef1f81076	Avoid undesirable allocation when device is associated with multiple NUMA Nodes suppose there are two devices dev1 and dev2, each has NUMA Nodes associated as below: dev1: numa1 dev2: numa1, numa2 and we request a device from numa2, currently filterByAffinity() will return [], [dev1, dev2], [] if loop of available devices produce a sequence of [dev1, dev2], that is is not desirable as what we truely expect is an allocation of dev2 from numa2.	2021-05-19 10:07:37 +08:00
Elana Hashman	6af7eb6d49	Migrate missed log entries in kubelet Co-Authored-By: pacoxu <paco.xu@daocloud.io>	2021-03-18 14:26:26 -07:00
Amim Knabben	c1d24c87bb	Migrate devicemanager to structured logging	2021-03-14 11:57:06 -04:00
Francesco Romani	ad68f9588c	node: podresources: make GetDevices() consistent We want to make the return type of the GetDevices() method of the podresources DevicesProvider interface consistent with the newly added GetAllocatableDevices type. This makes the code easier to read and reduces the coupling between the podresourcesapi server and the devicemanager code. No intended changes in behaviour, but the different return types now requires some data massaging. Tests are updated accordingly. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:36 +01:00
Francesco Romani	6d33354e4c	node: podresources: implement GetAllocatableResources API Extend the podresources API implementing the GetAllocatableResources endpoint, as specified in the KEPs: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments https://github.com/kubernetes/enhancements/pull/2404 Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:36 +01:00
Kubernetes Prow Robot	06a7e2bacf	Merge pull request #96781 from fighterhit/fix-kukelet-device-plugin-bug Fix: kubelet return error when device plugin sets PreStartRequired true while creating pods with 0 resource	2021-01-25 17:59:00 -08:00
Anthony ARNAUD	6013aaa370	use Lstat instead of Stat for unix socket on windows	2020-12-29 15:14:29 -05:00
Anthony ARNAUD	8bdc3d8970	Port deviceManager in windows container manager	2020-12-16 00:25:26 -05:00
fighterhit	0eaceb7eb5	Fix: kubelet return error when device plugin sets PreStartRequired true while creating pods with 0 resource	2020-11-21 22:44:27 +08:00
Alexey Perevalov	5e6aed4137	Fixes sigfault in case of empty TopologyInfo Device plugin which implements v1beta interface can return nil in Topology field For example nvidia-gpu-deviceplugin `3520254b75/nvidia.go (L147)` Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-13 11:51:47 +03:00
Krzysztof Wiatrzyk	6db58b2e92	Update logging to use a format util Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:55 +01:00
Alexey Perevalov	a8b8995ef2	Implement TopologyInfo and cpu_ids in podresources It covers deviceplugin & cpumanager. It has drawback, since cpuset and all other structs including cadvisor's keep cpu as int, but for protobuf based interface is better to have fixed int. This patch also introduces additional interface CPUsProvider, while DeviceProvider might have been extended too. Checkpoint not covered by unit test. Signed-off-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 13:50:49 +03:00
Alexey Perevalov	62326a1846	Convert podDevices to struct PodDevices will have its own guard Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 13:50:48 +03:00
Alexey Perevalov	9f54dccc92	Change GetDevices interface This change is necessary for supporting Topology in the ContainerDevices. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 12:41:31 +03:00
Kubernetes Prow Robot	332d17c7f5	Merge pull request #95731 from farah/split-scheduler Delete framework/v1alpha1 folder and change remaining import paths	2020-10-30 11:14:22 -07:00
Ali	bfdeda58b7	Delete framework/v1alpha1 folder and change remaining import paths	2020-10-23 13:16:13 +11:00
chenyw1990	009d46f834	write checkpoint only when allocated devices updated.	2020-10-22 22:45:04 +08:00
Renaud Gaubert	4eadf40448	Run gofmt Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>	2020-09-15 06:22:44 -07:00
Renaud Gaubert	60304452ff	Move podresources api to k8s.io/kubelet/pkg/apis Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>	2020-09-15 05:13:33 -07:00
Alexey Perevalov	a047e8aa1b	move to cadvisor.MachineInfo This patch removes GetNUMANodeInfo, cadvisor.MachineInfo will be used instead of it. GetNUMANodeInfo was introduced due to difference of meaning of MachineInfo.Topology. On the arm it was NUMA nodes, but on the x86 it represents sockets (since reading from /proc/cpuinfo). Now it unified and MachineInfo.Topology represents NUMA node. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-07-24 09:29:41 -04:00
Kevin Klues	26cb650655	Remove unnecessary union after call to GetPreferredAllocation() There is no need to try and allocate already-allocated devices again.	2020-07-07 06:35:57 +00:00
Kevin Klues	67ecc11c44	Harden callGetPreferredAllocationIfAvailable() return value Previously, we didn't check the contents of the result after calling out to the plugin endpoint. This could have resulted in errors if the plugin returned either 'nil' or an empty result. This patch fixes this.	2020-07-07 06:35:57 +00:00
Kevin Klues	d87365494a	Fix bug in call to callGetPreferredAllocationIfAvailable() Previously, we were passing the variable 'devices' to this function, when we should have been passing 'allocated'. This bug crept in due to a variable name change that didn't propogate its way through the entire function. The tests added in the previous commit would have caught this.	2020-07-07 06:35:57 +00:00
Kevin Klues	a780ccff5b	Updates logic in devicesToAllocate() to call GetPreferredAllocation()	2020-07-02 22:07:27 +00:00
Kevin Klues	bb56a09133	Add callGetPreferredAllocationIfAvailable() function in devicemanager This function mimics what is already done for the conditional call to PreStartContainer() via the callPreStartContainerIfNeeded() function.	2020-07-02 22:07:27 +00:00
Kevin Klues	c45f1317eb	Fix some whitespacing and comments in devicemanager	2020-07-02 15:15:44 +00:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Abdullah Gharaibeh	d6522e0e74	rename framework pkg with schedulerframework for all instances under pkg/kubelet	2020-04-14 14:24:07 -04:00
Abdullah Gharaibeh	bed9b2f23b	Cleanup obsolete NodeInfo methods	2020-04-12 18:13:46 -04:00
nolancon	cb9fdc49db	Device Manager - Refactor allocatePodResources - allocatePodResources logic altered to allow for container by container device allocation. - New type PodReusableDevices - New field in devicemanager devicesToReuse	2020-02-27 07:24:34 +00:00
Kevin Klues	0b168f0243	Change devicemanager to implement HintProvider.Allocate() This change will not work on its own. Higher level code needs to make sure and call Allocate() before AddContainer is called. This is already being done in cases when the TopologyManager feature gate is enabled (in the PodAdmitHandler of the TopologyManager). However, we need to make sure we add proper logic to call it in cases when the TopologyManager feature gate is disabled.	2020-02-10 03:27:47 +00:00
Kevin Klues	a3f099ea4d	Split devicemanager Allocate into two functions Instead of having a single call for Allocate(), we now split this into two functions Allocate() and UpdatePluginResources(). The semantics split across them: // Allocate configures and assigns devices to a pod. From the requested // device resources, Allocate will communicate with the owning device // plugin to allow setup procedures to take place, and for the device // plugin to provide runtime settings to use the device (environment // variables, mount points and device files). Allocate(pod v1.Pod) error // UpdatePluginResources updates node resources based on devices already // allocated to pods. The node object is provided for the device manager to // update the node capacity to reflect the currently available devices. UpdatePluginResources( node schedulernodeinfo.NodeInfo, attrs *lifecycle.PodAdmitAttributes) error As we move to a model in which the TopologyManager is able to ensure aligned allocations from the CPUManager, devicemanger, and any other TopologManager HintProviders in the same synchronous loop, we will need to be able to call Allocate() independently from an UpdatePluginResources(). This commit makes that possible.	2020-02-10 03:27:47 +00:00
Takeaki Matsumoto	785fac6826	Make updateAllocatedDevices() as a public method and call it in podresources api	2020-02-07 13:26:56 +09:00
danielqsj	1a9b121764	remove deprecated metrics of kubelet	2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot	ad5d4c4705	Merge pull request #85706 from yutedz/per-node-dev Remove nodes slice in loop of takeByTopology	2019-12-05 13:50:30 -08:00
Kubernetes Prow Robot	57b6b287d4	Merge pull request #85688 from yutedz/pods-to-rm Reduce unnecessary Set in updateAllocatedDevices	2019-12-02 17:07:26 -08:00
Ted Yu	84a9803741	Log error when writing checkpoint fails	2019-11-29 19:47:17 -08:00
Ted Yu	6415fa765e	Remove nodes slice in loop of takeByTopology	2019-11-29 12:12:22 -08:00
Ted Yu	86f3bc25e1	Reduce unnecessary Set in updateAllocatedDevices	2019-11-27 08:48:06 -08:00

1 2 3

104 Commits