kubernetes

Author	SHA1	Message	Date
David Porter	9c20cee504	Revert "node: device-mgr: Handle recovery flow by checking if healthy devices exist"	2023-03-07 11:50:52 -08:00
Swati Sehgal	5b2a3dbbdc	node: device-mgr: explicitly check if pre-allocated devices are healthy Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 11:52:23 +00:00
Swati Sehgal	7ac399c205	node: device-mgr: Handle recovery by checking if healthy devices exist In case of node reboot/kubelet restart, the flow of events involves obtaining the state from the checkpoint file followed by setting the `healthDevices`/`unhealthyDevices` to its zero value. This is done to allow the device plugin to re-register itself so that capacity can be updated appropriately. During the allocation phase, we need to check if the resources requested by the pod have been registered AND healthy devices are present on the node to be allocated. Also we need to move this check above `needed==0` where needed is required - devices allocated to the container (which is obtained from the checkpoint file) because even in cases where no additional devices have to be allocated (as they were pre-allocated), we still need to make the devices that were previously allocated are healthy. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 11:52:23 +00:00
Kubernetes Prow Robot	25dc4c4f32	Merge pull request #112980 from swatisehgal/devicemanager-ga-graduation node: devicemgr: Graduate Kubelet DeviceManager to GA	2022-11-02 13:17:01 -07:00
Swati Sehgal	40741681a2	node: devicemgr: Address warnings from golint Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2022-11-02 11:05:20 +00:00
Swati Sehgal	752fa093e0	node: devicemgr: GA graduation implies Feature Gate is ON by default Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2022-11-02 11:05:20 +00:00
Ryan Phillips	2514486d80	kubelet: fix nil crash in allocateRemainingFrom	2022-10-12 12:51:17 -05:00
Kevin Klues	db88676c20	Refactor all device plugin logic into separate 'plugin' package This is the first step towards being able to support a new plugin API version in parallel with the existing one. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-04-29 10:52:37 +00:00
waynepeking348	6157d3cc4a	skip deleted activePods and return nil	2022-03-27 20:35:09 +08:00
waynepeking348	35a456b0c6	skip reallocate logic if pod is already removed	2022-03-20 21:09:47 +08:00
Jan Safranek	77aa06d0c8	Remove util/selinux package The package says: > the libcontainer SELinux package is only built for Linux, so it is > necessary to have a NOP wrapper which is built for non-Linux platforms This is not true, Kubernetes now imports github.com/opencontainers/selinux/go-selinux and it has proper multiplatform support (i.e. NOOP on non-Linux platforms). Removing the whole package and calling go-selinux directly.	2022-02-11 15:20:35 +01:00
Francesco Romani	2f426fdba6	devicemanager: checkpoint: support pre-1.20 data The commit `a8b8995ef2` changed the content of the data kubelet writes in the checkpoint. Unfortunately, the checkpoint restore code was not updated, so if we upgrade kubelet from pre-1.20 to 1.20+, the device manager cannot anymore restore its state correctly. The only trace of this misbehaviour is this line in the kubelet logs: ``` W0615 07:31:49.744770 4852 manager.go:244] Continue after failing to read checkpoint file. Device allocation info may NOT be up-to-date. Err: json: cannot unmarshal array into Go struct field PodDevicesEntry.Data.PodDeviceEntries.DeviceIDs of type checkpoint.DevicesPerNUMA ``` If we hit this bug, the device allocation info is indeed NOT up-to-date up until the device plugins register themselves again. This can take up to few minutes, depending on the specific device plugin. While the device manager state is inconsistent: 1. the kubelet will NOT update the device availability to zero, so the scheduler will send pods towards the inconsistent kubelet. 2. at pod admission time, the device manager allocation will not trigger, so pods will be admitted without devices actually being allocated to them. To fix these issues, we add support to the device manager to read pre-1.20 checkpoint data. We retroactively call this format "v1". Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-26 09:54:11 +02:00
Kubernetes Prow Robot	c4d802b0b5	Merge pull request #103289 from AlexeyPerevalov/DoNotExportEmptyTopology podresources: do not export empty NUMA topology	2021-10-07 07:11:46 -07:00
Francesco Romani	1b6efa5e21	devicemanager: skip unhealthy devs in GetAllocatable The GetAllocatableDevices, needed to support the podresources API, doesn't take into account the device health when computing its output. In this PR we address this gap and add unit tests along the way to prevent regressions. This gives us a good initial coverage, E2E tests to cover this case are much harder to write, because we would need to inject faults to trigger the unhealthy status. We will evaluate if adding these tests into later PRs. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-09-22 19:20:04 +02:00
Alexey Perevalov	bb81101570	podresource: do not export NUMA topology if it's empty If device plugin returns device without topology, keep it internaly as NUMA node -1, it helps at podresources level to not export NUMA topology, otherwise topology is exported with NUMA node id 0, which is not accurate. It's imposible to unveile this bug just by tracing json.Marshal(resp) in podresource client, because NUMANodes field ID has json property omitempty, in this case when ID=0 shown as emtpy NUMANode. To reproduce it, better to iterate on devices and just trace dev.Topology.Nodes[0].ID. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2021-08-24 15:38:21 +00:00
Artyom Lukianov	73a5cce3e6	device manager: do not clean admitted pods from the state Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-08 16:46:06 +03:00
sanwishe	9e257ec194	Optimization logging format for pkg/kubelet Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>	2021-05-25 08:52:08 +08:00
kikimo	c0a7939cbb	remove redundant test branch in sorting algorithm	2021-05-20 20:31:47 +08:00
kikimo	445b9c0762	minor tweak on numa node sorting algorithm	2021-05-20 08:21:20 +08:00
kikimo	ecfa609b71	simplify sorting comparator of numa nodes	2021-05-19 21:19:47 +08:00
kikimo	7d30bfecd5	simplify sorting comparator of numa nodes	2021-05-19 10:07:37 +08:00
kikimo	2ef1f81076	Avoid undesirable allocation when device is associated with multiple NUMA Nodes suppose there are two devices dev1 and dev2, each has NUMA Nodes associated as below: dev1: numa1 dev2: numa1, numa2 and we request a device from numa2, currently filterByAffinity() will return [], [dev1, dev2], [] if loop of available devices produce a sequence of [dev1, dev2], that is is not desirable as what we truely expect is an allocation of dev2 from numa2.	2021-05-19 10:07:37 +08:00
Elana Hashman	6af7eb6d49	Migrate missed log entries in kubelet Co-Authored-By: pacoxu <paco.xu@daocloud.io>	2021-03-18 14:26:26 -07:00
Amim Knabben	c1d24c87bb	Migrate devicemanager to structured logging	2021-03-14 11:57:06 -04:00
Francesco Romani	ad68f9588c	node: podresources: make GetDevices() consistent We want to make the return type of the GetDevices() method of the podresources DevicesProvider interface consistent with the newly added GetAllocatableDevices type. This makes the code easier to read and reduces the coupling between the podresourcesapi server and the devicemanager code. No intended changes in behaviour, but the different return types now requires some data massaging. Tests are updated accordingly. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:36 +01:00
Francesco Romani	6d33354e4c	node: podresources: implement GetAllocatableResources API Extend the podresources API implementing the GetAllocatableResources endpoint, as specified in the KEPs: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments https://github.com/kubernetes/enhancements/pull/2404 Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:36 +01:00
Kubernetes Prow Robot	06a7e2bacf	Merge pull request #96781 from fighterhit/fix-kukelet-device-plugin-bug Fix: kubelet return error when device plugin sets PreStartRequired true while creating pods with 0 resource	2021-01-25 17:59:00 -08:00
Anthony ARNAUD	6013aaa370	use Lstat instead of Stat for unix socket on windows	2020-12-29 15:14:29 -05:00
Anthony ARNAUD	8bdc3d8970	Port deviceManager in windows container manager	2020-12-16 00:25:26 -05:00
fighterhit	0eaceb7eb5	Fix: kubelet return error when device plugin sets PreStartRequired true while creating pods with 0 resource	2020-11-21 22:44:27 +08:00
Alexey Perevalov	5e6aed4137	Fixes sigfault in case of empty TopologyInfo Device plugin which implements v1beta interface can return nil in Topology field For example nvidia-gpu-deviceplugin `3520254b75/nvidia.go (L147)` Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-13 11:51:47 +03:00
Krzysztof Wiatrzyk	6db58b2e92	Update logging to use a format util Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:55 +01:00
Alexey Perevalov	a8b8995ef2	Implement TopologyInfo and cpu_ids in podresources It covers deviceplugin & cpumanager. It has drawback, since cpuset and all other structs including cadvisor's keep cpu as int, but for protobuf based interface is better to have fixed int. This patch also introduces additional interface CPUsProvider, while DeviceProvider might have been extended too. Checkpoint not covered by unit test. Signed-off-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 13:50:49 +03:00
Alexey Perevalov	62326a1846	Convert podDevices to struct PodDevices will have its own guard Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 13:50:48 +03:00
Alexey Perevalov	9f54dccc92	Change GetDevices interface This change is necessary for supporting Topology in the ContainerDevices. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 12:41:31 +03:00
Kubernetes Prow Robot	332d17c7f5	Merge pull request #95731 from farah/split-scheduler Delete framework/v1alpha1 folder and change remaining import paths	2020-10-30 11:14:22 -07:00
Ali	bfdeda58b7	Delete framework/v1alpha1 folder and change remaining import paths	2020-10-23 13:16:13 +11:00
chenyw1990	009d46f834	write checkpoint only when allocated devices updated.	2020-10-22 22:45:04 +08:00
Renaud Gaubert	4eadf40448	Run gofmt Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>	2020-09-15 06:22:44 -07:00
Renaud Gaubert	60304452ff	Move podresources api to k8s.io/kubelet/pkg/apis Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>	2020-09-15 05:13:33 -07:00
Alexey Perevalov	a047e8aa1b	move to cadvisor.MachineInfo This patch removes GetNUMANodeInfo, cadvisor.MachineInfo will be used instead of it. GetNUMANodeInfo was introduced due to difference of meaning of MachineInfo.Topology. On the arm it was NUMA nodes, but on the x86 it represents sockets (since reading from /proc/cpuinfo). Now it unified and MachineInfo.Topology represents NUMA node. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-07-24 09:29:41 -04:00
Kevin Klues	26cb650655	Remove unnecessary union after call to GetPreferredAllocation() There is no need to try and allocate already-allocated devices again.	2020-07-07 06:35:57 +00:00
Kevin Klues	67ecc11c44	Harden callGetPreferredAllocationIfAvailable() return value Previously, we didn't check the contents of the result after calling out to the plugin endpoint. This could have resulted in errors if the plugin returned either 'nil' or an empty result. This patch fixes this.	2020-07-07 06:35:57 +00:00
Kevin Klues	d87365494a	Fix bug in call to callGetPreferredAllocationIfAvailable() Previously, we were passing the variable 'devices' to this function, when we should have been passing 'allocated'. This bug crept in due to a variable name change that didn't propogate its way through the entire function. The tests added in the previous commit would have caught this.	2020-07-07 06:35:57 +00:00
Kevin Klues	a780ccff5b	Updates logic in devicesToAllocate() to call GetPreferredAllocation()	2020-07-02 22:07:27 +00:00
Kevin Klues	bb56a09133	Add callGetPreferredAllocationIfAvailable() function in devicemanager This function mimics what is already done for the conditional call to PreStartContainer() via the callPreStartContainerIfNeeded() function.	2020-07-02 22:07:27 +00:00
Kevin Klues	c45f1317eb	Fix some whitespacing and comments in devicemanager	2020-07-02 15:15:44 +00:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Abdullah Gharaibeh	d6522e0e74	rename framework pkg with schedulerframework for all instances under pkg/kubelet	2020-04-14 14:24:07 -04:00
Abdullah Gharaibeh	bed9b2f23b	Cleanup obsolete NodeInfo methods	2020-04-12 18:13:46 -04:00

1 2 3

114 Commits