Commit Graph

7 Commits

Author SHA1 Message Date
Kevin Klues
0449cef8fd Increase timeout for DRA kubelet plugin client
The 10 second timeout was too low. Given that the retry loop for the
kubelet itself is 90s, increasing the timeout to half of this seems
reasonable. Ideally we would pull in the variable that sets the retry
timeout to 90s and then just set our local timeout to half of that.
Unfortunately, this is not exported, so we settle (for now with just
explicitly setting it to 45s.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-07-18 22:45:01 +01:00
Patrick Ohly
d743c50bb9 kubelet: support batched prepare/unprepare in v1alpha3 DRA plugin API
Combining all prepare/unprepare operations for a pod enables plugins to
optimize the execution. Plugins can continue to use the v1beta2 API for now,
but should switch. The new API is designed so that plugins which want to work
on each claim one-by-one can do so and then report errors for each claim
separately, i.e. partial success is supported.
2023-07-12 14:50:30 +02:00
Kevin Klues
579295e727 Update kubeletplugin API for DynamicResourceAllocation to v1alpha2
This PR makes the NodePrepareResources() and NodeUnprepareResource()
calls of the kubeletplugin API for DynamicResourceAllocation
symmetrical. It wasn't clear how one would use the set of CDIDevices
passed back in the NodeUnprepareResource() of the v1alpha1 API, and the
new API now passes back the full ResourceHandle that was originally
passed to the Prepare() call. Passing the ResourceHandle is strictly
more informative and a plugin could always (re)derive the set of
CDIDevice from it.

This is a breaking change, but this release is scheduled to break
multiple APIs for DynamicResourceAllocation, so it makes sense to do
this now instead of later.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2023-03-14 23:09:44 +00:00
Ed Bartosh
50cb3268b6 DRA: add constant PluginClientTimeout 2023-03-14 00:37:43 +02:00
Saza
d34b0275a3
dynamic resource allocation: add timeouts for communiction with plugin (#114844)
* add timeouts for communication with dra plugin

* move timeout constant to k8s.io/kubernetes/pkg/kubelet/cm/util

* move settings of timeout to pkg/kubelet/plugin/dra/plugin/client.go

* remove timeout constant
2023-03-13 04:34:56 -07:00
Patrick Ohly
bc6c7fa912 logging: fix names of keys
The stricter checking with the upcoming logcheck v0.4.1 pointed out these names
which don't comply with our recommendations in
https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md#name-arguments.
2023-01-23 14:24:29 +01:00
Ed Bartosh
ae0f38437c kubelet: add support for dynamic resource allocation
Dependencies need to be updated to use
github.com/container-orchestrated-devices/container-device-interface.

It's not decided yet whether we will implement Topology support
for DRA or not. Not having any toppology-related code
will help to avoid wrong impression that DRA is used as a hint
provider for the Topology Manager.
2022-11-11 21:58:03 +01:00