kubernetes/pkg
Swati Sehgal 7ac399c205 node: device-mgr: Handle recovery by checking if healthy devices exist
In case of node reboot/kubelet restart, the flow of events involves
obtaining the state from the checkpoint file followed by setting
the `healthDevices`/`unhealthyDevices` to its zero value. This is
done to allow the device plugin to re-register itself so that
capacity can be updated appropriately.

During the allocation phase, we need to check if the resources requested
by the pod have been registered AND healthy devices are present on
the node to be allocated.

Also we need to move this check above `needed==0` where needed is
required - devices allocated to the container (which is obtained from
the checkpoint file) because even in cases where no additional devices
have to be allocated (as they were pre-allocated), we still need to
make the devices that were previously allocated are healthy.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 11:52:23 +00:00
..
api In-place Pod Vertical Scaling - API changes 2023-02-24 17:18:04 +00:00
apis Merge pull request #115463 from SergeyKanzhelev/containerStatusDocs 2023-03-03 20:17:06 -08:00
auth
capabilities
client delete unused functions in pkg directory 2023-01-16 21:43:36 +08:00
cloudprovider archived design proposals are now moved to Design Proposals Archive Repo. 2023-02-08 11:12:22 +08:00
cluster/ports e2e_node/{service,util}: use kubelet healthz port. 2022-04-22 16:14:31 -07:00
controller Merge pull request #113270 from rrangith/fix/create-pvc-for-pending-pod 2023-03-03 10:24:58 -08:00
controlplane update lease controller 2023-03-02 15:06:00 +01:00
credentialprovider delete unused functions in pkg directory 2023-01-16 21:43:36 +08:00
features GRPCContainerProbe is GA 2023-03-02 22:07:59 +00:00
fieldpath Improved FormatMap: Improves performance by about 4x, or nearly 2x in the worst case (#112661) 2023-03-01 22:26:55 -08:00
generated Merge pull request #115463 from SergeyKanzhelev/containerStatusDocs 2023-03-03 20:17:06 -08:00
kubeapiserver authenticator config: use static CA reader for OIDC CA 2023-02-14 13:43:58 +01:00
kubectl Refactor to simplify factory Validator 2022-12-11 18:20:28 -08:00
kubelet node: device-mgr: Handle recovery by checking if healthy devices exist 2023-03-06 11:52:23 +00:00
kubemark Merge pull request #114725 from danwinship/kube-proxy-startup-cleanup 2023-01-05 13:57:59 -08:00
printers Merge pull request #114759 from my-git9/chore/k8staint 2023-01-31 21:01:17 -08:00
probe Document risk of HTTP response body in probe failure msg 2023-02-09 16:37:32 -08:00
proxy proxier: track metrics before conntrack cleaning 2023-03-02 20:56:05 +05:30
quota/v1 In-place Pod Vertical Scaling - API changes 2023-02-24 17:18:04 +00:00
registry update documentation on generateSelector for manual selector case 2023-03-02 19:47:58 +00:00
routes unittests: Fixes unit tests for Windows (part 3) 2022-10-21 19:25:48 +03:00
scheduler Merge pull request #102884 from vinaykul/restart-free-pod-vertical-scaling 2023-02-27 22:53:15 -08:00
security changes in NewValidator 2023-02-21 13:02:30 +05:30
securitycontext Merge pull request #112037 from mingweishih/update_default_proc_mount 2023-02-14 23:28:24 -08:00
serviceaccount handle new error where sa jwt issued in the future 2023-03-02 03:15:13 +01:00
util Merge pull request #115527 from sondinht/ipvs_sh 2023-02-14 04:25:30 -08:00
volume Remove check for CSI driver running on node for CSI migration attach operations 2023-02-09 02:45:02 +00:00
windows/service Fix typo at pkg/windows/service/service.go:94 2022-03-24 07:25:33 -04:00
.import-restrictions
OWNERS Move root approvers to subdirs 2022-10-10 13:43:03 -04:00