Commit Graph

73 Commits

Author SHA1 Message Date
Kevin Klues
26cb650655 Remove unnecessary union after call to GetPreferredAllocation()
There is no need to try and allocate already-allocated devices again.
2020-07-07 06:35:57 +00:00
Kevin Klues
67ecc11c44 Harden callGetPreferredAllocationIfAvailable() return value
Previously, we didn't check the contents of the result after calling out
to the plugin endpoint. This could have resulted in errors if the plugin
returned either 'nil' or an empty result. This patch fixes this.
2020-07-07 06:35:57 +00:00
Kevin Klues
d87365494a Fix bug in call to callGetPreferredAllocationIfAvailable()
Previously, we were passing the variable 'devices' to this function,
when we should have been passing 'allocated'. This bug crept in due to a
variable name change that didn't propogate its way through the entire
function. The tests added in the previous commit would have caught this.
2020-07-07 06:35:57 +00:00
Kevin Klues
a780ccff5b Updates logic in devicesToAllocate() to call GetPreferredAllocation() 2020-07-02 22:07:27 +00:00
Kevin Klues
bb56a09133 Add callGetPreferredAllocationIfAvailable() function in devicemanager
This function mimics what is already done for the conditional call to
PreStartContainer() via the callPreStartContainerIfNeeded() function.
2020-07-02 22:07:27 +00:00
Kevin Klues
c45f1317eb Fix some whitespacing and comments in devicemanager 2020-07-02 15:15:44 +00:00
Davanum Srinivas
442a69c3bd
switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
Abdullah Gharaibeh
d6522e0e74 rename framework pkg with schedulerframework for all instances under pkg/kubelet 2020-04-14 14:24:07 -04:00
Abdullah Gharaibeh
bed9b2f23b Cleanup obsolete NodeInfo methods 2020-04-12 18:13:46 -04:00
nolancon
cb9fdc49db Device Manager - Refactor allocatePodResources
- allocatePodResources logic altered to allow for container by container
device allocation.
- New type PodReusableDevices
- New field in devicemanager devicesToReuse
2020-02-27 07:24:34 +00:00
Kevin Klues
0b168f0243 Change devicemanager to implement HintProvider.Allocate()
This change will not work on its own. Higher level code needs to make
sure and call Allocate() before AddContainer is called. This is already
being done in cases when the TopologyManager feature gate is enabled (in
the PodAdmitHandler of the TopologyManager). However, we need to make
sure we add proper logic to call it in cases when the TopologyManager
feature gate is disabled.
2020-02-10 03:27:47 +00:00
Kevin Klues
a3f099ea4d Split devicemanager Allocate into two functions
Instead of having a single call for Allocate(), we now split this into two
functions Allocate() and UpdatePluginResources().

The semantics split across them:

// Allocate configures and assigns devices to a pod. From the requested
// device resources, Allocate will communicate with the owning device
// plugin to allow setup procedures to take place, and for the device
// plugin to provide runtime settings to use the device (environment
// variables, mount points and device files).
Allocate(pod *v1.Pod) error

// UpdatePluginResources updates node resources based on devices already
// allocated to pods. The node object is provided for the device manager to
// update the node capacity to reflect the currently available devices.
UpdatePluginResources(
    node *schedulernodeinfo.NodeInfo,
    attrs *lifecycle.PodAdmitAttributes) error

As we move to a model in which the TopologyManager is able to ensure
aligned allocations from the CPUManager, devicemanger, and any
other TopologManager HintProviders in the same synchronous loop, we will
need to be able to call Allocate() independently from an
UpdatePluginResources(). This commit makes that possible.
2020-02-10 03:27:47 +00:00
Takeaki Matsumoto
785fac6826 Make updateAllocatedDevices() as a public method and call it in
podresources api
2020-02-07 13:26:56 +09:00
danielqsj
1a9b121764 remove deprecated metrics of kubelet 2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot
ad5d4c4705
Merge pull request #85706 from yutedz/per-node-dev
Remove nodes slice in loop of takeByTopology
2019-12-05 13:50:30 -08:00
Kubernetes Prow Robot
57b6b287d4
Merge pull request #85688 from yutedz/pods-to-rm
Reduce unnecessary Set in updateAllocatedDevices
2019-12-02 17:07:26 -08:00
Ted Yu
84a9803741 Log error when writing checkpoint fails 2019-11-29 19:47:17 -08:00
Ted Yu
6415fa765e Remove nodes slice in loop of takeByTopology 2019-11-29 12:12:22 -08:00
Ted Yu
86f3bc25e1 Reduce unnecessary Set in updateAllocatedDevices 2019-11-27 08:48:06 -08:00
Kubernetes Prow Robot
30e6238795
Merge pull request #85147 from yutedz/devmgr-rm-contents
Continue removing file in ManagerImpl#removeContents
2019-11-14 16:38:28 -08:00
Ted Yu
fb046f7787 Continue removing file in ManagerImpl#removeContents 2019-11-13 06:00:34 -08:00
Kubernetes Prow Robot
ed10b5b17f
Merge pull request #85047 from yutedz/dev-mgr-err-handling
Handle error return from allocatePodResources
2019-11-12 11:51:27 -08:00
Kubernetes Prow Robot
897ce3073c
Merge pull request #84533 from davidz627/fix/deprecatedPath
Remove plugin watching of deprecated directory and CSI v0 support in accordance with deprecation policy
2019-11-12 04:48:20 -08:00
David Zhu
802fe12803 Remove plugin watching of deprecated directory {kubelet_root_dir}/plugins and support for CSI V0 in accordance with deprecation announcement in https://v1-13.docs.kubernetes.io/docs/setup/release/notes/ 2019-11-11 11:42:58 -08:00
Ted Yu
db0f616974 Handle error return from allocatePodResources 2019-11-09 16:25:15 -08:00
yanghaichao12
5cbafba457 change directory permissions from 0755 to 0750 2019-11-04 17:04:37 +08:00
Kubernetes Prow Robot
3db6d3abcf
Merge pull request #83551 from dims/move-external-facing-kubelet-apis-to-staging
Move external facing kubelet apis to staging
2019-10-10 13:41:36 -07:00
Davanum Srinivas
f29d2272c8
fix gofmt and golint failures
Change-Id: I6535b506f50558b31663a13cd270b15023afa2c6
2019-10-06 18:43:17 -04:00
Davanum Srinivas
d30c489c54
Move pkg/kubelet/pluginregistration and deviceplugin
Change-Id: I06adcb43bd278b430ffad2010869e1524c8cc4ff
2019-10-06 15:28:38 -04:00
tanjunchen
de3cf23414 remove the repeat word in documents 2019-10-06 23:32:01 +08:00
Connor Doyle
e35301c19f Rename package socketmask to bitmask.
- As discussed in reviews and other public channels,
  this abstraction is used to represent numa nodes, not sockets.
- There is nothing inherently related to sockets in this package anyway.
2019-09-23 17:08:45 -07:00
Kevin Klues
cc567afaf0 Consume TopologyHints in the devicemanager 2019-08-29 08:22:50 -05:00
Louise Daly
9a118ceac4 Added stub support for Topology Manager to Device Manager
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>
Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-29 07:45:43 -05:00
Tim Allclair
8a495cb5e4 Clean up error messages (ST1005) 2019-08-21 10:40:21 -07:00
Tara Gu
5e18554442 Implement plugin manager - a controller that manages plugin registration/unregistration 2019-05-30 19:00:59 -04:00
Richard Chen
c9f1b57b5b Reset extended resources only when node is recreated. 2019-05-21 14:16:54 -07:00
Kubernetes Prow Robot
e476a60ccb
Merge pull request #73241 from vikaschoudhary16/selinux-label
Add correct selinux label at plugin socket directory
2019-05-20 11:07:17 -07:00
vikaschoudhary16
58d1b4d564 Add correct selinux label at plugin socket directory 2019-05-18 12:35:17 +05:30
danielqsj
79a3eb816c rename latency to duration in metrics 2019-02-18 17:40:04 +08:00
danielqsj
9fd99a48f5 Change kubelet metrics to conform guideline 2019-02-18 14:01:58 +08:00
Jiaying Zhang
00b88c14b0 Checks whether we have cached runtime state before starting a container
that requests any device plugin resource. If not, re-issue Allocate
grpc calls. This allows us to handle the edge case that a pod got
assigned to a node even before it populates its extended resource
capacity.
2019-02-07 11:12:36 -08:00
Kubernetes Prow Robot
03b434c9d4
Merge pull request #58122 from tianshapjq/nit-int-is-enough
Len() is already int
2019-02-03 12:02:24 -08:00
Kubernetes Prow Robot
d88994cf9f
Merge pull request #71306 from ping035627/k8s-181121
fix some typos
2019-01-09 09:06:31 -08:00
yuexiao-wang
f3353c358d [scheduler cleanup phase 2]: Rename to
Signed-off-by: yuexiao-wang <wang.yuexiao@zte.com.cn>
2018-12-11 11:21:12 +08:00
saad-ali
a7c5582bba Permit use of deprecated dir in device plugin. 2018-11-21 18:37:31 -08:00
saad-ali
8f666d9e41 Modify kubelet watcher to support old versions
Modify kubelet plugin watcher to support older CSI drivers that use an
the old plugins directory for socket registration.
Also modify CSI plugin registration to support multiple versions of CSI
registering with the same name.
2018-11-21 18:37:31 -08:00
PingWang
9d541911bb fix some typos
Signed-off-by: PingWang <wang.ping5@zte.com.cn>

fix typo

Signed-off-by: PingWang <wang.ping5@zte.com.cn>
2018-11-22 08:27:14 +08:00
David Ashpole
630cb53f82 add kubelet grpc server for pod-resources service 2018-11-15 09:43:20 -08:00
Davanum Srinivas
954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
Renaud Gaubert
8dd1d27c03 Updated the device manager pluginwatcher handler 2018-09-06 15:34:46 +02:00