kubernetes

Author	SHA1	Message	Date
Abdullah Gharaibeh	d6522e0e74	rename framework pkg with schedulerframework for all instances under pkg/kubelet	2020-04-14 14:24:07 -04:00
Abdullah Gharaibeh	bed9b2f23b	Cleanup obsolete NodeInfo methods	2020-04-12 18:13:46 -04:00
nolancon	cb9fdc49db	Device Manager - Refactor allocatePodResources - allocatePodResources logic altered to allow for container by container device allocation. - New type PodReusableDevices - New field in devicemanager devicesToReuse	2020-02-27 07:24:34 +00:00
Kevin Klues	0b168f0243	Change devicemanager to implement HintProvider.Allocate() This change will not work on its own. Higher level code needs to make sure and call Allocate() before AddContainer is called. This is already being done in cases when the TopologyManager feature gate is enabled (in the PodAdmitHandler of the TopologyManager). However, we need to make sure we add proper logic to call it in cases when the TopologyManager feature gate is disabled.	2020-02-10 03:27:47 +00:00
Kevin Klues	a3f099ea4d	Split devicemanager Allocate into two functions Instead of having a single call for Allocate(), we now split this into two functions Allocate() and UpdatePluginResources(). The semantics split across them: // Allocate configures and assigns devices to a pod. From the requested // device resources, Allocate will communicate with the owning device // plugin to allow setup procedures to take place, and for the device // plugin to provide runtime settings to use the device (environment // variables, mount points and device files). Allocate(pod v1.Pod) error // UpdatePluginResources updates node resources based on devices already // allocated to pods. The node object is provided for the device manager to // update the node capacity to reflect the currently available devices. UpdatePluginResources( node schedulernodeinfo.NodeInfo, attrs *lifecycle.PodAdmitAttributes) error As we move to a model in which the TopologyManager is able to ensure aligned allocations from the CPUManager, devicemanger, and any other TopologManager HintProviders in the same synchronous loop, we will need to be able to call Allocate() independently from an UpdatePluginResources(). This commit makes that possible.	2020-02-10 03:27:47 +00:00
Takeaki Matsumoto	785fac6826	Make updateAllocatedDevices() as a public method and call it in podresources api	2020-02-07 13:26:56 +09:00
danielqsj	1a9b121764	remove deprecated metrics of kubelet	2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot	ad5d4c4705	Merge pull request #85706 from yutedz/per-node-dev Remove nodes slice in loop of takeByTopology	2019-12-05 13:50:30 -08:00
Kubernetes Prow Robot	57b6b287d4	Merge pull request #85688 from yutedz/pods-to-rm Reduce unnecessary Set in updateAllocatedDevices	2019-12-02 17:07:26 -08:00
Ted Yu	84a9803741	Log error when writing checkpoint fails	2019-11-29 19:47:17 -08:00
Ted Yu	6415fa765e	Remove nodes slice in loop of takeByTopology	2019-11-29 12:12:22 -08:00
Ted Yu	86f3bc25e1	Reduce unnecessary Set in updateAllocatedDevices	2019-11-27 08:48:06 -08:00
Kubernetes Prow Robot	30e6238795	Merge pull request #85147 from yutedz/devmgr-rm-contents Continue removing file in ManagerImpl#removeContents	2019-11-14 16:38:28 -08:00
Ted Yu	fb046f7787	Continue removing file in ManagerImpl#removeContents	2019-11-13 06:00:34 -08:00
Kubernetes Prow Robot	ed10b5b17f	Merge pull request #85047 from yutedz/dev-mgr-err-handling Handle error return from allocatePodResources	2019-11-12 11:51:27 -08:00
Kubernetes Prow Robot	897ce3073c	Merge pull request #84533 from davidz627/fix/deprecatedPath Remove plugin watching of deprecated directory and CSI v0 support in accordance with deprecation policy	2019-11-12 04:48:20 -08:00
David Zhu	802fe12803	Remove plugin watching of deprecated directory {kubelet_root_dir}/plugins and support for CSI V0 in accordance with deprecation announcement in https://v1-13.docs.kubernetes.io/docs/setup/release/notes/	2019-11-11 11:42:58 -08:00
Ted Yu	db0f616974	Handle error return from allocatePodResources	2019-11-09 16:25:15 -08:00
yanghaichao12	5cbafba457	change directory permissions from 0755 to 0750	2019-11-04 17:04:37 +08:00
Kubernetes Prow Robot	3db6d3abcf	Merge pull request #83551 from dims/move-external-facing-kubelet-apis-to-staging Move external facing kubelet apis to staging	2019-10-10 13:41:36 -07:00
Davanum Srinivas	f29d2272c8	fix gofmt and golint failures Change-Id: I6535b506f50558b31663a13cd270b15023afa2c6	2019-10-06 18:43:17 -04:00
Davanum Srinivas	d30c489c54	Move pkg/kubelet/pluginregistration and deviceplugin Change-Id: I06adcb43bd278b430ffad2010869e1524c8cc4ff	2019-10-06 15:28:38 -04:00
tanjunchen	de3cf23414	remove the repeat word in documents	2019-10-06 23:32:01 +08:00
Connor Doyle	e35301c19f	Rename package socketmask to bitmask. - As discussed in reviews and other public channels, this abstraction is used to represent numa nodes, not sockets. - There is nothing inherently related to sockets in this package anyway.	2019-09-23 17:08:45 -07:00
Kevin Klues	cc567afaf0	Consume TopologyHints in the devicemanager	2019-08-29 08:22:50 -05:00
Louise Daly	9a118ceac4	Added stub support for Topology Manager to Device Manager Co-authored-by: Conor Nolan <conor.nolan@intel.com> Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com> Co-authored-by: Kevin Klues <kklues@nvidia.com>	2019-08-29 07:45:43 -05:00
Tim Allclair	8a495cb5e4	Clean up error messages (ST1005)	2019-08-21 10:40:21 -07:00
Tara Gu	5e18554442	Implement plugin manager - a controller that manages plugin registration/unregistration	2019-05-30 19:00:59 -04:00
Richard Chen	c9f1b57b5b	Reset extended resources only when node is recreated.	2019-05-21 14:16:54 -07:00
Kubernetes Prow Robot	e476a60ccb	Merge pull request #73241 from vikaschoudhary16/selinux-label Add correct selinux label at plugin socket directory	2019-05-20 11:07:17 -07:00
vikaschoudhary16	58d1b4d564	Add correct selinux label at plugin socket directory	2019-05-18 12:35:17 +05:30
danielqsj	79a3eb816c	rename latency to duration in metrics	2019-02-18 17:40:04 +08:00
danielqsj	9fd99a48f5	Change kubelet metrics to conform guideline	2019-02-18 14:01:58 +08:00
Jiaying Zhang	00b88c14b0	Checks whether we have cached runtime state before starting a container that requests any device plugin resource. If not, re-issue Allocate grpc calls. This allows us to handle the edge case that a pod got assigned to a node even before it populates its extended resource capacity.	2019-02-07 11:12:36 -08:00
Kubernetes Prow Robot	03b434c9d4	Merge pull request #58122 from tianshapjq/nit-int-is-enough Len() is already int	2019-02-03 12:02:24 -08:00
Kubernetes Prow Robot	d88994cf9f	Merge pull request #71306 from ping035627/k8s-181121 fix some typos	2019-01-09 09:06:31 -08:00
yuexiao-wang	f3353c358d	[scheduler cleanup phase 2]: Rename to Signed-off-by: yuexiao-wang <wang.yuexiao@zte.com.cn>	2018-12-11 11:21:12 +08:00
saad-ali	a7c5582bba	Permit use of deprecated dir in device plugin.	2018-11-21 18:37:31 -08:00
saad-ali	8f666d9e41	Modify kubelet watcher to support old versions Modify kubelet plugin watcher to support older CSI drivers that use an the old plugins directory for socket registration. Also modify CSI plugin registration to support multiple versions of CSI registering with the same name.	2018-11-21 18:37:31 -08:00
PingWang	9d541911bb	fix some typos Signed-off-by: PingWang <wang.ping5@zte.com.cn> fix typo Signed-off-by: PingWang <wang.ping5@zte.com.cn>	2018-11-22 08:27:14 +08:00
David Ashpole	630cb53f82	add kubelet grpc server for pod-resources service	2018-11-15 09:43:20 -08:00
Davanum Srinivas	954996e231	Move from glog to klog - Move from the old github.com/golang/glog to k8s.io/klog - klog as explicit InitFlags() so we add them as necessary - we update the other repositories that we vendor that made a similar change from glog to klog * github.com/kubernetes/repo-infra * k8s.io/gengo/ * k8s.io/kube-openapi/ * github.com/google/cadvisor - Entirely remove all references to glog - Fix some tests by explicit InitFlags in their init() methods Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135	2018-11-10 07:50:31 -05:00
Renaud Gaubert	8dd1d27c03	Updated the device manager pluginwatcher handler	2018-09-06 15:34:46 +02:00
Kubernetes Submit Queue	d017bebf6b	Merge pull request #67145 from jiayingz/reboot-fix Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Fail container start if its requested device plugin resource is unknown. With the change, Kubelet device manager now checks whether it has cached option state for the requested device plugin resource to make sure the resource is in ready state when we start the container. What this PR does / why we need it: Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes https://github.com/kubernetes/kubernetes/issues/67107 Special notes for your reviewer: Release note: ```release-note Fail container start if its requested device plugin resource hasn't registered after Kubelet restart. ```	2018-08-21 01:48:54 -07:00
tianshapjq	81081dc9e7	nits in manager.go	2018-08-15 08:16:04 +08:00
Jiaying Zhang	7b1ae66432	Fail container start if its requested device plugin resource doesn't have cached option state to make sure the device plugin resource is in ready state when we start the container.	2018-08-08 13:11:36 -07:00
hui luo	7101c17498	While reviewing devicemanager code, found the caching layer on endpoint is redundant. Here are the 3 related objects in picture: devicemanager <-> endpoint <-> plugin Plugin is the source of truth for devices and device health status. devicemanager maintain healthyDevices, unhealthyDevices, allocatedDevices based on updates from plugin. So there is no point for endpoint caching devices, this patch is removing this caching layer on endpoint, Also removing the Manager.Devices() since i didn't find any caller of this other than test, i am adding a notification channel to facilitate testing, If we need to get all devices from manager in future, it just need to return healthyDevices + unhealthyDevices, we don't have to call endpoint after all. This patch makes code more readable, data model been simplified.	2018-07-29 21:07:14 -07:00
Kubernetes Submit Queue	32e38b6659	Merge pull request #58755 from vikaschoudhary16/probing-mode Automatic merge from submit-queue (batch tested with PRs 58755, 66414). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Use probe based plugin watcher mechanism in Device Manager What this PR does / why we need it: Uses this probe based utility in the device plugin manager. Which issue(s) this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged): Fixes #56944 Notes For Reviewers: Changes are backward compatible and existing device plugins will continue to work. At the same time, any new plugins that has required support for probing model (Identity service implementation), will also work. Release note ```release-note Add support kubelet plugin watcher in device manager. ``` /sig node /area hw-accelerators /cc /cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle	2018-07-27 15:20:06 -07:00
bingshen.wbs	b1bdd043c4	fix kubelet npe on device plugin return zero container Signed-off-by: bingshen.wbs <bingshen.wbs@alibaba-inc.com>	2018-07-25 10:15:30 +08:00
vikaschoudhary16	a5842503eb	Use probe based plugin discovery mechanism in device manager	2018-07-17 04:02:31 -04:00

1 2

66 Commits