Abdullah Gharaibeh
d6522e0e74
rename framework pkg with schedulerframework for all instances under pkg/kubelet
2020-04-14 14:24:07 -04:00
Abdullah Gharaibeh
bed9b2f23b
Cleanup obsolete NodeInfo methods
2020-04-12 18:13:46 -04:00
nolancon
a9c6129577
Device Manager - Update unit tests
...
- Pass container to Allocate().
- Loop through containers to call Allocate() on container by container
basis.
2020-02-27 07:24:34 +00:00
nolancon
cb9fdc49db
Device Manager - Refactor allocatePodResources
...
- allocatePodResources logic altered to allow for container by container
device allocation.
- New type PodReusableDevices
- New field in devicemanager devicesToReuse
2020-02-27 07:24:34 +00:00
Kevin Klues
0b168f0243
Change devicemanager to implement HintProvider.Allocate()
...
This change will not work on its own. Higher level code needs to make
sure and call Allocate() before AddContainer is called. This is already
being done in cases when the TopologyManager feature gate is enabled (in
the PodAdmitHandler of the TopologyManager). However, we need to make
sure we add proper logic to call it in cases when the TopologyManager
feature gate is disabled.
2020-02-10 03:27:47 +00:00
Kevin Klues
a3f099ea4d
Split devicemanager Allocate into two functions
...
Instead of having a single call for Allocate(), we now split this into two
functions Allocate() and UpdatePluginResources().
The semantics split across them:
// Allocate configures and assigns devices to a pod. From the requested
// device resources, Allocate will communicate with the owning device
// plugin to allow setup procedures to take place, and for the device
// plugin to provide runtime settings to use the device (environment
// variables, mount points and device files).
Allocate(pod *v1.Pod) error
// UpdatePluginResources updates node resources based on devices already
// allocated to pods. The node object is provided for the device manager to
// update the node capacity to reflect the currently available devices.
UpdatePluginResources(
node *schedulernodeinfo.NodeInfo,
attrs *lifecycle.PodAdmitAttributes) error
As we move to a model in which the TopologyManager is able to ensure
aligned allocations from the CPUManager, devicemanger, and any
other TopologManager HintProviders in the same synchronous loop, we will
need to be able to call Allocate() independently from an
UpdatePluginResources(). This commit makes that possible.
2020-02-10 03:27:47 +00:00
Takeaki Matsumoto
785fac6826
Make updateAllocatedDevices() as a public method and call it in
...
podresources api
2020-02-07 13:26:56 +09:00
Kevin Klues
bc686ea27b
Update TopologyManager.GetTopologyHints() to take pointers
...
Previously, this function was taking full Pod and Container objects
unnecessarily. This commit updates this so that they will take pointers
instead.
2020-02-03 17:13:28 +00:00
danielqsj
1a9b121764
remove deprecated metrics of kubelet
2020-01-10 16:46:52 +08:00
danielqsj
19fe9f8d94
replace grpc.WithDialer which is deprecated
2019-12-26 17:46:59 +08:00
Kubernetes Prow Robot
ad5d4c4705
Merge pull request #85706 from yutedz/per-node-dev
...
Remove nodes slice in loop of takeByTopology
2019-12-05 13:50:30 -08:00
Kubernetes Prow Robot
57b6b287d4
Merge pull request #85688 from yutedz/pods-to-rm
...
Reduce unnecessary Set in updateAllocatedDevices
2019-12-02 17:07:26 -08:00
Ted Yu
84a9803741
Log error when writing checkpoint fails
2019-11-29 19:47:17 -08:00
Ted Yu
6415fa765e
Remove nodes slice in loop of takeByTopology
2019-11-29 12:12:22 -08:00
Ted Yu
86f3bc25e1
Reduce unnecessary Set in updateAllocatedDevices
2019-11-27 08:48:06 -08:00
Kubernetes Prow Robot
30e6238795
Merge pull request #85147 from yutedz/devmgr-rm-contents
...
Continue removing file in ManagerImpl#removeContents
2019-11-14 16:38:28 -08:00
Ted Yu
fb046f7787
Continue removing file in ManagerImpl#removeContents
2019-11-13 06:00:34 -08:00
Kubernetes Prow Robot
ed10b5b17f
Merge pull request #85047 from yutedz/dev-mgr-err-handling
...
Handle error return from allocatePodResources
2019-11-12 11:51:27 -08:00
Kubernetes Prow Robot
897ce3073c
Merge pull request #84533 from davidz627/fix/deprecatedPath
...
Remove plugin watching of deprecated directory and CSI v0 support in accordance with deprecation policy
2019-11-12 04:48:20 -08:00
David Zhu
802fe12803
Remove plugin watching of deprecated directory {kubelet_root_dir}/plugins and support for CSI V0 in accordance with deprecation announcement in https://v1-13.docs.kubernetes.io/docs/setup/release/notes/
2019-11-11 11:42:58 -08:00
Ted Yu
db0f616974
Handle error return from allocatePodResources
2019-11-09 16:25:15 -08:00
Kubernetes Prow Robot
08e5781b41
Merge pull request #84525 from klueska/upstream-fix-hint-generation-after-kubelet-restart
...
Fix bug in TopologyManager hint generation after kubelet restart
2019-11-06 15:33:50 -08:00
Kevin Klues
4d4d4bdd61
Ensure devicemanager TopologyHints are regenerated after kubelet restart
...
This patch also includes test to make sure the newly added logic works
as expected.
2019-11-06 15:01:34 +00:00
Kubernetes Prow Robot
0c0408c790
Merge pull request #76407 from yanghaichao12/dev0411
...
change directory permissions from 0755 to 0750
2019-11-05 19:30:59 -08:00
Kevin Klues
a338c8f7fd
Add some more comments to GetTopologyHints() in the devicemanager
2019-11-05 13:06:23 +00:00
Kevin Klues
58f3554ebe
Sync all CPU and device state before generating TopologyHints for them
...
This ensures that we have the most up-to-date state when generating
topology hints for a container. Without this, it's possible that some
resources will be seen as allocated, when they are actually free.
2019-11-05 13:00:20 +00:00
Adrian Chiris
b17706b149
Added LessThan() and IsEqual() methods for TopologyHints
2019-11-04 18:43:07 +01:00
yanghaichao12
5cbafba457
change directory permissions from 0755 to 0750
2019-11-04 17:04:37 +08:00
Kubernetes Prow Robot
3db6d3abcf
Merge pull request #83551 from dims/move-external-facing-kubelet-apis-to-staging
...
Move external facing kubelet apis to staging
2019-10-10 13:41:36 -07:00
Davanum Srinivas
f29d2272c8
fix gofmt and golint failures
...
Change-Id: I6535b506f50558b31663a13cd270b15023afa2c6
2019-10-06 18:43:17 -04:00
Kubernetes Prow Robot
48b90db9c3
Merge pull request #83495 from tanjunchen/fix-typo
...
remove the repeat word in documents
2019-10-06 15:05:08 -07:00
Davanum Srinivas
6ecc0f83af
update bazel BUILD files
...
Change-Id: Ia3917cec1453c0b22a958faf8c22bccd79242d14
2019-10-06 15:29:23 -04:00
Davanum Srinivas
d30c489c54
Move pkg/kubelet/pluginregistration and deviceplugin
...
Change-Id: I06adcb43bd278b430ffad2010869e1524c8cc4ff
2019-10-06 15:28:38 -04:00
tanjunchen
de3cf23414
remove the repeat word in documents
2019-10-06 23:32:01 +08:00
Kevin Klues
d2b53af7d7
Add klueska as reviewer for CPUManager and devicemanager
2019-10-03 13:01:41 -07:00
Connor Doyle
e35301c19f
Rename package socketmask to bitmask.
...
- As discussed in reviews and other public channels,
this abstraction is used to represent numa nodes, not sockets.
- There is nothing inherently related to sockets in this package anyway.
2019-09-23 17:08:45 -07:00
Kevin Klues
eb0216e54e
Update semantics to set Preferred field in TopologyHint generation
...
We now only set Preferred to true if resources can be allocated with a
size equal to the minimimum _possible_ mask when all resources are
available.
2019-08-29 14:32:10 -05:00
Kevin Klues
dcc9f66311
Add devicemanager tests for TopologyHint consumption
2019-08-29 08:22:50 -05:00
Kevin Klues
cc567afaf0
Consume TopologyHints in the devicemanager
2019-08-29 08:22:50 -05:00
Kevin Klues
a3320f80d9
Add devicemanager tests for TopologyHint generation
2019-08-29 07:45:43 -05:00
Kevin Klues
d3d7a8f5d4
Generate TopologyHints from the devicemanager
2019-08-29 07:45:43 -05:00
Louise Daly
9a118ceac4
Added stub support for Topology Manager to Device Manager
...
Co-authored-by: Conor Nolan <conor.nolan@intel.com>
Co-authored-by: Sreemanti Ghosh <sreemanti.ghosh@intel.com>
Co-authored-by: Kevin Klues <kklues@nvidia.com>
2019-08-29 07:45:43 -05:00
Tim Allclair
a2c51674cf
Cleanup more static check issues (S1*,ST*)
2019-08-21 10:40:21 -07:00
Tim Allclair
8a495cb5e4
Clean up error messages (ST1005)
2019-08-21 10:40:21 -07:00
Tim Allclair
6510d26b6a
Fix misc static check issues
2019-08-21 10:40:21 -07:00
Tara Gu
5e18554442
Implement plugin manager - a controller that manages plugin registration/unregistration
2019-05-30 19:00:59 -04:00
Richard Chen
c9f1b57b5b
Reset extended resources only when node is recreated.
2019-05-21 14:16:54 -07:00
Kubernetes Prow Robot
e476a60ccb
Merge pull request #73241 from vikaschoudhary16/selinux-label
...
Add correct selinux label at plugin socket directory
2019-05-20 11:07:17 -07:00
vikaschoudhary16
58d1b4d564
Add correct selinux label at plugin socket directory
2019-05-18 12:35:17 +05:30
danielqsj
79a3eb816c
rename latency to duration in metrics
2019-02-18 17:40:04 +08:00