kubernetes

Author	SHA1	Message	Date
chaunceyjiang	d2b372e029	Remove redundant type conversion Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2022-10-18 14:37:40 +08:00
Vinay Kulkarni	0ef263c3b0	CRI changes to support implementation of in-place pod resize. KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources	2022-08-02 15:08:25 -07:00
DingShujie	fb3636da40	cpu manager policy set to none, no one remove container id from container map, lead memory leak	2022-03-30 23:25:05 +08:00
Sascha Grunert	de37b9d293	Make CRI `v1` the default and allow a fallback to `v1alpha2` This patch makes the CRI `v1` API the new project-wide default version. To allow backwards compatibility, a fallback to `v1alpha2` has been added as well. This fallback can either used by automatically determined by the kubelet. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-11-17 11:05:05 -08:00
Alexey Perevalov	5d9032007a	Return only isolated cpus in podresources interface Co-Authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2021-10-07 15:34:08 +01:00
Artyom Lukianov	66babd1a90	cpu manager: do not clean admitted pods from the state Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-08-08 16:46:06 +03:00
Francesco Romani	23abdab2b7	smtalign: propagate policy options to policies Consume in the static policy the cpu manager policy options from the cpumanager instance. Validate in the none policy if any option is given, and fail if so - this is almost surely a configuration mistake. Add new cpumanager.Options type to hold the options and translate from user arguments to flags. Co-authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-07-08 23:15:37 +02:00
Francesco Romani	c5cb263dcf	smtalign: propagate policy options to cpumanager The CPUManagerPolicyOptions received from the kubelet config/command line args is propogated to the Container Manager. We defer the consumption of the options to a later patch(set). Co-authored-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-07-08 23:15:35 +02:00
Rishabh Jain	8f08db9164	Change log level to Debug	2021-06-24 14:23:06 +05:30
Kevin Klues	6646039481	Add logic to only call Update() if state different than last Update() Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-05-06 23:38:08 +02:00
pacoxu	8d24c8d0ab	update structured log for cpumanager/cpu_manager.go	2021-03-16 09:40:53 +08:00
Francesco Romani	6d33354e4c	node: podresources: implement GetAllocatableResources API Extend the podresources API implementing the GetAllocatableResources endpoint, as specified in the KEPs: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2043-pod-resource-concrete-assigments https://github.com/kubernetes/enhancements/pull/2404 Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:13:36 +01:00
Kubernetes Prow Robot	e6e079aac3	Merge pull request #97748 from heqg/collides-state Fix variable 'state' collides with imported package name	2021-01-28 17:51:40 -08:00
Artyom Lukianov	38dc7509f8	cpu manager: specify the container CPU set during the creation We can set the container cpuset.cpus diring the creation and it will not need to call to update resources after the container creation. Additional side effect of the change, that the runc process that responsible to create the container will run with the same CPU affinity because the runc runs on the cpuset provided in the config.json arg. It will allow to prevent undesirable interupts on isolated CPUs. Signed-off-by: Artyom Lukianov <alukiano@redhat.com>	2021-01-20 17:53:33 +02:00
Artyom Lukianov	60678a24ca	Update CPU manager GetCPUs method to return pointer to CPUSet	2021-01-20 13:21:57 +02:00
he.qingguo	8249cd611d	Fix variable 'state' collides with imported package name Signed-off-by: he.qingguo <he.qingguo@zte.com.cn>	2021-01-06 10:31:47 +08:00
Kevin Klues	2fcbd2206d	Fix bug in CPUManager with race on map acccess Signed-off-by: Kevin Klues <kklues@nvidia.com>	2020-12-21 19:11:53 +00:00
sw.han	f5997fe537	Add GetPodTopologyHints() interface to Topology/CPU/Device Manager Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>	2020-11-12 12:25:54 +01:00
Alexey Perevalov	a8b8995ef2	Implement TopologyInfo and cpu_ids in podresources It covers deviceplugin & cpumanager. It has drawback, since cpuset and all other structs including cadvisor's keep cpu as int, but for protobuf based interface is better to have fixed int. This patch also introduces additional interface CPUsProvider, while DeviceProvider might have been extended too. Checkpoint not covered by unit test. Signed-off-by: Swati Sehgal <swsehgal@redhat.com> Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-11-11 13:50:49 +03:00
Alexey Perevalov	a047e8aa1b	move to cadvisor.MachineInfo This patch removes GetNUMANodeInfo, cadvisor.MachineInfo will be used instead of it. GetNUMANodeInfo was introduced due to difference of meaning of MachineInfo.Topology. On the arm it was NUMA nodes, but on the x86 it represents sockets (since reading from /proc/cpuinfo). Now it unified and MachineInfo.Topology represents NUMA node. Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>	2020-07-24 09:29:41 -04:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Chris Friesen	ab5870d808	Fix exclusive CPU allocations being deleted at container restart The expectation is that exclusive CPU allocations happen at pod creation time. When a container restarts, it should not have its exclusive CPU allocations removed, and it should not need to re-allocate CPUs. There are a few places in the current code that look for containers that have exited and call CpuManager.RemoveContainer() to clean up the container. This will end up deleting any exclusive CPU allocations for that container, and if the container restarts within the same pod it will end up using the default cpuset rather than what should be exclusive CPUs. Removing those calls and adding resource cleanup at allocation time should get rid of the problem. Signed-off-by: Chris Friesen <chris.friesen@windriver.com>	2020-04-27 11:36:54 -06:00
Byonggon Chun	a3047672d0	move pkg/kubelet/cm/cpumanager/containermap to pkg/kubelet/cm/containermap for reusing containerMap is used in CPU Manager to store all containers information in the node. containerMap provides a mapping from (pod, container) -> containerID for all containers a pod It is reusable in another component in pkg/kubelet/cm which needs to track changes of all containers in the node. Signed-off-by: Byonggon Chun <bg.chun@samsung.com>	2020-03-14 02:38:51 +09:00
nolancon	467f66580b	CPU Manager - Add check to policy.Allocate() for init conatiners If container allocated CPUs is an init container, release those CPUs back into the shared pool for re-allocation to next container.	2020-02-27 07:24:33 +00:00
nolancon	709989efa2	CPU Manager - Rename policy.AddContainer() to policy.Allocate()	2020-02-27 07:24:33 +00:00
Kevin Klues	91f91858a5	Change CPUManager to implement HintProvider.Allocate() This change will not work on its own. Higher level code needs to make sure and call Allocate() before AddContainer is called. This is already being done in cases when the TopologyManager feature gate is enabled (in the PodAdmitHandler of the TopologyManager). However, we need to make sure we add proper logic to call it in cases when the TopologyManager feature gate is disabled.	2020-02-10 03:27:47 +00:00
Kevin Klues	bc686ea27b	Update TopologyManager.GetTopologyHints() to take pointers Previously, this function was taking full Pod and Container objects unnecessarily. This commit updates this so that they will take pointers instead.	2020-02-03 17:13:28 +00:00
Kubernetes Prow Robot	9822016bf8	Merge pull request #87397 from klueska/upstream-cpu-manager-set-initial-containers Initialize CPUManager containerMap to set of initial containers	2020-01-20 17:39:50 -08:00
Kubernetes Prow Robot	e6b5194ec1	Merge pull request #84300 from klueska/upstream-cpu-manager-reconcile-on-container-state Update logic in `CPUManager` `reconcileState()`	2020-01-20 12:27:37 -08:00
Kevin Klues	bd9d8fa42f	Initialize CPUManager containerMap to set of initial containers A recent change made it so that the CPUManager receives a list of initial containers that exist on the system at startup. This list can be non-empty, for example, after a kubelet retart. This commit ensures that the CPUManagers containerMap structure is initialized with the containers from this list.	2020-01-20 20:42:29 +01:00
Kubernetes Prow Robot	37ee6425ef	Merge pull request #87255 from klueska/upstream-remove-redundant-active-pods-check Remove check for empty activePods list in CPUManager removeStaleState	2020-01-20 09:05:50 -08:00
Kevin Klues	7be9b0fe55	Update comments and error messages in the CPUManager	2020-01-20 15:31:01 +01:00
Kevin Klues	f2acbf6607	Base CPUManager state reconciliation on container state, not pod state	2020-01-20 13:57:30 +00:00
Kevin Klues	f6cf9b8ce9	Move CPUManager Pod Status logic before container loop	2020-01-20 13:57:30 +00:00
Kevin Klues	34b942a41d	Remove check for empty activePods list in CPUManager removeStaleState This check is redundant since we protect this call with a call to `m.sourcesReady.AllReady()` earlier on. Moreover, having this check in place means that we will leave some stale state around in cases where there are actually no active pods in the system and this loop hasn't cleaned them up yet. This can happen, for example, if a pod exits while the kubelet is down for some reason. We see this exact case being triggered in our e2e tests, where a test has been failing since October when this change was first introduced.	2020-01-15 20:09:24 +01:00
whypro	f4bd4e2e96	Return error instead of panic when cpu manager starts failed.	2019-12-19 21:56:23 +08:00
Kevin Klues	f553286156	Pass initial set of runtime containers to the CPUManager at startup These information associatedd with these containers is used to migrate the CPUManager state from it's old format to its new (i.e. keyed off of podUID and containerName instead of containerID).	2019-12-11 23:02:51 +01:00
Kevin Klues	6441e1ef43	Move CPUManager Checkpoint restoration to Start() instead of New()	2019-12-11 23:02:51 +01:00
Kevin Klues	69f8053850	Update top-level CPUManager to adhere to new state semantics For now, we just pass 'nil' as the set of 'initialContainers' for migrating from old state semantics to new ones. In a subsequent commit will we pull this information from higher layers so that we can pass it down at this stage properly.	2019-12-11 23:02:51 +01:00
Kevin Klues	765aae93f8	Move containerMap out of static policy and into top-level CPUManager	2019-12-11 23:02:51 +01:00
Kubernetes Prow Robot	73b2c82b28	Merge pull request #83592 from jianzzha/opt-reserved-cpus added --reserved-cpus kubelet command option	2019-11-06 22:14:42 -08:00
Jianzhu Zhang	89dfd24483	added --reserved-cpus kubelet command option	2019-11-06 07:33:52 -05:00
Kevin Klues	58f3554ebe	Sync all CPU and device state before generating TopologyHints for them This ensures that we have the most up-to-date state when generating topology hints for a container. Without this, it's possible that some resources will be seen as allocated, when they are actually free.	2019-11-05 13:00:20 +00:00
Kevin Klues	d9adf20360	Abstract removeStaleState from reconcileState in CPUManager This will become especially important as we move to a model where exclusive CPUs are assigned at pod admission time rather than at pod creation time. Having this function will allow us to do garbage collection on these CPUs anytime we are about to allocate CPUs to a new set of containers, in addition to reclaiming state periodically in the reconcileState() loop.	2019-11-05 12:45:11 +00:00
Kubernetes Prow Robot	17a57f99d5	Merge pull request #81344 from zouyee/cpm fix cpumanager reconcileState without sourceready	2019-10-30 23:33:36 -07:00
Connor Doyle	389853894d	Delegate topology hint gen to CPU manager policy - The previous implementation depended on a fixed set of policies.	2019-09-27 22:29:02 -07:00
zouyee	594fc0f4b9	fix cpumanager reconcileState without sourceready Signed-off-by: Zou Nengren <zouyee1989@gmail.com>	2019-09-25 10:39:06 +08:00
Kevin Klues	ddfd9ac0ca	Fix bug in CPUManager with setting topology for policies Also add a check in the unit tests to avoid regressions	2019-08-28 17:32:25 -05:00
Kevin Klues	ecc14fe661	Update CPUManager to include NUMANodeID in CPUTopology Unfortunately, the NUMA information is not readily available from cadvisor, so we have to roll the logic to discover it by hand. In the future, we should remove this custiom code to use the information provided by cadvisor once it is made available.	2019-08-27 16:51:05 -05:00
Kevin Klues	869962fa48	Cache the discovered topology in the CPUManager instead of MachineInfo	2019-08-27 16:23:07 -05:00

1 2

79 Commits