kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	8d5c96fed2	Merge pull request #116093 from swatisehgal/topologymanager-ga-graduation node: topologymgr: Graduate Kubelet Topology Manager to GA	2023-03-08 16:56:06 -08:00
David Porter	9c20cee504	Revert "node: device-mgr: Handle recovery flow by checking if healthy devices exist"	2023-03-07 11:50:52 -08:00
Claudiu Belu	5ba74c81ca	unit tests: Skip flaky tests on Windows Some of the unit tests are currently flaky on Windows. This commit skips them until they are resolved.	2023-03-06 20:46:05 +00:00
Kubernetes Prow Robot	890d39f976	Merge pull request #114640 from swatisehgal/handle-device-mgr-recovery node: device-mgr: Handle recovery flow by checking if healthy devices exist	2023-03-06 07:10:28 -08:00
Kubernetes Prow Robot	68eea2468c	Merge pull request #114572 from huyinhou/fix-concurrent-map-access kubelet/deviceplugin: fix concurrent map iteration and map write	2023-03-06 06:06:29 -08:00
Swati Sehgal	937d330393	node: topologymgr: Remove ResourceAllocator as TM is always enabled With Topology Manager enabled by default, we no longer need `resourceAllocator` as Topology Manager serves as the main PodAdmitHandler completely responsible for admission check based on hints received from the hintProviders and the subsequent allocation of the corresponding resources to a pod as can be seen here: https://github.com/kubernetes/kubernetes/blob/v1.26.0/pkg/kubelet/cm/topologymanager/scope.go#L150 With regard to DRA, the passing of `cm.draManager` into resourceAllocator seems redundant as no admission checks (and allocation of resources handled by DRA) is taking place in `Admit` method of resourceAllocator. DRA has a completely different model to the rest of the resource managers where pod is only scheduled on a node once resources are reserved for it. Because of this, admission checks or waiting for resources to be provisioned after the pod has been scheduled on the node is not required. Before making the above change, it was verified that DRA Manager is instantiated in `NewContainerManager`: https://github.com/kubernetes/kubernetes/blob/v1.26.0/pkg/kubelet/cm/container_manager_linux.go#L318 Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:51:11 +00:00
Swati Sehgal	6a62f0236a	node: topologymgr: trivial internal variable renaming Since Topology manager is graduating to GA, we remove internal configuration variable names with `Experimental` prefix. There is no expected change in behavior, only trival variable renaming. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:51:11 +00:00
Swati Sehgal	d536a342b4	node: topologymgr: GA graduation implies Feature Gate is ON by default Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:51:05 +00:00
Swati Sehgal	5b2a3dbbdc	node: device-mgr: explicitly check if pre-allocated devices are healthy Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 11:52:23 +00:00
Swati Sehgal	a799ffb571	node: device-mgr: unit-tests: admission failure due to unhealthy devices Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 11:52:23 +00:00
Swati Sehgal	7ac399c205	node: device-mgr: Handle recovery by checking if healthy devices exist In case of node reboot/kubelet restart, the flow of events involves obtaining the state from the checkpoint file followed by setting the `healthDevices`/`unhealthyDevices` to its zero value. This is done to allow the device plugin to re-register itself so that capacity can be updated appropriately. During the allocation phase, we need to check if the resources requested by the pod have been registered AND healthy devices are present on the node to be allocated. Also we need to move this check above `needed==0` where needed is required - devices allocated to the container (which is obtained from the checkpoint file) because even in cases where no additional devices have to be allocated (as they were pre-allocated), we still need to make the devices that were previously allocated are healthy. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 11:52:23 +00:00
huyinhou	88274d96fc	update code style Signed-off-by: huyinhou <huyinhou@bytedance.com>	2023-03-06 14:23:14 +08:00
Sergey Kanzhelev	04189b1fc4	rename ExperimentalPodPidsLimit to PodPidsLimit	2023-03-04 01:48:16 +00:00
Kubernetes Prow Robot	efe20f6c9b	Merge pull request #114114 from ffromani/full-pcpus-stricter-precheck-issue113537 node: cpumgr: stricter pre-check for the policy option full-pcpus-only	2023-03-02 09:04:56 -08:00
Francesco Romani	0e9b92090c	node: cpumgr: stricter precheck for full-pcpus-only In order to implement the `full-pcpus-only` cpumanager policy option, we leverage the implementation of the algorithm which picks CPUs. By design, CPUs are taken from the biggest chunk available (socket or NUMA zone) to physical cores, down to single cores. Leveraging this, if the requested CPU count is a multiple of the SMT level (commonly 2), we're guaranteed that only full physical cores will be taken. The hidden assumption here is this holds true by construction iff the user reserved CPUs (if any) considering full physical CPUs. IOW, if the user did intentionally or mistakely reserve single threads which are no core siblings[1], then the simple check we implemented is not sufficient. A easy example can probably outline this better. With this setup: cores: [(0, 4), (1, 5), (2, 6), (3, 8)] (in parens: thread siblings). SMT level: 2 (each tuple is 2 elements) Reserved CPUs: 0,1 (explicit pick using `--reserved-cpus`) A container then requests 6 cpus. full-pcpus-only check: 6 % 2 == 0. Passed. The CPU allocator will take first full cores, (2,6) and (3,8), and will then pick the remaining single CPUs. The allocation will succeed, but it's incorrect. We can fix this case with a stricter precheck. We need to additionally consider all the core siblings of the reserved CPUs as unavailable when computing the free cpus, before to start the actual allocation. Doing so, we fall back in the intended behavior, and by construction all possible CPUs allocation whose number is multiple of the SMT level are now correct again. +++ [1] or thread siblings in the linux parlance, in any case: hyperthread siblings of the same physical core Signed-off-by: Francesco Romani <fromani@redhat.com>	2023-03-02 16:00:58 +01:00
Kubernetes Prow Robot	6a25c528bb	Merge pull request #115891 from bart0sh/PR103-CRI-add-CDI-devices DRA: Pass CDI devices with a new CRI field	2023-02-28 14:53:28 -08:00
Kubernetes Prow Robot	18eea58ac2	Merge pull request #115359 from iancoolidge/devel-cpuset More code-review changes from k/utlils cpuset review	2023-02-28 10:55:16 -08:00
Ed Bartosh	5a86895070	DRA: pass CDI devices through CRI CDIDevice field	2023-02-28 19:21:20 +02:00
Chen Wang	7db339dba2	This commit contains the following: 1. Scheduler bug-fix + scheduler-focussed E2E tests 2. Add cgroup v2 support for in-place pod resize 3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes. Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>	2023-02-24 18:21:21 +00:00
Vinay Kulkarni	f2bd94a0de	In-place Pod Vertical Scaling - core implementation 1. Core Kubelet changes to implement In-place Pod Vertical Scaling. 2. E2E tests for In-place Pod Vertical Scaling. 3. Refactor kubelet code and add missing tests (Derek's kubelet review) 4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature. 5. Fix corner-case where resize A->B->A gets ignored 6. Add cgroup v2 support to pod resize E2E test. KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>	2023-02-24 18:21:21 +00:00
Ian K. Coolidge	d4a1bf83c1	cpuset: Convert Fatalf to Errrof in tests Use of Fatalf is not apppropriate in any of these cases: None of these failures are prerequisites.	2023-02-21 05:41:16 +00:00
Ian K. Coolidge	b536851fc7	cpuset: Add a few more test cases Feedback from https://github.com/kubernetes/utils/pull/267 and related reviews. * Equality when insertion order is different * UnsortedList contents * Not-Subset cases * Clone coverage	2023-02-21 05:40:54 +00:00
Ian K. Coolidge	22d3f67850	cpuset: Fix Parse() error message for n-k s.t. k<n This case is tested extensively in cpuset_test.go, but the error message needs a small adjustmnet.	2023-02-21 04:51:14 +00:00
huyinhou	32495ae3f1	add lock in generate topology hints function	2023-02-20 10:56:53 +08:00
Kubernetes Prow Robot	e18fa74551	Merge pull request #115590 from swatisehgal/topology-mgr-duration-metrics node: topology-mgr: Add metric to measure topology manager admission latency	2023-02-15 07:12:25 -08:00
Swati Sehgal	8442b450e5	node: topology-mgr: code optimization Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-02-15 14:04:10 +00:00
Swati Sehgal	bc941633c1	node: topology-mgr: add metric to measure topology mgr admission latency Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-02-15 13:59:47 +00:00
Ed Bartosh	4f88332ab4	kubelet: prepare DRA resources before CNI setup	2023-02-06 20:40:11 +02:00
Kubernetes Prow Robot	4df945853e	Merge pull request #115137 from swatisehgal/topologymgr-metrics node: topologymgr: add metrics about admission requests and errors	2023-01-30 18:43:00 -08:00
Patrick Ohly	bc6c7fa912	logging: fix names of keys The stricter checking with the upcoming logcheck v0.4.1 pointed out these names which don't comply with our recommendations in https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md#name-arguments.	2023-01-23 14:24:29 +01:00
Swati Sehgal	172c55d310	node: topologymgr: add metrics about admission requests and errors Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-01-17 17:50:29 +00:00
Ian K. Coolidge	5533e49e2c	cpuset: Add package comment Describe use cases (node IDs, HT siblings, etc) Call out novelty (Linux CPU list parse/dump) Describe future work (relax immutable, refactor to use 'set')	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	cbb985a310	cpuset: Delete 'builder' methods All usage of builder pattern is convertible to cpuset.New() with the same or fewer lines of code. Migrate Builder.Add to a private method of CPUSet, with a comment that it is only intended for internal use to preserve immutable propoerty of the exported interface. This also removes 'require' library dependency, which avoids non-standard library usage.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	f3829c4be3	cpuset: Rename 'NewCPUSet' to 'New'	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	768b1ecfb6	cpuset: hide 'Filter' API FilterNot is only used in this file, and is trivially converted to a 'filter' call site by inverting the predicate. Filter is only used in this file, so don't export it.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	e5143d16c2	cpuset: Make 'ToSlice*' methods look like 'set' methods In 'set', conversions to slice are done also, but with different names: ToSliceNoSort() -> UnsortedList() ToSlice() -> List() Reimplement List() in terms of UnsortedList to save some duplication.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	a0c989b99a	cpuset: Remove *Int64 methods These are rarely used and can be accommodated with a trivial helper.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	67a057d4f2	cpuset: Remove 'MustParse' method Removes exit/fatal from cpuset library. Usage in podresources test was not necessary. Library reference in cpu_manager_test was moved to a local function, and converted to use e2e test framework error catching.	2023-01-06 23:32:51 +00:00
Ian K. Coolidge	824bd57ad6	cpuset: Convert Union arguments to variadic This allows Union to implement UnionAll easily.	2023-01-06 23:32:50 +00:00
huyinhou	4702503d15	update test case Signed-off-by: huyinhou <huyinhou@bytedance.com>	2023-01-03 15:00:12 +08:00
huyinhou	b9987eeb6c	fix allDevices map data race	2022-12-29 18:27:08 +08:00
huyinhou	997cefc9da	add unit test	2022-12-29 14:50:18 +08:00
huyinhou	692f8aab27	fix kubelet crash, concurrent map iteration and map write When kubelet starts a Pod that requires device resources, if the device plug-in updates the device at the same time, it may cause kubelet to crash. Signed-off-by: huyinhou <huyinhou@bytedance.com>	2022-12-19 12:45:17 +08:00
Kubernetes Prow Robot	68f808e6db	Merge pull request #111371 from sivchari/improve-naming feat: improve naming	2022-12-14 02:23:37 -08:00
Kubernetes Prow Robot	7754f007d6	Merge pull request #114169 from jpbetz/improve-kubelet-flag-errors Improve error messages of flags that parse quantities and percentages	2022-12-10 06:05:11 -08:00
Kubernetes Prow Robot	a668924cb6	Merge pull request #113255 from claudiubelu/path-filepath-update-kubelet Replaces path.Operation with filepath.Operation (kubelet)	2022-12-09 22:27:41 -08:00
Joe Betz	ab3c353227	Improve error messages for parse errors of --kube-reserved, --system-reserved and --qos-reserved	2022-11-28 16:35:26 -05:00
Ed Bartosh	abcb56defb	kubelet: do not enter termination status if pod might need to unprepare resources	2022-11-11 21:58:03 +01:00
Ed Bartosh	ae0f38437c	kubelet: add support for dynamic resource allocation Dependencies need to be updated to use github.com/container-orchestrated-devices/container-device-interface. It's not decided yet whether we will implement Topology support for DRA or not. Not having any toppology-related code will help to avoid wrong impression that DRA is used as a hint provider for the Topology Manager.	2022-11-11 21:58:03 +01:00
PiotrProkop	540b5bd308	[topologymanager] rely on Cadvisor to calculate NUMA distance Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2022-11-09 17:52:14 +01:00

1 2 3 4 5 ...

1219 Commits