kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	e48d42d81d	Merge pull request #122627 from sanposhiho/remove-AssignedPodUpdated take PodTopologySpread into consideration when requeueing Pods based on Pod related events	2024-07-08 16:21:11 -07:00
Kensei Nakada	41f7607c04	cleanup: remove non-necessary ifs	2024-07-06 13:19:24 +00:00
Kensei Nakada	e16aa35865	address review suggestions	2024-07-06 13:17:17 +00:00
Kensei Nakada	533140f065	take PodTopologySpread into consideration when requeueing Pods based on Pod related events	2024-07-06 13:17:14 +00:00
Kubernetes Prow Robot	59673f0f37	Merge pull request #125578 from nayihz/fix_sche_queue_update skip update pod that exist in scheduling cycle	2024-06-25 14:18:19 -07:00
nayihz	26dcab1146	skip update pod that exist in scheduling cycle	2024-06-24 17:11:09 +08:00
Kensei Nakada	98a3182398	correct comment	2024-06-20 23:48:42 +00:00
Kensei Nakada	2304806cbe	elaborate comment more	2024-06-20 23:43:41 +00:00
Kensei Nakada	2c4dc6b65b	elaborate comments	2024-06-20 23:36:05 +00:00
Kensei Nakada	dd3af9a85b	fix: skip isPodWorthRequeuing only when SchedulingGates gates the pod	2024-06-17 01:14:34 +00:00
AxeZhan	d66f8f9413	schedulingQueue update pod by queueHint	2024-06-12 21:26:09 +08:00
Gabe	4e99ada05f	Filter gated pods before calling isPodWorthRequeueing	2024-04-29 16:54:40 +00:00
Kensei Nakada	2b56de43e5	register Node/UpdateNodeTaint event to plugins which has Node/Add only, doesn't have Node/UpdateNodeTaint	2024-03-16 14:13:06 +00:00
Kensei Nakada	18ba3b388e	fix(scheduling queue): ignore events that interest no registered plugin	2024-02-24 06:42:19 +00:00
kerthcet	d81023db30	When matching clusterEvent, we should consider the "*" additionally Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-02-04 14:59:26 +08:00
Kubernetes Prow Robot	ce80b7752a	Merge pull request #122081 from colin404/fix/fix-incorrect-comment fix incorrect function comment	2023-12-14 06:17:13 +01:00
孔令飞	917027b42e	fix incorrect function comment Change-Id: I7d5e908f979026faa467fdd77049b6aa3087fd7c	2023-12-12 17:38:03 +08:00
kerthcet	fade7463cd	Add String() to framework status Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-11-01 17:01:36 +08:00
Kubernetes Prow Robot	fd5c406112	Merge pull request #120933 from mengjiao-liu/contextual-logging-scheduler-remaining-part kube-scheduler: convert the remaining part to use contextual logging	2023-10-27 10:30:58 +02:00
Kensei Nakada	27bb66fd7b	cleanup: rename failedPlugin to plugin in framework.Status	2023-10-25 12:03:56 +00:00
Mengjiao Liu	b0a73213d6	kube-scheduler: convert the remaining part to use contextual logging	2023-10-24 17:56:48 +08:00
Kensei Nakada	4f5bc7e8d7	fix based on reviews	2023-10-20 02:53:06 +00:00
Kensei Nakada	cb5dc46edf	feature(scheduler): simplify QueueingHint by introducing new statuses	2023-10-19 11:02:11 +00:00
carlory	0105a002bc	when the hint fn returns error, the scheduling queue logs the error and treats it as QueueAfterBackoff. Co-authored-by: Kensei Nakada <handbomusic@gmail.com> Co-authored-by: Kante Yin <kerthcet@gmail.com> Co-authored-by: XsWack <xushiwei5@huawei.com>	2023-09-21 09:40:44 +08:00
Kensei Nakada	0d3eafdfa3	fix(scheduling_queue): always put Pods with no unschedulable plugins into activeQ/backoffQ (#119105 ) * always put Pods with no unschedulable plugins into activeQ/backoffQ * address review comments	2023-09-11 09:30:11 -07:00
Patrick Ohly	4e73634b53	scheduler: start scheduling attempt with clean UnschedulablePlugins When some plugin was registered as "unschedulable" in some previous scheduling attempt, it kept that attribute for a pod forever. When that plugin then later failed with an error that requires backoff, the pod was incorrectly moved to the "unschedulable" queue where it got stuck until the periodic flushing because there was no event that the plugin was waiting for. Here's an example where that happened: framework.go:1280: E0831 20:03:47.184243] Reserve/DynamicResources: Plugin failed err="Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" node="scheduler-perf-dra-7l2v2" plugin="DynamicResources" pod="test/test-dragxd5c" schedule_one.go:1001: E0831 20:03:47.184345] Error scheduling pod; retrying err="running Reserve plugin \"DynamicResources\": Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" pod="test/test-dragxd5c" ... scheduling_queue.go:745: I0831 20:03:47.198968] Pod moved to an internal scheduling queue pod="test/test-dragxd5c" event="ScheduleAttemptFailure" queue="Unschedulable" schedulingCycle=9576 hint="QueueSkip" Pop still needs the information about unschedulable plugins to update the UnschedulableReason metric. It can reset that information before returning the PodInfo for the next scheduling attempt.	2023-09-08 16:52:36 +02:00
Patrick Ohly	cd943dd95e	scheduler: fix tracking of concurrent events The previous approach was based on the assumption that an in-flight pod can use the head of the received event list as marker for identifying all events that occur while the pod is in flight. That assumption is incorrect: when that existing element gets removed from the list because all pods that were in-flight when it was received are done, that marker's Next method returns nil and the code which should have seen several concurrent events (if there were any) missed all of those. As a result, a pod with concurrent events could incorrectly get moved to the unschedulable queue where it could got stuck until the next periodic purging after 5 minutes if there was no other event for it. The approach with maintaining a single list of concurrent events can be fixed by inserting each in-flight pod into the list and using that element to identify "more recent" events for the pod.	2023-09-05 19:58:38 +02:00
Kensei Nakada	cf3f0bd778	fix: register the plugin rejects Pods in WaitOnPermit to UnschedulablePlugins	2023-08-12 07:18:01 +00:00
Kensei Nakada	050c0437e6	fix: broadcast when pod is pushed back to activeQ directly in AddUnschedulableIfNotPresent	2023-08-09 03:32:14 +00:00
Kensei Nakada	c7e7eee554	feature(scheduling_queue): track events per Pods (#118438 ) * feature(sscheduling_queue): track events per Pods * fix typos * record events in one slice and make each in-flight Pod to refer it * fix: use Pop() in test before AddUnschedulableIfNotPresent to register in-flight Pods * eliminate MakeNextPodFuncs * call Done inside the scheduling queue * fix comment * implement done() not to require lock in it * fix UTs * improve the receivedEvents implementation based on suggestions * call DonePod when we don't call AddUnschedulableIfNotPresent * fix UT * use queuehint to filter out events for in-flight Pods * fix based on suggestion from aldo * fix based on suggestion from Wei * rename lastEventBefore → previousEvent * fix based on suggestion * address comments from aldo * fix based on the suggestion from Abdullah * gate in-flight Pods logic by the SchedulingQueueHints feature gate	2023-07-17 15:53:07 -07:00
carlory	0599b3caa0	change the QueueingHintFn to pass a logger	2023-07-13 00:56:41 +08:00
Patrick Ohly	444d23bd2f	dra: generated name for ResourceClaim from template Generating the name avoids all potential name collisions. It's not clear how much of a problem that was because users can avoid them and the deterministic names for generic ephemeral volumes have not led to reports from users. But using generated names is not too hard either. What makes it relatively easy is that the new pod.status.resourceClaimStatus map stores the generated name for kubelet and node authorizer, i.e. the information in the pod is sufficient to determine the name of the ResourceClaim. The resource claim controller becomes a bit more complex and now needs permission to modify the pod status. The new failure scenario of "ResourceClaim created, updating pod status fails" is handled with the help of a new special "resource.kubernetes.io/pod-claim-name" annotation that together with the owner reference identifies exactly for what a ResourceClaim was generated, so updating the pod status can be retried for existing ResourceClaims. The transition from deterministic names is handled with a special case for that recovery code path: a ResourceClaim with no annotation and a name that follows the Kubernetes <= 1.27 naming pattern is assumed to be generated for that pod claim and gets added to the pod status. There's no immediate need for it, but just in case that it may become relevant, the name of the generated ResourceClaim may also be left unset to record that no claim was needed. Components processing such a pod can skip whatever they normally would do for the claim. To ensure that they do and also cover other cases properly ("no known field is set", "must check ownership"), resourceclaim.Name gets extended.	2023-07-11 14:23:48 +02:00
Kensei Nakada	be0db3f93d	clean up the implementation around QueueingHintFn	2023-07-06 16:07:39 +00:00
Heba Elayoty	d548983dbb	Use table-driven table for TestPerPodSchedulingMetrics Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com>	2023-06-29 14:51:55 -07:00
Kubernetes Prow Robot	d9714078f8	Merge pull request #118551 from sanposhiho/event-to-register feature(scheduler): implement ClusterEventWithHint to filter out useless events	2023-06-26 06:41:45 -07:00
Kensei Nakada	6f8d38406a	feature(scheduler): implement ClusterEventWithHint to filter out useless events	2023-06-22 13:36:19 +00:00
Heba Elayoty	902c711fb4	Unset gated pod info timestamp in addToActiveQ Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com>	2023-06-21 14:16:08 -07:00
Kubernetes Prow Robot	4483bf66fe	Merge pull request #116635 from mengjiao-liu/contextual-logging-plugin-interpodaffinity Migrated `pkg/scheduler/framework/plugins/interpodaffinity` to contextual logging	2023-06-09 08:14:13 -07:00
Yuan Chen	9eaa50cc82	Rename scheduler queue variables for consistency	2023-06-05 09:02:06 -07:00
Mengjiao Liu	6d23da045f	Migrated pkg/scheduler/framework/plugins/interpodaffinity to use contextual logging	2023-06-01 18:24:54 +08:00
likakuli	5a14573258	clean: use info instead of error to log queue closed message when scheduler exit Signed-off-by: likakuli <1154584512@qq.com>	2023-05-31 11:07:24 +08:00
Mengjiao Liu	074900e81b	scheduler: update the scheduler interface and cache methods to use contextual logging	2023-05-29 13:26:32 +08:00
Kubernetes Prow Robot	29c8fb678c	Merge pull request #117194 from sanposhiho/revert-preenqueue Revert "Optimization on running prePreEnqueuePlugins before adding pods into activeQ"	2023-04-13 16:00:50 -07:00
Kensei Nakada	2bed67d0f1	Revert "Optimization on running prePreEnqueuePlugins before adding pods into activeQ" This reverts commit `c01fa8279d`.	2023-04-11 22:28:42 +00:00
sarab	8d18ae6fc2	Use the generic Set in scheduler	2023-04-09 11:34:17 +05:30
Kensei Nakada	6697467062	add(scheduler): implement "plugin_execution_duration_seconds" metric in PreEnqueue	2023-03-12 04:45:52 +00:00
Aldo Culquicondor	07a73bb2e1	One lock among PodNominator and SchedulingQueue Change-Id: I17fe5da40250e42c04124c25b530ce6c8dea4154	2023-03-08 16:18:36 -05:00
Chen Wang	7db339dba2	This commit contains the following: 1. Scheduler bug-fix + scheduler-focussed E2E tests 2. Add cgroup v2 support for in-place pod resize 3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes. Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>	2023-02-24 18:21:21 +00:00
lianghao208	c01fa8279d	Optimization on running prePreEnqueuePlugins before adding pods into activeQ	2023-02-15 11:13:21 +08:00
Wei Huang	a731a44596	Fix an accuracy issue of `scheduler_pending_pods` metric	2022-11-21 21:33:16 -08:00

1 2 3 4

159 Commits