kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	e48d42d81d	Merge pull request #122627 from sanposhiho/remove-AssignedPodUpdated take PodTopologySpread into consideration when requeueing Pods based on Pod related events	2024-07-08 16:21:11 -07:00
Kensei Nakada	41f7607c04	cleanup: remove non-necessary ifs	2024-07-06 13:19:24 +00:00
Kensei Nakada	e16aa35865	address review suggestions	2024-07-06 13:17:17 +00:00
Kensei Nakada	533140f065	take PodTopologySpread into consideration when requeueing Pods based on Pod related events	2024-07-06 13:17:14 +00:00
Kubernetes Prow Robot	b6899c5e08	Merge pull request #122251 from olderTaoist/unschedulable-plugin register unschedulable plugin for those plugins that PreFilter's PreFilterResult filter out some nodes	2024-07-05 05:44:26 -07:00
olderTaoist	b478621596	register unscheduable plugin when prefileter with NodeNames	2024-07-02 13:02:45 +08:00
Maciej Skoczeń	c5d376dc20	Fix typos and error messages in scheduling queue tests	2024-07-01 11:37:53 +00:00
Kubernetes Prow Robot	a326cfa2b5	Merge pull request #125691 from kerthcet/fix/multi-profil fix flaky integration test about multi profiles	2024-06-30 23:09:06 -07:00
kerthcet	20a70e2301	Fix flaky test in maxSurge integration tests Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-07-01 10:45:18 +08:00
Kubernetes Prow Robot	ac9aec9f9b	Merge pull request #125116 from pohly/dra-one-of-source DRA: remove "source" indirection from v1 Pod API	2024-06-28 12:46:45 -07:00
Kubernetes Prow Robot	eb66365bc4	Merge pull request #124931 from pohly/dra-scheduler-prebind-fix DRA: fix scheduler/resource claim controller race	2024-06-28 05:57:24 -07:00
kerthcet	a7ef06da87	Set permit timeout to 10s in test Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-06-28 14:02:36 +08:00
Patrick Ohly	bde9b64cdf	DRA: remove "source" indirection from v1 Pod API This makes the API nicer: resourceClaims: - name: with-template resourceClaimTemplateName: test-inline-claim-template - name: with-claim resourceClaimName: test-shared-claim Previously, this was: resourceClaims: - name: with-template source: resourceClaimTemplateName: test-inline-claim-template - name: with-claim source: resourceClaimName: test-shared-claim A more long-term benefit is that other, future alternatives might not make sense under the "source" umbrella. This is a breaking change. It's justified because DRA is still alpha and will have several other API breaks in 1.31.	2024-06-27 17:53:24 +02:00
Patrick Ohly	4bddebc48e	DRA: fix scheduler/resource claim controller race with retry The JSON patch approach works, but it is complex. A retry loop is easier to understand (detect conflict, get new claim, try again). There is one additional API call (the get), but in practice this scenario is unlikely.	2024-06-27 15:03:56 +02:00
Patrick Ohly	ecbafb8de5	DRA: fix scheduler/resource claim controller race There was a race caused by having to update claim finalizer and status in two different operations: - Resource claim controller removes allocation, does not yet get to remove the finalizer. - Scheduler prepares an allocation, without adding the finalizer because it's there. - Controller removes finalizer. - Scheduler adds allocation. This is an invalid state. Automatic checking found this during the execution of the "with translated parameters on single node.*supports sharing a claim sequentially" E2E test, but only when run stand-alone. When running in parallel (as in the CI), the bad outcome of the race did not occur. The fix is to check that the finalizer is still set when adding the allocation. The apiserver doesn't check that because it doesn't know which finalizer goes with the allocation result. It could check for "some finalizer", but that is not guaranteed to be correct (could be some unrelated one). Checking the finalizer can only be done with a JSON patch. Despite the complications, having the ability to add multiple pods concurrently to ReservedFor seems worth it (avoids expensive rescheduling or a local retry loop). The resource claim controller doesn't need this, it can do a normal update which implicitly checks ResourceVersion.	2024-06-27 15:03:06 +02:00
googs1025	8ce056df84	add DefaultSelector method ut Signed-off-by: googs1025 <googs1025@gmail.com>	2024-06-27 11:23:48 +08:00
Kubernetes Prow Robot	084d6c4968	Merge pull request #125699 from pohly/scheduler-framework-logging scheduler: fix klog.KObjSlice when applied to []*NodeInfo	2024-06-26 01:50:23 -07:00
Patrick Ohly	719a49cc13	scheduler: fix klog.KObjSlice when applied to []*NodeInfo The DRA plugin does that. It didn't actually work and only printed an error message about NodeInfo not implementing klog.KMetata. That's not a compile-time check due to limitations with Go generics and had been missed earlier.	2024-06-26 08:11:31 +02:00
Kubernetes Prow Robot	d0579b6f9c	Merge pull request #125683 from likakuli/fix-benchmarkupdatesnapshot clean: add nodeinfo to cache	2024-06-25 14:18:39 -07:00
Kubernetes Prow Robot	59673f0f37	Merge pull request #125578 from nayihz/fix_sche_queue_update skip update pod that exist in scheduling cycle	2024-06-25 14:18:19 -07:00
Kubernetes Prow Robot	8c478a06d8	Merge pull request #124595 from pohly/dra-scheduler-assume-cache-eventhandlers DRA: scheduler event handlers via assume cache	2024-06-25 11:56:28 -07:00
likakuli	ea6ca270b5	clean: add nodeinfo to cache Signed-off-by: likakuli <1154584512@qq.com>	2024-06-25 21:29:05 +08:00
Patrick Ohly	1b63639d31	DRA scheduler: use assume cache to list claims This finishes the transition to the assume cache as source of truth for the current set of claims. The tests have to be adapted. It's not enough anymore to directly put objects into the informer store because that doesn't change the assume cache content. Instead, normal Create/Update calls and waiting for the cache update are needed.	2024-06-25 14:00:25 +02:00
Patrick Ohly	9a6f3b9388	scheduler: central ResourceClaim assume cache This enables connecting the event handler for ResourceClaim to the assume cache, which addresses a theoretic race condition. It may also be useful for implementing the autoscaler support, because now the autoscaler can modify the content of the cache.	2024-06-25 14:00:25 +02:00
Patrick Ohly	dea16757ef	scheduler: AddEventHandler for assume cache This enables using the assume cache for cluster events.	2024-06-25 14:00:25 +02:00
Patrick Ohly	639f86915b	scheduler: add FIFO queue This is a basic implementation of a first-in-first-out queue with unbounded size. It's useful for cases where a channel with fixed size might deadlock. The caller is responsible for locking.	2024-06-25 13:56:15 +02:00
nayihz	26dcab1146	skip update pod that exist in scheduling cycle	2024-06-24 17:11:09 +08:00
Kubernetes Prow Robot	8c508c5480	Merge pull request #125527 from sanposhiho/gated-pods-filter-out-bug fix: skip isPodWorthRequeuing only when SchedulingGates gates the pod	2024-06-21 12:22:55 -07:00
Kensei Nakada	98a3182398	correct comment	2024-06-20 23:48:42 +00:00
Kensei Nakada	2304806cbe	elaborate comment more	2024-06-20 23:43:41 +00:00
Kensei Nakada	fa8da84835	remove fixme comment	2024-06-20 23:36:25 +00:00
Kensei Nakada	2c4dc6b65b	elaborate comments	2024-06-20 23:36:05 +00:00
Kubernetes Prow Robot	a008776ec9	Merge pull request #125279 from HirazawaUi/add-poddeleted-queueinghintfn Add QueueingHintFn for pod events in VolumeRestriction plugin	2024-06-19 12:22:41 -07:00
Kubernetes Prow Robot	64355780d9	Merge pull request #125495 from pohly/dra-scheduler-fix-parameter-indexing DRA: fix indexing of generated parameters	2024-06-18 04:10:38 -07:00
Kubernetes Prow Robot	ab8ad49b47	Merge pull request #125533 from kaisoz/sched-test-disruption-target-cond scheduler: Test that the DisruptionTarget condition is added at preemption time	2024-06-18 01:14:28 -07:00
Tomas Tormo	8d7c113434	Test that the DisruptionTarget condition is added at preemption	2024-06-17 16:59:52 +00:00
HirazawaUi	f9693e0c0a	Implement QueueingHintFn for pod deleted event	2024-06-17 22:42:04 +08:00
Kensei Nakada	dd3af9a85b	fix: skip isPodWorthRequeuing only when SchedulingGates gates the pod	2024-06-17 01:14:34 +00:00
Patrick Ohly	e0fce54d02	DRA: fix indexing of generated parameters The claim parameter key didn't include the namespace of the claim. In the case where two namespaces used the exact same parameter reference, the "too many generated parameters" case got triggered incorrectly and lookup could have returned an object from the wrong namespace. Found while running the E2E tests in parallel: message: 'running PreFilter plugin "DynamicResources": multiple generated claim parameters for ConfigMap. dra-8794/parameters-3 found: [dra-4729/parameters-4 dra-7328/parameters-4 dra-8794/parameters-4 dra-3402/parameters-4 dra-6156/parameters-4 dra-1839/parameters-4 dra-7434/parameters-4 dra-6504/parameters-4]'	2024-06-13 17:27:04 +02:00
Kubernetes Prow Robot	9c8c61aee4	Merge pull request #122234 from AxeZhan/podUpdateEvent [Scheduler]Put pod into the correct queue during podUpdate	2024-06-12 12:28:17 -07:00
AxeZhan	d66f8f9413	schedulingQueue update pod by queueHint	2024-06-12 21:26:09 +08:00
Patrick Ohly	c339eafb76	scheduler: allow PreBind to return "Pending" and "Unschedulable" Any error result from PreBind was treated as a pod scheduling failure. This was overlooked when moving blocking API calls in the DRA plugin into a PreBind implementation, leading to: E0604 15:45:50.980929 306340 schedule_one.go:1048] "Error scheduling pod; retrying" err="waiting for resource driver" pod="test/test-draqld28" That's because DRA's PreBind does some updates in the apiserver, then returns Pending to wait for the outcome. The fix is to allow PreBind to return the same special status codes as other extension points.	2024-06-06 15:28:08 +02:00
AxeZhan	cf73c9d93c	remove EvaluatedNodes field in Diagnosis struct	2024-06-04 14:20:55 +08:00
Kubernetes Prow Robot	cfe5a7d03a	Merge pull request #125213 from carlory/fix-dra-flaky fix dra flaky test on TestPlugin	2024-06-03 13:32:10 -07:00
Kubernetes Prow Robot	8bd36c60bd	Merge pull request #125197 from gabesaba/prefilter_perf [scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable	2024-06-03 07:35:41 -07:00
Gabe	c8f0ea1a54	Don't fill in NodeToStatusMap with UnschedulableAndUnresolvable	2024-05-31 15:52:16 +00:00
carlory	2794baf4c0	fix dra flaky test on TestPlugin	2024-05-30 23:22:37 +08:00
Kubernetes Prow Robot	ee2c1ffa80	Merge pull request #124630 from carlory/fix-123731 DRA: scheduler: index claim and class parameters to simplify lookup	2024-05-29 14:38:14 -07:00
Gabe	7ea3bf4db4	Revert "scheduler: preallocation for NodeToStatusMap" This reverts commit `9fcd791c01`.	2024-05-29 14:09:58 +00:00
carlory	3072987fcc	DRA: scheduler: index claim and class parameters to simplify lookup	2024-05-27 15:57:10 +08:00

1 2 3 4 5 ...

3430 Commits