kubernetes

Author	SHA1	Message	Date
Filip Křepinský	747ffe785d	improve message, log level and testing for unmanaged pods in disruption controller - set higher severity and log level when unmanaged pods found and improve testing - do not mention unsupported controller when triggering event for unmanaged pods (this is covered by CalculateExpectedPodCountFailed event) - test unsupported controller - make testing for events non blocking when event not found	2023-03-03 23:03:06 +01:00
Kubernetes Prow Robot	6fd488a4e6	Merge pull request #115861 from JayKayy/inform-unsupported-pdb Add a warning event when pdb has found a unmanaged pod	2023-03-03 03:16:58 -08:00
Kubernetes Prow Robot	3835c7aecd	Merge pull request #115882 from binacs/binacs/controller-use-issuperset cleanup(controller): use IsSuperset to avoid interim slice	2023-03-02 17:00:57 -08:00
John Kwiatkoski	1f42ebc013	Add a warning event when pdb has found a unmanaged pod	2023-03-01 20:14:10 -05:00
weizhichen	4d6be42c1a	add unit test	2023-03-01 06:48:37 +00:00
weizhichen	d06c0995cb	fix 116028	2023-02-27 12:49:44 +00:00
Kubernetes Prow Robot	a34f8423a7	Merge pull request #115907 from qinqon/svc-same-address-different-pod svc: Support pods with same address	2023-02-24 19:00:05 -08:00
Enrique Llorente	697ea476e2	svc: Support pods with same address If different pods with same address are exposed by the same service if some of the endpointslices endpoints are overwriten. This change add the pod name to the hash function to ensure that all the endpoints are in place. Signed-off-by: Enrique Llorente <ellorent@redhat.com>	2023-02-23 11:37:57 +01:00
Daniel Vega-Myhre	d41302312e	update validation logic so completions is mutable iff completions is modified in tandem with parallelsim so completions == parallelism	2023-02-23 03:25:16 +00:00
kannon92	32ac4a9581	left over uncounted from tracking cleanup	2023-02-22 16:45:53 +00:00
binacs	84ff621309	cleanup(controller): use IsSuperset to avoid interim slice	2023-02-19 21:49:58 +08:00
Kubernetes Prow Robot	d9ed2ff4b0	Merge pull request #114687 from freddie400/migrate-hpa Migrate pkg/controller/podautoscaler to contextual logging	2023-02-17 05:44:03 -08:00
Freddie	dee494ece1	squashing without rebase	2023-02-17 01:47:52 +05:30
Patrick Ohly	0e1139d027	dra: avoid goroutine leaks from event broadcaster When using these controllers in test/integration/scheduler_perf, the goroutine leak check there pointed out that broadcaster.Shutdown function wasn't called and thus goroutines leaked during a test.	2023-02-15 15:14:27 +01:00
Andy Goldstein	71ec5ed81d	resourcequota: use contexual logging (#113315 ) Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>	2023-02-14 07:19:31 -08:00
Kubernetes Prow Robot	49babf218a	Merge pull request #115464 from sunnylovestiramisu/fixCSIMigrationBug Remove check for CSI driver running on node for CSI migration attach operations	2023-02-13 12:49:30 -08:00
Kubernetes Prow Robot	2c37b470b3	Merge pull request #113794 from littlejiancc/feature_stateful_cleanup Simplify case conditions	2023-02-09 20:37:39 -08:00
Sunny Song	98f944f55d	Remove check for CSI driver running on node for CSI migration attach operations	2023-02-09 02:45:02 +00:00
Antonio Ojea	3bb203e7eb	replace nodeipam custom logic by a workqueue Change-Id: I242174b9d92606b1225a4af29a0730b7cd7d3c03	2023-02-06 19:34:29 +00:00
Sarvesh Rangnekar	c791d69b3e	Fix the nodeSelector key creation mechanism Fixes the issue caused when multile ClusterCIDR objects have the same nodeSelector values, order of the requirements in the nodeSelector is not preserved when nodeSelector is marshalled and converted to a string.	2023-02-01 13:48:07 +00:00
Kubernetes Prow Robot	bd63a912d6	Merge pull request #115349 from danielvegamyhre/job-controller-changes Update previous succeeded indexes for Indexed jobs unconditionally	2023-01-31 15:51:04 -08:00
Daniel Vega-Myhre	2a81337e7c	update prev succeeded indexes for indexed jobs unconditionally	2023-01-31 19:15:53 +00:00
Kubernetes Prow Robot	fb9884577e	Merge pull request #115345 from gnufied/ignore-error-when-unable-find-plugin Ignore error when we can't find plugin capable of expanding the volum…	2023-01-31 05:24:50 -08:00
Sarvesh Rangnekar	a8f120b76c	Fix the delete flow for ClusterCIDR objects Fixes the deletion of ClusterCIDR object, when a Node is associated(has Pod CIDRs allocated from this ClusterCIDR) with it. Currently the ClusterCIDR finalizer is never cleaned up as there is no reconciliation happening after the associated Node has been deleted. This commit fixes the issue by adding workitems from all events to a worker queue and reconcile until the delete is successful.	2023-01-30 19:35:41 +00:00
Kubernetes Prow Robot	ad2a9f2f33	Merge pull request #113863 from msau42/owners update sig-storage owners	2023-01-30 10:10:50 -08:00
Hemant Kumar	c9fc35c496	reword the warning that gets printed on external expansion	2023-01-30 11:37:30 -05:00
Hemant Kumar	1e57dae5ec	Ignore error when we can't find plugin capable of expanding the volume intre	2023-01-26 14:39:05 -05:00
Patrick Ohly	bc6c7fa912	logging: fix names of keys The stricter checking with the upcoming logcheck v0.4.1 pointed out these names which don't comply with our recommendations in https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/migration-to-structured-logging.md#name-arguments.	2023-01-23 14:24:29 +01:00
Kubernetes Prow Robot	c63434aaff	Merge pull request #110838 from soltysh/cronjob_improvements CronJob controller cleanups	2023-01-18 09:44:34 -08:00
Maciej Szulik	be44d67566	Re-use common parts between getNextScheduleTime and nextScheduledTimeDuration The two methods nextScheduledTimeDuration and getNextScheduleTime have a lot of similarities, so this commit squashes the common parts together along with getMostRecentScheduleTime to avoid code duplication.	2023-01-18 16:52:45 +01:00
Maciej Szulik	cb491a8d0f	Cleanups in controller utils 1. Squash two identical sorters byTime 2. Move helper for searching active jobs into utils to exist next to its counterpart	2023-01-18 13:40:23 +01:00
Viacheslav Panasovets	6adf60fdf4	Do not create endpoints if service of type ExternalName (#114814 )	2023-01-18 03:12:34 -08:00
Kubernetes Prow Robot	46f3821bf4	Merge pull request #114586 from andrewsykim/apiserver-lease-rename Rename apiserver identity lease labels to apiserver.kubernetes.io/identity	2023-01-17 21:36:34 -08:00
Kubernetes Prow Robot	5550064bc2	Merge pull request #115063 from kannon92/tracking-remove-comments tracking with finalizers is the default way for the job controller so comments are not needed that say we are tracking with finalizers	2023-01-17 07:56:44 -08:00
Kubernetes Prow Robot	7b01daba71	Merge pull request #115074 from yangjunmyfm192085/deleteklogv0-controller use klog instead of klog.V(0)--controller manager part	2023-01-16 09:58:50 -08:00
Kubernetes Prow Robot	ed8cad1e80	Merge pull request #115056 from mimowo/podgc-do-not-add-condition-for-terminated-pods PodGC should not add DisruptionTarget condition for pods which are in terminal phase	2023-01-16 03:04:50 -08:00
JunYang	29086e2b04	use klog instead of klog.V(0)	2023-01-14 21:15:50 +08:00
Andrew Sy Kim	3da0f1809c	apiserver: update lease label key to apiserver.kubernetes.io/identity Signed-off-by: Andrew Sy Kim <andrewsy@google.com>	2023-01-13 15:37:22 -05:00
Kubernetes Prow Robot	9af5ae0365	Merge pull request #115030 from kannon92/remove-pod-error-job-tracking Update SyncJob with PodControllerError updates in job unit tests	2023-01-13 12:08:14 -08:00
Kubernetes Prow Robot	70217a4083	Merge pull request #114944 from mimowo/fix-active-deadline-test Fix the job controller unit test for enforcing ActiveDeadlineSeconds	2023-01-13 10:46:26 -08:00
Michal Wozniak	3833c0c349	PodGC should not add DisruptionTarget condition for pods which are in terminal phase	2023-01-13 18:28:44 +01:00
kannon92	4890928b78	tracking with finalizers is the default way for the job controller	2023-01-13 16:48:35 +00:00
kannon92	3a838033f8	Update SyncJob with PodControllerError updates in job unit tests	2023-01-13 16:39:18 +00:00
Michal Wozniak	7065b42bb2	Fix the job controller unit test for enforcing ActiveDeadlineSeconds	2023-01-13 16:48:15 +01:00
Kubernetes Prow Robot	c0c386b9c9	Merge pull request #114516 from nikhita/job-backoff-fix pkg/controller/job: re-honor exponential backoff delay	2023-01-13 07:36:40 -08:00
Kubernetes Prow Robot	1b8692ce46	Merge pull request #114296 from cbroglie/concurrent-monitor-node-health controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently	2023-01-12 12:42:54 -08:00
Nikhita Raghunath	fd8d92a29d	pkg/controller/job: re-honor exponential backoff This commit makes the job controller re-honor exponential backoff for failed pods. Before this commit, the controller created pods without any backoff. This is a regression because the controller used to create pods with an exponential backoff delay before (10s, 20s, 40s ...). The issue occurs only when the JobTrackingWithFinalizers feature is enabled (which is enabled by default right now). With this feature, we get an extra pod update event when the finalizer of a failed pod is removed. Note that the pod failure detection and new pod creation happen in the same reconcile loop so the 2nd pod is created immediately after the 1st pod fails. The backoff is only applied on 2nd pod failure, which means that the 3rd pod created 10s after the 2nd pod, 4th pod is created 20s after the 3rd pod and so on. This commit fixes a few bugs: 1. Right now, each time `uncounted != nil` and the job does not see a _new_ failure, `forget` is set to true and the job is removed from the queue. Which means that this condition is also triggered each time the finalizer for a failed pod is removed and `NumRequeues` is reset, which results in a backoff of 0s. 2. Updates `updatePod` to only apply backoff when we see a particular pod failed for the first time. This is necessary to ensure that the controller does not apply backoff when it sees a pod update event for finalizer removal of a failed pod. 3. If `JobsReadyPods` feature is enabled and backoff is 0s, the job is now enqueued after `podUpdateBatchPeriod` seconds, instead of 0s. The unit test for this check also had a few bugs: - `DefaultJobBackOff` is overwritten to 0 in certain unit tests, which meant that `DefaultJobBackOff` was considered to be 0, effectively not running any meaningful checks. - `JobsReadyPods` was not enabled for test cases that ran tests which required the feature gate to be enabled. - The check for expected and actual backoff had incorrect calculations.	2023-01-12 20:34:10 +05:30
Christopher Broglie	3c88de52c8	controller/nodelifecycle: Make monitorNodeHealth process nodes concurrently Marking the pods not ready on a node requires looping over them and updating each pod's status one at a time. This is performed serially, and can take a while if we're processing each node serially as well. Since the time is spent waiting on io, there's an opportunity to go faster by processing multiple nodes concurrently. This change modifies the loop to process nodes in parallel, using the same number of workers as doNodeProcessingPassWorker. This change also introduces histogram metrics to better observe monitorNodeHealth.	2023-01-11 12:34:39 -08:00
kannon92	6dfaeff33c	Remove Legacy Job Tracking	2023-01-10 14:52:54 +00:00
Kubernetes Prow Robot	e7549eae87	Merge pull request #114905 from kannon92/sync-job-test-fix Fix SyncPastDeadlineJobFinished for enabling finalizer path	2023-01-09 12:47:28 -08:00

1 2 3 4 5 ...

6018 Commits