kubernetes

Author	SHA1	Message	Date
Aldo Culquicondor	60fc90967b	Count ready pods in job controller When the feature gate JobReadyPods is enabled. Change-Id: I86f93914568de6a7029f9ae92ee7b749686fbf97	2021-10-19 15:18:37 -04:00
Konstantin Misyutin	dbc9d7b71a	Remove tests when StorageObjectInUseProtection feature is disabled As well as feature gate are locked, the tests when this feature is disabled will crash. So we should remove them together with locking the feature. Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>	2021-10-15 19:39:37 +08:00
Kubernetes Prow Robot	0bfa37dfcc	Merge pull request #105676 from alculquicondor/job-name Fix name for Pods of NonIndexed Jobs	2021-10-14 10:50:12 -07:00
Aldo Culquicondor	4ef9d18abe	Fix name for Pods of NonIndexed Jobs Change-Id: I0ea4685a82f4cdec0caab362d52144476652f95a	2021-10-14 10:55:46 -04:00
Kubernetes Prow Robot	3aafe75698	Merge pull request #105461 from damemi/wire-contexts-autoscaling Wire contexts to Autoscaling controllers	2021-10-14 06:59:33 -07:00
Kubernetes Prow Robot	f27e4714ba	Merge pull request #105377 from damemi/wire-contexts-apps Wire contexts to Apps controllers	2021-10-14 06:59:19 -07:00
Kubernetes Prow Robot	baaa53db64	Merge pull request #105211 from xiaopingrubyist/fix-pv-controller-claim-cache-issue fix:claim cached in pvcontroller is not the newest may cause unexpected issue	2021-10-14 05:47:18 -07:00
Mike Dame	41fcb95f2f	Wire contexts to Apps controllers	2021-10-13 16:32:13 -04:00
torubylist	f28a8d7f2b	fix:cached claim is not the newest will cause unexpected issue	2021-10-13 20:03:00 +08:00
Mike Dame	7780024916	Wire contexts to Autoscaling controllers	2021-10-12 14:34:05 -04:00
Maciej Szulik	8322121434	Move test-related utils to test/utils	2021-10-12 14:52:19 +02:00
Maciej Szulik	1fb6bf8a14	Wire context instead of TODO	2021-10-12 13:21:45 +02:00
Patrick Ohly	a8c930ef46	generic ephemeral volume: graduation to GA The feature gate gets locked to "true", with the goal to remove it in two releases. All code now can assume that the feature is enabled. Tests for "feature disabled" are no longer needed and get removed. Some code wasn't using the new helper functions yet. That gets changed while touching those lines.	2021-10-11 20:54:20 +02:00
Mike Dame	3f0b6d390c	Wire contexts to RBAC controllers	2021-10-07 15:04:49 -04:00
Kubernetes Prow Robot	b0eac84937	Merge pull request #105345 from pohly/generic-ephemeral-volume-util generic ephemeral volume util, base code and controller	2021-10-07 08:19:47 -07:00
Jordan Liggitt	81fa9855c1	Fix quota controller hotloop in integration tests	2021-10-06 11:34:32 -04:00
Mike Dame	6ce2924818	Wire contexts to Bootstrap controllers	2021-10-06 10:27:32 -04:00
Patrick Ohly	4ae0eecb34	controller: use generic ephemeral volume helper functions The name concatenation and ownership check were originally considered small enough to not warrant dedicated functions, but the intent of the code is more readable with them. There also was a missing owner check in the attach controller.	2021-10-06 14:01:44 +02:00
Kubernetes Prow Robot	debd6c1e9e	Merge pull request #104526 from jingxu97/aug/volumeattach Fix issue in node status updating VolumeAttached list	2021-10-05 17:30:32 -07:00
Jing Xu	69b9f9b1f0	Fix issue in node status updating VolumeAttached list During volume detach, the following might happen in reconciler 1. Pod is deleting 2. remove volume from reportedAsAttached, so node status updater will update volumeAttached list 3. detach failed due to some issue 4. volume is added back in reportedAsAttached 5. reconciler loops again the volume, remove volume from reportedAsAttached 6. detach will not be trigged because exponential back off, detach call will fail with exponential backoff error 7. another pod is added which using the same volume on the same node 8. reconciler loops and it will NOT try to tigger detach anymore At this point, volume is still attached and in actual state, but volumeAttached list in node status does not has this volume anymore, and will block volume mount from kubelet. The fix in first round is to add volume back into the volume list that need to reported as attached at step 6 when detach call failed with error (exponentical backoff). However this might has some performance issue if detach fail for a while. During this time, volume will be keep removing/adding back to node status which will cause a surge of API calls. So we changed to logic to check first whether operation is safe to retry which means no pending operation or it is not in exponentical backoff time period before calling detach. This way we can avoid keep removing/adding volume from node status. Change-Id: I5d4e760c880d72937d34b9d3e904ecad125f802e	2021-10-05 09:44:35 -07:00
Kubernetes Prow Robot	519b164db1	Merge pull request #105222 from cyclinder/remove_node_lease_GA remove nodeLease feature GA	2021-10-05 05:41:21 -07:00
Author cyclinder	e61b901628	remove nodeLease feature GA Signed-off-by: cyclinder <qifeng.guo@daocloud.io>	2021-10-05 12:23:27 +08:00
Kubernetes Prow Robot	0a29e2a73a	Merge pull request #105197 from alculquicondor/job-tracking Roll-forward: Beta requirements for JobTrackingWithFinalizers	2021-10-04 18:57:49 -07:00
Aldo Culquicondor	5929ccd391	Track expected removals of Pod finalizers Add the UIDs of Pods for which we are removing finalizers to an in-memory cache. The controller removes UIDs from the cache as Pod updates or deletes come in. This avoids double counting finished Pods when Pod updates arrive after Job status updates. https://github.com/kubernetes/kubernetes/issues/105200	2021-10-04 16:09:58 -04:00
Kubernetes Prow Robot	c724abae81	Merge pull request #105375 from jonyhy96/fix-clock-util node: test file use k8s.io/utils/clock instead	2021-10-04 04:19:31 -07:00
Rob Scott	d7a640d831	Excluding Control Plane Nodes from Topology Hints calculations	2021-10-01 15:12:38 -07:00
haoyun	8d9fa1decc	fix: use k8s.io/utils/clock instead Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-10-02 00:08:53 +08:00
Aldo Culquicondor	95c2a8024c	Parallelize pod updates in job test To potentially reduce the number of job controller syncs. Also reduce the maximum number of pods to sync in tests.	2021-10-01 09:55:53 -04:00
Kubernetes Prow Robot	598a8829c1	Merge pull request #105267 from llhuii/fix-topology-hint-bug TopologyAwareHints: fix getHintsByZone bug	2021-09-28 20:54:48 -07:00
llhuii	e7350d126e	TopologyAwareHints: add getHintsByZone unit test	2021-09-29 10:39:03 +08:00
llhuii	528cd30145	TopologyAwareHints: fix getHintsByZone bug The bug could result in the EndpointSlice controller unnecessarily updating EndpointSlices associated with a Service that had Topology Aware Hints enabled.	2021-09-28 09:02:24 +08:00
Kubernetes Prow Robot	aec9acda68	Merge pull request #105214 from alculquicondor/job-update Remove GET job and retries for status updates	2021-09-27 09:05:47 -07:00
Khaled Henidak (Kal)	a53e2eaeab	move IPv6DualStack feature to stable. (#104691 ) * kube-proxy * endpoints controller * app: kube-controller-manager * app: cloud-controller-manager * kubelet * app: api-server * node utils + registry/strategy * api: validation (comment removal) * api:pod strategy (util pkg) * api: docs * core: integration testing * kubeadm: change feature gate to GA * service registry and rest stack * move feature to GA * generated	2021-09-24 16:30:22 -07:00
Kubernetes Prow Robot	b6924839ca	Merge pull request #101987 from sky-philipalmeida/patch-1 Log if PV is still in use trying to delete it	2021-09-23 14:30:54 -07:00
Aldo Culquicondor	a438f16741	Revert "Revert "Add metric job_pod_finished"" This reverts commit `7868fbbe64`.	2021-09-23 12:56:29 -04:00
Aldo Culquicondor	47a957d163	Revert "Revert "Limit number of Pods counted in a single Job sync"" This reverts commit `8bcb780808`.	2021-09-23 12:56:29 -04:00
Aldo Culquicondor	01f27cd93e	Fix log line for target number of running pods	2021-09-23 12:56:29 -04:00
Aldo Culquicondor	eebd678cda	Remove GET job and retries for status updates. Doing a GET right before retrying has 2 problems: - It can masquerade conflicts - It adds an additional delay As for retries, we are better of going through the sync backoff. In the case of conflict, we know that there was a Job update that would trigger another sync, so there is no need to do a rate limited requeue.	2021-09-23 11:48:34 -04:00
Kubernetes Prow Robot	372103f4b8	Merge pull request #100672 from wangyx1992/structured-log Structured Logging migration: modify logs of controller-manager	2021-09-22 20:27:10 -07:00
Kubernetes Prow Robot	76c0573ff4	Merge pull request #105181 from alculquicondor/revert Revert #104739	2021-09-21 16:54:00 -07:00
Aldo Culquicondor	7868fbbe64	Revert "Add metric job_pod_finished" This reverts commit `a0e7a567c5`.	2021-09-21 15:16:54 -04:00
Aldo Culquicondor	8bcb780808	Revert "Limit number of Pods counted in a single Job sync" This reverts commit `7d9cb88fed`.	2021-09-21 15:16:50 -04:00
Phil	f1a9402082	Log if PV is still in use trying to delete it Similar to what we have in: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/pvcprotection/pvc_protection_controller.go#L181 The objective is to have a easy way to monitor if a PV will enter in Terminating state due to a failed removal when still in use. This way we can capture the PV log and alert according. The code is not tested. Update pv_protection_controller.go Change call to Infof	2021-09-21 18:05:16 +01:00
Kubernetes Prow Robot	b34a735bbe	Merge pull request #102523 from stlaz/rootca_metrics_cleanup rootcacertpublisher: drop the namespace label from metrics to reduce its cardinality	2021-09-20 13:54:24 -07:00
Kubernetes Prow Robot	353f0a5eab	Merge pull request #105095 from wojtek-t/migrate_clock_3 Unify towards k8s.io/utils/clock - part 3	2021-09-20 12:46:45 -07:00
Kubernetes Prow Robot	f55101913f	Merge pull request #105098 from Karthik-K-N/fix-error-format Fix incorrect format specifier in test files	2021-09-20 08:56:09 -07:00
Shivanshu Raj Shrivastava	bbd809cbd0	Fixing incorrectly migrated structured logs (#105122 ) * added keys for structured logging * used KObj	2021-09-19 12:28:08 -07:00
wojtekt	d9b08c611d	Migrate to k8s.io/utils/clock	2021-09-17 15:19:08 +02:00
Kubernetes Prow Robot	399656369f	Merge pull request #104739 from alculquicondor/job-tracking Beta requirements for JobTrackingWithFinalizers	2021-09-17 04:57:00 -07:00
Karthik K N	c651d50202	Fix incorrect format specifier in test files	2021-09-17 16:27:53 +05:30
Stanislav Laznicka	b67bd722a9	rootcacertpublisher: drop the namespace label from metrics to reduce its cardinality The `root_ca_cert_publisher_sync_duration_seconds` metric tracks the sync duration in the root CA cert publisher per code and namespace. In clusters with a high namespace turnover (like CI clusters), this may cause the kube-controller-manager to expose over 100k series to Prometheus, which may cause degradation of that service. Drop the `namespace` label to remove the metrics' cardinality, tracking this metric by namespace does not justify the impact of keeping it.	2021-09-16 14:05:32 +02:00
Aldo Culquicondor	a0e7a567c5	Add metric job_pod_finished To count the number of pods that the job controller successfully tracked with the JobTrackingWithFinalizers feature gate.	2021-09-15 11:19:47 -04:00
Kubernetes Prow Robot	047a6b9f86	Merge pull request #104874 from wojtek-t/migrate_clock_1 Unify towards k8s.io/utils/clock - part 1	2021-09-13 19:09:20 -07:00
Kubernetes Prow Robot	8ac9526475	Merge pull request #101928 from alexanderConstantinescu/drain-workqueue client-go/workqueue: Drain work queue on shutdown	2021-09-10 18:02:06 -07:00
Aldo Culquicondor	7d9cb88fed	Limit number of Pods counted in a single Job sync This prevents big Jobs from starving smaller ones.	2021-09-10 10:32:04 -04:00
wojtekt	e233feb99b	Migrate to k8s.io/utils/clock in pkg/controller	2021-09-10 11:42:32 +02:00
Kubernetes Prow Robot	669de4b957	Merge pull request #104666 from alculquicondor/tracking-beta Fix Job tracking with finalizers for more than 500 pods	2021-09-09 09:26:11 -07:00
Kubernetes Prow Robot	f9488f314a	Merge pull request #104741 from robscott/topology-logging Adding more detailed logging for Topology Hints	2021-09-07 16:32:16 -07:00
Kubernetes Prow Robot	bcd2ffbdc1	Merge pull request #104590 from Jiawei0227/anno Add GA AnnStorageProvisioner annotation to PVC	2021-09-03 06:09:49 -07:00
Rob Scott	f24d917d3c	Adding more detailed logging for Topology Hints	2021-09-02 15:46:14 -07:00
Aldo Culquicondor	23ea5d80d6	Fix Job tracking with finalizers for more than 500 pods When doing partial updates for uncountedTerminatedPods, the controller might have removed UIDs for Pods which still had finalizers. Also make more space by removing UIDs that don't have finalizers at the beginning of the sync.	2021-09-01 16:19:04 -04:00
Kubernetes Prow Robot	cd63952f13	Merge pull request #95885 from jiahuif/refactor/controller-manager refactor: controller manager: InitFunc and base controller interface.	2021-08-27 15:40:52 -07:00
Kubernetes Prow Robot	fca3175df7	Merge pull request #104231 from astraw99/fix_unified_workers Unify controller worker num param `threadiness` to `workers`	2021-08-27 09:34:05 -07:00
Jiawei Wang	8de0f11946	Add GA AnnStorageProvisioner annotation to PVC This PR adds GA AnnStorageProvisioner annotation to a PVC if the PVC requires dynamic provisioning. This also deprecates the beta AnnStorageProvisioner annotation and it will be removed in a later release.	2021-08-26 12:46:47 -07:00
Jiahui Feng	8f5771d243	use common controller interface in KCM.	2021-08-25 13:29:03 -07:00
Stephen Augustus	481cf6fbe7	generated: Run hack/update-gofmt.sh Signed-off-by: Stephen Augustus <foo@auggie.dev>	2021-08-24 15:47:49 -04:00
Kubernetes Prow Robot	4e832a7db9	Merge pull request #103630 from mysunshine92/controller_util-annotation Modify the wrong comment for controller_util.go	2021-08-24 10:37:14 -07:00
Alexander Constantinescu	5b740f430e	[queue] Implement `ShutDownWithDrain` allowing the queue to drain when shutting down Signed-off-by: Alexander Constantinescu <aconstan@redhat.com>	2021-08-20 13:59:12 +02:00
Antonio Ojea	0cd75e8fec	run hack/update-netparse-cve.sh	2021-08-20 10:42:09 +02:00
Jordan Liggitt	87a4e082ac	Change defaulter-gen input to package path	2021-08-14 11:00:18 -04:00
Konstantin Misyutin	29bd66d018	Remove "pkg/controller/volume/scheduling" dependency from "pkg/scheduler/framework/plugins" All dependencies of VolumeBinding plugin from "k8s.io/kubernetes/pkg/controller/volume/scheduling" package moved to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding" package: - whole file pkg/controller/volume/scheduling/scheduler_assume_cache.go - whole file pkg/controller/volume/scheduling/scheduler_assume_cache_test.go - whole file pkg/controller/volume/scheduling/scheduler_binder.go - whole file pkg/controller/volume/scheduling/scheduler_binder_fake.go - whole file pkg/controller/volume/scheduling/scheduler_binder_test.go Package "k8s.io/kubernetes/pkg/controller/volume/scheduling/metrics" moved to "k8s.io/kubernetes/pkg/scheduler/framework/plugins/volumebinding/metrics" because it only used in VolumeBinding plugin and (e2e) tests. More described in issue #89930 and PR #102953. Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>	2021-08-13 19:08:45 +08:00
astraw99	e6df935fd3	unify worker num to workers	2021-08-09 15:46:04 +08:00
Kubernetes Prow Robot	7ab3e3c8c3	Merge pull request #102981 from SataQiu/add-ephemeral-config-v1alpha1 Add --concurrent-ephemeralvolume-syncs flag for kube-controller-manager	2021-08-05 20:55:12 -07:00
Kubernetes Prow Robot	6f73963fdf	Merge pull request #103756 from nakamasato/fix-typo-in-comment Fix typo in comment in endpoints_controller	2021-08-04 22:12:37 -07:00
Kubernetes Prow Robot	06ae9a2ad7	Merge pull request #103259 from mamil/fix-typo fix typo for daemon_controller_test.go	2021-08-04 19:00:07 -07:00
SataQiu	7fa0b9b6c1	add --concurrent-ephemeralvolume-syncs flag for kube-controller-manager	2021-07-25 21:36:57 +08:00
10177505	2740965dc9	Merge conditional assignment into variable declaration	2021-07-23 17:02:19 +08:00
Jordan Liggitt	236e72cf8a	Make CSR cleaner tolerate objects with invalid status.certificate	2021-07-21 10:35:17 -04:00
Masato Naka	baf0bf831d	Fix typo in comment in endpoints_controller	2021-07-18 21:56:34 +09:00
Aldo Culquicondor	5e1b5ec398	Revert counting deleted pods as failures for Job When JobTrackingWithFinalizers is disabled. To preserve existing behavior. Change-Id: Id1752f96feed322911712fe9e918e91e42eca809	2021-07-14 10:03:20 -04:00
wangyamei00221466	4a9896775d	Modify the wrong comment for controller_util.go	2021-07-10 08:38:28 +08:00
Aldo Culquicondor	2dd2622188	Track Job Pods completion in status Through Job.status.uncountedPodUIDs and a Pod finalizer An annotation marks if a job should be tracked with new behavior A separate work queue is used to remove finalizers from orphan pods. Change-Id: I1862e930257a9d1f7f1b2b0a526ed15bc8c248ad	2021-07-08 17:48:05 +00:00
Kubernetes Prow Robot	16af282ee7	Merge pull request #103520 from swetharepakula/truncate-endpoints Truncate endpoints over a 1000 addresses	2021-07-07 18:09:21 -07:00
Swetha Repakula	9bd857ca04	Truncate endpoints over a 1000 addresses * set `endpoints.kubernetes.io/over-capacity` to "truncated" when number of addresses has been truncated to a 1000 * ready addresses are prioritized over non-ready addresses * addresses are proportionally truncated across subsets	2021-07-07 12:48:43 -07:00
Kubernetes Prow Robot	ac6a1b1821	Merge pull request #103414 from ravisantoshgudimetla/fix-pdb-status [disruptioncontroller] Don't error for unmanaged pods	2021-07-07 12:40:35 -07:00
ravisantoshgudimetla	2c116055f7	[disruptioncontroller] Don't error for unmanaged pods As of now, we allow PDBs to be applied to pods via selectors, so there can be unmanaged pods(pods that don't have backing controllers) but still have PDBs associated. Such pods are to be logged instead of immediately throwing a sync error. This ensures disruption controller is not frequently updating the status subresource and thus preventing excessive and expensive writes to etcd.	2021-07-07 10:42:24 -04:00
Kubernetes Prow Robot	15222a599f	Merge pull request #103244 from verult/fsgroup-to-csi Delegate applying FSGroup to CSI driver through NodeStageVolume and NodePublishVolume	2021-07-06 16:22:10 -07:00
Mike Dame	4b9230ed27	Promote LogarithmicScaleDown to beta This promotes the LogarithmicScaleDown feature gate to Beta, enabling it by default. It also introduces a new metric, `sorting_deletion_age_ratio`, intended to measure the efficacy of this new replica set scaledown behavior.	2021-07-06 09:58:03 -04:00
Kubernetes Prow Robot	6acc62da75	Merge pull request #99997 from JornShen/extract_ep_and_epm_share_code_to_pkg Extracting same code of endpointslice and endpointslicemirror into a new shared EndpointSlice package	2021-07-05 13:25:38 -07:00
Cheng Xing	0e315355df	Pass FsGroup to MountDevice	2021-07-03 16:29:42 -07:00
Monis Khan	cd91e59f7c	csr: add expirationSeconds field to control cert lifetime This change updates the CSR API to add a new, optional field called expirationSeconds. This field is a request to the signer for the maximum duration the client wishes the cert to have. The signer is free to ignore this request based on its own internal policy. The signers built-in to KCM will honor this field if it is not set to a value greater than --cluster-signing-duration. The minimum allowed value for this field is 600 seconds (ten minutes). This change will help enforce safer durations for certificates in the Kube ecosystem and will help related projects such as cert-manager with their migration to the Kube CSR API. Future enhancements may update the Kubelet to take advantage of this field when it is configured in a way that can tolerate shorter certificate lifespans with regular rotation. Signed-off-by: Monis Khan <mok@vmware.com>	2021-07-01 23:38:15 -04:00
Chris Henzie	83e3ee780a	Rename access mode contains helper method So it is consistent with other methods performing the same check (one for internal and external types)	2021-06-28 21:24:56 -07:00
Raymonder jin	03f9f75e88	fix typo for daemon_controller_test.go	2021-06-28 04:54:09 -07:00
Kubernetes Prow Robot	3a07d96d25	Merge pull request #99412 from enj/enj/i/ttl_backdate csr: correctly handle backdating of short lived certs	2021-06-23 15:00:10 -07:00
Monis Khan	7e891e5d6c	csr: correctly handle backdating of short lived certs This change updates the backdating logic to only be applied to the NotBefore date and not the NotAfter date when the certificate is short lived. Thus when such a certificate is issued, it will not be immediately expired. Long lived certificates continue to have the same lifetime as before. Consolidated all certificate lifetime logic into the PermissiveSigningPolicy.policy method. Signed-off-by: Monis Khan <mok@vmware.com>	2021-06-23 15:36:11 -04:00
Kubernetes Prow Robot	268cab5f44	Merge pull request #102022 from adtac/sbeta graduate SuspendJob to beta	2021-06-22 17:18:10 -07:00
Kubernetes Prow Robot	81aaeee8a6	Merge pull request #102731 from sharmarajdaksh/lastappliedconfig-annotation-no-mirror fix: mirroring of last-applied-configuration annotation in EndpointSlices	2021-06-15 13:41:55 -07:00
Kubernetes Prow Robot	a5be86fee7	Merge pull request #101316 from ravisantoshgudimetla/add-minReadySeconds-impl Add min ready seconds impl	2021-06-15 13:41:43 -07:00
Kubernetes Prow Robot	270b66fb94	Merge pull request #102642 from alaypatel07/lastSuccessfulTime populate last successful time to cronjob status	2021-06-15 11:31:35 -07:00
Dakshraj Sharma	211485c23d	last-applied-config annotation no longer mirrored to endpoint slices Handles incorrect mirroring of endpoint annotations to created endpoint slices, specifically the last-applied-config. Also updates tests and adds test cases for the same	2021-06-15 22:32:33 +05:30

1 2 3 4 5 ...

5563 Commits