kubernetes

Author	SHA1	Message	Date
cyclinder	ab47b8b94b	fix some lint error Signed-off-by: cyclinder <qifeng.guo@daocloud.io>	2021-10-26 13:56:29 +08:00
Konstantin Misyutin	dbc9d7b71a	Remove tests when StorageObjectInUseProtection feature is disabled As well as feature gate are locked, the tests when this feature is disabled will crash. So we should remove them together with locking the feature. Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>	2021-10-15 19:39:37 +08:00
Kubernetes Prow Robot	0bfa37dfcc	Merge pull request #105676 from alculquicondor/job-name Fix name for Pods of NonIndexed Jobs	2021-10-14 10:50:12 -07:00
Aldo Culquicondor	4ef9d18abe	Fix name for Pods of NonIndexed Jobs Change-Id: I0ea4685a82f4cdec0caab362d52144476652f95a	2021-10-14 10:55:46 -04:00
Kubernetes Prow Robot	3aafe75698	Merge pull request #105461 from damemi/wire-contexts-autoscaling Wire contexts to Autoscaling controllers	2021-10-14 06:59:33 -07:00
Kubernetes Prow Robot	f27e4714ba	Merge pull request #105377 from damemi/wire-contexts-apps Wire contexts to Apps controllers	2021-10-14 06:59:19 -07:00
Kubernetes Prow Robot	baaa53db64	Merge pull request #105211 from xiaopingrubyist/fix-pv-controller-claim-cache-issue fix:claim cached in pvcontroller is not the newest may cause unexpected issue	2021-10-14 05:47:18 -07:00
Mike Dame	41fcb95f2f	Wire contexts to Apps controllers	2021-10-13 16:32:13 -04:00
torubylist	f28a8d7f2b	fix:cached claim is not the newest will cause unexpected issue	2021-10-13 20:03:00 +08:00
Mike Dame	7780024916	Wire contexts to Autoscaling controllers	2021-10-12 14:34:05 -04:00
Maciej Szulik	8322121434	Move test-related utils to test/utils	2021-10-12 14:52:19 +02:00
Maciej Szulik	1fb6bf8a14	Wire context instead of TODO	2021-10-12 13:21:45 +02:00
Kubernetes Prow Robot	b0eac84937	Merge pull request #105345 from pohly/generic-ephemeral-volume-util generic ephemeral volume util, base code and controller	2021-10-07 08:19:47 -07:00
Jordan Liggitt	81fa9855c1	Fix quota controller hotloop in integration tests	2021-10-06 11:34:32 -04:00
Patrick Ohly	4ae0eecb34	controller: use generic ephemeral volume helper functions The name concatenation and ownership check were originally considered small enough to not warrant dedicated functions, but the intent of the code is more readable with them. There also was a missing owner check in the attach controller.	2021-10-06 14:01:44 +02:00
Kubernetes Prow Robot	debd6c1e9e	Merge pull request #104526 from jingxu97/aug/volumeattach Fix issue in node status updating VolumeAttached list	2021-10-05 17:30:32 -07:00
Jing Xu	69b9f9b1f0	Fix issue in node status updating VolumeAttached list During volume detach, the following might happen in reconciler 1. Pod is deleting 2. remove volume from reportedAsAttached, so node status updater will update volumeAttached list 3. detach failed due to some issue 4. volume is added back in reportedAsAttached 5. reconciler loops again the volume, remove volume from reportedAsAttached 6. detach will not be trigged because exponential back off, detach call will fail with exponential backoff error 7. another pod is added which using the same volume on the same node 8. reconciler loops and it will NOT try to tigger detach anymore At this point, volume is still attached and in actual state, but volumeAttached list in node status does not has this volume anymore, and will block volume mount from kubelet. The fix in first round is to add volume back into the volume list that need to reported as attached at step 6 when detach call failed with error (exponentical backoff). However this might has some performance issue if detach fail for a while. During this time, volume will be keep removing/adding back to node status which will cause a surge of API calls. So we changed to logic to check first whether operation is safe to retry which means no pending operation or it is not in exponentical backoff time period before calling detach. This way we can avoid keep removing/adding volume from node status. Change-Id: I5d4e760c880d72937d34b9d3e904ecad125f802e	2021-10-05 09:44:35 -07:00
Kubernetes Prow Robot	519b164db1	Merge pull request #105222 from cyclinder/remove_node_lease_GA remove nodeLease feature GA	2021-10-05 05:41:21 -07:00
Author cyclinder	e61b901628	remove nodeLease feature GA Signed-off-by: cyclinder <qifeng.guo@daocloud.io>	2021-10-05 12:23:27 +08:00
Kubernetes Prow Robot	0a29e2a73a	Merge pull request #105197 from alculquicondor/job-tracking Roll-forward: Beta requirements for JobTrackingWithFinalizers	2021-10-04 18:57:49 -07:00
Aldo Culquicondor	5929ccd391	Track expected removals of Pod finalizers Add the UIDs of Pods for which we are removing finalizers to an in-memory cache. The controller removes UIDs from the cache as Pod updates or deletes come in. This avoids double counting finished Pods when Pod updates arrive after Job status updates. https://github.com/kubernetes/kubernetes/issues/105200	2021-10-04 16:09:58 -04:00
Kubernetes Prow Robot	c724abae81	Merge pull request #105375 from jonyhy96/fix-clock-util node: test file use k8s.io/utils/clock instead	2021-10-04 04:19:31 -07:00
Rob Scott	d7a640d831	Excluding Control Plane Nodes from Topology Hints calculations	2021-10-01 15:12:38 -07:00
haoyun	8d9fa1decc	fix: use k8s.io/utils/clock instead Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-10-02 00:08:53 +08:00
Aldo Culquicondor	95c2a8024c	Parallelize pod updates in job test To potentially reduce the number of job controller syncs. Also reduce the maximum number of pods to sync in tests.	2021-10-01 09:55:53 -04:00
Kubernetes Prow Robot	598a8829c1	Merge pull request #105267 from llhuii/fix-topology-hint-bug TopologyAwareHints: fix getHintsByZone bug	2021-09-28 20:54:48 -07:00
llhuii	e7350d126e	TopologyAwareHints: add getHintsByZone unit test	2021-09-29 10:39:03 +08:00
llhuii	528cd30145	TopologyAwareHints: fix getHintsByZone bug The bug could result in the EndpointSlice controller unnecessarily updating EndpointSlices associated with a Service that had Topology Aware Hints enabled.	2021-09-28 09:02:24 +08:00
Kubernetes Prow Robot	aec9acda68	Merge pull request #105214 from alculquicondor/job-update Remove GET job and retries for status updates	2021-09-27 09:05:47 -07:00
Khaled Henidak (Kal)	a53e2eaeab	move IPv6DualStack feature to stable. (#104691 ) * kube-proxy * endpoints controller * app: kube-controller-manager * app: cloud-controller-manager * kubelet * app: api-server * node utils + registry/strategy * api: validation (comment removal) * api:pod strategy (util pkg) * api: docs * core: integration testing * kubeadm: change feature gate to GA * service registry and rest stack * move feature to GA * generated	2021-09-24 16:30:22 -07:00
Kubernetes Prow Robot	b6924839ca	Merge pull request #101987 from sky-philipalmeida/patch-1 Log if PV is still in use trying to delete it	2021-09-23 14:30:54 -07:00
Aldo Culquicondor	a438f16741	Revert "Revert "Add metric job_pod_finished"" This reverts commit `7868fbbe64`.	2021-09-23 12:56:29 -04:00
Aldo Culquicondor	47a957d163	Revert "Revert "Limit number of Pods counted in a single Job sync"" This reverts commit `8bcb780808`.	2021-09-23 12:56:29 -04:00
Aldo Culquicondor	01f27cd93e	Fix log line for target number of running pods	2021-09-23 12:56:29 -04:00
Aldo Culquicondor	eebd678cda	Remove GET job and retries for status updates. Doing a GET right before retrying has 2 problems: - It can masquerade conflicts - It adds an additional delay As for retries, we are better of going through the sync backoff. In the case of conflict, we know that there was a Job update that would trigger another sync, so there is no need to do a rate limited requeue.	2021-09-23 11:48:34 -04:00
Kubernetes Prow Robot	372103f4b8	Merge pull request #100672 from wangyx1992/structured-log Structured Logging migration: modify logs of controller-manager	2021-09-22 20:27:10 -07:00
Kubernetes Prow Robot	76c0573ff4	Merge pull request #105181 from alculquicondor/revert Revert #104739	2021-09-21 16:54:00 -07:00
Aldo Culquicondor	7868fbbe64	Revert "Add metric job_pod_finished" This reverts commit `a0e7a567c5`.	2021-09-21 15:16:54 -04:00
Aldo Culquicondor	8bcb780808	Revert "Limit number of Pods counted in a single Job sync" This reverts commit `7d9cb88fed`.	2021-09-21 15:16:50 -04:00
Phil	f1a9402082	Log if PV is still in use trying to delete it Similar to what we have in: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/pvcprotection/pvc_protection_controller.go#L181 The objective is to have a easy way to monitor if a PV will enter in Terminating state due to a failed removal when still in use. This way we can capture the PV log and alert according. The code is not tested. Update pv_protection_controller.go Change call to Infof	2021-09-21 18:05:16 +01:00
Kubernetes Prow Robot	b34a735bbe	Merge pull request #102523 from stlaz/rootca_metrics_cleanup rootcacertpublisher: drop the namespace label from metrics to reduce its cardinality	2021-09-20 13:54:24 -07:00
Kubernetes Prow Robot	353f0a5eab	Merge pull request #105095 from wojtek-t/migrate_clock_3 Unify towards k8s.io/utils/clock - part 3	2021-09-20 12:46:45 -07:00
Kubernetes Prow Robot	f55101913f	Merge pull request #105098 from Karthik-K-N/fix-error-format Fix incorrect format specifier in test files	2021-09-20 08:56:09 -07:00
Shivanshu Raj Shrivastava	bbd809cbd0	Fixing incorrectly migrated structured logs (#105122 ) * added keys for structured logging * used KObj	2021-09-19 12:28:08 -07:00
wojtekt	d9b08c611d	Migrate to k8s.io/utils/clock	2021-09-17 15:19:08 +02:00
Kubernetes Prow Robot	399656369f	Merge pull request #104739 from alculquicondor/job-tracking Beta requirements for JobTrackingWithFinalizers	2021-09-17 04:57:00 -07:00
Karthik K N	c651d50202	Fix incorrect format specifier in test files	2021-09-17 16:27:53 +05:30
Stanislav Laznicka	b67bd722a9	rootcacertpublisher: drop the namespace label from metrics to reduce its cardinality The `root_ca_cert_publisher_sync_duration_seconds` metric tracks the sync duration in the root CA cert publisher per code and namespace. In clusters with a high namespace turnover (like CI clusters), this may cause the kube-controller-manager to expose over 100k series to Prometheus, which may cause degradation of that service. Drop the `namespace` label to remove the metrics' cardinality, tracking this metric by namespace does not justify the impact of keeping it.	2021-09-16 14:05:32 +02:00
Aldo Culquicondor	a0e7a567c5	Add metric job_pod_finished To count the number of pods that the job controller successfully tracked with the JobTrackingWithFinalizers feature gate.	2021-09-15 11:19:47 -04:00
Kubernetes Prow Robot	047a6b9f86	Merge pull request #104874 from wojtek-t/migrate_clock_1 Unify towards k8s.io/utils/clock - part 1	2021-09-13 19:09:20 -07:00

1 2 3 4 5 ...

5509 Commits