kubernetes

Author	SHA1	Message	Date
Mark Rossetti	498d065cc5	Promoting WindowsHostProcessContainers to stable Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2022-11-01 14:06:25 -07:00
Francesco Romani	47d3299781	node: metrics: cpumanager: add pinning metrics In order to improve the observability of the cpumanager, add and populate metrics to track if the combination of the kubelet configuration and podspec would trigger exclusive core allocation and pinning. We should avoid leaking any node/machine specific information (e.g. core ids, even though this is admittedly an extreme example); tracking these metrics seems to be a good first step, because it allows us to get feedback without exposing details. Signed-off-by: Francesco Romani <fromani@redhat.com>	2022-10-27 14:40:40 +02:00
Kubernetes Prow Robot	9bcb81e13f	Merge pull request #113175 from liggitt/pr_normalize_probes_lifecycle_handlers Record event and metric for lifecycle fallback to http	2022-10-20 02:31:08 -07:00
Jordan Liggitt	a5d785fae8	Record metric for lifecycle fallback to http	2022-10-19 14:45:25 -04:00
Francesco Romani	ba6b468982	node: metrics: register podresources metrics Because of a bug in the commit `1e7bb20c52`, podresources metrics were added, they are updated in the right places, but they are never exported, so they cannot be consumed. Fix trivially registering the metrics. Signed-off-by: Francesco Romani <fromani@redhat.com>	2022-10-06 15:14:56 +02:00
Clayton Coleman	e9a5fb7372	kubelet: Record a metric for latency of pod status update Track how long it takes for pod updates to propagate from detection to successful change on API server. Will guide future improvements in pod start and shutdown latency. Metric is `kubelet_pod_status_sync_duration_seconds` and is ALPHA stability. Histogram buckets are chosen based on distribution of observed status delays in practice.	2022-09-08 12:17:44 -04:00
Kubernetes Prow Robot	b0254c8a0b	Merge pull request #108758 from fengzixu/improvement-volume-health re-push "add volume kubelet_volume_stats_health_abnormal to kubelet #105585"	2022-03-29 17:35:34 -07:00
Kubernetes Prow Robot	5cb6fab8f6	Merge pull request #105585 from fengzixu/improvement-volume-health add volume kubelet_volume_stats_health_abnormal to kubelet	2022-03-17 01:32:38 +00:00
Maciej Borsz	aa95513982	Revert "add volume kubelet_volume_stats_health_abnormal to kubelet"	2022-03-16 13:44:09 +01:00
Kubernetes Prow Robot	1a5abe5d1f	Merge pull request #105585 from fengzixu/improvement-volume-health add volume kubelet_volume_stats_health_abnormal to kubelet	2022-03-15 05:58:11 -07:00
Shiming Zhang	5eb3e88f6b	Support metrics for node shutdown	2022-03-11 17:31:10 +08:00
fengzixu	9808ae48a0	change the volume health status metrics name	2022-01-23 02:44:10 +00:00
Sergey Kanzhelev	7e7bc6d53b	remove DynamicKubeletConfig logic from kubelet	2022-01-19 22:38:04 +00:00
fengzixu	bab1755274	fix: correct metrics expression	2022-01-11 13:50:17 +00:00
fengzixu	d71e21e01e	add volume kubelet_volume_stats_health_abnormal to kubelet	2022-01-11 13:50:17 +00:00
Kubernetes Prow Robot	19591a1324	Merge pull request #105829 from yuanchen8911/master Fix and improve comments on kubelet metrics	2022-01-04 23:02:32 -08:00
Elana Hashman	b35c500541	Revert "Bump DynamicKubeConfig metric deprecation to 1.23"	2021-11-17 11:48:49 -08:00
Mark Rossetti	ef324d6bbd	Adding kubelet metrics for started and failed to start HostProcess containers Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2021-11-04 14:39:57 -07:00
Yuan Chen	b99495d1d9	Fix and improve comments on kubelet metrics	2021-10-21 17:38:25 -07:00
yxxhero	35df409a7e	remove StartedPodsErrorsTotal metrice message Signed-off-by: yxxhero <aiopsclub@163.com>	2021-09-23 22:18:56 +08:00
Elana Hashman	d2ed3b28b7	Revert "revert Bump DynamicKubeConfig metric deprecation to 1.23 by delta update"	2021-08-06 08:38:56 -07:00
kerthcet	980cf85439	revert Bump DynamicKubeConfig metric deprecation to 1.23 by delta update Signed-off-by: kerthcet <kerthcet@gmail.com>	2021-08-02 23:15:10 +08:00
Elana Hashman	b5f24c334e	Bump DynamicKubeConfig metric deprecation to 1.23	2021-07-28 09:29:57 -07:00
Kubernetes Prow Robot	4d78db54a5	Merge pull request #103580 from tkestack/fix-version-format fix kubelet panic when DynamicKubeletConfig enabled	2021-07-08 14:02:24 -07:00
Kubernetes Prow Robot	7c84064a4f	Merge pull request #99000 from verb/1.21-kubelet-metrics Add kubelet metrics for ephemeral containers	2021-07-08 14:00:55 -07:00
Li Bo	79e230ea21	fix kubelet panic when DynamicKubeletConfig enabled	2021-07-08 16:20:51 +08:00
Sergey Kanzhelev	dffc2a60a2	deprecate and disable by default DynamicKubeletConfig feature flag	2021-07-02 23:53:11 +00:00
Lee Verberne	30d2ad576a	Remove ManagedPod,ManagedContainer metrics This replaces the generic ManagedPod and ManagedContainer kubelet metrics with a gauge to track only ephemeral container usage.	2021-06-15 19:02:07 +02:00
pacoxu	650666406e	update kubelet_running_pods metrics comments: pods that have a running pod sandbox Signed-off-by: pacoxu <paco.xu@daocloud.io> Co-authored-by: Elana Hashman <ehashman@users.noreply.github.com>	2021-04-29 11:05:52 +08:00
Lee Verberne	29178fff1c	Add kubelet managed pod metrics	2021-04-13 14:13:30 +02:00
Francesco Romani	1e7bb20c52	kubelet: podresources: per-endpoint metrics Before the addition of GetAllocatableResources, the podresources API had just one endpoint `List()`, thus we could just account for the total of the calls to have a good pulse of the API usage. Now that we extend the API with more endpoints (`GetAlloctableResources`), in order to improve the observability we add per-endpoint counters, in addition to the existing counter of the total API calls. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:14:58 +01:00
jialaijun	15612338e5	Migrate pkg/kubelet/metrics logs to structured logging.	2021-02-14 09:41:35 +08:00
wawa0210	f28f0953e6	Adjust kubelet_cgroup_manager_duration_seconds bucket	2021-01-19 16:23:14 +08:00
00041544	f2b8fdb265	Define const for metric name	2020-11-30 14:40:26 +08:00
Alvaro Aleman	801a52c06d	Allow debugging kubelet image pull times This PR changes the buckets of the kubelet_runtime_operation_duration_seconds metric to be metrics.ExponentialBuckets(.005, 2.5, 14) in order to allow debugging image pull times. Right now the biggest bucket is 10 seconds, which is an ordinary time frame to pull an image, making the metric useless for the aforementioned usecase.	2020-11-11 20:18:36 -05:00
Renaud Gaubert	969e45f49f	Add the pod_resources_endpoint_requests_total metric	2020-10-27 11:23:39 -07:00
RyderXia	2214117cd1	clean up unused var containerCache	2020-07-21 16:57:36 +08:00
RainbowMango	168c695e1a	Update two metrics name to make promlint happy.	2020-06-23 15:16:18 +08:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Tim Allclair	43c7f3be29	Register RunPodSandbox* metrics	2020-01-28 13:26:11 -08:00
danielqsj	ab182552b4	clean SinceInMicroseconds, convert to SinceInSeconds	2020-01-10 17:05:38 +08:00
danielqsj	1a9b121764	remove deprecated metrics of kubelet	2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot	49bc696614	Merge pull request #86251 from bboreham/pleg-last-seen-metric Kubelet: add a metric to observe time since PLEG last seen	2020-01-06 18:06:18 -08:00
Bryan Boreham	cc0b3e82eb	Kubelet: add a metric to observe time since PLEG last seen Expose the measurement that kubelet uses to judge that "PLEG is unhealthy". If we can observe the measurement growing then we can alert before the node goes unhealthy. Note that the existing metrics PLEGRelistInterval and PLEGRelistDuration are poor for this, because when relist() gets stuck they are never updated. Signed-off-by: Bryan Boreham <bryan@weave.works>	2020-01-03 10:01:27 +00:00
yiyang5055	0f410d625a	change CounterVec to use Counter in the Kubelet's Pod Lifecycle Event Generator	2019-12-11 23:51:28 +08:00
RainbowMango	30bf1f47dd	Hide kubelet metrics that have been deprecated in 1.14	2019-11-13 19:17:38 +08:00
Kubernetes Prow Robot	b3dde20411	Merge pull request #84907 from RainbowMango/pr_migrate_custom_collector_kubelet migrate kubelet custom metrics to stability framework part 1	2019-11-10 19:43:56 -08:00
Kubernetes Prow Robot	9646bd9736	Merge pull request #83664 from RainbowMango/pr_refactor_kubelet_ut_with_metrics_testutil Refactor kubelet ut with metrics testutil	2019-11-10 19:43:42 -08:00
RainbowMango	ee4394a306	Migrate custom collector for kubelet	2019-11-08 09:16:57 +08:00
Clayton Coleman	3c44e11cfa	kubelet: Record preemptions similarly to evictions A preemption is a disruption event that should have a metric so that the rate of preemption can be assessed. Nodes that are under heavy preemption may have conflicting workloads or otherwise need attention. A sudden burst of preemption on a cluster in steady state could indicate pathological conditions within the scheduler or workload controllers.	2019-10-19 19:07:37 -04:00

1 2 3

109 Commits