kubernetes

Author	SHA1	Message	Date
Elana Hashman	b5f24c334e	Bump DynamicKubeConfig metric deprecation to 1.23	2021-07-28 09:29:57 -07:00
Kubernetes Prow Robot	4d78db54a5	Merge pull request #103580 from tkestack/fix-version-format fix kubelet panic when DynamicKubeletConfig enabled	2021-07-08 14:02:24 -07:00
Kubernetes Prow Robot	7c84064a4f	Merge pull request #99000 from verb/1.21-kubelet-metrics Add kubelet metrics for ephemeral containers	2021-07-08 14:00:55 -07:00
Li Bo	79e230ea21	fix kubelet panic when DynamicKubeletConfig enabled	2021-07-08 16:20:51 +08:00
Sergey Kanzhelev	dffc2a60a2	deprecate and disable by default DynamicKubeletConfig feature flag	2021-07-02 23:53:11 +00:00
Lee Verberne	30d2ad576a	Remove ManagedPod,ManagedContainer metrics This replaces the generic ManagedPod and ManagedContainer kubelet metrics with a gauge to track only ephemeral container usage.	2021-06-15 19:02:07 +02:00
pacoxu	650666406e	update kubelet_running_pods metrics comments: pods that have a running pod sandbox Signed-off-by: pacoxu <paco.xu@daocloud.io> Co-authored-by: Elana Hashman <ehashman@users.noreply.github.com>	2021-04-29 11:05:52 +08:00
Lee Verberne	29178fff1c	Add kubelet managed pod metrics	2021-04-13 14:13:30 +02:00
Francesco Romani	1e7bb20c52	kubelet: podresources: per-endpoint metrics Before the addition of GetAllocatableResources, the podresources API had just one endpoint `List()`, thus we could just account for the total of the calls to have a good pulse of the API usage. Now that we extend the API with more endpoints (`GetAlloctableResources`), in order to improve the observability we add per-endpoint counters, in addition to the existing counter of the total API calls. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-03-09 13:14:58 +01:00
jialaijun	15612338e5	Migrate pkg/kubelet/metrics logs to structured logging.	2021-02-14 09:41:35 +08:00
wawa0210	f28f0953e6	Adjust kubelet_cgroup_manager_duration_seconds bucket	2021-01-19 16:23:14 +08:00
00041544	f2b8fdb265	Define const for metric name	2020-11-30 14:40:26 +08:00
Alvaro Aleman	801a52c06d	Allow debugging kubelet image pull times This PR changes the buckets of the kubelet_runtime_operation_duration_seconds metric to be metrics.ExponentialBuckets(.005, 2.5, 14) in order to allow debugging image pull times. Right now the biggest bucket is 10 seconds, which is an ordinary time frame to pull an image, making the metric useless for the aforementioned usecase.	2020-11-11 20:18:36 -05:00
Renaud Gaubert	969e45f49f	Add the pod_resources_endpoint_requests_total metric	2020-10-27 11:23:39 -07:00
RyderXia	2214117cd1	clean up unused var containerCache	2020-07-21 16:57:36 +08:00
RainbowMango	168c695e1a	Update two metrics name to make promlint happy.	2020-06-23 15:16:18 +08:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Tim Allclair	43c7f3be29	Register RunPodSandbox* metrics	2020-01-28 13:26:11 -08:00
danielqsj	ab182552b4	clean SinceInMicroseconds, convert to SinceInSeconds	2020-01-10 17:05:38 +08:00
danielqsj	1a9b121764	remove deprecated metrics of kubelet	2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot	49bc696614	Merge pull request #86251 from bboreham/pleg-last-seen-metric Kubelet: add a metric to observe time since PLEG last seen	2020-01-06 18:06:18 -08:00
Bryan Boreham	cc0b3e82eb	Kubelet: add a metric to observe time since PLEG last seen Expose the measurement that kubelet uses to judge that "PLEG is unhealthy". If we can observe the measurement growing then we can alert before the node goes unhealthy. Note that the existing metrics PLEGRelistInterval and PLEGRelistDuration are poor for this, because when relist() gets stuck they are never updated. Signed-off-by: Bryan Boreham <bryan@weave.works>	2020-01-03 10:01:27 +00:00
yiyang5055	0f410d625a	change CounterVec to use Counter in the Kubelet's Pod Lifecycle Event Generator	2019-12-11 23:51:28 +08:00
RainbowMango	30bf1f47dd	Hide kubelet metrics that have been deprecated in 1.14	2019-11-13 19:17:38 +08:00
Kubernetes Prow Robot	b3dde20411	Merge pull request #84907 from RainbowMango/pr_migrate_custom_collector_kubelet migrate kubelet custom metrics to stability framework part 1	2019-11-10 19:43:56 -08:00
Kubernetes Prow Robot	9646bd9736	Merge pull request #83664 from RainbowMango/pr_refactor_kubelet_ut_with_metrics_testutil Refactor kubelet ut with metrics testutil	2019-11-10 19:43:42 -08:00
RainbowMango	ee4394a306	Migrate custom collector for kubelet	2019-11-08 09:16:57 +08:00
Clayton Coleman	3c44e11cfa	kubelet: Record preemptions similarly to evictions A preemption is a disruption event that should have a metric so that the rate of preemption can be assessed. Nodes that are under heavy preemption may have conflicting workloads or otherwise need attention. A sudden burst of preemption on a cluster in steady state could indicate pathological conditions within the scheduler or workload controllers.	2019-10-19 19:07:37 -04:00
RainbowMango	debe2f7b43	Refactor TestRunningPodAndContainerCount with metrics testutil	2019-10-09 15:09:23 +08:00
SataQiu	77f42c8108	eliminate direct references to prometheus	2019-10-04 21:33:34 +08:00
Rajdeep Das	c02d49d775	Update running_pod_count and running_container_count metric As already mentioned in this issue https://github.com/kubernetes/kubernetes/issues/79286, some metrics like "running_pod_count" and "running_container_count" uses non-standard prometheus metrics, this change converts them to be standard prometheus gauges Minor refactor in kubelet/pleg/generic.go and added some test for ruuning container and running pod metrics Fixed issues related to github CI pipeline failure * Updated bazel for new deps * Add comment for exported metrics variables,RuuningContainerCount and RunningPodCount * Specify keys explicitly in Guage metric instantation Fix go lint errors Replace "+=1" with "++", as reported by go lint Set container state as a label for the metrics "running_container_count" As per the metrics name "running_container_count" it should "ideally" be showing the number of containers in "running" state , but it was showing all the container count, irrespective of the state it is in. This commit adds a new label "container_running_state" to the metrics "running_container_count", which doesn't change the base metrics but adds the option to query the metrics with "container_state" such as "running"/"unknown/... remove unused methods reported by staticcheck Remove variables while instantiating gauge(vec) which are default set to nil Convert kubelet metrics(running_pod_count and running_container_count) to standard gauges and added label to running_container_count metrics. Currently kubelet metrics(running_pod_count and running_container_count) use non-standard prometheus collectors , this change converts them to standard prometheus gauges. Also this adds a new label(container_state) to running_container_count which does a breakdown of containers tracked by kubelet based on the containers' state(running/unknown/created/exited). Set statbility explicitly for running_pod_count and running_container_count and reformat test register metrics explicitly in test , so that they don't become no-op	2019-08-29 17:23:04 +02:00
Han Kang	3a50917795	migrate kubelet's metrics/probes & metrics endpoint to metrics stability framework	2019-08-28 11:16:38 -07:00
Seth Jennings	23b69cf02d	kubelet: add eviction counter to metrics	2019-08-13 15:21:38 -05:00
obitech	a5bc997aa9	Fixed pull-kubernetes-verify issues	2019-08-03 21:07:12 +02:00
obitech	457972f1a4	Fix suggestions, track removed library in bazel	2019-08-03 21:07:12 +02:00
obitech	898c40a484	Fix golint failures in some pkg/kubelet packages Fixed: - pkg/kubelet/pod - pkg/kubelet/metrics - pkg/kubelet/configmap - pkg/kubelet/config	2019-08-03 21:07:12 +02:00
Ted Yu	5d1bb99fcd	Log warning if config labels deletion returns false	2019-07-16 09:46:12 -07:00
haiyanmeng	ec18200f8b	Fit RuntimeClass metrics to prometheus conventions 1) Add suffix (`seconds` or `total`) to metric name 2) Switch Summary metric to Histogram metric (Summary metrics are not supported completely by prometheus-to-sd and can't be aggregated.)	2019-02-19 12:46:37 -08:00
danielqsj	79a3eb816c	rename latency to duration in metrics	2019-02-18 17:40:04 +08:00
danielqsj	0bfe4c26b1	add default buckets for histogram metrics	2019-02-18 14:07:30 +08:00
danielqsj	4fa0ee7805	Mark deprecated in related kubelet metrics	2019-02-18 14:03:44 +08:00
danielqsj	0e9515c709	Move kubelet metrics to histogram metrics	2019-02-18 14:03:44 +08:00
danielqsj	9fd99a48f5	Change kubelet metrics to conform guideline	2019-02-18 14:01:58 +08:00
Kubernetes Prow Robot	289a60ad71	Merge pull request #72709 from changyaowei/pleg_relist When pleg channel is full, discard events and record its count	2019-02-13 01:44:48 -08:00
Kubernetes Prow Robot	459e509f94	Merge pull request #73549 from haiyanmeng/runtimeclass Add monitoring for RuntimeClass	2019-02-05 15:14:38 -08:00
haiyanmeng	18bcdcecce	Add monitoring for RuntimeClass	2019-02-04 16:01:29 -08:00
changyaowei	b52afc350f	when pleg channel is full, discard events and record how many events discard	2019-01-30 20:43:54 +08:00
danielqsj	1d73c7daed	Add kubelet_node_name metrics	2019-01-15 18:01:04 +08:00
Davanum Srinivas	954996e231	Move from glog to klog - Move from the old github.com/golang/glog to k8s.io/klog - klog as explicit InitFlags() so we add them as necessary - we update the other repositories that we vendor that made a similar change from glog to klog * github.com/kubernetes/repo-infra * k8s.io/gengo/ * k8s.io/kube-openapi/ * github.com/google/cadvisor - Entirely remove all references to glog - Fix some tests by explicit InitFlags in their init() methods Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135	2018-11-10 07:50:31 -05:00
Joonyoung Park	e6d02e9410	fix metrics help comment pod_start_latency_microseconds is not broken down by podname.	2018-07-13 10:26:35 +09:00

1 2

87 Commits