Commit Graph

89 Commits

Author SHA1 Message Date
Elana Hashman
d2ed3b28b7
Revert "revert Bump DynamicKubeConfig metric deprecation to 1.23 by delta update" 2021-08-06 08:38:56 -07:00
kerthcet
980cf85439 revert Bump DynamicKubeConfig metric deprecation to 1.23 by delta update
Signed-off-by: kerthcet <kerthcet@gmail.com>
2021-08-02 23:15:10 +08:00
Elana Hashman
b5f24c334e
Bump DynamicKubeConfig metric deprecation to 1.23 2021-07-28 09:29:57 -07:00
Kubernetes Prow Robot
4d78db54a5
Merge pull request #103580 from tkestack/fix-version-format
fix kubelet panic when DynamicKubeletConfig enabled
2021-07-08 14:02:24 -07:00
Kubernetes Prow Robot
7c84064a4f
Merge pull request #99000 from verb/1.21-kubelet-metrics
Add kubelet metrics for ephemeral containers
2021-07-08 14:00:55 -07:00
Li Bo
79e230ea21 fix kubelet panic when DynamicKubeletConfig enabled 2021-07-08 16:20:51 +08:00
Sergey Kanzhelev
dffc2a60a2 deprecate and disable by default DynamicKubeletConfig feature flag 2021-07-02 23:53:11 +00:00
Lee Verberne
30d2ad576a Remove ManagedPod,ManagedContainer metrics
This replaces the generic ManagedPod and ManagedContainer kubelet
metrics with a gauge to track only ephemeral container usage.
2021-06-15 19:02:07 +02:00
pacoxu
650666406e update kubelet_running_pods metrics comments: pods that have a running pod sandbox
Signed-off-by: pacoxu <paco.xu@daocloud.io>
Co-authored-by: Elana Hashman <ehashman@users.noreply.github.com>
2021-04-29 11:05:52 +08:00
Lee Verberne
29178fff1c Add kubelet managed pod metrics 2021-04-13 14:13:30 +02:00
Francesco Romani
1e7bb20c52 kubelet: podresources: per-endpoint metrics
Before the addition of GetAllocatableResources, the
podresources API had just one endpoint `List()`, thus we could just
account for the total of the calls to have a good pulse of the API usage.
Now that we extend the API with more endpoints
(`GetAlloctableResources`), in order to improve the observability we add
per-endpoint counters, in addition to the existing counter of the total
API calls.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-03-09 13:14:58 +01:00
jialaijun
15612338e5 Migrate pkg/kubelet/metrics logs to structured logging. 2021-02-14 09:41:35 +08:00
wawa0210
f28f0953e6
Adjust kubelet_cgroup_manager_duration_seconds bucket 2021-01-19 16:23:14 +08:00
00041544
f2b8fdb265 Define const for metric name 2020-11-30 14:40:26 +08:00
Alvaro Aleman
801a52c06d
Allow debugging kubelet image pull times
This PR changes the buckets of the
kubelet_runtime_operation_duration_seconds metric to be
metrics.ExponentialBuckets(.005, 2.5, 14) in order to
allow debugging image pull times. Right now the biggest bucket is 10
seconds, which is an ordinary time frame to pull an image, making the
metric useless for the aforementioned usecase.
2020-11-11 20:18:36 -05:00
Renaud Gaubert
969e45f49f Add the pod_resources_endpoint_requests_total metric 2020-10-27 11:23:39 -07:00
RyderXia
2214117cd1 clean up unused var containerCache 2020-07-21 16:57:36 +08:00
RainbowMango
168c695e1a Update two metrics name to make promlint happy. 2020-06-23 15:16:18 +08:00
Davanum Srinivas
442a69c3bd
switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
Tim Allclair
43c7f3be29 Register RunPodSandbox* metrics 2020-01-28 13:26:11 -08:00
danielqsj
ab182552b4 clean SinceInMicroseconds, convert to SinceInSeconds 2020-01-10 17:05:38 +08:00
danielqsj
1a9b121764 remove deprecated metrics of kubelet 2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot
49bc696614
Merge pull request #86251 from bboreham/pleg-last-seen-metric
Kubelet: add a metric to observe time since PLEG last seen
2020-01-06 18:06:18 -08:00
Bryan Boreham
cc0b3e82eb Kubelet: add a metric to observe time since PLEG last seen
Expose the measurement that kubelet uses to judge that "PLEG is
unhealthy". If we can observe the measurement growing then we can
alert before the node goes unhealthy.

Note that the existing metrics PLEGRelistInterval and
PLEGRelistDuration are poor for this, because when relist() gets
stuck they are never updated.

Signed-off-by: Bryan Boreham <bryan@weave.works>
2020-01-03 10:01:27 +00:00
yiyang5055
0f410d625a change CounterVec to use Counter in the Kubelet's Pod Lifecycle Event Generator 2019-12-11 23:51:28 +08:00
RainbowMango
30bf1f47dd Hide kubelet metrics that have been deprecated in 1.14 2019-11-13 19:17:38 +08:00
Kubernetes Prow Robot
b3dde20411
Merge pull request #84907 from RainbowMango/pr_migrate_custom_collector_kubelet
migrate kubelet custom metrics to stability framework part 1
2019-11-10 19:43:56 -08:00
Kubernetes Prow Robot
9646bd9736
Merge pull request #83664 from RainbowMango/pr_refactor_kubelet_ut_with_metrics_testutil
Refactor kubelet ut with metrics testutil
2019-11-10 19:43:42 -08:00
RainbowMango
ee4394a306 Migrate custom collector for kubelet 2019-11-08 09:16:57 +08:00
Clayton Coleman
3c44e11cfa
kubelet: Record preemptions similarly to evictions
A preemption is a disruption event that should have a metric so that
the rate of preemption can be assessed. Nodes that are under heavy
preemption may have conflicting workloads or otherwise need attention.
A sudden burst of preemption on a cluster in steady state could
indicate pathological conditions within the scheduler or workload
controllers.
2019-10-19 19:07:37 -04:00
RainbowMango
debe2f7b43 Refactor TestRunningPodAndContainerCount with metrics testutil 2019-10-09 15:09:23 +08:00
SataQiu
77f42c8108 eliminate direct references to prometheus 2019-10-04 21:33:34 +08:00
Rajdeep Das
c02d49d775 Update running_pod_count and running_container_count metric
As already mentioned in this issue https://github.com/kubernetes/kubernetes/issues/79286, some metrics like
"running_pod_count" and "running_container_count" uses non-standard prometheus metrics, this change converts them to be
standard prometheus gauges

Minor refactor in kubelet/pleg/generic.go and added some test for ruuning container and running pod metrics

Fixed issues related to github CI pipeline failure

* Updated bazel for new deps
* Add comment for exported metrics variables,RuuningContainerCount and RunningPodCount
* Specify keys explicitly in Guage metric instantation

Fix go lint errors

Replace "+=1" with "++", as reported by go lint

Set container state as a label for the metrics "running_container_count"

As per the metrics name "running_container_count" it should "ideally" be showing
the number of containers in "running" state , but it was showing all the container count, irrespective of the state it is in.
This commit adds a new label "container_running_state" to the metrics "running_container_count", which doesn't change the base metrics but adds the
option to query the metrics with "container_state" such as "running"/"unknown/...

remove unused methods reported by staticcheck

Remove variables while instantiating gauge(vec) which are default set to nil

Convert kubelet metrics(running_pod_count and running_container_count) to standard gauges and added label to running_container_count metrics.

Currently kubelet metrics(running_pod_count and running_container_count) use non-standard prometheus collectors , this change
converts them to standard prometheus gauges. Also this adds a new label(container_state) to running_container_count which does a breakdown of
containers tracked by kubelet based on the containers' state(running/unknown/created/exited).

Set statbility explicitly for running_pod_count and running_container_count and reformat test

register metrics explicitly in test , so that they don't become no-op
2019-08-29 17:23:04 +02:00
Han Kang
3a50917795 migrate kubelet's metrics/probes & metrics endpoint to metrics stability framework 2019-08-28 11:16:38 -07:00
Seth Jennings
23b69cf02d kubelet: add eviction counter to metrics 2019-08-13 15:21:38 -05:00
obitech
a5bc997aa9 Fixed pull-kubernetes-verify issues 2019-08-03 21:07:12 +02:00
obitech
457972f1a4 Fix suggestions, track removed library in bazel 2019-08-03 21:07:12 +02:00
obitech
898c40a484 Fix golint failures in some pkg/kubelet packages
Fixed:
- pkg/kubelet/pod
- pkg/kubelet/metrics
- pkg/kubelet/configmap
- pkg/kubelet/config
2019-08-03 21:07:12 +02:00
Ted Yu
5d1bb99fcd Log warning if config labels deletion returns false 2019-07-16 09:46:12 -07:00
haiyanmeng
ec18200f8b Fit RuntimeClass metrics to prometheus conventions
1) Add suffix (`seconds` or `total`) to metric name
2) Switch Summary metric to Histogram metric (Summary metrics are not
supported completely by prometheus-to-sd and can't be aggregated.)
2019-02-19 12:46:37 -08:00
danielqsj
79a3eb816c rename latency to duration in metrics 2019-02-18 17:40:04 +08:00
danielqsj
0bfe4c26b1 add default buckets for histogram metrics 2019-02-18 14:07:30 +08:00
danielqsj
4fa0ee7805 Mark deprecated in related kubelet metrics 2019-02-18 14:03:44 +08:00
danielqsj
0e9515c709 Move kubelet metrics to histogram metrics 2019-02-18 14:03:44 +08:00
danielqsj
9fd99a48f5 Change kubelet metrics to conform guideline 2019-02-18 14:01:58 +08:00
Kubernetes Prow Robot
289a60ad71
Merge pull request #72709 from changyaowei/pleg_relist
When pleg channel is full, discard events and record its count
2019-02-13 01:44:48 -08:00
Kubernetes Prow Robot
459e509f94
Merge pull request #73549 from haiyanmeng/runtimeclass
Add monitoring for RuntimeClass
2019-02-05 15:14:38 -08:00
haiyanmeng
18bcdcecce Add monitoring for RuntimeClass 2019-02-04 16:01:29 -08:00
changyaowei
b52afc350f when pleg channel is full, discard events and record how many events discard 2019-01-30 20:43:54 +08:00
danielqsj
1d73c7daed Add kubelet_node_name metrics 2019-01-15 18:01:04 +08:00