This adds a metric, kubeproxy_sync_proxy_rules_last_queued_timestamp,
that captures the last time a change was queued to be applied to the
proxy. This matches the healthz logic, which fails if a pending change
is stale.
This allows us to write alerts that mirror healthz.
Signed-off-by: Casey Callendrello <cdc@redhat.com>
As mentioned in issue #80061, in iptables lock contention case,
we can see increasing rate of iptables restore failures because it
need to grab iptables file lock.
The failure metric can provide administrators more insight
Metrics will be collected in kube-proxy iptables and ipvs modes
Signed-off-by: Hui Luo <luoh@vmware.com>
refs https://github.com/kubernetes/perf-tests/issues/640
We have too fine buckets granularity for lower latencies, at cost of the higher
latecies (7+ minutes). This is causing spikes in SLI calculated based on that
metrics.
I don't have strong opinion about actual values - those seemed to be better
matching our need. But let's have discussion about them.
Values:
0.015 s
0.030 s
0.060 s
0.120 s
0.240 s
0.480 s
0.960 s
1.920 s
3.840 s
7.680 s
15.360 s
30.720 s
61.440 s
122.880 s
245.760 s
491.520 s
983.040 s
1966.080 s
3932.160 s
7864.320 s
This adds some useful metrics around pending changes and last successful
sync time.
The goal is for administrators to be able to alert on proxies that, for
whatever reason, are quite stale.
Signed-off-by: Casey Callendrello <cdc@redhat.com>