The repair loop are great for saving us of leaks, but the side effect
is that bugs can go unnoticed for a long time, so we need some
signal to be able to identify those errors proactivily.
Add two new metrics to identify:
- errors on the reconcile loop
- errors per clusterip