Fix some warnings from go-staticcheck.
"should use buffer.String() instead of string(buffer.Bytes()) (S1030)"
This warning is explained at this link.
https://staticcheck.io/docs/checks#S1030
the iptables restore function, if it considers that the --wait flag
is not supported, creates a lock file to mimic the iptables behaviour.
The test should take this into account and remove the file.
1. For iptables mode, add KUBE-NODEPORTS chain in filter table. Add
rules to allow healthcheck node port traffic.
2. For ipvs mode, add KUBE-NODE-PORT chain in filter table. Add
KUBE-HEALTH-CHECK-NODE-PORT ipset to allow traffic to healthcheck
node port.
the iptables monitor was using iptables -L to list the chains,
without the -n option, so it was trying to do reverse DNS lookups.
A side effect is that it was holding the lock, so other components
could not use it.
We can use -S instead of -L -n to avoid this, since we only want
to check the chain exists.
iptables has two options to modify the behaviour trying to
acquire the lock.
--wait -w [seconds] maximum wait to acquire xtables lock
before give up
--wait-interval -W [usecs] wait time to try to acquire xtables
lock
interval to wait for xtables lock
default is 1 second
Kubernetes uses -w 5 that means that wait 5 seconds to try to
acquire the lock. If we are not able to acquire it, kube-proxy
fails and retries in 30 seconds, that is an important penalty
on sensitive applications.
We can be a bit more aggresive and try to acquire the lock every
100 msec, that means that we have to fail 50 times to not being
able to succeed.
Kubelet and kube-proxy both had loops to ensure that their iptables
rules didn't get deleted, by repeatedly recreating them. But on
systems with lots of iptables rules (ie, thousands of services), this
can be very slow (and thus might end up holding the iptables lock for
several seconds, blocking other operations, etc).
The specific threat that they need to worry about is
firewall-management commands that flush *all* dynamic iptables rules.
So add a new iptables.Monitor() function that handles this by creating
iptables-flush canaries and only triggering a full rule reload after
noticing that someone has deleted those chains.
The firewalld monitoring code was not well tested (and not easily
testable), would never be triggered on most platforms, and was only
being taken advantage of from one place (kube-proxy), which didn't
need it anyway since it already has its own resync loop.
Since the firewalld monitoring was the only consumer of pkg/util/dbus,
we can also now delete that.
Work around Linux kernel bug that sometimes causes multiple flows to
get mapped to the same IP:PORT and consequently some suffer packet
drops.
Also made the same update in kubelet.
Also added cross-pointers between the two bodies of code, in comments.
Some day we should eliminate the duplicate code. But today is not
that day.
The iptables code was doing version detection on the iptables binary
but feature detection on the iptables-restore binary, to try to
support the version of iptables in RHEL 7, which claims to be 1.4.21
but has certain features from iptables 1.6.
The problem is that this particular set of versions and checks
resulted in the code passing "-w" ("wait forever for the lock") to
iptables, but "-w 5" ("wait at most 5 seconds for the lock") to
iptables-restore. On systems with very very many iptables rules, this
could result in the kubelet periodic resyncs (which use "iptables")
blocking kube-proxy (which uses "iptables-restore") and causing it to
time out.
We already have code to grab the lock file by hand when using a
version of iptables-restore that doesn't support "-w", and it works
fine. So just use that instead, and only pass "-w 5" to
iptables-restore when iptables reports a version that actually
supports it.