Kubelet and kube-proxy both had loops to ensure that their iptables
rules didn't get deleted, by repeatedly recreating them. But on
systems with lots of iptables rules (ie, thousands of services), this
can be very slow (and thus might end up holding the iptables lock for
several seconds, blocking other operations, etc).
The specific threat that they need to worry about is
firewall-management commands that flush *all* dynamic iptables rules.
So add a new iptables.Monitor() function that handles this by creating
iptables-flush canaries and only triggering a full rule reload after
noticing that someone has deleted those chains.
The firewalld monitoring code was not well tested (and not easily
testable), would never be triggered on most platforms, and was only
being taken advantage of from one place (kube-proxy), which didn't
need it anyway since it already has its own resync loop.
Since the firewalld monitoring was the only consumer of pkg/util/dbus,
we can also now delete that.
Work around Linux kernel bug that sometimes causes multiple flows to
get mapped to the same IP:PORT and consequently some suffer packet
drops.
Also made the same update in kubelet.
Also added cross-pointers between the two bodies of code, in comments.
Some day we should eliminate the duplicate code. But today is not
that day.
This patch moves the HostUtil functionality from the util/mount package
to the volume/util/hostutil package.
All `*NewHostUtil*` calls are changed to return concrete types instead
of interfaces.
All callers are changed to use the `*NewHostUtil*` methods instead of
directly instantiating the concrete types.
The MakeFile and MakeDir methods in the HostUtil interface only had one
caller -- the Host Path volume plugin. This patch relocates MakeFile and
MakeDir to the Host Path plugin itself.
Increased the number of tries in pkg/util/node/node.go::GetNodeIP by
1, because the kube-proxy was giving up too early.
This is meant to address #81879
This patch takes all the HostUtil functionality currently found in
mount*.go files and copies it into hostutil*.go files. Care was taken to
preserve git history to the fullest extent.
As part of doing this, some common functionality was moved into
mount_helper files in preperation for HostUtils to stay in k/k and Mount
to move out. THe tests for each relevant function were moved to test
files to match the appropriate location.
This patch renames GetFSGroup (a process property) to GetOwner (a file
property), returning both the uid and gid of the given pathname. This
method is only used in one place in the k/k codebase, but having
"GetOwner" instead of "GetGroup" seems to have more utility.
This patch adds comments to exported items that were missing them in
order to make the linter happy. Only code changes that were limited to
the scope of this package were made. There are other linting issues that
will effect callers, and that will be done a seperate patch.
The iptables code was doing version detection on the iptables binary
but feature detection on the iptables-restore binary, to try to
support the version of iptables in RHEL 7, which claims to be 1.4.21
but has certain features from iptables 1.6.
The problem is that this particular set of versions and checks
resulted in the code passing "-w" ("wait forever for the lock") to
iptables, but "-w 5" ("wait at most 5 seconds for the lock") to
iptables-restore. On systems with very very many iptables rules, this
could result in the kubelet periodic resyncs (which use "iptables")
blocking kube-proxy (which uses "iptables-restore") and causing it to
time out.
We already have code to grab the lock file by hand when using a
version of iptables-restore that doesn't support "-w", and it works
fine. So just use that instead, and only pass "-w 5" to
iptables-restore when iptables reports a version that actually
supports it.
ipvs `getProxyMode` test fails on mac as `utilipvs.GetRequiredIPVSMods`
try to reach `/proc/sys/kernel/osrelease` to find version of the running
linux kernel. Linux kernel version is used to determine the list of required
kernel modules for ipvs.
Logic to determine kernel version is moved to GetKernelVersion
method in LinuxKernelHandler which implements ipvs.KernelHandler.
Mock KernelHandler is used in the test cases.
Read and parse file is converted to go function instead of execing cut.
This patch refactors pkg/util/mount to be more usable outside of
Kubernetes. This is done by refactoring mount.Interface to only contain
methods that are not K8s specific. Methods that are not relevant to
basic mount activities but still have OS-specific implementations are
now found in a mount.HostUtils interface.