Commit Graph

619 Commits

Author SHA1 Message Date
Dan Winship
7696bcd10c Remove some now-obviously-unnecessary checks
Now that the endpoint update fields have names that make it clear that
they only contain UDP objects, it's obvious that the "protocol == UDP"
checks in the iptables and ipvs proxiers were no-ops, so remove them.
2023-03-14 12:18:58 -04:00
Dan Winship
c5c0d9f5bd Make deleteEndpointConnection test use syncProxyRules
Rather than calling fp.deleteEndpointConnection() directly, set up the
proxy to have syncProxyRules() call it, so that we are testing it in
the way that it actually gets called.

Squash the IPv4 and IPv6 unit tests together so we don't need to
duplicate all that code. Fix a tiny bug in NewFakeProxier() found
while doing this...
2023-03-14 12:18:58 -04:00
Dan Winship
dea8e34ea7 Improve the naming of the stale-conntrack-entry-tracking fields
The APIs talked about "stale services" and "stale endpoints", but the
thing that is actually "stale" is the conntrack entries, not the
services/endpoints. Fix the names to indicate what they actual keep
track of.

Also, all three fields (2 in the endpoints update object and 1 in the
service update object) are currently UDP-specific, but only the
service one made that clear. Fix that too.
2023-03-14 12:18:58 -04:00
Dan Winship
4381973a44 Revert (most of) "Issue 70020; Flush Conntrack entities for SCTP"
This commit did not actually work; in between when it was first
written and tested, and when it merged, the code in
pkg/proxy/endpoints.go was changed to only add UDP endpoints to the
"stale endpoints"/"stale services" lists, and so checking for "either
UDP or SCTP" rather than just UDP when processing those lists had no
effect.

This reverts most of commit aa8521df66
(but leaves the changes related to
ipvs.IsRsGracefulTerminationNeeded() since that actually did have the
effect it meant to have).
2023-03-14 12:18:58 -04:00
Kubernetes Prow Robot
611273a5bb
Merge pull request #115253 from danwinship/proxy-update-healthchecknodeport
Split out HealthCheckNodePort stuff from service/endpoint map Update()
2023-03-13 15:22:48 -07:00
Alexander Constantinescu
ec917850af Add proxy healthz result to ETP=local health check
Today, the health check response to the load balancers asking Kube-proxy for
the status of ETP:Local services does not include the healthz state of Kube-
proxy. This means that Kube-proxy might indicate to load balancers that they
should forward traffic to the node in question, simply because the endpoint
is running on the node - this overlooks the fact that Kube-proxy might be
not-healthy and hasn't successfully written the rules enabling traffic to
reach the endpoint.
2023-03-06 10:53:17 +01:00
Dan Winship
0c2711bf24 Make NodePortAddresses abstraction around GetNodeAddresses/ContainsIPv4Loopback 2023-02-22 08:32:19 -05:00
Dan Winship
d43878f970 Put all iptables nodeport address handling in one place
For some reason we were calculating the available nodeport IPs at the
top of syncProxyRules even though we didn't use them until the end.
(Well, the previous code avoided generating KUBE-NODEPORTS chain rules
if there were no node IPs available, but that case is considered an
error anyway, so there's no need to optimize it.)

(Also fix a stale `err` reference exposed by this move.)
2023-02-22 08:30:36 -05:00
Kubernetes Prow Robot
c94f708ce4
Merge pull request #114470 from danwinship/kep-3178-fixups
KEP-3178-related iptables rule fixups
2023-02-21 14:24:08 -08:00
Artem Minyaylov
f573e14942 Update k8s.io/utils to latest version
Update all usages of FakeExec to pointer to avoid copying the mutex
2023-02-04 11:05:22 -08:00
Dan Winship
d901992eae Split out HealthCheckNodePort stuff from service/endpoint map Update()
In addition to actually updating their data from the provided list of
changes, EndpointsMap.Update() and ServicePortMap.Update() return a
struct with some information about things that changed because of that
update (eg services with stale conntrack entries).

For some reason, they were also returning information about
HealthCheckNodePorts, but they were returning *static* information
based on the current (post-Update) state of the map, not information
about what had *changed* in the update. Since this doesn't match how
the other data in the struct is used (and since there's no reason to
have the data only be returned when you call Update() anyway) , split
it out.
2023-01-22 10:33:33 -05:00
Dan Winship
b9bc0e5ac8 Ensure needFullSync is set at iptables proxy startup
The unit tests were broken with MinimizeIPTablesRestore enabled
because syncProxyRules() assumed that needFullSync would be set on the
first (post-setInitialized()) run, but the unit tests didn't ensure
that.

(In fact, there was a race condition in the real Proxier case as well;
theoretically syncProxyRules() could be run by the
BoundedFrequencyRunner after OnServiceSynced() called setInitialized()
but before it called forceSyncProxyRules(), thus causing the first
real sync to try to do a partial sync and fail. This is now fixed as
well.)
2023-01-18 10:50:12 -05:00
Dan Winship
169604d906 Validate single-stack --nodeport-addresses sooner
In the dual-stack case, iptables.NewDualStackProxier and
ipvs.NewDualStackProxier filtered the nodeport addresses values by IP
family before creating the single-stack proxiers. But in the
single-stack case, the kube-proxy startup code just passed the value
to the single-stack proxiers without validation, so they had to
re-check it themselves. Fix that.
2023-01-03 09:01:45 -05:00
Dan Winship
e7ed7220eb Explicitly pass IP family to proxier
Rather than re-determining it from the iptables object in both proxies.
2023-01-03 09:01:45 -05:00
Dan Winship
0ea0295965 Duplicate the "anti-martian-packet" rule in kube-proxy
This rule was mistakenly added to kubelet even though it only applies
to kube-proxy's traffic. We do not want to remove it from kubelet yet
because other components may be depending on it for security, but we
should make kube-proxy output its own rule rather than depending on
kubelet.
2022-12-29 16:24:58 -05:00
Dan Winship
305641bd4c Add iptablesKubeletJumpChains to iptables proxier
Some of the chains kube-proxy creates are also created by kubelet; we
need to ensure that those chains exist but we should not delete them
in CleanupLeftovers().
2022-12-29 16:24:58 -05:00
Dan Winship
37a8a2bdaf fix indentation of iptables dumps in some test cases
(Especially, use tabs rather than spaces.)
2022-12-29 16:24:58 -05:00
Dan Winship
1870c4cdd7 Add a comment-only rule to the end of KUBE-FW-* chains
With the removal of the "-j KUBE-MARK-DROP" rules, the firewall chains
end rather ambiguously. Add a comment-only rule explaining what will
happen.
2022-12-29 16:24:58 -05:00
Dan Winship
c6cc056675 Replace iptables-proxy-specific SCTP e2e test with a unit test
We had a test that creating a Service with an SCTP port would create
an iptables rule with "-p sctp" in it, which let us test that
kube-proxy was doing vaguely the right thing with SCTP even if the e2e
environment didn't have SCTP support. But this would really make much
more sense as a unit test.
2022-12-13 16:21:12 -05:00
Tim Hockin
dd0a50336e
ServiceInternalTrafficPolicyType: s/Type//
Rename ServiceInternalTrafficPolicyType => ServiceInternalTrafficPolicy
2022-12-11 13:48:31 -08:00
Tim Hockin
d0e2b06850
ServiceExternalTrafficPolicyType: s/Type//
Rename ServiceExternalTrafficPolicyType => ServiceExternalTrafficPolicy
2022-12-11 13:48:27 -08:00
Dan Winship
bfa4948bb6 Don't re-run EnsureChain/EnsureRules on partial syncs
We currently invoke /sbin/iptables 24 times on each syncProxyRules
before calling iptables-restore. Since even trivial iptables
invocations are slow on hosts with lots of iptables rules, this adds a
lot of time to each sync. Since these checks are expected to be a
no-op 99% of the time, skip them on partial syncs.
2022-11-29 09:42:49 -05:00
cyclinder
4aff0dba0d
kube-proxy ipatbles: update log message 2022-11-04 10:07:15 +08:00
Kubernetes Prow Robot
d86c013b0d
Merge pull request #108250 from cyclinder/add_flag_in_proxy
kube-proxy:  add a flag  to  disable nodePortOnLocalhost
2022-11-03 17:10:13 -07:00
Andy Voltz
29f4862ed8 Promote ServiceInternalTrafficPolicy to GA 2022-11-03 13:17:03 -04:00
Manav Agarwal
3320e50e24 If applied, this commit will refactor variable names in kube-proxy 2022-11-03 03:45:57 +05:30
cyclinder
bef2070031
kube-proxy: add a flag to disables the allowing NodePort services to be accessed via localhost 2022-11-02 16:17:52 +08:00
Dan Winship
818de5a545 proxy/iptables: Add metric for partial sync failures, add test 2022-09-26 16:31:42 -04:00
Dan Winship
ab326d2f4e proxy/iptables: Don't rewrite chains that haven't changed
iptables-restore requires that if you change any rule in a chain, you
have to rewrite the entire chain. But if you avoid mentioning a chain
at all, it will leave it untouched. Take advantage of this by not
rewriting the SVC, SVL, EXT, FW, and SEP chains for services that have
not changed since the last sync, which should drastically cut down on
the size of each iptables-restore in large clusters.
2022-09-26 16:30:42 -04:00
Dan Winship
c437b15441 proxy/iptables: make part of the unit test sanity-checking optional 2022-08-24 09:02:48 -04:00
Kubernetes Prow Robot
da112dda68
Merge pull request #111806 from danwinship/kube-proxy-no-mode-fallback
remove kube-proxy mode fallback
2022-08-24 05:52:03 -07:00
Dan Winship
9f69a3a9d4 kube-proxy: remove iptables-to-userspace fallback
Back when iptables was first made the default, there were
theoretically some users who wouldn't have been able to support it due
to having an old /sbin/iptables. But kube-proxy no longer does the
things that didn't work with old iptables, and we removed that check a
long time ago. There is also a check for a new-enough kernel version,
but it's checking for a feature which was added in kernel 3.6, and no
one could possibly be running Kubernetes with a kernel that old. So
the fallback code now never actually falls back, so it should just be
removed.
2022-08-16 09:21:34 -04:00
ialidzhikov
f2bc2ed2da pkg/proxy: Replace deprecated func usage from the k8s.io/utils/pointer pkg 2022-08-14 18:27:33 +03:00
Dan Winship
f65fbc877b proxy/iptables: remove last references to KUBE-MARK-DROP 2022-07-28 09:03:49 -04:00
Dan Winship
9313188909 proxy/iptables: Don't use KUBE-MARK-DROP for LoadBalancerSourceRanges 2022-07-28 09:03:46 -04:00
Kubernetes Prow Robot
ce433f87b4
Merge pull request #110266 from danwinship/minimize-prep-reorg
iptables proxy reorg in preparation for minimizing iptables-restore
2022-07-27 04:06:30 -07:00
Dan Williams
f197509879 proxy: queue syncs on node events rather than syncing immediately
The proxies watch node labels for topology changes, but node labels
can change in bursts especially in larger clusters. This causes
pressure on all proxies because they can't filter the events, since
the topology could match on any label.

Change node event handling to queue the request rather than immediately
syncing. The sync runner can already handle short bursts which shouldn't
change behavior for most cases.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2022-07-18 09:21:52 -05:00
Dan Winship
367f18c49b proxy/iptables: move firewall chain setup
Part of reorganizing the syncProxyRules loop to do:
  1. figure out what chains are needed, mark them in activeNATChains
  2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
  3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)

This moves the FW chain creation to the end (rather than having it in
the middle of adding the jump rules for the LB IPs).
2022-07-09 07:08:42 -04:00
Dan Winship
2030591ce7 proxy/iptables: move internal traffic setup code
Part of reorganizing the syncProxyRules loop to do:
  1. figure out what chains are needed, mark them in activeNATChains
  2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
  3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)

This fixes the jump rules for internal traffic. Previously we were
handling "jumping from kubeServices to internalTrafficChain" and
"adding masquerade rules to internalTrafficChain" in the same place.
2022-07-09 07:07:48 -04:00
Dan Winship
00f789cd8d proxy/iptables: move EXT chain rule creation to the end
Part of reorganizing the syncProxyRules loop to do:
  1. figure out what chains are needed, mark them in activeNATChains
  2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
  3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)

This fixes the handling of the EXT chain.
2022-07-09 07:07:47 -04:00
Dan Winship
8906ab390e proxy/iptables: reorganize cluster/local chain creation
Part of reorganizing the syncProxyRules loop to do:
  1. figure out what chains are needed, mark them in activeNATChains
  2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
  3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)

This fixes the handling of the SVC and SVL chains. We were already
filling them in at the end of the loop; this fixes it to create them
at the bottom of the loop as well.
2022-07-09 07:05:05 -04:00
Dan Winship
da14a12fe5 proxy/iptables: move endpoint chain rule creation to the end
Part of reorganizing the syncProxyRules loop to do:
  1. figure out what chains are needed, mark them in activeNATChains
  2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
  3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)

This fixes the handling of the endpoint chains. Previously they were
handled entirely at the top of the loop. Now we record which ones are
in use at the top but don't create them and fill them in until the
bottom.
2022-07-09 06:51:47 -04:00
Dan Winship
8a5801996b proxy/iptables: belatedly simplify local traffic policy metrics
We figure out early on whether we're going to end up outputting no
endpoints, so update the metrics then.

(Also remove a redundant feature gate check; svcInfo already checks
the ServiceInternalTrafficPolicy feature gate itself and so
svcInfo.InternalPolicyLocal() will always return false if the gate is
not enabled.)
2022-07-09 06:50:16 -04:00
Dan Winship
95705350d5 proxy/iptables: Don't use KUBE-MARK-DROP for "no local endpoints"
Rather than marking packets to be dropped in the "nat" table and then
dropping them from the "filter" table later, just use rules in
"filter" to drop the packets we don't like directly.
2022-06-29 16:37:24 -04:00
Dan Winship
283218bd4c proxy/iptables: update TestTracePackets
Re-sync the rules from TestOverallIPTablesRulesWithMultipleServices to
make sure we're testing all the right kinds of rules. Remove a
duplicate copy of the KUBE-MARK-MASQ and KUBE-POSTROUTING rules.

Update the "REJECT" test to use the new svc6 from
TestOverallIPTablesRulesWithMultipleServices. (Previously it had used
a modified version of TOIPTRWMS's svc3.)
2022-06-29 16:33:13 -04:00
Dan Winship
59b7f969e8 proxy/iptables: fix up TestOverallIPTablesRulesWithMultipleServices
svc2b was using the same ClusterIP as svc3; change it and rename the
service to svc5 to make everything clearer.

Move the test of LoadBalancerSourceRanges from svc2 to svc5, so that
svc2 tests the rules for dropping packets due to
externalTrafficPolicy, and svc5 tests the rules for dropping packets
due to LoadBalancerSourceRanges, rather than having them both mixed
together in svc2.

Add svc6 with no endpoints.
2022-06-29 16:33:13 -04:00
Kubernetes Prow Robot
f045fb688f
Merge pull request #110334 from danwinship/iptables-fewer-saves
only clean up iptables chains periodically in large clusters
2022-06-29 09:48:06 -07:00
Dan Winship
7d3ba837f5 proxy/iptables: only clean up chains periodically in large clusters
"iptables-save" takes several seconds to run on machines with lots of
iptables rules, and we only use its result to figure out which chains
are no longer referenced by any rules. While it makes things less
confusing if we delete unused chains immediately, it's not actually
_necessary_ since they never get called during packet processing. So
in large clusters, make it so we only clean up chains periodically
rather than on every sync.
2022-06-29 11:14:38 -04:00
Dan Winship
1cd461bd24 proxy/iptables: abstract the "endpointChainsNumberThreshold" a bit
Turn this into a generic "large cluster mode" that determines whether
we optimize for performance or debuggability.
2022-06-29 11:14:38 -04:00
Dan Winship
c12da17838 proxy/iptables: Add a unit test with multiple resyncs 2022-06-29 11:14:38 -04:00