When I first wrote TestInternalExternalMasquerade, I put "FIXME"
comments in all of the cases that seemed wrong to me, most of which
got removed as we fixed the corner cases. But there were two cases
where we decided that the implemented behavior, though odd, was
correct, and those FIXMEs never got removed.
All the code to deal with enabling/disabling the feature gate is gone,
but some of the tests were still specifying "this test case assumes
PTE is enabled".
Remove "EndpointSlice" from some unit test names, because they don't
need to clarify that they use EndpointSlices now, because all of the
tests use EndpointSlices now.
Likewise, remove TestEndpointSliceE2E entirely; it was originally an
EndpointSlice version of one of the other tests, but the other test
uses EndpointSlices now too.
Historically, IptablesRulesTotal could have been intepreted as either
"the total number of iptables rules kube-proxy is responsible for" or
"the number of iptables rules kube-proxy rewrote on the last sync".
Post-MinimizeIPTablesRestore, these are very different things (and
IptablesRulesTotal unintentionally became the latter).
Fix IptablesRulesTotal (sync_proxy_rules_iptables_total) to be "the
total number of iptables rules kube-proxy is responsible for" and add
IptablesRulesLastSync (sync_proxy_rules_iptables_last) to be "the
number of iptables rules kube-proxy rewrote on the last sync".
This required fixing a small bug in the metric, where it had
previously been counting the "-X" lines that had been passed to
iptables-restore to delete stale chains, rather than only counting the
actual rules.
getLocalDetector() used to pass a utiliptables.Interface to
NewDetectLocalByCIDR() so that NewDetectLocalByCIDR() could verify
that the passed-in CIDR was of the same family as the iptables
interface. It would make more sense for getLocalDetector() to verify
this itself and just *not call NewDetectLocalByCIDR* if the families
don't match, and that's what the code does now. So there's no longer
any need to pass the utiliptables.Interface to the local detector.
Both proxies handle IPv4 and IPv6 nodeport addresses separately, but
GetNodeAddresses went out of its way to make that difficult. Fix that.
This commit does not change any externally-visible semantics, but it
makes the existing weird semantics more obvious. Specifically, if you
say "--nodeport-addresses 10.0.0.0/8,192.168.0.0/16", then the
dual-stack proxy code would have split that into a list of IPv4 CIDRs
(["10.0.0.0/8", "192.168.0.0/16"]) to pass to the IPv4 proxier, and a
list of IPv6 CIDRs ([]) to pass to the IPv6 proxier, and then the IPv6
proxier would say "well since the list of nodeport addresses is empty,
I'll listen on all IPv6 addresses", which probably isn't what you
meant, but that's what it did.
This touches cases where FromInt() is used on numeric constants, or
values which are already int32s, or int variables which are defined
close by and can be changed to int32s with little impact.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
Rather than calling fp.deleteEndpointConnection() directly, set up the
proxy to have syncProxyRules() call it, so that we are testing it in
the way that it actually gets called.
Squash the IPv4 and IPv6 unit tests together so we don't need to
duplicate all that code. Fix a tiny bug in NewFakeProxier() found
while doing this...
The APIs talked about "stale services" and "stale endpoints", but the
thing that is actually "stale" is the conntrack entries, not the
services/endpoints. Fix the names to indicate what they actual keep
track of.
Also, all three fields (2 in the endpoints update object and 1 in the
service update object) are currently UDP-specific, but only the
service one made that clear. Fix that too.
This commit did not actually work; in between when it was first
written and tested, and when it merged, the code in
pkg/proxy/endpoints.go was changed to only add UDP endpoints to the
"stale endpoints"/"stale services" lists, and so checking for "either
UDP or SCTP" rather than just UDP when processing those lists had no
effect.
This reverts most of commit aa8521df66
(but leaves the changes related to
ipvs.IsRsGracefulTerminationNeeded() since that actually did have the
effect it meant to have).
In addition to actually updating their data from the provided list of
changes, EndpointsMap.Update() and ServicePortMap.Update() return a
struct with some information about things that changed because of that
update (eg services with stale conntrack entries).
For some reason, they were also returning information about
HealthCheckNodePorts, but they were returning *static* information
based on the current (post-Update) state of the map, not information
about what had *changed* in the update. Since this doesn't match how
the other data in the struct is used (and since there's no reason to
have the data only be returned when you call Update() anyway) , split
it out.
The unit tests were broken with MinimizeIPTablesRestore enabled
because syncProxyRules() assumed that needFullSync would be set on the
first (post-setInitialized()) run, but the unit tests didn't ensure
that.
(In fact, there was a race condition in the real Proxier case as well;
theoretically syncProxyRules() could be run by the
BoundedFrequencyRunner after OnServiceSynced() called setInitialized()
but before it called forceSyncProxyRules(), thus causing the first
real sync to try to do a partial sync and fail. This is now fixed as
well.)
This rule was mistakenly added to kubelet even though it only applies
to kube-proxy's traffic. We do not want to remove it from kubelet yet
because other components may be depending on it for security, but we
should make kube-proxy output its own rule rather than depending on
kubelet.
We had a test that creating a Service with an SCTP port would create
an iptables rule with "-p sctp" in it, which let us test that
kube-proxy was doing vaguely the right thing with SCTP even if the e2e
environment didn't have SCTP support. But this would really make much
more sense as a unit test.
iptables-restore requires that if you change any rule in a chain, you
have to rewrite the entire chain. But if you avoid mentioning a chain
at all, it will leave it untouched. Take advantage of this by not
rewriting the SVC, SVL, EXT, FW, and SEP chains for services that have
not changed since the last sync, which should drastically cut down on
the size of each iptables-restore in large clusters.
Rather than marking packets to be dropped in the "nat" table and then
dropping them from the "filter" table later, just use rules in
"filter" to drop the packets we don't like directly.
Re-sync the rules from TestOverallIPTablesRulesWithMultipleServices to
make sure we're testing all the right kinds of rules. Remove a
duplicate copy of the KUBE-MARK-MASQ and KUBE-POSTROUTING rules.
Update the "REJECT" test to use the new svc6 from
TestOverallIPTablesRulesWithMultipleServices. (Previously it had used
a modified version of TOIPTRWMS's svc3.)
svc2b was using the same ClusterIP as svc3; change it and rename the
service to svc5 to make everything clearer.
Move the test of LoadBalancerSourceRanges from svc2 to svc5, so that
svc2 tests the rules for dropping packets due to
externalTrafficPolicy, and svc5 tests the rules for dropping packets
due to LoadBalancerSourceRanges, rather than having them both mixed
together in svc2.
Add svc6 with no endpoints.
"iptables-save" takes several seconds to run on machines with lots of
iptables rules, and we only use its result to figure out which chains
are no longer referenced by any rules. While it makes things less
confusing if we delete unused chains immediately, it's not actually
_necessary_ since they never get called during packet processing. So
in large clusters, make it so we only clean up chains periodically
rather than on every sync.