Commit Graph

446 Commits

Author SHA1 Message Date
Max Renaud
f0dfac5d07 Add sync_proxy_rules_no_local_endpoints_total metric 2022-03-31 18:54:23 +00:00
Tim Hockin
40e21e310f Elide the -FW- chain when possible
This makes it epsilon harder to reason about but saves one chain
declaration and one rule per service-port usually.
2022-03-30 09:55:34 -07:00
Tim Hockin
7726b5f9fc kube-proxy: inline args in most cases 2022-03-30 09:55:34 -07:00
Tim Hockin
9ed6b73495 kube-proxy: comment endpoint in SEP jumps 2022-03-30 09:55:34 -07:00
Tim Hockin
0e47dc3a65 kube-proxy: remove old TODO 2022-03-30 09:55:33 -07:00
Tim Hockin
30c1523708 kube-proxy: Renames for readability 2022-03-30 09:55:32 -07:00
Tim Hockin
f1553f58c5 kube-proxy: Remove now unneeded rule
Now that NodePorts jump to EXT, we don't need a specific rule for
loopback source detection.
2022-03-30 09:54:40 -07:00
Tim Hockin
db932a0ab1 kube-proxy: Rework LB VIP capture logic
* Comments
* If there are multiple VIPs, don't declare the fwChain multiple times.
* Don't emit the last -j DROP if there's no source ranges
2022-03-30 09:54:40 -07:00
Tim Hockin
07b2585927 kube-proxy: Rename XLB -> EXT
This changes the "XLB" chain into the "EXT" chain - the "external
destinations" chain.
2022-03-30 09:54:38 -07:00
Tim Hockin
482f3bc4bf kube-proxy: all external jumps to XLB chain
This makes the "destination" policy model clearer.  All external
destination captures now jump to the "XLB chain, which is the main place
that masquerade is done (removing it from most other places).

This is simpler to trace - XLB *always* exists (as long as you have an
external exposure) and never gets bypassed.
2022-03-30 09:52:18 -07:00
Tim Hockin
99330d407a kube-proxy: internal renames 2022-03-29 18:48:27 -07:00
Dan Winship
b9141e5c0d proxy/iptables: rename chain variables 2022-03-26 11:14:18 -04:00
Dan Winship
548cf9d5de proxy/iptables: fix internal-vs-external traffic policy handling
Fix internal and external traffic policy to be handled separately (so
that, in particular, services with Local internal traffic policy and
Cluster external traffic policy do not behave as though they had Local
external traffic policy as well.

Additionally, traffic to an `internalTrafficPolicy: Local` service on
a node with no endpoints is now dropped rather than being rejected
(which, as in the external case, may prevent traffic from being lost
when endpoints are in flux).
2022-03-26 11:06:34 -04:00
Dan Winship
2e780ecd99 proxy/iptables: Split KUBE-SVL-XXX chain out of KUBE-XLB-XXX
Now the XLB chain _only_ implements the "short-circuit local
connections to the SVC chain" rule, and the actual endpoint selection
happens in the SVL chain.

Though not quite implemented yet, this will eventually also mean that
"SVC" = "Service, Cluster traffic policy" as opposed to "SVL" =
"Service, Local traffic policy"
2022-03-26 11:06:34 -04:00
Dan Winship
87dcf8b914 proxy/iptables: move XLB chain initial rule setup 2022-03-26 11:06:34 -04:00
Dan Winship
2b872a990d proxy/iptables: clean up / clarify iptables chain names a bit 2022-03-26 11:06:34 -04:00
Kubernetes Prow Robot
475f7af1c1
Merge pull request #108812 from danwinship/endpoint-chain-names
proxy/iptables: fix up endpoint chain name computation
2022-03-19 02:15:09 -07:00
Dan Winship
dd4d88398c proxy/iptables: fix up endpoint chain name computation
Rather than lazily computing and then caching the endpoint chain name
because we don't have the right information at construct time, just
pass the right information at construct time and compute the chain
name then.
2022-03-18 16:10:33 -04:00
Dan Winship
e3549646ec pkg/proxy: Simplify LocalTrafficDetector
Now that we don't have to always append all of the iptables args into
a single array, there's no reason to have LocalTrafficDetector take in
a set of args to prepend to its own output, and also not much point in
having it write out the "-j CHAIN" by itself either.
2022-03-18 16:09:04 -04:00
Khaled (Kal) Henidak
407dcf5164 iptables: remove port opener 2022-03-03 20:04:08 +00:00
Kubernetes Prow Robot
8f3636e8ac
Merge pull request #108224 from danwinship/kube-proxy-logging
Only log full iptables-restore input at V(9)
2022-02-22 16:42:18 -08:00
Dan Winship
9483c272f4 Log metadata about kube-proxy iptables-restore calls
For each iptables-restore call, log the number of services, endpoints,
filter chains, filter rules, NAT chains, and NAT rules in the update
at V(2), in addition to logging the actual rules if V(9).
2022-02-22 08:29:25 -05:00
Dan Winship
37ada4b04f proxy/iptables: Don't create unused chains, and enable the unit test for that 2022-02-21 09:16:22 -05:00
Dan Winship
f5ad58b57b Only log full iptables-restore input at V(9)
In large clusters, the iptables-restore input will be tens of
thousands of lines long, and logging it at V(5) essentially means that
"kube-proxy -v=5" cannot be used in such clusters to see _other_
things that get logged at V(5), because logs will get rolled over far
too quickly. So bump the full-rules logging output down to V(9).
2022-02-21 09:02:36 -05:00
Dan Winship
e7bae9df81 Count iptables lines as we write them 2022-02-19 11:56:14 -05:00
Antonio Ojea
8b5fa408e0 kube-proxy: only set route_localnet if required
kube-proxy sets the sysctl net.ipv4.conf.all.route_localnet=1
so NodePort services can be accessed on the loopback addresses in
IPv4, but this may present security issues.

Leverage the --nodeport-addresses flag to opt-out of this feature,
if the list is not empty and none of the IP ranges contains an IPv4
loopback address this sysctl is not set.

In addition, add a warning to inform users about this behavior.
2022-02-17 20:20:31 +01:00
Quan Tian
6ce612ef65 kube-proxy: fix duplicate port opening
When nodePortAddresses is not specified for kube-proxy, it tried to open
the node port for a NodePort service twice, triggered by IPv4ZeroCIDR
and IPv6ZeroCIDR separately. The first attempt would succeed and the
second one would always generate an error log like below:

"listen tcp4 :30522: bind: address already in use"

This patch fixes it by ensuring nodeAddresses of a proxier only contain
the addresses for its IP family.
2022-01-08 02:35:35 +08:00
cyclinder
97bd6e977d kube-proxy should log the payload when iptables-restore fails
Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
2021-12-23 09:50:56 +08:00
Neha Lohia
fa1b6765d5
move pkg/util/node to component-helpers/node/util (#105347)
Signed-off-by: Neha Lohia <nehapithadiya444@gmail.com>
2021-11-12 07:52:27 -08:00
Quan Tian
95a706ba7c Remove redundant forwarding rule in filter table 2021-11-11 10:27:53 +08:00
Dan Winship
8ef1255cdd proxy/iptables: Abstract out code for writing service-chain-to-endpoint-chain rules
The same code appeared twice, once for the SVC chain and once for the
XLB chain, with the only difference being that the XLB version had
more verbose comments.
2021-11-09 20:59:33 -05:00
Dan Winship
4c64008181 proxy/iptables: Abstract out shared OpenLocalPort code
Also, in the NodePort code, fix it to properly take advantage of the
fact that GetNodeAddresses() guarantees that if it returns a
"match-all" CIDR, then it doesn't return anything else. That also
makes it unnecessary to loop over the node addresses twice.
2021-11-09 20:59:30 -05:00
Dan Winship
9cd0552ddd proxy/iptables: Remove unnecessary /32 and /128 in iptables rules
If you pass just an IP address to "-s" or "-d", the iptables command
will fill in the correct mask automatically.

Originally, the proxier was just hardcoding "/32" for all of these,
which was unnecessary but simple. But when IPv6 support was added, the
code was made more complicated to deal with the fact that the "/32"
needed to be "/128" in the IPv6 case, so it would parse the IPs to
figure out which family they were, which in turn involved adding some
checks in case the parsing fails (even though that "can't happen" and
the old code didn't check for invalid IPs, even though that would
break the iptables-restore if there had been any).

Anyway, all of that is unnecessary because we can just pass the IP
strings to iptables directly rather than parsing and unparsing them
first.

(The diff to proxier_test.go is just deleting "/32" everywhere.)
2021-11-09 09:32:50 -05:00
Dan Winship
62672d06e6 proxy/iptables: fix a bug in node address error handling
If GetNodeAddresses() fails (eg, because you passed the wrong CIDR to
`--nodeport-addresses`), then any NodePort services would end up with
only half a set of iptables rules. Fix it to just not output the
NodePort-specific parts in that case (in addition to logging an error
about the GetNodeAddresses() failure).
2021-11-09 09:32:50 -05:00
Dan Winship
ab67a942ca proxy/iptables, proxy/ipvs: Remove an unnecessary check
The iptables and ipvs proxiers both had a check that none of the
elements of svcInfo.LoadBalancerIPStrings() were "", but that was
already guaranteed by the svcInfo code. Drop the unnecessary checks
and remove a level of indentation.
2021-11-09 09:32:50 -05:00
Tim Hockin
731dc8cf74
Fix regression in kube-proxy (#106214)
* Fix regression in kube-proxy

Don't use a prepend() - that allocates.  Instead, make Write() take
either strings or slices (I wish we could express that better).

* WIP: switch to intf

* WIP: less appends

* tests and ipvs
2021-11-08 15:14:49 -08:00
Kubernetes Prow Robot
0940dd6fc4
Merge pull request #106163 from aojea/conntrack_readiness
kube-proxy consider endpoint readiness to delete UDP stale conntrack entries
2021-11-08 13:11:44 -08:00
Tim Hockin
f662170ff7 kube-proxy: make iptables buffer-writing cleaner 2021-11-05 12:28:19 -07:00
Tim Hockin
f558554ce0 kube-proxy: minor cleanup
Get rid of overlapping helper functions.
2021-11-05 12:28:19 -07:00
Antonio Ojea
909925b492 kube-proxy: fix stale detection logic
The logic to detect stale endpoints was not assuming the endpoint
readiness.

We can have stale entries on UDP services for 2 reasons:
- an endpoint was receiving traffic and is removed or replaced
- a service was receiving traffic but not forwarding it, and starts
to forward it.

Add an e2e test to cover the regression
2021-11-05 20:14:56 +01:00
Dan Winship
229ae58520 proxy/iptables: fix all-vs-ready endpoints a bit
Filter the allEndpoints list into readyEndpoints sooner, and set
"hasEndpoints" based (mostly) on readyEndpoints, not allEndpoints (so
that, eg, we correctly generate REJECT rules for services with no
_functioning_ endpoints, even if they have unusable terminating
endpoints).

Also, write out the endpoint chains at the top of the loop when we
iterate the endpoints for the first time, rather than copying some of
the data to another set of variables and then writing them out later.
And don't write out endpoint chains that won't be used

Also, generate affinity rules only for readyEndpoints rather than
allEndpoints, so affinity gets broken correctly when an endpoint
becomes unready.
2021-11-04 16:32:08 -04:00
Dan Winship
3679639cf1 proxy/iptables: Remove a no-op check
There was code to deal with endpoints that have invalid/empty IP
addresses, but EndpointSlice validation already ensures that these
can't exist.
2021-11-04 16:32:08 -04:00
Dan Winship
08680192fb proxy/iptables: Fix sync_proxy_rules_iptables_total metric
It was counting the number of lines including the "COMMIT" line at the
end, so it was off by one.
2021-11-04 16:30:12 -04:00
Shivanshu Raj Shrivastava
81636f2158
Fixed improperly migrated logs (#105763)
* fixed improperly migrated logs

* small fixes

* small fix

* Update pkg/proxy/iptables/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/healthcheck/service_health.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/iptables/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/iptables/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/iptables/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/iptables/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/ipvs/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/ipvs/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/ipvs/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* Update pkg/proxy/winkernel/proxier.go

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>

* refactoring

* refactoring

* refactoring

* reverted some files back to master

Co-authored-by: Marek Siarkowicz <marek.siarkowicz@protonmail.com>
2021-10-20 03:55:58 -07:00
Ricardo Pchevuzinske Katz
37d11bcdaf Move node and networking related helpers from pkg/util to component helpers
Signed-off-by: Ricardo Katz <rkatz@vmware.com>
2021-09-16 17:00:19 -03:00
Kubernetes Prow Robot
648559b63e
Merge pull request #104742 from khenidak/health-check-port
change health-check port to listen to node port addresses
2021-09-13 15:43:52 -07:00
Khaled (Kal) Henidak
acdf50fbed change proxiers to pass nodePortAddresses 2021-09-13 18:27:07 +00:00
Dan Winship
7f6fbc4482 Drop broken/no-op proxyconfig.EndpointsHandler implementations
Because the proxy.Provider interface included
proxyconfig.EndpointsHandler, all the backends needed to
implement its methods. But iptables, ipvs, and winkernel implemented
them as no-ops, and metaproxier had an implementation that wouldn't
actually work (because it couldn't handle Services with no active
Endpoints).

Since Endpoints processing in kube-proxy is deprecated (and can't be
re-enabled unless you're using a backend that doesn't support
EndpointSlice), remove proxyconfig.EndpointsHandler from the
definition of proxy.Provider and drop all the useless implementations.
2021-09-13 09:32:38 -04:00
Antonio Ojea
0cd75e8fec run hack/update-netparse-cve.sh 2021-08-20 10:42:09 +02:00
Antonio Ojea
a2a22903bc delete stale UDP conntrack entries for loadbalancer IPs 2021-07-29 17:35:07 +02:00