Commit Graph

1681 Commits

Author SHA1 Message Date
Max Renaud
ba4f5c4e7b use sets.String for tracking IPVS no local endpoint metric 2022-03-31 18:54:27 +00:00
Max Renaud
f0dfac5d07 Add sync_proxy_rules_no_local_endpoints_total metric 2022-03-31 18:54:23 +00:00
Kubernetes Prow Robot
f2e5c16545
Merge pull request #109060 from thockin/kube-proxy-rule-cleanups-after-106497
Kube proxy rule reorg XLB->EXT
2022-03-31 00:11:01 -07:00
Kubernetes Prow Robot
5223c1efef
Merge pull request #97081 from Nordix/issue-93456
Ipvs: non-local access to externalTrafficPolicy:Local
2022-03-30 13:37:56 -07:00
Tim Hockin
40e21e310f Elide the -FW- chain when possible
This makes it epsilon harder to reason about but saves one chain
declaration and one rule per service-port usually.
2022-03-30 09:55:34 -07:00
Tim Hockin
7726b5f9fc kube-proxy: inline args in most cases 2022-03-30 09:55:34 -07:00
Tim Hockin
c4271c9a6f Rename tests to avoid underscores 2022-03-30 09:55:34 -07:00
Tim Hockin
9ed6b73495 kube-proxy: comment endpoint in SEP jumps 2022-03-30 09:55:34 -07:00
Tim Hockin
0e47dc3a65 kube-proxy: remove old TODO 2022-03-30 09:55:33 -07:00
Tim Hockin
30c1523708 kube-proxy: Renames for readability 2022-03-30 09:55:32 -07:00
Tim Hockin
f1553f58c5 kube-proxy: Remove now unneeded rule
Now that NodePorts jump to EXT, we don't need a specific rule for
loopback source detection.
2022-03-30 09:54:40 -07:00
Tim Hockin
db932a0ab1 kube-proxy: Rework LB VIP capture logic
* Comments
* If there are multiple VIPs, don't declare the fwChain multiple times.
* Don't emit the last -j DROP if there's no source ranges
2022-03-30 09:54:40 -07:00
Tim Hockin
07b2585927 kube-proxy: Rename XLB -> EXT
This changes the "XLB" chain into the "EXT" chain - the "external
destinations" chain.
2022-03-30 09:54:38 -07:00
Tim Hockin
482f3bc4bf kube-proxy: all external jumps to XLB chain
This makes the "destination" policy model clearer.  All external
destination captures now jump to the "XLB chain, which is the main place
that masquerade is done (removing it from most other places).

This is simpler to trace - XLB *always* exists (as long as you have an
external exposure) and never gets bypassed.
2022-03-30 09:52:18 -07:00
Tim Hockin
dd0fc6b354 kube-proxy: print line number for test failures 2022-03-29 18:48:27 -07:00
Tim Hockin
ef959f00af kube-proxy: clean up tests
No functional changes, much whitespace.

Make assertIPTablesRulesEqual() *not* sort the `expected` value - make
the test cases all be pre-sorted.  This will make followup commits
cleaner.

Make the test output cleaner when this fails.

Use dedent everywhere for easier reading.
2022-03-29 18:48:27 -07:00
Tim Hockin
99330d407a kube-proxy: internal renames 2022-03-29 18:48:27 -07:00
Lars Ekman
61085a7589 Ipvs: non-local access to externalTrafficPolicy:Local
Allow access to externalTrafficPolicy:Local services from PODs
not on a node where a server executes. Problem described in #93456
2022-03-29 21:42:39 +02:00
Andrew Sy Kim
53439020a4 pkg/proxy/ipvs: add unit tests Test_EndpointSliceOnlyReadyAndTerminatingCluster and Test_EndpointSliceReadyAndTerminatingCluster for validating ProxyTerminatingEndpoints when the traffic policy is 'Cluster'
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
2022-03-29 11:37:15 -04:00
Andrew Sy Kim
718a655e42 pkg/proxy/iptables: add and fix existing unit tests based on changes to ProxyTermintingEndpoints
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
2022-03-29 11:37:15 -04:00
Andrew Sy Kim
e2e0b6fca8 pkg/proxy: update CategorizeEndpoints to apply ProxyTerminatingEndpoints to all traffic policies
Signed-off-by: Andrew Sy Kim <andrewsy@google.com>
2022-03-29 11:06:58 -04:00
Kubernetes Prow Robot
9f213370cc
Merge pull request #106497 from danwinship/traffic-policy-fixes
fix internalTrafficPolicy
2022-03-28 14:19:54 -07:00
Dan Winship
b9141e5c0d proxy/iptables: rename chain variables 2022-03-26 11:14:18 -04:00
Dan Winship
548cf9d5de proxy/iptables: fix internal-vs-external traffic policy handling
Fix internal and external traffic policy to be handled separately (so
that, in particular, services with Local internal traffic policy and
Cluster external traffic policy do not behave as though they had Local
external traffic policy as well.

Additionally, traffic to an `internalTrafficPolicy: Local` service on
a node with no endpoints is now dropped rather than being rejected
(which, as in the external case, may prevent traffic from being lost
when endpoints are in flux).
2022-03-26 11:06:34 -04:00
Dan Winship
2e780ecd99 proxy/iptables: Split KUBE-SVL-XXX chain out of KUBE-XLB-XXX
Now the XLB chain _only_ implements the "short-circuit local
connections to the SVC chain" rule, and the actual endpoint selection
happens in the SVL chain.

Though not quite implemented yet, this will eventually also mean that
"SVC" = "Service, Cluster traffic policy" as opposed to "SVL" =
"Service, Local traffic policy"
2022-03-26 11:06:34 -04:00
Dan Winship
87dcf8b914 proxy/iptables: move XLB chain initial rule setup 2022-03-26 11:06:34 -04:00
Dan Winship
2b872a990d proxy/iptables: clean up / clarify iptables chain names a bit 2022-03-26 11:06:34 -04:00
Surya Seetharaman
1ea5f9432c Add validation for bridge-interface and interface-name-prefix
Co-authored-by: Will Daly <widaly@microsoft.com>
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
2022-03-25 20:06:12 +01:00
Surya Seetharaman
7d480d8ac8 Enable local traffic detection using the interface options
This commit adds the framework for the new local detection
modes BridgeInterface and InterfaceNamePrefix to work.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
2022-03-25 20:06:12 +01:00
Surya Seetharaman
5632991115 Local Traffic Detector: Add two new modes
This PR introduces two new modes for detecting
local traffic in a cluster.
1) detectLocalByBridgeInterface: This takes a bridge name
as argument and decides all traffic that match on their
originating interface being that of this bridge, shall be
considered as local pod traffic.
2) detectLocalByInterfaceNamePrefix: This takes an interface prefix
name as argument and decides all traffic that match on their
originating interface names having a prefix that matches this
argument shall be considered as local pod traffic.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
2022-03-25 20:06:06 +01:00
Kubernetes Prow Robot
475f7af1c1
Merge pull request #108812 from danwinship/endpoint-chain-names
proxy/iptables: fix up endpoint chain name computation
2022-03-19 02:15:09 -07:00
Kubernetes Prow Robot
2bda940add
Merge pull request #108811 from danwinship/simplify-local-traffic-detector
pkg/proxy: Simplify LocalTrafficDetector
2022-03-18 20:59:12 -07:00
Dan Winship
dd4d88398c proxy/iptables: fix up endpoint chain name computation
Rather than lazily computing and then caching the endpoint chain name
because we don't have the right information at construct time, just
pass the right information at construct time and compute the chain
name then.
2022-03-18 16:10:33 -04:00
Dan Winship
e3549646ec pkg/proxy: Simplify LocalTrafficDetector
Now that we don't have to always append all of the iptables args into
a single array, there's no reason to have LocalTrafficDetector take in
a set of args to prepend to its own output, and also not much point in
having it write out the "-j CHAIN" by itself either.
2022-03-18 16:09:04 -04:00
Kubernetes Prow Robot
41b29e6542
Merge pull request #99287 from anfernee/clientip
Add HNS Load Balancer Healthchecks for ExternalTrafficPolicy: Local
2022-03-16 22:57:18 -07:00
Yongkun Gui
78a507b256 Fix health check from Google's Load Balancer
This change adds 2 options for windows:
--forward-healthcheck-vip: If true forward service VIP for health check
port
--root-hnsendpoint-name: The name of the hns endpoint name for root
namespace attached to l2bridge, default is cbr0

When --forward-healthcheck-vip is set as true and winkernel is used,
kube-proxy will add an hns load balancer to forward health check request
that was sent to lb_vip:healthcheck_port to the node_ip:healthcheck_port.
Without this forwarding, the health check from google load balancer will
fail, and it will stop forwarding traffic to the windows node.

This change fixes the following 2 cases for service:
- `externalTrafficPolicy: Cluster` (default option): healthcheck_port is
10256 for all services. Without this fix, all traffic won't be directly
forwarded to windows node. It will always go through a linux node and
get forwarded to windows from there.
- `externalTrafficPolicy: Local`: different healthcheck_port for each
service that is configured as local. Without this fix, this feature
won't work on windows node at all. This feature preserves client ip
that tries to connect to their application running in windows pod.

Change-Id: If4513e72900101ef70d86b91155e56a1f8c79719
2022-03-11 22:34:59 -08:00
Khaled (Kal) Henidak
c4a00b7d90 ipvs: remove port opener 2022-03-04 21:10:55 +00:00
Khaled (Kal) Henidak
407dcf5164 iptables: remove port opener 2022-03-03 20:04:08 +00:00
Kubernetes Prow Robot
8f3636e8ac
Merge pull request #108224 from danwinship/kube-proxy-logging
Only log full iptables-restore input at V(9)
2022-02-22 16:42:18 -08:00
Kubernetes Prow Robot
108e8136e2
Merge pull request #107393 from danwinship/filter-endpoints
kube-proxy endpoint filtering unit test refactoring
2022-02-22 08:55:15 -08:00
Dan Winship
9483c272f4 Log metadata about kube-proxy iptables-restore calls
For each iptables-restore call, log the number of services, endpoints,
filter chains, filter rules, NAT chains, and NAT rules in the update
at V(2), in addition to logging the actual rules if V(9).
2022-02-22 08:29:25 -05:00
Dan Winship
d830ef6112 proxy/iptables: add HealthCheckNodePorts to unit tests that need them
To avoid spurious errors in the test output:

  E0114 08:43:27.453974 3718376 service.go:221] "Service has no healthcheck nodeport" service="ns1/svc1"
2022-02-21 09:16:23 -05:00
Dan Winship
d74df127e9 proxy/iptables: Fix up IPs and ports in unit tests
All of the tests used a localDetector that considered the pod IP range
to be 10.0.0.0/24, but lots of the tests used pod IPs in 10.180.0.0/16
or 10.0.1.0/24, meaning the generated iptables rules were somewhat
inconsistent. Fix this by expanding the localDetector's pod IP range
to 10.0.0.0/8. (Changing the pod IPs to all be in 10.0.0.0/24 instead
would be a much larger change since it would result in the SEP chain
names changing.)

Meanwhile, the different tests were also horribly inconsistent about
what values they used for other IPs, and some of them even used the
same IPs (or ports) for different things in the same test case. Fix
these all up and create a consistent set of IP assignments:

// Pod IPs:             10.0.0.0/8
// Service ClusterIPs:  172.30.0.0/16
// Node IPs:            192.168.0.0/24
// Local Node IP:       192.168.0.2
// Service ExternalIPs: 192.168.99.0/24
// LoadBalancer IPs:    1.2.3.4, 5.6.7.8, 9.10.11.12
// Non-cluster IPs:     203.0.113.0/24
// LB Source Range:     203.0.113.0/25
2022-02-21 09:16:22 -05:00
Dan Winship
37ada4b04f proxy/iptables: Don't create unused chains, and enable the unit test for that 2022-02-21 09:16:22 -05:00
Dan Winship
ef4324eaf5 proxy/iptables: refactor unit test code / fix error reporting
Only run assertIPTablesRuleJumps() on the expected output, not on the
actual output, since if there's a problem with the actual output, we'd
rather see it as the diff from the expected output.
2022-02-21 09:16:22 -05:00
Dan Winship
4af471f8be proxy/iptables: move GetChainLines unit tests to the right package
GetChainLines is a utiliptables method, so it should be part of the
unit tests there.
2022-02-21 09:16:22 -05:00
Dan Winship
f5ad58b57b Only log full iptables-restore input at V(9)
In large clusters, the iptables-restore input will be tens of
thousands of lines long, and logging it at V(5) essentially means that
"kube-proxy -v=5" cannot be used in such clusters to see _other_
things that get logged at V(5), because logs will get rolled over far
too quickly. So bump the full-rules logging output down to V(9).
2022-02-21 09:02:36 -05:00
Dan Winship
e7bae9df81 Count iptables lines as we write them 2022-02-19 11:56:14 -05:00
Kubernetes Prow Robot
2134e971a6
Merge pull request #107684 from aojea/nodePortsOnLocalhost
kube-proxy: only set route_localnet if required
2022-02-17 16:14:48 -08:00
Antonio Ojea
8b5fa408e0 kube-proxy: only set route_localnet if required
kube-proxy sets the sysctl net.ipv4.conf.all.route_localnet=1
so NodePort services can be accessed on the loopback addresses in
IPv4, but this may present security issues.

Leverage the --nodeport-addresses flag to opt-out of this feature,
if the list is not empty and none of the IP ranges contains an IPv4
loopback address this sysctl is not set.

In addition, add a warning to inform users about this behavior.
2022-02-17 20:20:31 +01:00