Commit Graph

384 Commits

Author SHA1 Message Date
Dan Winship
0c2711bf24 Make NodePortAddresses abstraction around GetNodeAddresses/ContainsIPv4Loopback 2023-02-22 08:32:19 -05:00
Lars Ekman
a05b04ad96 Remove un-used function 2023-02-20 07:26:45 +01:00
Lars Ekman
32f8066119 Simplification and cleanup 2023-02-19 18:25:13 +01:00
Lars Ekman
8d63750c35 Generic sets in netlink and utils 2023-02-19 18:25:07 +01:00
Lars Ekman
17e2c7d535 Move variable closer to it's use 2023-02-19 18:25:02 +01:00
Lars Ekman
3325c7031d Generic sets in ipset.go 2023-02-19 18:24:56 +01:00
Lars Ekman
fbe671d3f0 Use generic sets 2023-02-19 18:24:51 +01:00
Lars Ekman
547db63bdf Drop the IPGetter 2023-02-19 18:24:45 +01:00
Son Dinh
4f75949bcb Ipvs: Add a new FlagSourceHash to "mh" distribution method.
With the flag, ipvs uses both source IP and source port (instead of
only source IP) to distribute new connections evently to endpoints
that avoids sending all connections from the same client (i.e. same
source IP) to one single endpoint.

User can explicitly set sessionAffinity in service spec to keep all
connections from a source IP to end up on the same endpoint if needed.

Change-Id: I42f950c0840ac06a4ee68a7bbdeab0fc5505c71f
2023-02-11 20:51:02 +11:00
Hao Ruan
7f3de6e53a fix a typo in pkg/proxy/ipvs/proxier.go 2023-01-09 09:29:22 +08:00
Dan Winship
169604d906 Validate single-stack --nodeport-addresses sooner
In the dual-stack case, iptables.NewDualStackProxier and
ipvs.NewDualStackProxier filtered the nodeport addresses values by IP
family before creating the single-stack proxiers. But in the
single-stack case, the kube-proxy startup code just passed the value
to the single-stack proxiers without validation, so they had to
re-check it themselves. Fix that.
2023-01-03 09:01:45 -05:00
Dan Winship
e7ed7220eb Explicitly pass IP family to proxier
Rather than re-determining it from the iptables object in both proxies.
2023-01-03 09:01:45 -05:00
Lars Ekman
5ff705fd77 proxy/ipvs: Describe and handle a bug in moby/ipvs
Handle https://github.com/moby/ipvs/issues/27
A work-around was already in place, but a segv would occur
when the bug is fixed. That will not happen now.
2022-12-24 10:21:27 +01:00
Lars Ekman
68d78c89ec use netutils.ParseIPSloppy 2022-12-23 14:19:28 +01:00
Lars Ekman
dc86bdc3aa Handle an empty scheduler ("") 2022-12-23 13:23:02 +01:00
Lars Ekman
4adc687275 Fixed typo 2022-12-23 11:13:55 +01:00
Lars Ekman
cf214d0738 Clean-up un-used code 2022-12-23 10:54:51 +01:00
Lars Ekman
3bd3759424 gofmt 2022-12-23 10:14:47 +01:00
Lars Ekman
b169f22eb8 Updates after rewiew
To update the scheduler without node reboot now works.
The address for the probe VS is now 198.51.100.0 which is
reseved for documentation, please see rfc5737. The comment
about this is extended.
2022-12-23 10:11:15 +01:00
Lars Ekman
cd15ca0548 proxy/ipvs: Check that a dummy virtual server can be added
This tests both ipvs and the configured scheduler
2022-12-22 20:36:53 +01:00
Lars Ekman
4f02671b23 proxy/ipvs: Remove kernel module tests
To check kernel modules is a bad way to check functionality.
This commit just removes the checks and makes it possible to
use a statically linked kernel.

Minimal updates to unit-tests are made.
2022-12-22 17:13:41 +01:00
Kubernetes Prow Robot
53906cbe89
Merge pull request #114360 from dengyufeng2206/120802pull
Reduce redundant conversions
2022-12-16 11:06:09 -08:00
Kubernetes Prow Robot
d1879644a6
Merge pull request #113463 from wzshiming/clean/ioutil-ipvs
Replace the ioutil by the os and io for the pkg/proxy/ipvs
2022-12-16 11:05:58 -08:00
dengyufeng2206
ba6afe213a Reduce redundant conversions 2022-12-08 15:21:24 +08:00
Andy Voltz
29f4862ed8 Promote ServiceInternalTrafficPolicy to GA 2022-11-03 13:17:03 -04:00
Manav Agarwal
3320e50e24 If applied, this commit will refactor variable names in kube-proxy 2022-11-03 03:45:57 +05:30
Shiming Zhang
1c8d3226a9 Replace the ioutil by the os and io for the pkg/proxy/ipvs 2022-10-31 16:29:19 +08:00
DingShujie
e1f0b85334 Dismiss connects to localhost early in the service chain
Signed-off-by: DingShujie <dingshujie@huawei.com>
2022-10-11 13:57:35 +08:00
Lars Ekman
639b9bca5d Corrects target in the KUBE-IPVS-FILTER chain
The target was "ACCEPT" which disabled any other check like
loadBalancerSourceRanges in the KUBE-PROXY-FIREWALL chain.
The target is now "RETURN".
2022-09-15 07:49:12 +02:00
Kubernetes Prow Robot
9924814270
Merge pull request #108460 from Nordix/issue-72236
Prevent host access on VIP addresses in proxy-mode=ipvs
2022-09-01 12:59:18 -07:00
Sanskar Jaiswal
b670656a09 update ipvs proxier to update realserver weights at startup
Update the IPVS proxier to have a bool `initialSync` which is set to
true when a new proxier is initialized and then set to false on all
syncs. This lets us run startup-only logic, which subsequently lets us
update the realserver only when needed and avoiding any expensive
operations.

Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-08-27 07:42:07 +00:00
Dan Winship
1609017f2b kube-proxy: remove ipvs-to-iptables fallback
If the user passes "--proxy-mode ipvs", and it is not possible to use
IPVS, then error out rather than falling back to iptables.

There was never any good reason to be doing fallback; this was
presumably erroneously added to parallel the iptables-to-userspace
fallback (which only existed because we had wanted iptables to be the
default but not all systems could support it).

In particular, if the user passed configuration options for ipvs, then
they presumably *didn't* pass configuration options for iptables, and
so even if the iptables proxy is able to run, it is likely to be
misconfigured.
2022-08-16 09:30:08 -04:00
Davanum Srinivas
a9593d634c
Generate and format files
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-07-26 13:14:05 -04:00
Dan Williams
f197509879 proxy: queue syncs on node events rather than syncing immediately
The proxies watch node labels for topology changes, but node labels
can change in bursts especially in larger clusters. This causes
pressure on all proxies because they can't filter the events, since
the topology could match on any label.

Change node event handling to queue the request rather than immediately
syncing. The sync runner can already handle short bursts which shouldn't
change behavior for most cases.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2022-07-18 09:21:52 -05:00
Dan Winship
a3556edba1 Stop trying to "preserve" iptables counters that are always 0
The iptables and ipvs proxies have code to try to preserve certain
iptables counters when modifying chains via iptables-restore, but the
counters in question only actually exist for the built-in chains (eg
INPUT, FORWARD, PREROUTING, etc), which we never modify via
iptables-restore (and in fact, *can't* safely modify via
iptables-restore), so we are really just doing a lot of unnecessary
work to copy the constant string "[0:0]" over from iptables-save
output to iptables-restore input. So stop doing that.

Also fix a confused error message when iptables-save fails.
2022-06-28 08:39:32 -04:00
Lars Ekman
c1e5a9e6f0 Prevent host access on VIP addresses in proxy-mode=ipvs 2022-06-24 08:33:58 +02:00
Dan Winship
28253f6030 proxy/ipvs: Use DROP directly rather than KUBE-MARK-DROP
The ipvs proxier was figuring out LoadBalancerSourceRanges matches in
the nat table and using KUBE-MARK-DROP to mark unmatched packets to be
dropped later. But with ipvs, unlike with iptables, DNAT happens after
the packet is "delivered" to the dummy interface, so the packet will
still be unmodified when it reaches the filter table (the first time)
so there's no reason to split the work between the nat and filter
tables; we can just do it all from the filter table and call DROP
directly.

Before:

  - KUBE-LOAD-BALANCER (in nat) uses kubeLoadBalancerFWSet to match LB
    traffic for services using LoadBalancerSourceRanges, and sends it
    to KUBE-FIREWALL.

  - KUBE-FIREWALL uses kubeLoadBalancerSourceCIDRSet and
    kubeLoadBalancerSourceIPSet to match allowed source/dest combos
    and calls "-j RETURN".

  - All remaining traffic that doesn't escape KUBE-FIREWALL is sent to
    KUBE-MARK-DROP.

  - Traffic sent to KUBE-MARK-DROP later gets dropped by chains in
    filter created by kubelet.

After:

  - All INPUT and FORWARD traffic gets routed to KUBE-PROXY-FIREWALL
    (in filter). (We don't use "KUBE-FIREWALL" any more because
    there's already a chain in filter by that name that belongs to
    kubelet.)

  - KUBE-PROXY-FIREWALL sends traffic matching kubeLoadbalancerFWSet
    to KUBE-SOURCE-RANGES-FIREWALL

  - KUBE-SOURCE-RANGES-FIREWALL uses kubeLoadBalancerSourceCIDRSet and
    kubeLoadBalancerSourceIPSet to match allowed source/dest combos
    and calls "-j RETURN".

  - All remaining traffic that doesn't escape
    KUBE-SOURCE-RANGES-FIREWALL is dropped (directly via "-j DROP").

  - (KUBE-LOAD-BALANCER in nat is now used only to set up masquerading)
2022-06-22 13:02:22 -04:00
Dan Winship
a9cd57fa40 proxy/ipvs: add filter table support to ipsetWithIptablesChain 2022-06-22 12:53:18 -04:00
Dan Winship
400d474bac proxy/ipvs: fix some identifiers
kubeLoadbalancerFWSet was the only LoadBalancer-related identifier
with a lowercase "b", so fix that.

rename TestLoadBalanceSourceRanges to TestLoadBalancerSourceRanges to
match the field name (and the iptables proxier test).
2022-06-13 09:13:15 -04:00
Dan Winship
0b1e364814 proxy/ipvs: fix a few comments 2022-06-12 20:30:47 -04:00
gkarthiks
1fd959e256 refactor: serviceNameString to svcptNameString
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>

refactor: svc port name variable #108806

Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>

refactor: rename struct for service port information to servicePortInfo and fields for more redability

Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>

fix: drop chain rule

Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
2022-05-22 03:31:00 -07:00
Dan Winship
b0d9c063a8 unexport mistakenly-exported constants 2022-05-06 07:33:29 -04:00
Dan Winship
84ad54f0e5 Don't increment "no local endpoints" metric when there are no remote endpoints
A service having no _local_ endpoints when it does have remote
endpoints is different from a service having no endpoints at all.
2022-05-04 12:38:17 -04:00
Johannes Scheerer
a3b7f219a1
Cleanup KUBE-NODE-PORT chain in filter table.
When cleaning up iptables rules and ipsets used by kube-proxy in IPVS mode
iptables chain KUBE-NODE-PORT needs to be deleted before ipset
KUBE-HEALTH-CHECK-NODE-PORT can be removed. Therefore, deletion of
iptables chain KUBE-NODE-PORT is added in this change.
2022-04-04 16:10:06 +02:00
Max Renaud
6454248b6b Moved counting logic to accommodate rebase 2022-04-01 15:52:21 +00:00
Max Renaud
61b7e6c49c Changed usage of NodeLocal* to *PolicyLocal 2022-03-31 18:55:47 +00:00
Max Renaud
198367a486 Added test where both policies are set 2022-03-31 18:54:28 +00:00
Max Renaud
ba4f5c4e7b use sets.String for tracking IPVS no local endpoint metric 2022-03-31 18:54:27 +00:00
Max Renaud
f0dfac5d07 Add sync_proxy_rules_no_local_endpoints_total metric 2022-03-31 18:54:23 +00:00
Kubernetes Prow Robot
f2e5c16545
Merge pull request #109060 from thockin/kube-proxy-rule-cleanups-after-106497
Kube proxy rule reorg XLB->EXT
2022-03-31 00:11:01 -07:00