Commit Graph

391 Commits

Author SHA1 Message Date
Dan Winship
4381973a44 Revert (most of) "Issue 70020; Flush Conntrack entities for SCTP"
This commit did not actually work; in between when it was first
written and tested, and when it merged, the code in
pkg/proxy/endpoints.go was changed to only add UDP endpoints to the
"stale endpoints"/"stale services" lists, and so checking for "either
UDP or SCTP" rather than just UDP when processing those lists had no
effect.

This reverts most of commit aa8521df66
(but leaves the changes related to
ipvs.IsRsGracefulTerminationNeeded() since that actually did have the
effect it meant to have).
2023-03-14 12:18:58 -04:00
Kubernetes Prow Robot
611273a5bb
Merge pull request #115253 from danwinship/proxy-update-healthchecknodeport
Split out HealthCheckNodePort stuff from service/endpoint map Update()
2023-03-13 15:22:48 -07:00
Kubernetes Prow Robot
86bf570711
Merge pull request #111661 from alexanderConstantinescu/etp-local-svc-hc-kube-proxy
[Proxy]: add `healthz` verification when determining HC response for eTP:Local
2023-03-07 05:34:36 -08:00
Alexander Constantinescu
ec917850af Add proxy healthz result to ETP=local health check
Today, the health check response to the load balancers asking Kube-proxy for
the status of ETP:Local services does not include the healthz state of Kube-
proxy. This means that Kube-proxy might indicate to load balancers that they
should forward traffic to the node in question, simply because the endpoint
is running on the node - this overlooks the fact that Kube-proxy might be
not-healthy and hasn't successfully written the rules enabling traffic to
reach the endpoint.
2023-03-06 10:53:17 +01:00
Daman
42a91c29e5 proxier: track metrics before conntrack cleaning 2023-03-02 20:56:05 +05:30
Daman
b23cb97704 proxier: syncing ipvs conntrack cleaning with iptables. 2023-03-02 20:54:34 +05:30
Dan Winship
0c2711bf24 Make NodePortAddresses abstraction around GetNodeAddresses/ContainsIPv4Loopback 2023-02-22 08:32:19 -05:00
Lars Ekman
a05b04ad96 Remove un-used function 2023-02-20 07:26:45 +01:00
Lars Ekman
32f8066119 Simplification and cleanup 2023-02-19 18:25:13 +01:00
Lars Ekman
8d63750c35 Generic sets in netlink and utils 2023-02-19 18:25:07 +01:00
Lars Ekman
17e2c7d535 Move variable closer to it's use 2023-02-19 18:25:02 +01:00
Lars Ekman
3325c7031d Generic sets in ipset.go 2023-02-19 18:24:56 +01:00
Lars Ekman
fbe671d3f0 Use generic sets 2023-02-19 18:24:51 +01:00
Lars Ekman
547db63bdf Drop the IPGetter 2023-02-19 18:24:45 +01:00
Son Dinh
4f75949bcb Ipvs: Add a new FlagSourceHash to "mh" distribution method.
With the flag, ipvs uses both source IP and source port (instead of
only source IP) to distribute new connections evently to endpoints
that avoids sending all connections from the same client (i.e. same
source IP) to one single endpoint.

User can explicitly set sessionAffinity in service spec to keep all
connections from a source IP to end up on the same endpoint if needed.

Change-Id: I42f950c0840ac06a4ee68a7bbdeab0fc5505c71f
2023-02-11 20:51:02 +11:00
Dan Winship
d901992eae Split out HealthCheckNodePort stuff from service/endpoint map Update()
In addition to actually updating their data from the provided list of
changes, EndpointsMap.Update() and ServicePortMap.Update() return a
struct with some information about things that changed because of that
update (eg services with stale conntrack entries).

For some reason, they were also returning information about
HealthCheckNodePorts, but they were returning *static* information
based on the current (post-Update) state of the map, not information
about what had *changed* in the update. Since this doesn't match how
the other data in the struct is used (and since there's no reason to
have the data only be returned when you call Update() anyway) , split
it out.
2023-01-22 10:33:33 -05:00
Hao Ruan
7f3de6e53a fix a typo in pkg/proxy/ipvs/proxier.go 2023-01-09 09:29:22 +08:00
Dan Winship
169604d906 Validate single-stack --nodeport-addresses sooner
In the dual-stack case, iptables.NewDualStackProxier and
ipvs.NewDualStackProxier filtered the nodeport addresses values by IP
family before creating the single-stack proxiers. But in the
single-stack case, the kube-proxy startup code just passed the value
to the single-stack proxiers without validation, so they had to
re-check it themselves. Fix that.
2023-01-03 09:01:45 -05:00
Dan Winship
e7ed7220eb Explicitly pass IP family to proxier
Rather than re-determining it from the iptables object in both proxies.
2023-01-03 09:01:45 -05:00
Lars Ekman
5ff705fd77 proxy/ipvs: Describe and handle a bug in moby/ipvs
Handle https://github.com/moby/ipvs/issues/27
A work-around was already in place, but a segv would occur
when the bug is fixed. That will not happen now.
2022-12-24 10:21:27 +01:00
Lars Ekman
68d78c89ec use netutils.ParseIPSloppy 2022-12-23 14:19:28 +01:00
Lars Ekman
dc86bdc3aa Handle an empty scheduler ("") 2022-12-23 13:23:02 +01:00
Lars Ekman
4adc687275 Fixed typo 2022-12-23 11:13:55 +01:00
Lars Ekman
cf214d0738 Clean-up un-used code 2022-12-23 10:54:51 +01:00
Lars Ekman
3bd3759424 gofmt 2022-12-23 10:14:47 +01:00
Lars Ekman
b169f22eb8 Updates after rewiew
To update the scheduler without node reboot now works.
The address for the probe VS is now 198.51.100.0 which is
reseved for documentation, please see rfc5737. The comment
about this is extended.
2022-12-23 10:11:15 +01:00
Lars Ekman
cd15ca0548 proxy/ipvs: Check that a dummy virtual server can be added
This tests both ipvs and the configured scheduler
2022-12-22 20:36:53 +01:00
Lars Ekman
4f02671b23 proxy/ipvs: Remove kernel module tests
To check kernel modules is a bad way to check functionality.
This commit just removes the checks and makes it possible to
use a statically linked kernel.

Minimal updates to unit-tests are made.
2022-12-22 17:13:41 +01:00
Kubernetes Prow Robot
53906cbe89
Merge pull request #114360 from dengyufeng2206/120802pull
Reduce redundant conversions
2022-12-16 11:06:09 -08:00
Kubernetes Prow Robot
d1879644a6
Merge pull request #113463 from wzshiming/clean/ioutil-ipvs
Replace the ioutil by the os and io for the pkg/proxy/ipvs
2022-12-16 11:05:58 -08:00
dengyufeng2206
ba6afe213a Reduce redundant conversions 2022-12-08 15:21:24 +08:00
Andy Voltz
29f4862ed8 Promote ServiceInternalTrafficPolicy to GA 2022-11-03 13:17:03 -04:00
Manav Agarwal
3320e50e24 If applied, this commit will refactor variable names in kube-proxy 2022-11-03 03:45:57 +05:30
Shiming Zhang
1c8d3226a9 Replace the ioutil by the os and io for the pkg/proxy/ipvs 2022-10-31 16:29:19 +08:00
DingShujie
e1f0b85334 Dismiss connects to localhost early in the service chain
Signed-off-by: DingShujie <dingshujie@huawei.com>
2022-10-11 13:57:35 +08:00
Lars Ekman
639b9bca5d Corrects target in the KUBE-IPVS-FILTER chain
The target was "ACCEPT" which disabled any other check like
loadBalancerSourceRanges in the KUBE-PROXY-FIREWALL chain.
The target is now "RETURN".
2022-09-15 07:49:12 +02:00
Kubernetes Prow Robot
9924814270
Merge pull request #108460 from Nordix/issue-72236
Prevent host access on VIP addresses in proxy-mode=ipvs
2022-09-01 12:59:18 -07:00
Sanskar Jaiswal
b670656a09 update ipvs proxier to update realserver weights at startup
Update the IPVS proxier to have a bool `initialSync` which is set to
true when a new proxier is initialized and then set to false on all
syncs. This lets us run startup-only logic, which subsequently lets us
update the realserver only when needed and avoiding any expensive
operations.

Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
2022-08-27 07:42:07 +00:00
Dan Winship
1609017f2b kube-proxy: remove ipvs-to-iptables fallback
If the user passes "--proxy-mode ipvs", and it is not possible to use
IPVS, then error out rather than falling back to iptables.

There was never any good reason to be doing fallback; this was
presumably erroneously added to parallel the iptables-to-userspace
fallback (which only existed because we had wanted iptables to be the
default but not all systems could support it).

In particular, if the user passed configuration options for ipvs, then
they presumably *didn't* pass configuration options for iptables, and
so even if the iptables proxy is able to run, it is likely to be
misconfigured.
2022-08-16 09:30:08 -04:00
Davanum Srinivas
a9593d634c
Generate and format files
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-07-26 13:14:05 -04:00
Dan Williams
f197509879 proxy: queue syncs on node events rather than syncing immediately
The proxies watch node labels for topology changes, but node labels
can change in bursts especially in larger clusters. This causes
pressure on all proxies because they can't filter the events, since
the topology could match on any label.

Change node event handling to queue the request rather than immediately
syncing. The sync runner can already handle short bursts which shouldn't
change behavior for most cases.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2022-07-18 09:21:52 -05:00
Dan Winship
a3556edba1 Stop trying to "preserve" iptables counters that are always 0
The iptables and ipvs proxies have code to try to preserve certain
iptables counters when modifying chains via iptables-restore, but the
counters in question only actually exist for the built-in chains (eg
INPUT, FORWARD, PREROUTING, etc), which we never modify via
iptables-restore (and in fact, *can't* safely modify via
iptables-restore), so we are really just doing a lot of unnecessary
work to copy the constant string "[0:0]" over from iptables-save
output to iptables-restore input. So stop doing that.

Also fix a confused error message when iptables-save fails.
2022-06-28 08:39:32 -04:00
Lars Ekman
c1e5a9e6f0 Prevent host access on VIP addresses in proxy-mode=ipvs 2022-06-24 08:33:58 +02:00
Dan Winship
28253f6030 proxy/ipvs: Use DROP directly rather than KUBE-MARK-DROP
The ipvs proxier was figuring out LoadBalancerSourceRanges matches in
the nat table and using KUBE-MARK-DROP to mark unmatched packets to be
dropped later. But with ipvs, unlike with iptables, DNAT happens after
the packet is "delivered" to the dummy interface, so the packet will
still be unmodified when it reaches the filter table (the first time)
so there's no reason to split the work between the nat and filter
tables; we can just do it all from the filter table and call DROP
directly.

Before:

  - KUBE-LOAD-BALANCER (in nat) uses kubeLoadBalancerFWSet to match LB
    traffic for services using LoadBalancerSourceRanges, and sends it
    to KUBE-FIREWALL.

  - KUBE-FIREWALL uses kubeLoadBalancerSourceCIDRSet and
    kubeLoadBalancerSourceIPSet to match allowed source/dest combos
    and calls "-j RETURN".

  - All remaining traffic that doesn't escape KUBE-FIREWALL is sent to
    KUBE-MARK-DROP.

  - Traffic sent to KUBE-MARK-DROP later gets dropped by chains in
    filter created by kubelet.

After:

  - All INPUT and FORWARD traffic gets routed to KUBE-PROXY-FIREWALL
    (in filter). (We don't use "KUBE-FIREWALL" any more because
    there's already a chain in filter by that name that belongs to
    kubelet.)

  - KUBE-PROXY-FIREWALL sends traffic matching kubeLoadbalancerFWSet
    to KUBE-SOURCE-RANGES-FIREWALL

  - KUBE-SOURCE-RANGES-FIREWALL uses kubeLoadBalancerSourceCIDRSet and
    kubeLoadBalancerSourceIPSet to match allowed source/dest combos
    and calls "-j RETURN".

  - All remaining traffic that doesn't escape
    KUBE-SOURCE-RANGES-FIREWALL is dropped (directly via "-j DROP").

  - (KUBE-LOAD-BALANCER in nat is now used only to set up masquerading)
2022-06-22 13:02:22 -04:00
Dan Winship
a9cd57fa40 proxy/ipvs: add filter table support to ipsetWithIptablesChain 2022-06-22 12:53:18 -04:00
Dan Winship
400d474bac proxy/ipvs: fix some identifiers
kubeLoadbalancerFWSet was the only LoadBalancer-related identifier
with a lowercase "b", so fix that.

rename TestLoadBalanceSourceRanges to TestLoadBalancerSourceRanges to
match the field name (and the iptables proxier test).
2022-06-13 09:13:15 -04:00
Dan Winship
0b1e364814 proxy/ipvs: fix a few comments 2022-06-12 20:30:47 -04:00
gkarthiks
1fd959e256 refactor: serviceNameString to svcptNameString
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>

refactor: svc port name variable #108806

Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>

refactor: rename struct for service port information to servicePortInfo and fields for more redability

Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>

fix: drop chain rule

Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
2022-05-22 03:31:00 -07:00
Dan Winship
b0d9c063a8 unexport mistakenly-exported constants 2022-05-06 07:33:29 -04:00
Dan Winship
84ad54f0e5 Don't increment "no local endpoints" metric when there are no remote endpoints
A service having no _local_ endpoints when it does have remote
endpoints is different from a service having no endpoints at all.
2022-05-04 12:38:17 -04:00