kubernetes

Author	SHA1	Message	Date
Dan Winship	4381973a44	Revert (most of) "Issue 70020; Flush Conntrack entities for SCTP" This commit did not actually work; in between when it was first written and tested, and when it merged, the code in pkg/proxy/endpoints.go was changed to only add UDP endpoints to the "stale endpoints"/"stale services" lists, and so checking for "either UDP or SCTP" rather than just UDP when processing those lists had no effect. This reverts most of commit `aa8521df66` (but leaves the changes related to ipvs.IsRsGracefulTerminationNeeded() since that actually did have the effect it meant to have).	2023-03-14 12:18:58 -04:00
Kubernetes Prow Robot	611273a5bb	Merge pull request #115253 from danwinship/proxy-update-healthchecknodeport Split out HealthCheckNodePort stuff from service/endpoint map Update()	2023-03-13 15:22:48 -07:00
Alexander Constantinescu	ec917850af	Add proxy healthz result to ETP=local health check Today, the health check response to the load balancers asking Kube-proxy for the status of ETP:Local services does not include the healthz state of Kube- proxy. This means that Kube-proxy might indicate to load balancers that they should forward traffic to the node in question, simply because the endpoint is running on the node - this overlooks the fact that Kube-proxy might be not-healthy and hasn't successfully written the rules enabling traffic to reach the endpoint.	2023-03-06 10:53:17 +01:00
Dan Winship	0c2711bf24	Make NodePortAddresses abstraction around GetNodeAddresses/ContainsIPv4Loopback	2023-02-22 08:32:19 -05:00
Dan Winship	d43878f970	Put all iptables nodeport address handling in one place For some reason we were calculating the available nodeport IPs at the top of syncProxyRules even though we didn't use them until the end. (Well, the previous code avoided generating KUBE-NODEPORTS chain rules if there were no node IPs available, but that case is considered an error anyway, so there's no need to optimize it.) (Also fix a stale `err` reference exposed by this move.)	2023-02-22 08:30:36 -05:00
Kubernetes Prow Robot	c94f708ce4	Merge pull request #114470 from danwinship/kep-3178-fixups KEP-3178-related iptables rule fixups	2023-02-21 14:24:08 -08:00
Dan Winship	d901992eae	Split out HealthCheckNodePort stuff from service/endpoint map Update() In addition to actually updating their data from the provided list of changes, EndpointsMap.Update() and ServicePortMap.Update() return a struct with some information about things that changed because of that update (eg services with stale conntrack entries). For some reason, they were also returning information about HealthCheckNodePorts, but they were returning static information based on the current (post-Update) state of the map, not information about what had changed in the update. Since this doesn't match how the other data in the struct is used (and since there's no reason to have the data only be returned when you call Update() anyway) , split it out.	2023-01-22 10:33:33 -05:00
Dan Winship	b9bc0e5ac8	Ensure needFullSync is set at iptables proxy startup The unit tests were broken with MinimizeIPTablesRestore enabled because syncProxyRules() assumed that needFullSync would be set on the first (post-setInitialized()) run, but the unit tests didn't ensure that. (In fact, there was a race condition in the real Proxier case as well; theoretically syncProxyRules() could be run by the BoundedFrequencyRunner after OnServiceSynced() called setInitialized() but before it called forceSyncProxyRules(), thus causing the first real sync to try to do a partial sync and fail. This is now fixed as well.)	2023-01-18 10:50:12 -05:00
Dan Winship	169604d906	Validate single-stack --nodeport-addresses sooner In the dual-stack case, iptables.NewDualStackProxier and ipvs.NewDualStackProxier filtered the nodeport addresses values by IP family before creating the single-stack proxiers. But in the single-stack case, the kube-proxy startup code just passed the value to the single-stack proxiers without validation, so they had to re-check it themselves. Fix that.	2023-01-03 09:01:45 -05:00
Dan Winship	e7ed7220eb	Explicitly pass IP family to proxier Rather than re-determining it from the iptables object in both proxies.	2023-01-03 09:01:45 -05:00
Dan Winship	0ea0295965	Duplicate the "anti-martian-packet" rule in kube-proxy This rule was mistakenly added to kubelet even though it only applies to kube-proxy's traffic. We do not want to remove it from kubelet yet because other components may be depending on it for security, but we should make kube-proxy output its own rule rather than depending on kubelet.	2022-12-29 16:24:58 -05:00
Dan Winship	305641bd4c	Add iptablesKubeletJumpChains to iptables proxier Some of the chains kube-proxy creates are also created by kubelet; we need to ensure that those chains exist but we should not delete them in CleanupLeftovers().	2022-12-29 16:24:58 -05:00
Dan Winship	1870c4cdd7	Add a comment-only rule to the end of KUBE-FW-* chains With the removal of the "-j KUBE-MARK-DROP" rules, the firewall chains end rather ambiguously. Add a comment-only rule explaining what will happen.	2022-12-29 16:24:58 -05:00
Dan Winship	bfa4948bb6	Don't re-run EnsureChain/EnsureRules on partial syncs We currently invoke /sbin/iptables 24 times on each syncProxyRules before calling iptables-restore. Since even trivial iptables invocations are slow on hosts with lots of iptables rules, this adds a lot of time to each sync. Since these checks are expected to be a no-op 99% of the time, skip them on partial syncs.	2022-11-29 09:42:49 -05:00
cyclinder	4aff0dba0d	kube-proxy ipatbles: update log message	2022-11-04 10:07:15 +08:00
Kubernetes Prow Robot	d86c013b0d	Merge pull request #108250 from cyclinder/add_flag_in_proxy kube-proxy: add a flag to disable nodePortOnLocalhost	2022-11-03 17:10:13 -07:00
Manav Agarwal	3320e50e24	If applied, this commit will refactor variable names in kube-proxy	2022-11-03 03:45:57 +05:30
cyclinder	bef2070031	kube-proxy: add a flag to disables the allowing NodePort services to be accessed via localhost	2022-11-02 16:17:52 +08:00
Dan Winship	818de5a545	proxy/iptables: Add metric for partial sync failures, add test	2022-09-26 16:31:42 -04:00
Dan Winship	ab326d2f4e	proxy/iptables: Don't rewrite chains that haven't changed iptables-restore requires that if you change any rule in a chain, you have to rewrite the entire chain. But if you avoid mentioning a chain at all, it will leave it untouched. Take advantage of this by not rewriting the SVC, SVL, EXT, FW, and SEP chains for services that have not changed since the last sync, which should drastically cut down on the size of each iptables-restore in large clusters.	2022-09-26 16:30:42 -04:00
Dan Winship	9f69a3a9d4	kube-proxy: remove iptables-to-userspace fallback Back when iptables was first made the default, there were theoretically some users who wouldn't have been able to support it due to having an old /sbin/iptables. But kube-proxy no longer does the things that didn't work with old iptables, and we removed that check a long time ago. There is also a check for a new-enough kernel version, but it's checking for a feature which was added in kernel 3.6, and no one could possibly be running Kubernetes with a kernel that old. So the fallback code now never actually falls back, so it should just be removed.	2022-08-16 09:21:34 -04:00
Dan Winship	f65fbc877b	proxy/iptables: remove last references to KUBE-MARK-DROP	2022-07-28 09:03:49 -04:00
Dan Winship	9313188909	proxy/iptables: Don't use KUBE-MARK-DROP for LoadBalancerSourceRanges	2022-07-28 09:03:46 -04:00
Kubernetes Prow Robot	ce433f87b4	Merge pull request #110266 from danwinship/minimize-prep-reorg iptables proxy reorg in preparation for minimizing iptables-restore	2022-07-27 04:06:30 -07:00
Dan Williams	f197509879	proxy: queue syncs on node events rather than syncing immediately The proxies watch node labels for topology changes, but node labels can change in bursts especially in larger clusters. This causes pressure on all proxies because they can't filter the events, since the topology could match on any label. Change node event handling to queue the request rather than immediately syncing. The sync runner can already handle short bursts which shouldn't change behavior for most cases. Signed-off-by: Dan Williams <dcbw@redhat.com>	2022-07-18 09:21:52 -05:00
Dan Winship	367f18c49b	proxy/iptables: move firewall chain setup Part of reorganizing the syncProxyRules loop to do: 1. figure out what chains are needed, mark them in activeNATChains 2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS 3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP) This moves the FW chain creation to the end (rather than having it in the middle of adding the jump rules for the LB IPs).	2022-07-09 07:08:42 -04:00
Dan Winship	2030591ce7	proxy/iptables: move internal traffic setup code Part of reorganizing the syncProxyRules loop to do: 1. figure out what chains are needed, mark them in activeNATChains 2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS 3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP) This fixes the jump rules for internal traffic. Previously we were handling "jumping from kubeServices to internalTrafficChain" and "adding masquerade rules to internalTrafficChain" in the same place.	2022-07-09 07:07:48 -04:00
Dan Winship	00f789cd8d	proxy/iptables: move EXT chain rule creation to the end Part of reorganizing the syncProxyRules loop to do: 1. figure out what chains are needed, mark them in activeNATChains 2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS 3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP) This fixes the handling of the EXT chain.	2022-07-09 07:07:47 -04:00
Dan Winship	8906ab390e	proxy/iptables: reorganize cluster/local chain creation Part of reorganizing the syncProxyRules loop to do: 1. figure out what chains are needed, mark them in activeNATChains 2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS 3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP) This fixes the handling of the SVC and SVL chains. We were already filling them in at the end of the loop; this fixes it to create them at the bottom of the loop as well.	2022-07-09 07:05:05 -04:00
Dan Winship	da14a12fe5	proxy/iptables: move endpoint chain rule creation to the end Part of reorganizing the syncProxyRules loop to do: 1. figure out what chains are needed, mark them in activeNATChains 2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS 3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP) This fixes the handling of the endpoint chains. Previously they were handled entirely at the top of the loop. Now we record which ones are in use at the top but don't create them and fill them in until the bottom.	2022-07-09 06:51:47 -04:00
Dan Winship	8a5801996b	proxy/iptables: belatedly simplify local traffic policy metrics We figure out early on whether we're going to end up outputting no endpoints, so update the metrics then. (Also remove a redundant feature gate check; svcInfo already checks the ServiceInternalTrafficPolicy feature gate itself and so svcInfo.InternalPolicyLocal() will always return false if the gate is not enabled.)	2022-07-09 06:50:16 -04:00
Dan Winship	95705350d5	proxy/iptables: Don't use KUBE-MARK-DROP for "no local endpoints" Rather than marking packets to be dropped in the "nat" table and then dropping them from the "filter" table later, just use rules in "filter" to drop the packets we don't like directly.	2022-06-29 16:37:24 -04:00
Dan Winship	7d3ba837f5	proxy/iptables: only clean up chains periodically in large clusters "iptables-save" takes several seconds to run on machines with lots of iptables rules, and we only use its result to figure out which chains are no longer referenced by any rules. While it makes things less confusing if we delete unused chains immediately, it's not actually _necessary_ since they never get called during packet processing. So in large clusters, make it so we only clean up chains periodically rather than on every sync.	2022-06-29 11:14:38 -04:00
Dan Winship	1cd461bd24	proxy/iptables: abstract the "endpointChainsNumberThreshold" a bit Turn this into a generic "large cluster mode" that determines whether we optimize for performance or debuggability.	2022-06-29 11:14:38 -04:00
Dan Winship	7c27cf0b9b	Simplify iptables-save parsing We don't need to parse out the counter values from the iptables-save output (since they are always 0 for the chains we care about). Just parse the chain names themselves. Also, all of the callers of GetChainLines() pass it input that contains only a single table, so just assume that, rather than carefully parsing only a single table's worth of the input.	2022-06-28 08:39:32 -04:00
Dan Winship	a3556edba1	Stop trying to "preserve" iptables counters that are always 0 The iptables and ipvs proxies have code to try to preserve certain iptables counters when modifying chains via iptables-restore, but the counters in question only actually exist for the built-in chains (eg INPUT, FORWARD, PREROUTING, etc), which we never modify via iptables-restore (and in fact, can't safely modify via iptables-restore), so we are really just doing a lot of unnecessary work to copy the constant string "[0:0]" over from iptables-save output to iptables-restore input. So stop doing that. Also fix a confused error message when iptables-save fails.	2022-06-28 08:39:32 -04:00
gkarthiks	1fd959e256	refactor: serviceNameString to svcptNameString Signed-off-by: gkarthiks <github.gkarthiks@gmail.com> refactor: svc port name variable #108806 Signed-off-by: gkarthiks <github.gkarthiks@gmail.com> refactor: rename struct for service port information to servicePortInfo and fields for more redability Signed-off-by: gkarthiks <github.gkarthiks@gmail.com> fix: drop chain rule Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>	2022-05-22 03:31:00 -07:00
Dan Winship	b0d9c063a8	unexport mistakenly-exported constants	2022-05-06 07:33:29 -04:00
Kubernetes Prow Robot	2b3508e0f1	Merge pull request #109826 from danwinship/multi-load-balancer fix kube-proxy bug with multiple LB IPs and source ranges	2022-05-06 03:09:15 -07:00
Dan Winship	813aca47af	proxy/iptables: fix firewall rules with multiple LB IPs The various loops in the LoadBalancer rule section were mis-nested such that if a service had multiple LoadBalancer IPs, we would write out the firewall rules multiple times (and the allowFromNode rule for the second and later IPs would end up being written after the "else DROP" rule from the first IP).	2022-05-05 10:58:09 -04:00
Dan Winship	84ad54f0e5	Don't increment "no local endpoints" metric when there are no remote endpoints A service having no _local_ endpoints when it does have remote endpoints is different from a service having no endpoints at all.	2022-05-04 12:38:17 -04:00
Max Renaud	f0dfac5d07	Add sync_proxy_rules_no_local_endpoints_total metric	2022-03-31 18:54:23 +00:00
Tim Hockin	40e21e310f	Elide the -FW- chain when possible This makes it epsilon harder to reason about but saves one chain declaration and one rule per service-port usually.	2022-03-30 09:55:34 -07:00
Tim Hockin	7726b5f9fc	kube-proxy: inline `args` in most cases	2022-03-30 09:55:34 -07:00
Tim Hockin	9ed6b73495	kube-proxy: comment endpoint in SEP jumps	2022-03-30 09:55:34 -07:00
Tim Hockin	0e47dc3a65	kube-proxy: remove old TODO	2022-03-30 09:55:33 -07:00
Tim Hockin	30c1523708	kube-proxy: Renames for readability	2022-03-30 09:55:32 -07:00
Tim Hockin	f1553f58c5	kube-proxy: Remove now unneeded rule Now that NodePorts jump to EXT, we don't need a specific rule for loopback source detection.	2022-03-30 09:54:40 -07:00
Tim Hockin	db932a0ab1	kube-proxy: Rework LB VIP capture logic * Comments * If there are multiple VIPs, don't declare the fwChain multiple times. * Don't emit the last -j DROP if there's no source ranges	2022-03-30 09:54:40 -07:00
Tim Hockin	07b2585927	kube-proxy: Rename XLB -> EXT This changes the "XLB" chain into the "EXT" chain - the "external destinations" chain.	2022-03-30 09:54:38 -07:00

1 2 3 4 5 ...

487 Commits