Commit Graph

93 Commits

Author SHA1 Message Date
wojtekt
6e50f39dbd Avoid allocations when parsing iptables 2018-07-08 10:55:19 +02:00
wojtekt
d073b2097f Optimize iptables 2018-07-06 14:25:56 +02:00
xujieasd
368cb99d0b fix iptables_test typo 2018-06-13 15:12:40 +08:00
Zihong Zheng
f6eed81f21 [kube-proxy] Mass service/endpoint info functions rename and comments 2018-02-27 11:14:02 -08:00
Zihong Zheng
b485f7b5b4 [kube-proxy] Move Service/EndpointInfo common codes to change tracker 2018-02-27 11:05:59 -08:00
m1093782566
ddfa04e8f4 iptables part implementation 2018-02-26 23:48:47 +08:00
Pavithra Ramesh
098a4467fe Remove conntrack entry on udp rule add.
Moved conntrack util outside of proxy pkg
Added warning message if conntrack binary is not found
Addressed review comments.
ran gofmt
2018-02-22 23:34:42 -08:00
Kubernetes Submit Queue
f0ca996274
Merge pull request #56164 from danwinship/proxier-chain-split
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Split KUBE-SERVICES chain to re-shrink the INPUT chain

**What this PR does / why we need it**:
#43972 added an iptables rule "`-A INPUT -j KUBE-SERVICES`" to make NodePort ICMP rejection work. (Previously the KUBE-SERVICES chain was only run from OUTPUT, not INPUT.) #44547 extended that patch for ExternalIP rejection as well.

However, the KUBE-SERVICES chain may potentially have a very large number of ICMP reject rules for plain ClusterIP services (the ones that get run from OUTPUT), and it seems that for some reason the kernel is much more sensitive to the length of the INPUT chain than it is to the length of the OUTPUT chain. So a node that worked fine with kube 1.6 (when KUBE-SERVICES was only run from OUTPUT) might fall over with kube 1.7 (with KUBE-SERVICES being run from both INPUT and OUTPUT).

(Specifically, a node with about 5000 ClusterIP reject rules that ran fine with OpenShift 3.6 [kube 1.6] slowed almost to a complete halt with OpenShift 3.7 [kube 1.7].)

This PR fixes things by splitting out the "new" part of KUBE-SERVICES (NodePort and ExternalIP reject rules) into a separate KUBE-EXTERNAL-SERVICES chain run from INPUT, and moves KUBE-SERVICES back to being only run from OUTPUT. (So, yes, this assumes that you don't have 5000 NodePort/ExternalIP services, but, if you do, there's not much we can do, since those rules *have* to be run on the INPUT side.)

Oh, and I left in the code to clean up the "`-A INPUT -j KUBE-SERVICES`" rule even though we don't generate it any more, so it gets fixed on upgrade.

**Release note**:
```release-note
Reorganized iptables rules to fix a performance regression on clusters with thousands of services.
```

@kubernetes/sig-network-bugs @kubernetes/rh-networking
2018-02-22 18:52:53 -08:00
m1093782566
f3512cbbb9 iptables proxier part changes 2018-02-09 17:20:51 +08:00
Dan Winship
780d5954e0 Split out a KUBE-EXTERNAL-SERVICES chain so we don't have to run KUBE-SERVICES from INPUT 2018-02-07 10:20:52 -05:00
m1093782566
b015f1f567 add ut for localhost nodeport 2018-01-15 11:05:21 +08:00
Kubernetes Submit Queue
02ca5cac01
Merge pull request #53555 from leblancd/v6_del_endpoint_proxier
Automatic merge from submit-queue (batch tested with PRs 55988, 53555, 55858). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add IPv6 and negative UT test cases for proxier's deleteEndpointConnections

This change adds IPv6 and negative UT test cases for the proxier's deleteEndpointConnections.

Changes include:
- Add IPv6 UT test cases to TestDeleteEndpointConnections.
- Add negative UT test case to TestDeleteEndpointConnections for
  handling case where no connections need clearing (benign error).
- Add negative UT test case to test unexpected error.
- Reorganize UT in TestDeleteEndpointConnections so that the fake
  command executor's command and scripted responses are generated on
  the fly based on the test case table (rather than using a fixed
  set of commands/responses that will need to be updated every time
  test cases are added/deleted).
- Create the proxier service map in real time, based on the test case
  table (rather than using a fixed service map that will need to be updated
  every time test cases are added/deleted).

fixes #53554



**What this PR does / why we need it**:
This change adds IPv6 and negative UT test cases for the proxier's
deleteEndpointConnections.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53554

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-11-18 20:31:23 -08:00
Kubernetes Submit Queue
cae7240cf9
Merge pull request #55601 from m1093782566/getlocalips
Automatic merge from submit-queue (batch tested with PRs 55009, 55532, 55601, 52569, 55533). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix ipvs/proxy getLocalIPs inconsistency with iptables/proxy

**What this PR does / why we need it**:

* Fix ipvs/proxy `getLocalIPs()` inconsistency with iptables/proxy

* validate the ip address before pkg/proxy/util IPPart() return ip string.

**Which issue(s) this PR fixes** :
Fixes #55612

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-11-14 00:09:52 -08:00
m1093782566
42832e7666 fix ipvs proxier getLocalIPs() error 2017-11-13 17:55:53 +08:00
Kubernetes Submit Queue
d6cabaf706
Merge pull request #55568 from m1093782566/unsortlist
Automatic merge from submit-queue (batch tested with PRs 53580, 55568). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Replace sets.List() with sets.UnsortedList() in pkg/proxy

**What this PR does / why we need it**:

Replace sets.List() with sets.UnsortedList() in pkg/proxy - sets.List() will sort the result array, we don't need sorted array in pkg/proxy. Using sets.UnsortedList() can reduce the unnecessary overhead spending.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

@wojtek-t wdyt ^_^

**Release note**:

```release-note
NONE
```

/sig network
2017-11-12 21:07:37 -08:00
m1093782566
83ada5c7bf replace sets.List() with sets.UnsortedList() 2017-11-13 10:20:54 +08:00
Zihong Zheng
f7ed9cf09a [kube-proxy] Fix session affinity with local endpoints traffic 2017-11-10 18:42:07 -08:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Dane LeBlanc
799341f2dc Add IPv6 and negative UT test cases for proxier's deleteEndpointConnections
This change adds IPv6 and negative UT test cases for the proxier's
deleteEndpointConnections.

Changes include:
- Add IPv6 UT test cases to TestDeleteEndpointConnections.
- Add negative UT test case to TestDeleteEndpointConnections for
  handling case where no connections need clearing (benign error).
- Add negative UT test case to test unexpected error.
- Reorganize UT in TestDeleteEndpointConnections so that the fake
  command executor's command and scripted responses are generated on
  the fly based on the test case table (rather than using a fixed
  set of commands/responses that will need to be updated every time
  test cases are added/deleted).
- Create the proxier service map in real time, based on the test case
  table (rather than using a fixed service map that will need to be updated
  every time test cases are added/deleted).

fixes #53554
2017-10-12 20:07:19 -04:00
m1093782566
d96409178b consume endpoints IPPart function in util 2017-10-11 09:51:58 +08:00
Dane LeBlanc
5fbc9e45cc Add IPv6 support to iptables proxier
The following changes are proposed for the iptables proxier:

* There are three places where a string specifying IP:port is parsed
  using something like this:

      if index := strings.Index(e.endpoint, ":"); index != -1 {

  This will fail for IPv6 since V6 addresses contain colons. Also,
  the V6 address is expected to be surrounded by square brackets
  (i.e. []:). Fix this by replacing call to Index with
  call to LastIndex() and stripping out square brackets.
* The String() method for the localPort struct should put square brackets
  around IPv6 addresses.
* The logging in the merge() method for proxyServiceMap should put brackets
  around IPv6 addresses.
* There are several places where filterRules destination is hardcoded to
  /32. This should be a /128 for IPv6 case.
* Add IPv6 unit test cases

fixes #48550
2017-09-16 09:16:12 -04:00
Kubernetes Submit Queue
b65f3cc8dd Merge pull request #49850 from m1093782566/service-session-timeout
Automatic merge from submit-queue (batch tested with PRs 49850, 47782, 50595, 50730, 51341)

Paramaterize `stickyMaxAgeMinutes` for service in API

**What this PR does / why we need it**:

Currently I find `stickyMaxAgeMinutes` for a session affinity type service is hard code to 180min. There is a TODO comment, see

https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/iptables/proxier.go#L205

I think the seesion sticky max time varies from service to service and users may not aware of it since it's hard coded in all proxier.go - iptables, userspace and winuserspace.

Once we parameterize it in API, users can set/get the values for their different services.

Perhaps, we can introduce a new field `api.ClientIPAffinityConfig` in `api.ServiceSpec`.

There is an initial discussion about it in sig-network group. See,

https://groups.google.com/forum/#!topic/kubernetes-sig-network/i-LkeHrjs80

**Which issue this PR fixes**: 

fixes #49831

**Special notes for your reviewer**:

**Release note**:

```release-note
Paramaterize session affinity timeout seconds in service API for Client IP based session affinity.
```
2017-08-25 20:43:30 -07:00
m1093782566
c355a2ac96 Paramaterize stickyMaxAgeMinutes for service in API 2017-08-25 17:44:47 +08:00
m1093782566
a7fd545d49 clean up LocalPort in proxier.go 2017-08-24 11:16:38 +08:00
xiangpengzhao
ebe21ee4c1 Remove deprecated ESIPP beta annotations 2017-08-05 15:00:58 +08:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Minhan Xia
68a2749b28 fix unit tests 2017-07-06 16:01:03 -07:00
xiangpengzhao
9e31eb280a Populate endpoints and allow ports with headless service 2017-06-28 11:15:51 +08:00
Wojciech Tyczynski
03c255d7c5 Store chain names to avoid recomputing them multiple times 2017-05-30 10:50:10 +02:00
Wojciech Tyczynski
c4d51f12a2 Store port endpoint chain names to avoid recomputing it multiple times 2017-05-30 10:49:36 +02:00
Wojciech Tyczynski
070f393bc8 Precompute probabilities in iptables kube-proxy. 2017-05-30 10:49:34 +02:00
Tim Hockin
2856fde23b Use BoundedFrequencyRunner in kube-proxy 2017-05-24 20:33:15 -07:00
Wojciech Tyczynski
758c9666e5 Call syncProxyRules when really needed and remove reasons 2017-05-20 18:46:28 +02:00
Wojciech Tyczynski
05ffcccdc1 Check whether endpoints change 2017-05-20 14:22:07 +02:00
Wojciech Tyczynski
a3da8d7300 Fix naming and comments in kube-proxy. 2017-05-19 21:34:05 +02:00
Wojciech Tyczynski
e3bb755270 Reuse buffers for generated iptables rules 2017-05-19 20:44:26 +02:00
Wojciech Tyczynski
5464c39333 Reuse buffer for getting iptables contents 2017-05-19 20:44:25 +02:00
Zihong Zheng
aca4d469b2 Revert "Remove reasons from iptables syncProxyRules"
This reverts commit 77624a12d3.
2017-05-17 16:33:13 -07:00
Zihong Zheng
c0920f75cf Move API annotations into annotation_key_constants and remove api/annotations package 2017-05-16 21:55:23 -07:00
Wojciech Tyczynski
77624a12d3 Remove reasons from iptables syncProxyRules 2017-05-12 13:32:02 +02:00
Wojciech Tyczynski
eb6949a53e Change locking mechanism in kube-proxy 2017-04-28 09:40:39 +02:00
Jeff Grafton
6a0c06926a gofmt proxier_test for go1.8.1 2017-04-25 11:23:59 -07:00
Ketan Kulkarni
ac7c026ee7 Reject Rules for ExternalIP and svc port if no ep
- Install ICMP Reject Rules for externalIP and svc port
  if no endpoints are present
- Includes Unit Test case
- Fixes #44516
2017-04-21 16:48:24 -07:00
Wojciech Tyczynski
c7353432df Don't rebuild service map in iptables kube-proxy all the time 2017-04-21 09:41:27 +02:00
Wojciech Tyczynski
2f250435fd Don't rebuild endpoints map in iptables kube-proxy all the time. 2017-04-20 08:34:46 +02:00
Wojciech Tyczynski
7a647f9d1a Event-based iptables proxy for services 2017-04-18 13:30:59 +02:00
Kubernetes Submit Queue
284615d79d Merge pull request #43702 from wojtek-t/edge_based_proxy
Automatic merge from submit-queue

Edge-based userspace LB in kube-proxy

@thockin @bowei - if one of you could take a look if that PR doesn't break some basic kube-proxy assumptions. The similar change for winuserproxy should be pretty trivial.

And we should also do that for iptables, but that requires splitting the iptables code to syncProxyRules (which from what I know @thockin already started working on so we should probably wait for it to be done).
2017-04-12 00:30:53 -07:00
Dan Williams
70e53ace17 proxy/iptables: precompute svcPortName strings
With many services, the calls to svcPortName.String() show up as a
somewhat significant CPU user under syncProxyRules().
2017-04-10 12:49:06 -05:00
Wojciech Tyczynski
b1475565e6 Edge-based iptables proxy 2017-04-10 13:12:45 +02:00
Tim Hockin
9bfb88d2d7 Fix a couple nits from previous reviews. 2017-04-07 20:47:11 -07:00