Commit Graph

114 Commits

Author SHA1 Message Date
Rob Scott
3924364585
Making iptables probability more granular in kube-proxy.
Until now, iptables probabilities had 5 decimal places of granularity.
That meant that probabilities would start to repeat once a Service
had 319 or more endpoints.

This doubles the granularity to 10 decimal places, ensuring that
probabilities will not repeat until a Service reaches 100,223 endpoints.
2019-10-07 17:37:33 -07:00
Rob Scott
af56f25797
Only detecting stale connections for UDP ports in kube-proxy.
The detectStaleConnections function in kube-proxy is very expensive in
terms of CPU utilization. The results of this function are only actually
used for UDP ports. This adds a protocol attribute to ServicePortName to
make it simple to only run this function for UDP connections. For
clusters with primarily TCP connections this can improve kube-proxy
performance by 2x.
2019-09-25 17:48:54 -07:00
Kubernetes Prow Robot
61ecdba9ca
Merge pull request #82289 from robscott/endpointslice-fixes
Fixing bugs related to Endpoint Slices
2019-09-05 09:03:10 -07:00
Rob Scott
8f9483d827
Fixing bugs related to Endpoint Slices
This should fix a bug that could break masters when the EndpointSlice
feature gate was enabled. This was all tied to how the apiserver creates
and manages it's own services and endpoints (or in this case endpoint
slices). Consumers of endpoint slices also need to know about the
corresponding service. Previously we were trying to set an owner
reference here for this purpose, but that came with potential downsides
and increased complexity. This commit changes behavior of the apiserver
endpointslice integration to set the service name label instead of owner
references, and simplifies consumer logic to reference that (both are
set by the EndpointSlice controller).

Additionally, this should fix a bug with the EndpointSlice GenerateName
value that had previously been set with a "." as a suffix.
2019-09-04 09:09:32 -07:00
Mike Spreitzer
d86d1defa1 Made IPVS and iptables modes of kube-proxy fully randomize masquerading if possible
Work around Linux kernel bug that sometimes causes multiple flows to
get mapped to the same IP:PORT and consequently some suffer packet
drops.

Also made the same update in kubelet.

Also added cross-pointers between the two bodies of code, in comments.

Some day we should eliminate the duplicate code.  But today is not
that day.
2019-09-01 22:07:30 -04:00
Rob Scott
9665c590c7
Adding EndpointSlice support for kube-proxy ipvs and iptables proxiers 2019-08-29 01:06:52 -07:00
Kubernetes Prow Robot
0c9964fac3
Merge pull request #76160 from JacobTanenbaum/BaseServiceInfo-cleanup
enforce the interface relationship between ServicePort and BaseServiceInfo
2019-06-13 20:37:13 -07:00
Jacob Tanenbaum
c0392d72e9 enforce the interface relationship between ServicePort and BaseServiceInfo
Currently the BaseServiceInfo struct implements the ServicePort interface, but
only uses that interface sometimes. All the elements of BaseServiceInfo are exported
and sometimes the interface is used to access them and othertimes not

I extended the ServicePort interface so that all relevent values can be accessed through
it and unexported all the elements of BaseServiceInfo
2019-06-05 14:50:24 -04:00
Kubernetes Prow Robot
bdf3d248eb
Merge pull request #77523 from andrewsykim/fix-xlb-from-local
iptables proxier: route local traffic to LB IPs to service chain
2019-05-31 12:22:53 -07:00
Andrew Sy Kim
8dfd4def99 add unit tests for -src-type=LOCAL from LB chain
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
2019-05-07 15:22:46 -04:00
Andrew Sy Kim
b926fb9d2b iptables proxier: route local traffic to LB IPs to service chain
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
2019-05-07 15:22:46 -04:00
Jacob Tanenbaum
9d4693a70f changing UpdateEndpointsMap to Update
changing UpdateEndpointsMap to be a function of the EndpointsMap object
2019-05-07 14:41:15 -04:00
Davanum Srinivas
7b8c9acc09
remove unused code
Change-Id: If821920ec8872e326b7d85437ad8d2620807799d
2019-04-19 08:36:31 -04:00
WanLinghao
d0138ca3fe This commit does two things in pkg package:
1. Remove unused ptr functions.
2. Replace ptr functions with k8s.io/utils/pointer
2019-04-09 10:56:35 +08:00
Davanum Srinivas
954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
Kubernetes Submit Queue
11c47e1872
Merge pull request #67948 from wojtek-t/use_buffers_in_kube_proxy
Automatic merge from submit-queue (batch tested with PRs 66577, 67948, 68001, 67982). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.

Reduce amount of allocations in kube-proxy

Follow up from https://github.com/kubernetes/kubernetes/pull/65902
2018-08-29 16:33:34 -07:00
wojtekt
8fb365df32 Reduce amount of allocations in kube-proxy 2018-08-28 15:18:58 +02:00
Laszlo Janosi
cbe94df8c6 gofmt update 2018-08-27 05:59:50 +00:00
Laszlo Janosi
e466bdc67e Changes according to the approved KEP. SCTP is supported for HostPort and LoadBalancer. Alpha feature flag SCTPSupport controls the support of SCTP. Kube-proxy config parameter is removed. 2018-08-27 05:58:36 +00:00
Laszlo Janosi
a6da2b1472 K8s SCTP support implementation for the first pull request
The requested Service Protocol is checked against the supported protocols of GCE Internal LB. The supported protocols are TCP and UDP.

SCTP is not supported by OpenStack LBaaS. If SCTP is requested in a Service with type=LoadBalancer, the request is rejected. Comment style is also corrected.

SCTP is not allowed for LoadBalancer Service and for HostPort. Kube-proxy can be configured not to start listening on the host port for SCTP: see the new SCTPUserSpaceNode parameter

changed the vendor github.com/nokia/sctp to github.com/ishidawataru/sctp. I.e. from now on we use the upstream version.

netexec.go compilation fixed. Various test cases fixed

SCTP related conformance tests removed. Netexec's pod definition and Dockerfile are updated to expose the new SCTP port(8082)

SCTP related e2e test cases are removed as the e2e test systems do not support SCTP

sctp related firewall config is removed from cluster/gce/util.sh. Variable name sctp_addr is corrected to sctpAddr in pkg/proxy/ipvs/proxier.go

cluster/gce/util.sh is copied from master
2018-08-27 05:56:27 +00:00
x00416946 fisherxu
79e17e6cd7 use versioned api in kube-proxy 2018-08-16 09:59:33 +08:00
wojtekt
6e50f39dbd Avoid allocations when parsing iptables 2018-07-08 10:55:19 +02:00
wojtekt
d073b2097f Optimize iptables 2018-07-06 14:25:56 +02:00
xujieasd
368cb99d0b fix iptables_test typo 2018-06-13 15:12:40 +08:00
Zihong Zheng
f6eed81f21 [kube-proxy] Mass service/endpoint info functions rename and comments 2018-02-27 11:14:02 -08:00
Zihong Zheng
b485f7b5b4 [kube-proxy] Move Service/EndpointInfo common codes to change tracker 2018-02-27 11:05:59 -08:00
m1093782566
ddfa04e8f4 iptables part implementation 2018-02-26 23:48:47 +08:00
Pavithra Ramesh
098a4467fe Remove conntrack entry on udp rule add.
Moved conntrack util outside of proxy pkg
Added warning message if conntrack binary is not found
Addressed review comments.
ran gofmt
2018-02-22 23:34:42 -08:00
Kubernetes Submit Queue
f0ca996274
Merge pull request #56164 from danwinship/proxier-chain-split
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Split KUBE-SERVICES chain to re-shrink the INPUT chain

**What this PR does / why we need it**:
#43972 added an iptables rule "`-A INPUT -j KUBE-SERVICES`" to make NodePort ICMP rejection work. (Previously the KUBE-SERVICES chain was only run from OUTPUT, not INPUT.) #44547 extended that patch for ExternalIP rejection as well.

However, the KUBE-SERVICES chain may potentially have a very large number of ICMP reject rules for plain ClusterIP services (the ones that get run from OUTPUT), and it seems that for some reason the kernel is much more sensitive to the length of the INPUT chain than it is to the length of the OUTPUT chain. So a node that worked fine with kube 1.6 (when KUBE-SERVICES was only run from OUTPUT) might fall over with kube 1.7 (with KUBE-SERVICES being run from both INPUT and OUTPUT).

(Specifically, a node with about 5000 ClusterIP reject rules that ran fine with OpenShift 3.6 [kube 1.6] slowed almost to a complete halt with OpenShift 3.7 [kube 1.7].)

This PR fixes things by splitting out the "new" part of KUBE-SERVICES (NodePort and ExternalIP reject rules) into a separate KUBE-EXTERNAL-SERVICES chain run from INPUT, and moves KUBE-SERVICES back to being only run from OUTPUT. (So, yes, this assumes that you don't have 5000 NodePort/ExternalIP services, but, if you do, there's not much we can do, since those rules *have* to be run on the INPUT side.)

Oh, and I left in the code to clean up the "`-A INPUT -j KUBE-SERVICES`" rule even though we don't generate it any more, so it gets fixed on upgrade.

**Release note**:
```release-note
Reorganized iptables rules to fix a performance regression on clusters with thousands of services.
```

@kubernetes/sig-network-bugs @kubernetes/rh-networking
2018-02-22 18:52:53 -08:00
m1093782566
f3512cbbb9 iptables proxier part changes 2018-02-09 17:20:51 +08:00
Dan Winship
780d5954e0 Split out a KUBE-EXTERNAL-SERVICES chain so we don't have to run KUBE-SERVICES from INPUT 2018-02-07 10:20:52 -05:00
m1093782566
b015f1f567 add ut for localhost nodeport 2018-01-15 11:05:21 +08:00
Kubernetes Submit Queue
02ca5cac01
Merge pull request #53555 from leblancd/v6_del_endpoint_proxier
Automatic merge from submit-queue (batch tested with PRs 55988, 53555, 55858). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add IPv6 and negative UT test cases for proxier's deleteEndpointConnections

This change adds IPv6 and negative UT test cases for the proxier's deleteEndpointConnections.

Changes include:
- Add IPv6 UT test cases to TestDeleteEndpointConnections.
- Add negative UT test case to TestDeleteEndpointConnections for
  handling case where no connections need clearing (benign error).
- Add negative UT test case to test unexpected error.
- Reorganize UT in TestDeleteEndpointConnections so that the fake
  command executor's command and scripted responses are generated on
  the fly based on the test case table (rather than using a fixed
  set of commands/responses that will need to be updated every time
  test cases are added/deleted).
- Create the proxier service map in real time, based on the test case
  table (rather than using a fixed service map that will need to be updated
  every time test cases are added/deleted).

fixes #53554



**What this PR does / why we need it**:
This change adds IPv6 and negative UT test cases for the proxier's
deleteEndpointConnections.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #53554

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-11-18 20:31:23 -08:00
Kubernetes Submit Queue
cae7240cf9
Merge pull request #55601 from m1093782566/getlocalips
Automatic merge from submit-queue (batch tested with PRs 55009, 55532, 55601, 52569, 55533). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix ipvs/proxy getLocalIPs inconsistency with iptables/proxy

**What this PR does / why we need it**:

* Fix ipvs/proxy `getLocalIPs()` inconsistency with iptables/proxy

* validate the ip address before pkg/proxy/util IPPart() return ip string.

**Which issue(s) this PR fixes** :
Fixes #55612

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-11-14 00:09:52 -08:00
m1093782566
42832e7666 fix ipvs proxier getLocalIPs() error 2017-11-13 17:55:53 +08:00
Kubernetes Submit Queue
d6cabaf706
Merge pull request #55568 from m1093782566/unsortlist
Automatic merge from submit-queue (batch tested with PRs 53580, 55568). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Replace sets.List() with sets.UnsortedList() in pkg/proxy

**What this PR does / why we need it**:

Replace sets.List() with sets.UnsortedList() in pkg/proxy - sets.List() will sort the result array, we don't need sorted array in pkg/proxy. Using sets.UnsortedList() can reduce the unnecessary overhead spending.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

@wojtek-t wdyt ^_^

**Release note**:

```release-note
NONE
```

/sig network
2017-11-12 21:07:37 -08:00
m1093782566
83ada5c7bf replace sets.List() with sets.UnsortedList() 2017-11-13 10:20:54 +08:00
Zihong Zheng
f7ed9cf09a [kube-proxy] Fix session affinity with local endpoints traffic 2017-11-10 18:42:07 -08:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Dane LeBlanc
799341f2dc Add IPv6 and negative UT test cases for proxier's deleteEndpointConnections
This change adds IPv6 and negative UT test cases for the proxier's
deleteEndpointConnections.

Changes include:
- Add IPv6 UT test cases to TestDeleteEndpointConnections.
- Add negative UT test case to TestDeleteEndpointConnections for
  handling case where no connections need clearing (benign error).
- Add negative UT test case to test unexpected error.
- Reorganize UT in TestDeleteEndpointConnections so that the fake
  command executor's command and scripted responses are generated on
  the fly based on the test case table (rather than using a fixed
  set of commands/responses that will need to be updated every time
  test cases are added/deleted).
- Create the proxier service map in real time, based on the test case
  table (rather than using a fixed service map that will need to be updated
  every time test cases are added/deleted).

fixes #53554
2017-10-12 20:07:19 -04:00
m1093782566
d96409178b consume endpoints IPPart function in util 2017-10-11 09:51:58 +08:00
Dane LeBlanc
5fbc9e45cc Add IPv6 support to iptables proxier
The following changes are proposed for the iptables proxier:

* There are three places where a string specifying IP:port is parsed
  using something like this:

      if index := strings.Index(e.endpoint, ":"); index != -1 {

  This will fail for IPv6 since V6 addresses contain colons. Also,
  the V6 address is expected to be surrounded by square brackets
  (i.e. []:). Fix this by replacing call to Index with
  call to LastIndex() and stripping out square brackets.
* The String() method for the localPort struct should put square brackets
  around IPv6 addresses.
* The logging in the merge() method for proxyServiceMap should put brackets
  around IPv6 addresses.
* There are several places where filterRules destination is hardcoded to
  /32. This should be a /128 for IPv6 case.
* Add IPv6 unit test cases

fixes #48550
2017-09-16 09:16:12 -04:00
Kubernetes Submit Queue
b65f3cc8dd Merge pull request #49850 from m1093782566/service-session-timeout
Automatic merge from submit-queue (batch tested with PRs 49850, 47782, 50595, 50730, 51341)

Paramaterize `stickyMaxAgeMinutes` for service in API

**What this PR does / why we need it**:

Currently I find `stickyMaxAgeMinutes` for a session affinity type service is hard code to 180min. There is a TODO comment, see

https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/iptables/proxier.go#L205

I think the seesion sticky max time varies from service to service and users may not aware of it since it's hard coded in all proxier.go - iptables, userspace and winuserspace.

Once we parameterize it in API, users can set/get the values for their different services.

Perhaps, we can introduce a new field `api.ClientIPAffinityConfig` in `api.ServiceSpec`.

There is an initial discussion about it in sig-network group. See,

https://groups.google.com/forum/#!topic/kubernetes-sig-network/i-LkeHrjs80

**Which issue this PR fixes**: 

fixes #49831

**Special notes for your reviewer**:

**Release note**:

```release-note
Paramaterize session affinity timeout seconds in service API for Client IP based session affinity.
```
2017-08-25 20:43:30 -07:00
m1093782566
c355a2ac96 Paramaterize stickyMaxAgeMinutes for service in API 2017-08-25 17:44:47 +08:00
m1093782566
a7fd545d49 clean up LocalPort in proxier.go 2017-08-24 11:16:38 +08:00
xiangpengzhao
ebe21ee4c1 Remove deprecated ESIPP beta annotations 2017-08-05 15:00:58 +08:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Minhan Xia
68a2749b28 fix unit tests 2017-07-06 16:01:03 -07:00
xiangpengzhao
9e31eb280a Populate endpoints and allow ports with headless service 2017-06-28 11:15:51 +08:00
Wojciech Tyczynski
03c255d7c5 Store chain names to avoid recomputing them multiple times 2017-05-30 10:50:10 +02:00