Kube-proxy will add ipset entries for all node ips for an SCTP nodeport service. This will solve the problem 'SCTP nodeport service is not working for all IPs present in the node when ipvs is enabled. It is working only for node's InternalIP.'
As mentioned in issue #80061, in iptables lock contention case,
we can see increasing rate of iptables restore failures because it
need to grab iptables file lock.
The failure metric can provide administrators more insight
Metrics will be collected in kube-proxy iptables and ipvs modes
Signed-off-by: Hui Luo <luoh@vmware.com>
ipvs `getProxyMode` test fails on mac as `utilipvs.GetRequiredIPVSMods`
try to reach `/proc/sys/kernel/osrelease` to find version of the running
linux kernel. Linux kernel version is used to determine the list of required
kernel modules for ipvs.
Logic to determine kernel version is moved to GetKernelVersion
method in LinuxKernelHandler which implements ipvs.KernelHandler.
Mock KernelHandler is used in the test cases.
Read and parse file is converted to go function instead of execing cut.
Computing all node ips twice would always happen when no node port
addresses were explicitly set. The GetNodeAddresses call would return
two zero cidrs (ipv4 and ipv6) and we would then retrieve all node IPs
twice because the loop wouldn't break after the first time.
Also, it is possible for the user to set explicit node port addresses
including both a zero and a non-zero cidr, but this wouldn't make sense
for nodeIPs since the zero cidr would already cause nodeIPs to include
all IPs on the node.
Previously the same ip addresses would be computed for each nodePort
service and this could be CPU intensive for a large number of nodePort
services with a large number of ipaddresses on the node.
Currently the BaseServiceInfo struct implements the ServicePort interface, but
only uses that interface sometimes. All the elements of BaseServiceInfo are exported
and sometimes the interface is used to access them and othertimes not
I extended the ServicePort interface so that all relevent values can be accessed through
it and unexported all the elements of BaseServiceInfo
This adds some useful metrics around pending changes and last successful
sync time.
The goal is for administrators to be able to alert on proxies that, for
whatever reason, are quite stale.
Signed-off-by: Casey Callendrello <cdc@redhat.com>
As part of the endpoint creation process when going from 0 -> 1 conntrack entries
are cleared. This is to prevent an existing conntrack entry from preventing traffic
to the service. Currently the system ignores the existance of the services external IP
addresses, which exposes that errant behavior
This adds the externalIP addresses of udp services to the list of conntrack entries that
get cleared. Allowing traffic to flow
Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
while cleaning up ipvs mode. flushing iptable chains first and then
remove the chains. this avoids trying to remove chains that are still
referenced by rules in other chains.
fixes#70615
This will allow for kube-proxy to be run without `privileged` and
with only adding the capability `NET_ADMIN`.
Signed-off-by: Jess Frazelle <acidburn@microsoft.com>
The requested Service Protocol is checked against the supported protocols of GCE Internal LB. The supported protocols are TCP and UDP.
SCTP is not supported by OpenStack LBaaS. If SCTP is requested in a Service with type=LoadBalancer, the request is rejected. Comment style is also corrected.
SCTP is not allowed for LoadBalancer Service and for HostPort. Kube-proxy can be configured not to start listening on the host port for SCTP: see the new SCTPUserSpaceNode parameter
changed the vendor github.com/nokia/sctp to github.com/ishidawataru/sctp. I.e. from now on we use the upstream version.
netexec.go compilation fixed. Various test cases fixed
SCTP related conformance tests removed. Netexec's pod definition and Dockerfile are updated to expose the new SCTP port(8082)
SCTP related e2e test cases are removed as the e2e test systems do not support SCTP
sctp related firewall config is removed from cluster/gce/util.sh. Variable name sctp_addr is corrected to sctpAddr in pkg/proxy/ipvs/proxier.go
cluster/gce/util.sh is copied from master
Automatic merge from submit-queue (batch tested with PRs 66491, 66587, 66856, 66657, 66923). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
optimize ipvs get nodeIP
**What this PR does / why we need it**:
Optimize ipvs get nodeIP.
The original ipvs `NodeIPs` need first get all local type address to set1, then get address of dummy device `kube-ipvs0` to set2, then do diff of set1 and set2 to get local addresses we need.
This work gonna result in unnecessary resource consumption, especially for large cluster, will have lots address in dummy device `kube-ipvs0`.
This pr optimized the workaround.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
ipvs: add addrtype match for nodeport
**What this PR does / why we need it**:
before this PR:
```
-A KUBE-SERVICES -m comment --comment "Kubernetes nodeport TCP port for masquerade purpose" -m set --match-set KUBE-NODE-PORT-TCP dst -j KUBE-NODE-PORT
-A KUBE-SERVICES -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A KUBE-NODE-PORT -p tcp -m comment --comment "Kubernetes nodeport TCP port with externalTrafficPolicy=local" -m set --match-set KUBE-NODE-PORT-LOCAL-TCP dst -j RETURN
-A KUBE-NODE-PORT -j KUBE-MARK-MASQ
```
after this PR:
```
-A KUBE-NODE-PORT -p tcp -m comment --comment "Kubernetes nodeport TCP port with externalTrafficPolicy=local" -m set --match-set KUBE-NODE-PORT-LOCAL-TCP dst -j RETURN
-A KUBE-NODE-PORT -p tcp -m comment --comment "Kubernetes nodeport TCP port for masquerade purpose" -m set --match-set KUBE-NODE-PORT-TCP dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ
-A KUBE-SERVICES -m set --match-set KUBE-CLUSTER-IP dst,dst -j ACCEPT
-A KUBE-SERVICES -m addrtype --dst-type LOCAL -j KUBE-NODE-PORT
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#65459
**Special notes for your reviewer**:
manually tested cases:
- ClusterIP distributed to pod on same node
- ClusterIP distributed to pod on other node
- NodePort distributed to pod on same node
- NodePort distributed to pod on other node
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 66136, 64999, 65425, 66120, 66074). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Not step into ipvs.CleanupLeftovers() if canUseIPVS's false
**What this PR does / why we need it**:
Earlier we decide whether we should clean up the left-over ipvs rules inside `ipvs.CleanupLeftovers()`, therefore we call function `ipvs.CanUseIPVSProxier()` two times (and `GetModules()` two times). Actually no need to step into `ipvs.CleanupLeftovers()` if `canUseIPVS` is false.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66064, 66040). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix Local externalTrafficPolicy is not respected for ipvs NodePort
**What this PR does / why we need it**:
Local externalTrafficPolicy is not respected for ipvs NodePort.
This PR fixes it.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#66062
**Special notes for your reviewer**:
Manually tested accessing NodePort with externalTrafficPolicy=Local and externalTrafficPolicy=Cluster.
**Release note**:
```release-note
```
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.
Lubomir I. Ivanov <neolit123@gmail.com>
applied ammend for:
pkg/cloudprovider/provivers/vsphere/nodemanager.go
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
add cleanLegacyBindAddr
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#65263
**Special notes for your reviewer**:
To fix the issue,
use `activeBindAddrs` map which represents ip address successfully bind to DefaultDummyDevice in the round of sync
use `currentBindAddrs` map which represents ip addresses bind to DefaultDummyDevice from the system
create a function `cleanLegacyBindAddr` to unbind address which is in `currentBindAddrs` map but not in `activeBindAddrs` map
**Release note**:
```release-note
NONE
```
/sig network
/area kube-proxy
Duplicated masq rules are created by current implementation:
-A KUBE-NODE-PORT -m comment --comment "mark MASQ for
externaltrafficpolicy=cluster" -j KUBE-MARK-MASQ
-A KUBE-NODE-PORT -j KUBE-MARK-MASQ
The last one is always there. So the one inside if statement could
just be removed.