Currently, there are some unit tests that are failing on Windows due to
various reasons:
- paths not properly joined (filepath.Join should be used).
- Proxy Mode IPVS not supported on Windows.
- DeadlineExceeded can occur when trying to read data from an UDP
socket. This can be used to detect whether the port was closed or not.
- In Windows, with long file name support enabled, file names can have
up to 32,767 characters. In this case, the error
windows.ERROR_FILENAME_EXCED_RANGE will be encountered instead.
- files not closed, which means that they cannot be removed / renamed.
- time.Now() is not as precise on Windows, which means that 2
consecutive calls may return the same timestamp.
- path.Base() will return the same path. filepath.Base() should be used
instead.
- path.Join() will always join the paths with a / instead of the OS
specific separator. filepath.Join() should be used instead.
Kube/proxy, in NodeCIDR local detector mode, uses the node.Spec.PodCIDRs
field to build the Services iptables rules.
The Node object depends on the kubelet, but if kube-proxy runs as a
static pods or as a standalone binary, it is not possible to guarantee
that the values obtained at bootsrap are valid, causing traffic outages.
Kube-proxy has to react on node changes to avoid this problems, it
simply restarts if detect that the node PodCIDRs have changed.
In case that the Node has been deleted, kube-proxy will only log an
error and keep working, since it may break graceful shutdowns of the
node.
Some of the unit tests cannot pass on Windows due to various reasons:
- fsnotify does not have a Windows implementation.
- Proxy Mode IPVS not supported on Windows.
- Seccomp not supported on Windows.
- VolumeMode=Block is not supported on Windows.
- iSCSI volumes are mounted differently on Windows, and iscsiadm is a
Linux utility.
iptables-restore requires that if you change any rule in a chain, you
have to rewrite the entire chain. But if you avoid mentioning a chain
at all, it will leave it untouched. Take advantage of this by not
rewriting the SVC, SVL, EXT, FW, and SEP chains for services that have
not changed since the last sync, which should drastically cut down on
the size of each iptables-restore in large clusters.
Level 4 is mean for debug operations.
The default level use to be level 2, on clusters with a lot of
Services this means that the kube-proxy will generate a lot of
noise on the logs, with the performance penalty associated to
it.
Update the IPVS proxier to have a bool `initialSync` which is set to
true when a new proxier is initialized and then set to false on all
syncs. This lets us run startup-only logic, which subsequently lets us
update the realserver only when needed and avoiding any expensive
operations.
Signed-off-by: Sanskar Jaiswal <jaiswalsanskar078@gmail.com>
If the user passes "--proxy-mode ipvs", and it is not possible to use
IPVS, then error out rather than falling back to iptables.
There was never any good reason to be doing fallback; this was
presumably erroneously added to parallel the iptables-to-userspace
fallback (which only existed because we had wanted iptables to be the
default but not all systems could support it).
In particular, if the user passed configuration options for ipvs, then
they presumably *didn't* pass configuration options for iptables, and
so even if the iptables proxy is able to run, it is likely to be
misconfigured.
Back when iptables was first made the default, there were
theoretically some users who wouldn't have been able to support it due
to having an old /sbin/iptables. But kube-proxy no longer does the
things that didn't work with old iptables, and we removed that check a
long time ago. There is also a check for a new-enough kernel version,
but it's checking for a feature which was added in kernel 3.6, and no
one could possibly be running Kubernetes with a kernel that old. So
the fallback code now never actually falls back, so it should just be
removed.
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The proxies watch node labels for topology changes, but node labels
can change in bursts especially in larger clusters. This causes
pressure on all proxies because they can't filter the events, since
the topology could match on any label.
Change node event handling to queue the request rather than immediately
syncing. The sync runner can already handle short bursts which shouldn't
change behavior for most cases.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Part of reorganizing the syncProxyRules loop to do:
1. figure out what chains are needed, mark them in activeNATChains
2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)
This moves the FW chain creation to the end (rather than having it in
the middle of adding the jump rules for the LB IPs).
Part of reorganizing the syncProxyRules loop to do:
1. figure out what chains are needed, mark them in activeNATChains
2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)
This fixes the jump rules for internal traffic. Previously we were
handling "jumping from kubeServices to internalTrafficChain" and
"adding masquerade rules to internalTrafficChain" in the same place.
Part of reorganizing the syncProxyRules loop to do:
1. figure out what chains are needed, mark them in activeNATChains
2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)
This fixes the handling of the EXT chain.
Part of reorganizing the syncProxyRules loop to do:
1. figure out what chains are needed, mark them in activeNATChains
2. write servicePort jump rules to KUBE-SERVICES/KUBE-NODEPORTS
3. write servicePort-specific chains (SVC, SVL, EXT, FW, SEP)
This fixes the handling of the SVC and SVL chains. We were already
filling them in at the end of the loop; this fixes it to create them
at the bottom of the loop as well.