* Add FeatureGate PodHostIPs
* Add HostIPs field and update PodIPs field
* Types conversion
* Add dropDisabledStatusFields
* Add HostIPs for kubelet
* Add fuzzer for PodStatus
* Add status.hostIPs in ConvertDownwardAPIFieldLabel
* Add status.hostIPs in validEnvDownwardAPIFieldPathExpressions
* Downward API support for status.hostIPs
* Add DownwardAPI validation for status.hostIPs
* Add e2e to check that hostIPs works
* Add e2e to check that Downward API works
* Regenerate
host-network pods IPs are obtained from the reported kubelet nodeIPs.
Historically, host-network podIPs are immutable once set, but when
we've added dual-stack support, we didn't consider that the secondary
IP address may not be present at the same time that the primary nodeIP.
If a secondary IP address is added to a node after the host-network pods
IPs are set, we can add the secondary host-network pod IP address
maintaining the current behavior of not updating the current podIPs on
host-network pods.
The feature gate gets locked to "true", with the goal to remove it in two
releases.
All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.
Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
When adding the ephemeral volume feature, the special case for
PersistentVolumeClaim volume sources in kubelet's host path and node
limits checks was overlooked. An ephemeral volume source is another
way of referencing a claim and has to be treated the same way.
Remove the VolumeSubpath feature gate.
Feature gate convention has been updated since this was introduced to
indicate that they "are intended to be deprecated and removed after a
feature becomes GA or is dropped.".
The Kubelet always clears reason and message in generateAPIPodStatus
even when the phase is unchanged. It is reasonable that we preserve
the previous values when the phase does not change, and clear it
when the phase does change.
When a pod is evicted, this ensurse that the eviction message and
reason are propagated even in the face of subsequent updates. It also
preserves the message and reason if components beyond the Kubelet
choose to set that value.
To preserve the value we need to know the old phase, which requires
a change to convertStatusToAPIStatus so that both methods have
access to it.
runtimes may return an arbitrary number of Pod IPs, however, kubernetes
only takes into consideration the first one of each IP family.
The order of the IPs are the one defined by the Kubelet:
- default prefer IPv4
- if NodeIPs are defined, matching the first nodeIP family
PodIP is always the first IP of PodIPs.
The downward API must expose the same IPs and in the same order than
the pod.Status API object.
If Containerd is used on Windows, then we can also mount individual
files into containers (e.g.: /etc/hosts), which was not possible with Docker.
Checks if the container runtime is containerd, and if it is, then also
mount /etc/hosts file (to C:\Windows\System32\drivers\etc\hosts).
add host file write for podIPs
update tests
remove import alias
update type check
update type check
remove import alias
update open api spec
add tests
update test
add tests
address review comments
update imports
remove todo and import alias
This patch moves the HostUtil functionality from the util/mount package
to the volume/util/hostutil package.
All `*NewHostUtil*` calls are changed to return concrete types instead
of interfaces.
All callers are changed to use the `*NewHostUtil*` methods instead of
directly instantiating the concrete types.
This patch refactors pkg/util/mount to be more usable outside of
Kubernetes. This is done by refactoring mount.Interface to only contain
methods that are not K8s specific. Methods that are not relevant to
basic mount activities but still have OS-specific implementations are
now found in a mount.HostUtils interface.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add proxy for container streaming in kubelet for streaming auth.
For https://github.com/kubernetes/kubernetes/issues/36666, option 2 of https://github.com/kubernetes/kubernetes/issues/36666#issuecomment-378440458.
This PR:
1. Removed the `DirectStreamingRuntime`, and changed `IndirectStreamingRuntime` to `StreamingRuntime`. All `DirectStreamingRuntime`s, `dockertools` and `rkt`, were removed.
2. Proxy container streaming in kubelet instead of returning redirect to apiserver. This solves the container runtime authentication issue, which is what we agreed on in https://github.com/kubernetes/kubernetes/issues/36666.
Please note that, this PR replaced the redirect with proxy directly instead of adding a knob to switch between the 2 behaviors. For existing CRI runtimes like containerd and cri-o, they should change to serve container streaming on localhost, so as to make the whole container streaming connection secure.
If a general authentication mechanism proposed in https://github.com/kubernetes/kubernetes/issues/62747 is ready, we can switch back to redirect, and all code can be found in github history.
Please also note that this added some overhead in kubelet when there are container streaming connections. However, the actual bottleneck is in the apiserver anyway, because it does proxy for all container streaming happens in the cluster. So it seems fine to get security and simplicity with this overhead. @derekwaynecarr @mrunalp Are you ok with this? Or do you prefer a knob?
@yujuhong @timstclair @dchen1107 @mikebrow @feiskyer
/cc @kubernetes/sig-node-pr-reviews
**Release note**:
```release-note
Kubelet now proxies container streaming between apiserver and container runtime. The connection between kubelet and apiserver is authenticated. Container runtime should change streaming server to serve on localhost, to make the connection between kubelet and container runtime local.
In this way, the whole container streaming connection is secure. To switch back to the old behavior, set `--redirect-container-streaming=true` flag.
```
Automatic merge from submit-queue (batch tested with PRs 61147, 62236, 62018). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix local volume absolute path issue on Windows
**What this PR does / why we need it**:
remove IsAbs validation on local volume since it does not work on windows cluster, Windows absolute path `D:` is not allowed in local volume, the [validation](https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/validation/validation.go#L1386) happens on both master and agent node, while for windows cluster, the master is Linux and agent is Windows, so `path.IsAbs()` func will not work all in both nodes.
**Instead**, this PR use `MakeAbsolutePath` func to convert `local.path` value in kubelet, it supports both linux and windows styple.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62016
**Special notes for your reviewer**:
**Release note**:
```
fix local volume absolute path issue on Windows
```
/sig storage
/sig windows
use MakeAbsolutePath to convert path in Windows
fix test error: allow relative path for local volume
fix comments
fix comments and add windows unit tests
We have 2 scenarios where we copy /etc/hosts
- with host network (we just copy the /etc/hosts from node)
- without host network (create a fresh /etc/hosts from pod info)
We are having trouble figuring out whether a /etc/hosts in a
pod/container has been "fixed-up" or not. And whether we used
host network or a fresh /etc/hosts in the various ways we start
up the tests which are:
- VM/box against a remote cluster
- As a container inside the k8s cluster
- DIND scenario in CI where test runs inside a managed container
Please see previous mis-guided attempt to fix this problem at
ba20e63446 In this commit we revert
the code from there as well.
So we should make sure:
- we always add a header if we touched the file
- we add slightly different headers so we can figure out if we used the
host network or not.
Update the test case to inject /etc/hosts from node to another path
(/etc/hosts-original) as well and use that to compare.
Users must not be allowed to step outside the volume with subPath.
Therefore the final subPath directory must be "locked" somehow
and checked if it's inside volume.
On Windows, we lock the directories. On Linux, we bind-mount the final
subPath into /var/lib/kubelet/pods/<uid>/volume-subpaths/<container name>/<subPathName>,
it can't be changed to symlink user once it's bind-mounted.
This also incorporates the version string into the package name so
that incompatibile versions will fail to connect.
Arbitrary choices:
- The proto3 package name is runtime.v1alpha2. The proto compiler
normally translates this to a go package of "runtime_v1alpha2", but
I renamed it to "v1alpha2" for consistency with existing packages.
- kubelet/apis/cri is used as "internalapi". I left it alone and put the
public "runtimeapi" in kubelet/apis/cri/runtime.
Since Mesos is no longer in your main repository and since we have
things like dynamic kubelet configuration in progress, we should
drop these undocumented, untested, private hooks.
cmd/kubelet/app/server.go::CreateAPIServerClientConfig
CreateAPIServerClientConfig::getRuntime
pkg/kubelet/kubelet_pods.go::getPhase
Also remove stuff from Dependencies struct that were specific to
the Mesos integration (ContainerRuntimeOptions and Options)
Also remove stale references in test/e2e and and test owners file
The POSIX standard restricts environment variable names to uppercase
letters, digits, and the underscore character in shell contexts only.
For generic application usage, it is stated that all other characters
shall be tolerated.
This change relaxes the rules to some degree. Namely, we stop requiring
environment variable names to be strict C_IDENTIFIERS and start
permitting lowercase, dot, and dash characters.
Public container images using environment variable names beyond the
shell-only context can benefit from this relaxation. Elasticsearch is
one popular example.
Introduced chages:
1. Re-writing of the resolv.conf file generated by docker.
Cluster dns settings aren't passed anymore to docker api in all cases, not only for pods with host network:
the resolver conf will be overwritten after infra-container creation to override docker's behaviour.
2. Added new one dnsPolicy - 'ClusterFirstWithHostNet', so now there are:
- ClusterFirstWithHostNet - use dns settings in all cases, i.e. with hostNet=true as well
- ClusterFirst - use dns settings unless hostNetwork is true
- Default
Fixes#17406
Automatic merge from submit-queue
Allow multipe DNS servers as comma-seperated argument for kubelet --dns
This PR explores how kubectls "--dns" could be extended to specify multiple DNS servers for in-cluster PODs. Testing on the local libvirt-coreos cluster shows that multiple DNS server are injected without issues.
Specifying multiple DNS servers increases resilience against
- Packet drops
- Single server failure
I am debugging services that do 50+ DNS requests for a single incoming interactive request, thus highly increase the chance of a slowdown (+5s) due to a single packet drop. Switching to two DNS servers will reduce the impact of the issues (roughly +1s on glibc, 0s on musl, error-rate goes down to error-rate^2).
Note that there is no need to change any runtime related code as far as I know. In the case of "default" dns the /etc/resolv.conf is parsed and multiple DNS server are send to the backend anyway. This only adds the same capability for the clusterFirst case.
I've heard from @thockin that multiple DNS entries are somehow considered. I've no idea what was considered, though. This is what I would like to see for our production use, though.
```release-note
NONE
```
- adjust ports to int32
- CRI flows the websocket ports as query params
- Do not validate ports since the protocol is unknown
SPDY flows the ports as headers and websockets uses query params
- Only flow query params if there is at least one port query param
Depending on an exact cluster setup multiple dns may make sense.
Comma-seperated lists of DNS server are quite common as DNS servers
are always plain IPs.
Automatic merge from submit-queue
Optional configmaps and secrets
Allow configmaps and secrets for environment variables and volume sources to be optional
Implements approved proposal c9f881b7bb
Release note:
```release-note
Volumes and environment variables populated from ConfigMap and Secret objects can now tolerate the named source object or specific keys being missing, by adding `optional: true` to the volume or environment variable source specifications.
```
Automatic merge from submit-queue (batch tested with PRs 37228, 40146, 40075, 38789, 40189)
Cleanup temp dirs
So funny story my /tmp ran out of space running the unit tests so I am cleaning up all the temp dirs we create.
Automatic merge from submit-queue
Fixed merging of host's and dns' search lines
Fixed forming of pod's Search line in resolv.conf:
- exclude duplicates while merging of host's and dns' search lines to form pod's one
- truncate pod's search line if it exceeds resolver limits: is > 255 chars and containes > 6 searches
- monitoring the resolv.conf file which is used by kubelet (set thru --resolv-conf="") and logging and eventing if search line in it consists of more than 3 entries (or 6 if Cluster Domain is set) or its lenght is > 255 chars
- logging and eventing when a pod's search line is > 255 chars or containes > 6 searches during forming
Fixes#29270
**Release note**:
```release-note
Fixed forming resolver search line for pods: exclude duplicates, obey libc limitations, logging and eventing appropriately.
```
- exclude duplicates while merging of host's and dns' search lines to form pod's one
- truncate pod's search line if it exceeds resolver limits: is > 255 chars and containes > 6 searches
- monitoring the resolv.conf file which is used by kubelet (set thru --resolv-conf="") and logging and eventing if search line in it consists of more than 3 entries
(or 6 if Cluster Domain is set) or its lenght is > 255 chars
- logging and eventing when a pod's search line is > 255 chars or containes > 6 searches during forming
Fixes#29270