Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)
Set up DNS server in containerized mounter path
During NFS/GlusterFS mount, it requires to have DNS server to be able to
resolve service name. This PR gets the DNS server ip from kubelet and
add it to the containerized mounter path. So if containerized mounter is
used, service name could be resolved during mount
**Release note**:
```release-note
Allow DNS resolution of service name for COS using containerized mounter. It fixed the issue with DNS resolution of NFS and Gluster services.
```
Automatic merge from submit-queue (batch tested with PRs 51186, 50350, 51751, 51645, 51837)
Wait for container cleanup before deletion
We should wait to delete pod API objects until the pod's containers have been cleaned up. See issue: #50268 for background.
This changes the kubelet container gc, which deletes containers belonging to pods considered "deleted".
It adds two conditions under which a pod is considered "deleted", allowing containers to be deleted:
Pods where deletionTimestamp is set, and containers are not running
Pods that are evicted
This PR also changes the function PodResourcesAreReclaimed by making it return false if containers still exist.
The eviction manager will wait for containers of previous evicted pod to be deleted before evicting another pod.
The status manager will wait for containers to be deleted before removing the pod API object.
/assign @vishh
During NFS/GlusterFS mount, it requires to have DNS server to be able to
resolve service name. This PR gets the DNS server ip from kubelet and
add it to the containerized mounter path. So if containerized mounter is
used, service name could be resolved during mount
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)
Deprecation warnings for auto detecting cloud providers
**What this PR does / why we need it**:
Adds deprecation warnings for auto detecting cloud providers. As part of the initiative for out-of-tree cloud providers, this feature is conflicting since we're shifting the dependency of kubernetes core into cAdvisor. In the future kubelets should be using `--cloud-provider=external` or no cloud provider at all.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50986
**Special notes for your reviewer**:
NOTE: I still have to coordinate with sig-node and kubernetes-dev to get approval for this deprecation, I'm only opening this PR since we're close to code freeze and it's something presentable.
**Release note**:
```release-note
Deprecate auto detecting cloud providers in kubelet. Auto detecting cloud providers go against the initiative for out-of-tree cloud providers as we'll now depend on cAdvisor integrations with cloud providers instead of the core repo. In the near future, `--cloud-provider` for kubelet will either be an empty string or `external`.
```
Automatic merge from submit-queue (batch tested with PRs 50932, 49610, 51312, 51415, 50705)
Implement StatsProvider interface using cadvisor
Ref: https://github.com/kubernetes/kubernetes/issues/46984
- This PR changes the `StatsProvider` interface in `pkg/kubelet/server/stats` so that it can provide container stats from either cadvisor or CRI, and the summary API can consume the stats without knowing how they are provided.
- The `StatsProvider` struct in the newly added package `pkg/kubelet/stats` implements part of the `StatsProvider` interface in `pkg/kubelet/server/stats`.
- In `pkg/kubelet/stats`,
- `stats_provider.go`: implements the node level stats and provides the entry point for this package.
- `cadvisor_stats_provider.go`: implements the container level stats using cadvisor.
- `cri_stats_provider.go`: implements the container level stats using CRI.
- `helper.go`: utility functions shared by the above three components.
- There should be no user visible behaviors change in this PR.
- A follow up PR will implement the StatsProvider interface using CRI.
**Release note**:
```
None
```
/assign @yujuhong
/assign @WIZARD-CXY
Automatic merge from submit-queue (batch tested with PRs 49284, 49555, 47639, 49526, 49724)
skip WaitForAttachAndMount for terminated pods in syncPod
Fixes https://github.com/kubernetes/kubernetes/issues/49663
I tried to tread lightly with a small localized change because this needs to be picked to 1.7 and 1.6 as well.
I suspect this has been as issue since we started unmounting volumes on pod termination https://github.com/kubernetes/kubernetes/pull/37228
xref openshift/origin#14383
@derekwaynecarr @eparis @smarterclayton @saad-ali @jwforres
/release-note-none
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747)
Add support for `no_new_privs` via AllowPrivilegeEscalation
**What this PR does / why we need it**:
Implements kubernetes/community#639
Fixes#38417
Adds `AllowPrivilegeEscalation` and `DefaultAllowPrivilegeEscalation` to `PodSecurityPolicy`.
Adds `AllowPrivilegeEscalation` to container `SecurityContext`.
Adds the proposed behavior to `kuberuntime`, `dockershim`, and `rkt`. Adds a bunch of unit tests to ensure the desired default behavior and that when `DefaultAllowPrivilegeEscalation` is explicitly set.
Tests pass locally with docker and rkt runtimes. There are also a few integration tests with a `setuid` binary for sanity.
**Release note**:
```release-note
Adds AllowPrivilegeEscalation to control whether a process can gain more privileges than it's parent process
```
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Use presence of kubeconfig file to toggle standalone mode
Fixes#40049
```release-note
The deprecated --api-servers flag has been removed. Use --kubeconfig to provide API server connection information instead. The --require-kubeconfig flag is now deprecated. The default kubeconfig path is also deprecated. Both --require-kubeconfig and the default kubeconfig path will be removed in Kubernetes v1.10.0.
```
/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-misc
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Remove flags low-diskspace-threshold-mb and outofdisk-transition-frequency
issue: #48843
This removes two flags replaced by the eviction manager. These have been depreciated for two releases, which I believe correctly follows the kubernetes depreciation guidelines.
```release-note
Remove depreciated flags: --low-diskspace-threshold-mb and --outofdisk-transition-frequency, which are replaced by --eviction-hard
```
cc @mtaufen since I am changing kubelet flags
cc @vishh @derekwaynecarr
/sig node
Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494)
Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This PR fixes issue #47809
Replaces use of --api-servers with --kubeconfig in Kubelet args across
the turnup scripts. In many cases this involves generating a kubeconfig
file for the Kubelet and placing it in the correct location on the node.
Automatic merge from submit-queue (batch tested with PRs 48981, 47316, 49180)
Added golint check for pkg/kubelet.
**What this PR does / why we need it**:
Added golint check for pkg/kubelet, and make golint happy.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#47315
**Release note**:
```release-note-none
```
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
Since v1.5 and the removal of --configure-cbr0:
0800df74ab "Remove the legacy networking mode --configure-cbr0"
kubelet hasn't done any shaping operations internally. They
have all been delegated to network plugins like kubenet or
external CNI plugins. But some shaping code was still left
in kubelet, so remove it now that it's unused.
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.
This patch fixes a bug in container manager where it was writing to a map without synchronization.
Signed-off-by: Vishnu kannan <vishnuk@google.com>
Automatic merge from submit-queue (batch tested with PRs 47694, 47772, 47783, 47803, 47673)
Make different container runtimes constant
Make different container runtimes constant to avoid hardcode
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 47000, 47188, 47094, 47323, 47124)
fix sync loop health check
This PR will do error logging about the fall behind sync for kubelet instead of sync loop healthz checking.
The reason is kubelet can not do sync loop and therefore can not update sync loop time when there is any runtime error, such as docker hung.
When there is any runtime error, according to current implementation, kubelet will not do sync operation and thus kubelet's sync loop time will not be updated. This will make when there is any runtime error, kubelet will also return non 200 response status code when accessing healthz endpoint. This is contrary with #37865 which prevents kubelet from being killed when docker hangs.
**Release note**:
```release-note
fix sync loop health check with seperating runtime errors
```
/cc @yujuhong @Random-Liu @dchen1107
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.