Automatic merge from submit-queue
Make /var/lib/kubelet as shared during startup
This is part of ~~https://github.com/kubernetes/community/pull/589~~https://github.com/kubernetes/community/pull/659
We'd like kubelet to be able to consume mounts from containers in the future, therefore kubelet should make sure that `/var/lib/kubelet` has shared mount propagation to be able to see these mounts.
On most distros, root directory is already mounted with shared mount propagation and this code will not do anything. On older distros such as Debian Wheezy, this code detects that `/var/lib/kubelet` is a directory on `/` which has private mount propagation and kubelet bind-mounts `/var/lib/kubelet` as rshared.
Both "regular" linux mounter and `NsenterMounter` are updated here.
@kubernetes/sig-storage-pr-reviews @kubernetes/sig-node-pr-reviews
@vishh
Release note:
```release-note
Kubelet re-binds /var/lib/kubelet directory with rshared mount propagation during startup if it is not shared yet.
```
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
Automatic merge from submit-queue (batch tested with PRs 51391, 51338, 51340, 50773, 49599)
Delete "hugetlb" from whitelistControllers
**What this PR does / why we need it**:
Delete "hugetlb" from whitelistControllers
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#50770
**Special notes for your reviewer**:
**Release note**:
```
NONE
```
Automatic merge from submit-queue (batch tested with PRs 50119, 48366, 47181, 41611, 49547)
Fail on swap enabled and deprecate experimental-fail-swap-on flag
**What this PR does / why we need it**:
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
Fixes#34726
**Special notes for your reviewer**:
**Release note**:
```release-note
Kubelet will by default fail with swap enabled from now on. The experimental flag "--experimental-fail-swap-on" has been deprecated, please set the new "--fail-swap-on" flag to false if you wish to run with /proc/swaps on.
```
* Deprecate the old experimental-fail-swap-on
* Add a new flag fail-swap-on and set it to true
Before this change, we would not fail when swap is on. With this
change we fail for everyone when swap is on, unless they explicitly
set --fail-swap-on to false.
Automatic merge from submit-queue (batch tested with PRs 48976, 49474, 40050, 49426, 49430)
Remove duplicated import and wrong alias name of api package
**What this PR does / why we need it**:
**Which issue this PR fixes**: fixes#48975
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48636, 49088, 49251, 49417, 49494)
Fix issues for local storage allocatable feature
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
This PR fixes issue #47809
This PR fixes the following issues:
1. Use ResourceStorageScratch instead of ResourceStorage API to represent
local storage capacity
2. In eviction manager, use container manager instead of node provider
(kubelet) to retrieve the node capacity and reserved resources. Node
provider (kubelet) has a feature gate so that storagescratch information
may not be exposed if feature gate is not set. On the other hand,
container manager has all the capacity and allocatable resource
information.
Automatic merge from submit-queue (batch tested with PRs 47232, 48625, 48613, 48567, 39173)
Fix issue when setting fileysystem capacity in container manager
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
This PR fixes issue #48452
Automatic merge from submit-queue
Local storage teardown fix
**What this PR does / why we need it**: Local storage uses bindmounts and the method IsLikelyNotMountPoint does not detect these as mountpoints. Therefore, local PVs are not properly unmounted when they are deleted.
**Which issue this PR fixes**: fixes#48331
**Special notes for your reviewer**:
You can use these e2e tests to reproduce the issue and validate the fix works appropriately https://github.com/kubernetes/kubernetes/pull/47999
The existing method IsLikelyNotMountPoint purposely does not check mountpoints reliability (4c5b22d4c6/pkg/util/mount/mount_linux.go (L161)), since the number of mountpoints can be large. 4c5b22d4c6/pkg/util/mount/mount.go (L46)
This implementation changes the behavior for local storage to detect mountpoints reliably, and avoids changing the behavior for any other callers to a UnmountPath.
**Release note**:
```
Fixes bind-mount teardown failure with non-mount point Local volumes (issue https://github.com/kubernetes/kubernetes/issues/48331).
```
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
In Container manager, we set up the capacity by retrieving information
from cadvisor. However unlike machineinfo, filesystem information is
available at a later unknown time. This PR uses a go routine to keep
retriving the information until it is avaialble or timeout.
Centralize Capacity discovery of standard resources in Container manager.
Have storage derive node capacity from container manager.
Move certain cAdvisor interfaces to the cAdvisor package in the process.
This patch fixes a bug in container manager where it was writing to a map without synchronization.
Signed-off-by: Vishnu kannan <vishnuk@google.com>