_netdev mount option is a userspace mount option and
isn't copied over when bind mount is created and remount
also does not copies it over and hence must be explicitly
used with bind mount
These are all flagged by Go 1.11's
more accurate printf checking in go vet,
which runs as part of go test.
Lubomir I. Ivanov <neolit123@gmail.com>
applied ammend for:
pkg/cloudprovider/provivers/vsphere/nodemanager.go
Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Containerized subpath
**What this PR does / why we need it**:
Containerized kubelet needs a different implementation of `PrepareSafeSubpath` than kubelet running directly on the host.
On the host we safely open the subpath and then bind-mount `/proc/<pidof kubelet>/fd/<descriptor of opened subpath>`.
With kubelet running in a container, `/proc/xxx/fd/yy` on the host contains path that works only inside the container, i.e. `/rootfs/path/to/subpath` and thus any bind-mount on the host fails.
Solution:
- safely open the subpath and gets its device ID and inode number
- blindly bind-mount the subpath to `/var/lib/kubelet/pods/<uid>/volume-subpaths/<name of container>/<id of mount>`. This is potentially unsafe, because user can change the subpath source to a link to a bad place (say `/run/docker.sock`) just before the bind-mount.
- get device ID and inode number of the destination. Typical users can't modify this file, as it lies on /var/lib/kubelet on the host.
- compare these device IDs and inode numbers.
**Which issue(s) this PR fixes**
Fixes#61456
**Special notes for your reviewer**:
The PR contains some refactoring of `doBindSubPath` to extract the common code. New `doNsEnterBindSubPath` is added for the nsenter related parts.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 64013, 63896, 64139, 57527, 62102). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Setup fsGroup for local volumes correctly
**What this PR does / why we need it**:
This pr fixes fsGroup check in local volume in containerized kubelet. Except this, it also fixes fsGroup check when volume source is a normal directory whether kubelet is running on the host or in a container.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61741
**Special notes for your reviewer**:
Bind mounts are detected in `/proc/mounts`, but it does not contain root of mount for bind mounts. So `mount.GetMountRefsByDev()` cannot get all references if source is a normal directory. e.g.
```
# mkdir /tmp/src /mnt/dst
# mount --bind /tmp/src /tmp/src # required by local-volume-provisioner, see https://github.com/kubernetes-incubator/external-storage/pull/499
# mount --bind /tmp/src /mnt/dst
# grep -P 'src|dst' /proc/mounts
tmpfs /tmp/src tmpfs rw,nosuid,nodev,noatime,size=4194304k 0 0
tmpfs /mnt/dst tmpfs rw,nosuid,nodev,noatime,size=4194304k 0 0
# grep -P 'src|dst' /proc/self/mountinfo
234 409 0:42 /src /tmp/src rw,nosuid,nodev,noatime shared:30 - tmpfs tmpfs rw,size=4194304k
235 24 0:42 /src /mnt/dst rw,nosuid,nodev,noatime shared:30 - tmpfs tmpfs rw,size=4194304k
```
We need to compare root of mount and device in this case.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 64102, 63303, 64150, 63841). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
When creating ext3/ext4 volume, disable reserved blocks
**What this PR does / why we need it**:
When creating ext3/ext4 volume, `mkfs` defaults to reserving 5% of the volume for the super-user (root). This patch changes the `mkfs` to pass `-m0` to disable this setting.
Rationale: Reserving a percentage of the volume is generally a neither useful nor desirable feature for volumes that aren't used as root file systems for Linux distributions, since the reserved portion becomes unavailable for non-root users. For containers, the general case is to use the entire volume for data, without running as root. The case where one might want reserved blocks enabled is much rarer.
**Special notes for your reviewer**:
I also added some comments to describe the flags passed to `mkfs`.
**Release note**:
```release-note
Changes ext3/ext4 volume creation to not reserve any portion of the volume for the root user.
```
- getSubpathBindTarget() computes final target of subpath bind-mount.
- prepareSubpathTarget() creates target for bind-mount.
- safeOpenSubPath() checks symlinks in Subpath and safely opens it.
Kubelet should not resolve symlinks outside of mounter interface.
Only mounter interface knows, how to resolve them properly on the host.
As consequence, declaration of SafeMakeDir changes to simplify the
implementation:
from SafeMakeDir(fullPath string, base string, perm os.FileMode)
to SafeMakeDir(subdirectoryInBase string, base string, perm os.FileMode)
super-user-reserved blocks, which otherwise defaults to 5% of the
entire disk.
Rationale: Reserving a percentage of the volume is generally a neither
useful nor desirable feature for volumes that aren't used as root file
systems for Linux distributions, since the reserved portion becomes
unavailable for non-root users. For containers, the general case is to
use the entire volume for data, without running as root. The case where
one might want reserved blocks enabled is much rarer.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
passthrough readOnly to subpath
**What this PR does / why we need it**:
If a volume is mounted as readonly, or subpath volumeMount is configured as readonly, then the subpath bind mount should be readonly.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#62752
**Special notes for your reviewer**:
**Release note**:
```release-note
Fixes issue where subpath readOnly mounts failed
```
Users must not be allowed to step outside the volume with subPath.
Therefore the final subPath directory must be "locked" somehow
and checked if it's inside volume.
On Windows, we lock the directories. On Linux, we bind-mount the final
subPath into /var/lib/kubelet/pods/<uid>/volume-subpaths/<container name>/<subPathName>,
it can't be changed to symlink user once it's bind-mounted.
Automatic merge from submit-queue (batch tested with PRs 59103, 58433). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
bugfix(mount): lstat with abs path of parent instead of '/..'
**What this PR does / why we need it**:
If a nfs volume with improper permission is mounted on a Pod, operation of deleting this Pod will fail and the pod itself will be stuck at a 'TERMINATING' status. Kubelet cannot reconcile it correctly.
This is because kubelet will try to find the mount-point with '..' file which needs `x` permission of dir. When it's forbidden, the nfs volume will never umount without a correct mount-point finded.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#57095
**Special notes for your reviewer**:
**Release note**:
```release-note
Get parent dir via canonical absolute path when trying to judge mount-point
```
`lsblk` reads fs type info from udev files. If udev rules are not
installed. `lsblk` could not get correct fs type. This will cause
problems, e.g. expanding volume depends on fs type of disk.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
fix warning messages due to GetMountRefs func not implemented in windows
**What this PR does / why we need it**:
This PR completes the windows implementation of GetMountRefs in mount.go. In linux, the GetMountRefs implementaion is: read `/proc/mounts` and find all mount points, while in Windows, there is no such `/proc/mounts` place which shows all mounting points.
There is another way in windows, **we could walk through(by `getAllParentLinks` func) the mount path(symbolic link) and get all symlinks until we got the final device, which is actually a drive**.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#54670
This PR fixed the warnning issue mentioned in https://github.com/kubernetes/kubernetes/pull/51252
**Special notes for your reviewer**:
Some values in the code would be like follwoing:
```
GetMountRefs: mountPath ("\\var\\lib\\kubelet\\pods/4c74b128-92ca-11e7-b86b-000d3a36d70c/volumes/kubernetes.io~azure-disk/pvc-1cc91c70-92ca-11e7-b86b-000d3a36d70c")
getAllParentLinks: refs (["" "" "c:\\var\\lib\\kubelet\\plugins\\kubernetes.io\\azure-disk\\mounts\\b1246717734" "G:\\"])
basemountPath c:\var\lib\kubelet\plugins\kubernetes.io\azure-disk\mounts
got volumeID b1246717734
```
**Release note**:
```
fix warning messages due to GetMountRefs func not implemented in windows
```
Automatic merge from submit-queue (batch tested with PRs 43016, 50503, 51281, 51518, 51582). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
double const in mount_linux.go
**What this PR does / why we need it**:
fix some typo and double const
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 43016, 50503, 51281, 51518, 51582). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..
Clean up diskLooksUnformatted literal
**What this PR does / why we need it**:
#16948 moved the `formatAndMount` function to mount_linux.go, but `diskLooksUnformatted` does not necessarily need to appear in mount_unsupported.go
#31515 Renames `diskLooksUnformatted` to `getDiskFormat`, but did not update the comment
This is to do the small cleanup.
**Which issue this PR fixes**
**Special notes for your reviewer**:
**Release note**:
add initial work for mount azure file on windows
fix review comments
full implementation for attach azure file on windows node
working azure file mount
remove useless functions
add a workable implementation about mounting azure file on windows node
fix review comments and make the pod creating successful even azure file mount failed
fix according to review comments
add mount_windows_test
add implementation for IsLikelyNotMountPoint func
remove mount_windows_test.go temporaly
add back unit test for mount_windows.go
add normalizeWindowsPath func
fix normalizeWindowsPath func issue
implment azure disk on windows
update bazel BUILD
revert validation.go change as it's another PR
fix merge issue and compiling issue
fix windows compiling issue
fix according to review comments
fix according to review comments
fix cross-build failure
fix according to review comments
fix test build failure temporalily
fix darwin build failure
fix azure windows test failure
add empty implementation of MakeRShared on windows
fix gofmt errors
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
Automatic merge from submit-queue
Run mount in its own systemd scope.
Kubelet needs to run /bin/mount in its own cgroup.
- When kubelet runs as a systemd service, "systemctl restart kubelet" may kill all processes in the same cgroup and thus terminate fuse daemons that are needed for gluster and cephfs mounts.
- When kubelet runs in a docker container, restart of the container kills all fuse daemons started in the container.
Killing fuse daemons is bad, it basically unmounts volumes from running pods.
This patch runs mount via "systemd-run --scope /bin/mount ...", which makes sure that any fuse daemons are forked in its own systemd scope (= cgroup) and they will survive restart of kubelet's systemd service or docker container.
This helps with #34965
As a downside, each new fuse daemon will run in its own transient systemd service and systemctl output may be cluttered.
@kubernetes/sig-storage-pr-reviews
@kubernetes/sig-node-pr-reviews
```release-note
fuse daemons for GlusterFS and CephFS are now run in their own systemd scope when Kubernetes runs on a system with systemd.
```