Commit Graph

59 Commits

Author SHA1 Message Date
Travis Rhoden
2c4d748bed Refactor subpath out of pkg/util/mount
This patch moves subpath functionality out of pkg/util/mount and into a
new package pkg/volume/util/subpath. NSEnter funtionality is maintained.
2019-02-26 19:59:53 -07:00
Travis Rhoden
766cf26897 Move original mount files back
Move original mount files back into pkg/util/mount. This move is done to
preserve git history.
2019-02-26 12:18:25 -07:00
Travis Rhoden
f2438cacf5 Copy mount files to pkg/volume/util/subpath
Files in pkg/util/mount that contain significant code implementation for
subpaths are moved to a new package at pkg/volume/util/subpath. This
move is done in order to preserve git history.
2019-02-26 12:14:55 -07:00
Andrew Kim
84191eb99b replace pkg/util/file with k8s.io/utils/path 2019-01-29 15:20:13 -05:00
Andrew Kim
2ea82cea20 replace pkg/util/nsenter with k8s.io/utils/nsenter 2019-01-24 13:49:04 -05:00
Yecheng Fu
037ab98521 Deprecate mount.IsNotMountPoint 2019-01-06 20:25:31 +08:00
Davanum Srinivas
954996e231
Move from glog to klog
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
  * github.com/kubernetes/repo-infra
  * k8s.io/gengo/
  * k8s.io/kube-openapi/
  * github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods

Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
2018-11-10 07:50:31 -05:00
Jan Safranek
0b8c472578 Fixed subpath in containerized kubelet.
IsNotMountPoint should return no error when the checked directory does not
exists - missing directory can't be mounted. Therefore containerized
kubelet should check if the target exists first before resolving symlinks.
EvalHostSymlinks() returns indistinguishible error in case the path does
not exist.
2018-10-09 13:11:23 +02:00
k8s-ci-robot
c7a67b3e1b
Merge pull request #68626 from gnufied/fix-netdev-mount-opt
Apply _netdev mount option in bind mount if available
2018-09-25 17:00:36 -07:00
David Zhu
704573d304 GetMountRefs shouldn't error when file doesn'g exist in Windows and nsenter. Add unit test 2018-09-18 10:45:02 -07:00
Hemant Kumar
e881a29107 Apply _netdev mount option in bind mount if available
_netdev mount option is a userspace mount option and
isn't copied over when bind mount is created and remount
also does not copies it over and hence must be explicitly
used with bind mount
2018-09-13 13:47:34 -04:00
NickrenREN
7157d4582b make pathWithinBase public 2018-09-03 13:34:56 +08:00
Matthew Wong
b376b31ee0 Resolve potential devicePath symlink when MapVolume in containerized kubelet 2018-06-26 13:08:36 -04:00
Kubernetes Submit Queue
d2495b8329
Merge pull request #63143 from jsafrane/containerized-subpath
Automatic merge from submit-queue (batch tested with PRs 63348, 63839, 63143, 64447, 64567). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Containerized subpath

**What this PR does / why we need it**:
Containerized kubelet needs a different implementation of `PrepareSafeSubpath` than kubelet running directly on the host.

On the host we safely open the subpath and then bind-mount `/proc/<pidof kubelet>/fd/<descriptor of opened subpath>`.

With kubelet running in a container, `/proc/xxx/fd/yy` on the host contains path that works only inside the container, i.e. `/rootfs/path/to/subpath` and thus any bind-mount on the host fails.

Solution:
- safely open the subpath and gets its device ID and inode number
- blindly bind-mount the subpath to `/var/lib/kubelet/pods/<uid>/volume-subpaths/<name of container>/<id of mount>`. This is potentially unsafe, because user can change the subpath source to a link to a bad place (say `/run/docker.sock`) just before the bind-mount.
- get device ID and inode number of the destination. Typical users can't modify this file, as it lies on /var/lib/kubelet on the host.
- compare these device IDs and inode numbers.

**Which issue(s) this PR fixes**
Fixes #61456

**Special notes for your reviewer**:

The PR contains some refactoring of `doBindSubPath` to extract the common code. New `doNsEnterBindSubPath` is added for the nsenter related parts.

**Release note**:

```release-note
NONE
```
2018-06-01 12:12:19 -07:00
Yecheng Fu
28b6f34107 Should use hostProcMountinfoPath constant in nsenter_mount.go. 2018-05-26 00:09:25 +08:00
Jan Safranek
9b74125440 Pass Nsenter to NsenterMounter and NsenterWriter
So Nsenter is initialized only once and with the right parameters.
2018-05-23 10:21:21 +02:00
Jan Safranek
a8a37fb714 Created directories in /var/lib/kubelet directly. 2018-05-23 10:21:21 +02:00
Jan Safranek
9f80de3772 Split NsEnterMounter and Mounter implementation of doBindSubpath
nsenter implementation needs to mount different thing in the end and do
different checks on the result.
2018-05-23 10:21:21 +02:00
Jan Safranek
7e3fb502a8 Change SafeMakeDir to resolve symlinks in mounter implementation
Kubelet should not resolve symlinks outside of mounter interface.
Only mounter interface knows, how to resolve them properly on the host.

As consequence, declaration of SafeMakeDir changes to simplify the
implementation:
from SafeMakeDir(fullPath string, base string, perm os.FileMode)
to   SafeMakeDir(subdirectoryInBase string, base string, perm os.FileMode)
2018-05-23 10:21:20 +02:00
Jan Safranek
74ba0878a1 Enhance ExistsPath check
It should return error when the check fails (e.g. no permissions, symlink link
loop etc.)
2018-05-23 10:21:20 +02:00
Jan Safranek
7450d1b427 Allow EvalSymlinks target not to exist.
Various NsEnterMounter function need to resolve the part of the path that
exists and blindly add the part that doesn't.
2018-05-23 10:21:18 +02:00
Jan Safranek
97b5299cd7 Add GetMode to mounter interface.
Kubelet must not call os.Lstat on raw volume paths when it runs in a container.
Mounter knows where the file really is.
2018-05-23 10:17:59 +02:00
Yecheng Fu
df0f108a02 Fixes fsGroup check in local volume in containerized kubelet. Except
this, it also fixes fsGroup check when volume source is a normal
directory whether kubelet is running on the host or in a container.
2018-05-23 10:41:42 +08:00
Jan Safranek
598ca5accc Add GetSELinuxSupport to mounter. 2018-05-17 13:36:37 +02:00
Kubernetes Submit Queue
186dd7beb1
Merge pull request #62903 from cofyc/fixfsgroupcheckinlocal
Automatic merge from submit-queue (batch tested with PRs 62657, 63278, 62903, 63375). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add more volume types in e2e and fix part of them.

**What this PR does / why we need it**:

- Add dir-link/dir-bindmounted/dir-link-bindmounted/bockfs volume types for e2e tests.
- Fix fsGroup related e2e tests partially.
- Return error if we cannot resolve volume path.
  - Because we should not fallback to volume path, if it's a symbolic link, we may get wrong results.

To safely set fsGroup on local volume, we need to implement these two methods correctly for all volume types both on the host and in container:

- get volume path kubelet can access
  - paths on the host and in container are different
- get mount references
  - for directories, we cannot use its mount source (device field) to identify mount references, because directories on same filesystem have same mount source (e.g. tmpfs), we need to check filesystem's major:minor and directory root path on it

Here is current status:

| | (A) volume-path (host) | (B) volume-path (container) | (C) mount-refs (host) | (D) mount-refs (container) |
| --- | --- | --- | --- | --- |
| (1) dir | OK | FAIL | FAIL | FAIL |
| (2) dir-link | OK | FAIL | FAIL | FAIL |
| (3) dir-bindmounted | OK | FAIL | FAIL | FAIL |
| (4) dir-link-bindmounted | OK | FAIL | FAIL | FAIL |
| (5) tmpfs| OK | FAIL | FAIL | FAIL |
| (6) blockfs| OK | FAIL | OK | FAIL |
| (7) block| NOTNEEDED | NOTNEEDED | NOTNEEDED | NOTNEEDED |
| (8) gce-localssd-scsi-fs| NOTTESTED | NOTTESTED | NOTTESTED | NOTTESTED |

- This PR uses `nsenter ... readlink` to resolve path in container as @msau42  @jsafrane [suggested](https://github.com/kubernetes/kubernetes/pull/61489#pullrequestreview-110032850). This fixes B1:B6 and D6, , the rest will be addressed in https://github.com/kubernetes/kubernetes/pull/62102.
- C5:D5 marked `FAIL` because `tmpfs` filesystems can share same mount source, we cannot rely on it to check mount references. e2e tests passes due to we use unique mount source string in tests.
- A7:D7 marked `NOTNEEDED` because we don't set fsGroup on block devices in local plugin. (TODO: Should we set fsGroup on block device?)
- A8:D8 marked `NOTTESTED` because I didn't test it, I leave it to `pull-kubernetes-e2e-gce`. I think it should be same as `blockfs`.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-05-02 20:13:11 -07:00
Yecheng Fu
3748197876 Add more volume types in e2e and fix part of them.
- Add dir-link/dir-bindmounted/dir-link-bindmounted/blockfs volume types for e2e
tests.
- Return error if we cannot resolve volume path.
- Add GetFSGroup/GetMountRefs methods for mount.Interface.
- Fix fsGroup related e2e tests partially.
2018-05-02 10:31:42 +08:00
Davanum Srinivas
2f98d7a3ea Support nsenter in non-systemd environments
In our CI, we run kubekins image for most of the jobs. This is a
debian image with upstart and does not enable systemd. So we should:

* Bailout if any binary is missing other than systemd-run.
* SupportsSystemd should check the binary path to correctly
  identify if the systemd-run is present or not
* Pass the errors back to the callers so kubelet is forced to
  fail early when there is a problem. We currently assume
  that all binaries are in the root directory by default which
  is wrong.
2018-04-22 22:10:36 -04:00
andyzhangx
d154f8d021 fix nsenter GetFileType issue
use outputBytes as return error
2018-04-12 13:12:13 +00:00
Jan Safranek
5110db5087 Lock subPath volumes
Users must not be allowed to step outside the volume with subPath.
Therefore the final subPath directory must be "locked" somehow
and checked if it's inside volume.

On Windows, we lock the directories. On Linux, we bind-mount the final
subPath into /var/lib/kubelet/pods/<uid>/volume-subpaths/<container name>/<subPathName>,
it can't be changed to symlink user once it's bind-mounted.
2018-03-05 09:14:44 +01:00
Michal Fojtik
fd520aef61
Show findmt command output in case of error 2018-01-15 13:02:31 +01:00
Di Xu
57ead4898b use GetFileType per mount.Interface to check hostpath type 2017-09-26 09:57:06 +08:00
Di Xu
46b0b3491f refactor nsenter to new pkg/util 2017-09-26 09:56:44 +08:00
Jan Safranek
d9500105d8 Share /var/lib/kubernetes on startup
Kubelet makes sure that /var/lib/kubelet is rshared when it starts.
If not, it bind-mounts it with rshared propagation to containers
that mount volumes to /var/lib/kubelet can benefit from mount propagation.
2017-08-30 16:45:04 +02:00
Kubernetes Submit Queue
68ac78ae45 Merge pull request #49640 from jsafrane/systemd-mount-service
Automatic merge from submit-queue

Run mount in its own systemd scope.

Kubelet needs to run /bin/mount in its own cgroup.

- When kubelet runs as a systemd service, "systemctl restart kubelet" may kill all processes in the same cgroup and thus terminate fuse daemons that are needed for gluster and cephfs mounts.

- When kubelet runs in a docker container, restart of the container kills all fuse daemons started in the container.

Killing fuse daemons is bad, it basically unmounts volumes from running pods.

This patch runs mount via "systemd-run --scope /bin/mount ...", which makes sure that any fuse daemons are forked in its own systemd scope (= cgroup) and they will survive restart of kubelet's systemd service or docker container.

This helps with #34965

As a downside, each new fuse daemon will run in its own transient systemd service and systemctl output may be cluttered.

@kubernetes/sig-storage-pr-reviews 
@kubernetes/sig-node-pr-reviews 

```release-note
fuse daemons for GlusterFS and CephFS are now run in their own systemd scope when Kubernetes runs on a system with systemd.
```
2017-08-09 12:05:01 -07:00
Jan Safranek
5a8a6110a2 Run mount in its own systemd scope.
Kubelet needs to run /bin/mount in its own cgroup.

- When kubelet runs as a systemd service, "systemctl restart kubelet" may kill
  all processes in the same cgroup and thus terminate fuse daemons that are
  needed for gluster and cephfs mounts.

- When kubelet runs in a docker container, restart of the container kills all
  fuse daemons started in the container.

Killing fuse daemons is bad, it basically unmounts volumes from running pods.

This patch runs mount via "systemd-run --scope /bin/mount ...", which makes
sure that any fuse daemons are forked in its own systemd scope (= cgroup) and
they will survive restart of kubelet's systemd service or docker container.

As a downside, each new fuse daemon will run in its own transient systemd
service and systemctl output may be cluttered.
2017-07-26 16:14:39 +02:00
Kubernetes Submit Queue
feed4aa12a Merge pull request #49234 from mengqiy/master
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)

Move util/exec to vendor

Move util/exec to vendor.
Update import paths.
Update godep

Part of #48209

Associate PR against `k8s.io/utils` repo: https://github.com/kubernetes/utils/pull/5

```release-note
NONE
```

/assign @apelisse
2017-07-20 15:08:22 -07:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Jan Safranek
87551071a1 Fix findmnt parsing in containerized kubelet
NsEnterMounter should not stop parsing findmnt output on the first space but
on the last one, just in case the mount point name itself contains a space.
2017-07-18 13:35:44 +02:00
Ian Chakeres
2b18d3b6f7 Fixes bind-mount teardown failure with non-mount point Local volumes
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
2017-07-11 17:19:58 -04:00
Jędrzej Nowak
7a7c36261e fixed absense to absence 2016-10-14 16:28:46 +02:00
Jing Xu
f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Cindy Wang
e13c678e3b Make volume unmount more robust using exclusive mount w/ O_EXCL 2016-07-18 16:20:08 -07:00
David McMahon
ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
k8s-merge-robot
51dd3d562d Merge pull request #27380 from rootfs/fix-nsenter-list
Automatic merge from submit-queue

in nsenter mounter, read  hosts PID 1 /proc/mounts to list the mounts

fix #27378
2016-06-19 18:38:54 -07:00
Huamin Chen
8e67a308ed in nsenter mounter, read hosts PID 1 /proc/mounts to list the mounts
Signed-off-by: Huamin Chen <hchen@redhat.com>
2016-06-14 17:54:59 +00:00
Jing Xu
809dae1978 Fix bug in isLikelyNotMountPoint function
In nsenter_mount.go/isLikelyNotMountPoint function, the returned output
from findmnt command misses the last letter. Modify the code to make sure
that output has the full target path. fix #26421 #25056 #22911
2016-06-13 17:28:38 -07:00
k8s-merge-robot
04df711a35 Merge pull request #24035 from jsafrane/devel/nsenter-error-messages
Automatic merge from submit-queue

Add path to log messages.

It is useful to see the path that failed to mount in the log messages.

Fixes #23990.
2016-04-14 03:47:26 -07:00
Jan Safranek
be2ca8dec3 Add path to log messages.
It is useful to see the path that failed to mount in the log messages.
2016-04-08 09:41:09 +02:00
Jan Safranek
92f181174b Fixed mounting with containerized kubelet.
Kubelet was not able to mount volumes when running inside a container and
using nsenter mounter,

NsenterMounter.IsLikelyNotMountPoint() should return ErrNotExist when the
checked directory does not exists as the regular mounted does this and
some volume plugins depend on this behavior.
2016-03-24 17:27:11 +01:00
Filip Grzadkowski
217d9f38e7 Don't return error when findmnt exits with error.
Fixes #21303
2016-02-19 14:09:33 +01:00