Commit Graph

42 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
5d24a2c199 Merge pull request #49300 from tklauser/syscall-to-x-sys-unix
Automatic merge from submit-queue

Switch from package syscall to golang.org/x/sys/unix

**What this PR does / why we need it**:

The syscall package is locked down and the comment in https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24 advises to switch code to use the corresponding package from golang.org/x/sys. This PR does so and replaces usage of package syscall with package golang.org/x/sys/unix where applicable. This will also allow to get updates and fixes
without having to use a new go version.

In order to get the latest functionality, golang.org/x/sys/ is re-vendored. This also allows to use Eventfd() from this package instead of calling the eventfd() C function.

**Special notes for your reviewer**:

This follows previous works in other Go projects, see e.g. moby/moby#33399, cilium/cilium#588

**Release note**:

```release-note
NONE
```
2017-08-03 04:02:12 -07:00
Tobias Klauser
4a69005fa1 switch from package syscall to x/sys/unix
The syscall package is locked down and the comment in [1] advises to
switch code to use the corresponding package from golang.org/x/sys. Do
so and replace usage of package syscall with package
golang.org/x/sys/unix where applicable.

  [1] https://github.com/golang/go/blob/master/src/syscall/syscall.go#L21-L24

This will also allow to get updates and fixes for syscall wrappers
without having to use a new go version.

Errno, Signal and SysProcAttr aren't changed as they haven't been
implemented in /x/sys/. Stat_t from syscall is used if standard library
packages (e.g. os) require it. syscall.SIGTERM is used for
cross-platform files.
2017-07-21 12:14:42 +02:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Ian Chakeres
2b18d3b6f7 Fixes bind-mount teardown failure with non-mount point Local volumes
Added IsNotMountPoint method to mount utils (pkg/util/mount/mount.go)
Added UnmountMountPoint method to volume utils (pkg/volume/util/util.go)
Call UnmountMountPoint method from local storage (pkg/volume/local/local.go)
IsLikelyNotMountPoint behavior was not modified, so the logic/behavior for UnmountPath is not modified
2017-07-11 17:19:58 -04:00
Jan Safranek
4cf36b8b39 Do not reformat devices with partitions
lsblk reports FSTYPE of devices with partition tables as empty string "",
which is indistinguishable from empty devices. We must look for dependent
devices (i.e. partitions) to see that the device is really empty and report
error otherwise.

I checked that LVM, LUKS and MD RAID have their own FSTYPE in lsblk output,
so it should be only a partition table that has empty FSTYPE.

The main point of this patch is to run lsblk without "-n", i.e. print all
dependent devices and check if they're there.
2017-03-20 13:08:13 +01:00
Kubernetes Submit Queue
81d01a84e0 Merge pull request #41944 from jingxu97/Feb/mounter
Automatic merge from submit-queue (batch tested with PRs 35094, 42095, 42059, 42143, 41944)

Use chroot for containerized mounts

This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-28 09:20:21 -08:00
Kubernetes Submit Queue
a426904009 Merge pull request #31515 from jsafrane/format-error
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)

Show specific error when a volume is formatted by unexpected filesystem.

kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.

Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.

Example kubectl describe pod output:

```
  FirstSeen     LastSeen        Count   From                                    SubobjectPath   Type            Reason          Message
  41s           3s              7       {kubelet ip-172-18-3-82.ec2.internal}                   Warning         FailedMount     MountVolume.MountDevice failed for volume "kubernetes.io/aws-ebs/aws://us-east-1d/vol-ba79c81d" (spec.Name: "pvc-ce175cbb-6b82-11e6-9fe4-0e885cca73d3") pod "3d19cb64-6b83-11e6-9fe4-0e885cca73d3" (UID: "3d19cb64-6b83-11e6-9fe4-0e885cca73d3") with: failed to mount the volume as "ext4", it's already formatted with "xfs". Mount error: mount failed: exit status 32
Mounting arguments: /dev/xvdba /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-east-1d/vol-ba79c81d ext4 [defaults]
Output: mount: wrong fs type, bad option, bad superblock on /dev/xvdba,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
```
2017-02-25 02:17:57 -08:00
Jing Xu
ac22416835 Use chroot for containerized mounts
This PR is to modify the containerized mounter script to use chroot
instead of rkt fly. This will avoid the problem of possible large number
of mounts caused by rkt containers if they are not cleaned up.
2017-02-24 13:46:26 -08:00
Harry Zhang
3bdc3f25ec Use fnv.New32a() in hash instead adler32 2017-02-15 14:03:54 +08:00
Jan Safranek
c8df30973b Show specific error when a volume is formatted by unexpected filesystem.
kubelet now detects that e.g. xfs volume is being mounted as ext3 because of
wrong volume.Spec.

Mount error is left in the error message to diagnose issues with mounting e.g.
'ext3' volume as 'ext4' - they are different filesystems, however kernel should
mount ext3 as ext4 without errors.
2017-02-13 12:15:34 +01:00
Kubernetes Submit Queue
6dfe5c49f6 Merge pull request #38865 from vwfs/ext4_no_lazy_init
Automatic merge from submit-queue

Enable lazy initialization of ext3/ext4 filesystems

**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #30752, fixes #30240

**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```

**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.

These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount

Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```

The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.

When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system. 

As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory. 

I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.

As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
2017-01-18 09:09:52 -08:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Alexander Block
13a2bc8afb Enable lazy initialization of ext3/ext4 filesystems 2016-12-18 11:08:51 +01:00
Jing Xu
37136e9780 Enable containerized mounter only for nfs and glusterfs types
This change is to only enable containerized mounter for nfs and
glusterfs types. For other types such as tmpfs, ext2/3/4 or empty type,
we should still use mount from $PATH
2016-12-02 15:06:24 -08:00
Pengfei Ni
f584ed4398 Fix package aliases to follow golang convention 2016-11-30 15:40:50 +08:00
Vishnu kannan
dd8ec911f3 Revert "Revert "Merge pull request #35821 from vishh/gci-mounter-scope""
This reverts commit 402116aed4.
2016-11-08 11:09:10 -08:00
saadali
402116aed4 Revert "Merge pull request #35821 from vishh/gci-mounter-scope"
This reverts commit 973fa6b334, reversing
changes made to 41b5fe86b6.
2016-11-03 20:23:25 -07:00
Vishnu Kannan
414e4ae549 Revert "Adding a root filesystem override for kubelet mounter"
This reverts commit e861a5761d.
2016-11-02 15:18:09 -07:00
Vishnu Kannan
1ecc12f724 [Kubelet] Do not use custom mounter script for bind mounts, ext* and tmpfs mounts
Signed-off-by: Vishnu Kannan <vishnuk@google.com>
2016-11-02 15:18:08 -07:00
Vishnu kannan
7fd03c4b6e Fix source and target path with overriden rootfs in mount utility package
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-27 09:46:33 -07:00
Vishnu kannan
e861a5761d Adding a root filesystem override for kubelet mounter
This is useful for supporting hostPath volumes via containerized
mounters in kubelet.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-10-26 21:42:59 -07:00
Jing Xu
b02481708a Fix volume states out of sync problem after kubelet restarts
When kubelet restarts, all the information about the volumes will be
gone from actual/desired states. When update node status with mounted
volumes, the volume list might be empty although there are still volumes
are mounted and in turn causing master to detach those volumes since
they are not in the mounted volumes list. This fix is to make sure only
update mounted volumes list after reconciler starts sync states process.
This sync state process will scan the existing volume directories and
reconstruct actual states if they are missing.

This PR also fixes the problem during orphaned pods' directories. In
case of the pod directory is unmounted but has not yet deleted (e.g.,
interrupted with kubelet restarts), clean up routine will delete the
directory so that the pod directoriy could be cleaned up (it is safe to
delete directory since it is no longer mounted)

The third issue this PR fixes is that during reconstruct volume in
actual state, mounter could not be nil since it is required for creating
container.VolumeMap. If it is nil, it might cause nil pointer exception
in kubelet.

Details are in proposal PR #33203
2016-10-25 12:29:12 -07:00
Michael Taufen
dba917c5b7 Include mount command in Kubelet mounter output 2016-10-24 05:50:24 -07:00
Jing Xu
34ef93aa0c Add mounterPath to mounter interface
In order to be able to use new mounter library, this PR adds the
mounterPath flag to kubelet which passes the flag to the mount
interface. If flag is empty, mount uses default mount path.
2016-10-20 14:15:27 -07:00
Brendan Burns
07c8f9a173 Don't return an error if a file doesn't exist for IsPathDevice(...) 2016-09-05 20:45:22 -07:00
Morgan Bauer
92a043e833
ensure pkg/util/mount compiles & crosses
- move compile time check from linux code to generic code
2016-08-21 17:47:24 -07:00
Jing Xu
f19a1148db This change supports robust kubelet volume cleanup
Currently kubelet volume management works on the concept of desired
and actual world of states. The volume manager periodically compares the
two worlds and perform volume mount/unmount and/or attach/detach
operations. When kubelet restarts, the cache of those two worlds are
gone. Although desired world can be recovered through apiserver, actual
world can not be recovered which may cause some volumes cannot be cleaned
up if their information is deleted by apiserver. This change adds the
reconstruction of the actual world by reading the pod directories from
disk. The reconstructed volume information is added to both desired
world and actual world if it cannot be found in either world. The rest
logic would be as same as before, desired world populator may clean up
the volume entry if it is no longer in apiserver, and then volume
manager should invoke unmount to clean it up.
2016-08-15 11:29:15 -07:00
Scott Creeley
11d1289afa Add volume and mount logging 2016-07-21 09:10:00 -04:00
Cindy Wang
e13c678e3b Make volume unmount more robust using exclusive mount w/ O_EXCL 2016-07-18 16:20:08 -07:00
David McMahon
ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
laushinka
7ef585be22 Spelling fixes inspired by github.com/client9/misspell 2016-02-18 06:58:05 +07:00
Sami Wagiaalla
c18f342ac6 Use constants for fsck return values 2015-12-08 10:51:12 -05:00
Sami Wagiaalla
10688f1a11 Run fsck before formatting disk
Signed-off-by: Sami Wagiaalla <swagiaal@redhat.com>
2015-12-08 10:50:30 -05:00
Sami Wagiaalla
1880c4eedb move formatAndMount and diskLooksUnformatted to mount_linux 2015-11-06 15:37:46 -05:00
Huamin Chen
3b14135cad mount returns more verbose message upon error
Signed-off-by: Huamin Chen <hchen@redhat.com>
2015-10-21 11:52:02 -04:00
Eric Paris
f125ad88ce Rename IsMountPoint to IsLikelyNotMountPoint
IsLikelyNotMountPoint determines if a directory is not a mountpoint.
It is fast but not necessarily ALWAYS correct. If the path is in fact
a bind mount from one part of a mount to another it will not be detected.
mkdir /tmp/a /tmp/b; mount --bin /tmp/a /tmp/b; IsLikelyNotMountPoint("/tmp/b")
will return true. When in fact /tmp/b is a mount point. So this patch
renames the function and switches it from a positive to a negative (I
could think of a good positive name). This should make future users of
this function aware that it isn't quite perfect, but probably good
enough.
2015-08-14 18:45:43 -04:00
markturansky
450002a52e Fixed formatting of error message 2015-06-19 11:21:57 -04:00
Paul Morie
e5521234e4 Add NsenterMounter mount implementation 2015-05-04 14:40:04 -04:00
Eric Paris
6b3a6e6b98 Make copyright ownership statement generic
Instead of saying "Google Inc." (which is not always correct) say "The
Kubernetes Authors", which is generic.
2015-05-01 17:49:56 -04:00
Deyuan Deng
6897095e56 Change mount.Interface.Mount to exec('mount'), instead of syscall 2015-04-29 10:46:32 -04:00
Deyuan Deng
d62afa85ff Abstract ismountpoint and use platform mounter for NFS volume 2015-04-01 23:05:02 -04:00
Paul Morie
8ef04a8425 Factor mount utility code out gce_pd volume plugin 2015-03-05 13:49:32 -05:00