Commit Graph

165 Commits

Author SHA1 Message Date
Paul "TBBle" Hampson
8a4cbabc64 Reimport windows layers when comitting snapshots
A Scratch layer only contains a sandbox.vhdx, but to be used as a parent
layer, it must also contain the files on-disk.

Hence, we Export the layer from the sandbox.vhdx and Import it back into
itself, so that both data formats are present.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2021-04-14 20:45:59 +10:00
Alakesh Haloi
5ce35ac398 devmapper: log pool status when mkfs fails
If mkfs on device mapper thin pool fails, it will show pool status
as returned by dmsetup for enahnced error reporting.

Signed-off-by: Alakesh Haloi <alakeshh@amazon.com>
2021-04-12 19:24:04 +00:00
Derek McGowan
261c107ffc
Merge pull request #5278 from mxpv/toml
Migrate TOML to github.com/pelletier/go-toml
2021-04-01 21:24:52 -07:00
Kazuyoshi Kato
e1f51ba73d Use os.File#Seek() to get the size of a block device
Instead of calling blockdev(1), this change uses os.File#Seek which
would be more effecient.

https://github.com/firecracker-microvm/firecracker/pull/1371

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-03-26 10:14:38 -07:00
Maksym Pavlenko
ddd4298a10 Migrate current TOML code to github.com/pelletier/go-toml
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-25 13:13:33 -07:00
Sebastiaan van Stijn
708299ca40
Move RunningInUserNS() to its own package
This allows using the utility without bringing whole of "sys" with it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-23 11:29:53 +01:00
Kazuyoshi Kato
7704fe72d0 Specifically mention "mkfs.ext4" on the error from the command
Before the change, the error on the caller-side (e.g. ctr) was
something like

> unpack: failed to prepare extraction snapshot "...": exit status 5:
> unknown

which was too cryptic.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-03-19 10:38:47 -07:00
Sebastiaan van Stijn
ba8f9845ec
move overlay-checks to an overlayutils package
This allows using the utilities without importing the whole
snapshotter.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-03-15 19:18:50 +01:00
Derek McGowan
35eeb24a17
Fix exported comments enforcer in CI
Add comments where missing and fix incorrect comments

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-03-12 08:47:05 -08:00
Derek McGowan
ddf6594fbe
Merge pull request #5076 from AkihiroSuda/ovl-k511
overlay: support "userxattr" option (kernel 5.11)
2021-03-09 07:07:30 -08:00
Michael Crosby
7738246cd9
Merge pull request #5111 from ctrlaltdel121/master
mark device faulty after parent fails to suspend
2021-03-08 14:13:25 -05:00
Maksym Pavlenko
e1b4c0ad43 Remove flaky devmapper check
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-03-03 14:51:11 -08:00
Jeremy Williams
51a72f0492 mark device faulty after parent fails to suspend
When an error is returned here, unlike the other error returns in the function, nothing is done to mark the added device as faulty or remove it.
I have observed this causing future snapshot creations to continue to attempt to use the same ID (from the sequence) to create new devices
and get blocked because the device already exists because it was not rolled back here.

Hopefully fixes #5110

Signed-off-by: Jeremy Williams <ctrlaltdel121@gmail.com>
2021-03-03 17:02:07 -05:00
Akihiro Suda
9ade247b38
overlay: support "userxattr" option (kernel 5.11)
The "userxattr" option is needed for mounting overlayfs inside a user namespace with kernel >= 5.11.

The "userxattr" option is NOT needed for the initial user namespace (aka "the host").

Also, Ubuntu (since circa 2015) and Debian (since 10) with kernel < 5.11 can mount the overlayfs in a user namespace without the "userxattr" option.

The corresponding kernel commit: 2d2f2d7322ff43e0fe92bf8cccdc0b09449bf2e1
> ovl: user xattr
>
> Optionally allow using "user.overlay." namespace instead of "trusted.overlay."
> ...
> Disable redirect_dir and metacopy options, because these would allow privilege escalation through direct manipulation of the
> "user.overlay.redirect" or "user.overlay.metacopy" xattrs.

Fix issue 5060

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-03-01 13:54:51 +09:00
Justin
0cc3991387
Merge pull request #4912 from dcantah/dcantah/wcow-sandbox-size
Scratch size customization and UVM scratch creation for WCOW snapshotter
2021-02-17 15:19:19 -08:00
Kazuyoshi Kato
2ac33d79fe test: fix assert.Check's argumets to show its parameters correctly
The change I made at db6075fc2 didn't show its parameters correctly.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-02-04 10:56:58 -08:00
Kazuyoshi Kato
db6075fc24 snapshot/devmapper: log actual values to investigate #4965
This test has been flaky in GitHub Actions. This change logs the
values from devmapper to further investigate the issue.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2021-02-01 16:27:59 -08:00
Shengjing Zhu
074873c68e Add cgo tag to btrfs plugin
btrfs plugin needs CGO support. However on riscv64, cgo
is only support on go1.16 (not released yet).
Instead of setting no_btrfs manually, adding a cgo tag tells
the compiler to skip it automatically.

Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2021-01-23 02:42:57 +08:00
Derek McGowan
9b9de47eb9
Merge pull request #4824 from dcantah/dcantah/reuse-scratch
Add scratch space re-use functionality to LCOW snapshotter
2021-01-21 17:21:31 -08:00
Daniel Canter
ff1451cab8 Scratch size customization and UVM scratch creation for WCOW snapshotter
* Currently we rely on making the UVMs sandbox.vhdx in the shim itself instead of this being
made by the snapshotter itself. This change adds a label that affects whether to create the UVMs
scratch layer in the snapshotter itself.

* Adds container scratch size customization. Before adding the computestorage calls
(vendored in with https://github.com/containerd/containerd/pull/4859) there was no way to make a containers
or UVMs scratch size less than the default (20 for containers and 10 for the UVM).

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-01-18 07:33:52 -08:00
Daniel Canter
3e5acb9d91 Add scratch space re-use functionality to LCOW snapshotter
Currently we would create a new disk and mount this into the LCOW UVM for every container but there
are certain scenarios where we'd rather just mount a single disk and then have every container share this one
storage space instead of every container having it's own xGB of space to play around with.

This is accomplished by just making a symlink to the disk that we'd like to share and then
using ref counting later on down the stack in hcsshim if we see that we've already mounted this
disk.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2021-01-13 15:20:46 -08:00
Maksym Pavlenko
04df60d106
Merge pull request #4858 from samuelkarp/freebsd-native-snapshotter
Support the native snapshotter on FreeBSD
2021-01-11 09:52:56 -08:00
Peng Tao
b7026236f4 snapshot/devmapper: use losetup in mount package
No need to use the private losetup command line wrapper package.
The generic package provides the same functionality.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-01-04 10:15:04 -08:00
Samuel Karp
b624486c84
native: support for FreeBSD
Signed-off-by: Samuel Karp <me@samuelkarp.com>
2020-12-22 21:26:04 -08:00
Shengjing Zhu
5988bfc1ef docs: Various typo found by codespell
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2020-12-22 13:22:16 +08:00
Maksym Pavlenko
a5f9613b83
Merge pull request #3927 from katiewasnothere/snapshotter_check
add check that snapshotter supports the image platform when unpacking
2020-12-16 14:25:08 -08:00
Kathryn Baldauf
f8992f451c add optional check that snapshotter supports the image platform when unpacking
Signed-off-by: Kathryn Baldauf <kabaldau@microsoft.com>
2020-12-10 10:54:22 -08:00
Maksym Pavlenko
da68609866 Fix devmapper test
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-12-09 09:35:17 -08:00
Maksym Pavlenko
2b87d4554f Add retries when deleting a devmapper device
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-12-09 09:13:34 -08:00
Michael Crosby
b9092fae15
Merge pull request #4643 from dcantah/feedback-lcow-snapshotter
Optimize Windows and LCOW snapshotters to only create scratch layer on the final snapshot
2020-12-01 10:38:02 -05:00
Daniel Canter
a91c298d1d Optimize Windows and LCOW snapshotters to only create scratch layer on the final snapshot
For LCOW currently we copy (or create) the scratch.vhdx for every single snapshot
so there ends up being a sandbox.vhdx in every directory seemingly unnecessarily. With the default scratch
size of 20GB the size on disk is about 17MB so there's a 17MB overhead per layer plus the time to copy the
file with every snapshot. Only the final sandbox.vhdx is actually used so this would be a nice little
optimization.

For WCOW we essentially do the exact same except copy the blank vhdx from the base layer.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-11-30 16:25:38 -08:00
Wei Fu
625da6b3e6
Merge pull request #4719 from estesp/fix-shm-relabel-test
Reenable make test targets in GH Actions CI
2020-11-23 13:11:32 +08:00
Phil Estes
85d9fe3e8c
Adjust overlay tests to expect "index=off"
When running tests on any modern distro, this assumption will work. If
we need to make it work with kernels where we don't append this option
it will require some more involved changes.

Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
2020-11-19 10:59:40 -05:00
Phil Estes
027ee569a3
Import crypto for all snapshotters during testsuite
Fixes runtime panic for testing snapshotters

Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
2020-11-19 08:50:07 -05:00
Kazuyoshi Kato
bb8aac38a0 Do not hardcode "amd64" on LCOW and Windows-related files
Fixes #3281.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2020-11-09 13:39:07 -08:00
Daniel Canter
9a1f6ea4dc Cri - Pass snapshotter labels into customopts.WithNewSnapshot
Previously there wwasn't a way to pass any labels to snapshotters as the wrapper
around WithNewSnapshot didn't have a parm to pass them in.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2020-10-14 04:14:03 -07:00
Teemu Kallio
71fd68a920 devicemapper: seperate implementation pkg from plugin pkg
Signed-off-by: Teemu Kallio <teemu.kallio@pm.me>
2020-09-18 12:00:14 +02:00
Derek McGowan
d4e78200d6
Merge pull request #4518 from knight42/feat/btrfs-config-root-path
feat(snapshot::btrfs): config root_path
2020-09-03 11:12:27 -07:00
Jian Zeng
c50ff694f0
refactor(native): separate init from implementation
Part of #4513

Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2020-09-03 19:58:31 +08:00
Jian Zeng
98b0b2a7c6
feat: make native root_path configurable
Part of #4514

Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2020-09-03 19:58:05 +08:00
Jian Zeng
a52daa26ae
refactor(btrfs): separate init from implementation
Part of #4513

Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2020-09-03 19:54:18 +08:00
Jian Zeng
4154235735
feat: make btrfs root_path configurable
Part of #4514

Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
2020-09-03 19:52:13 +08:00
Derek McGowan
70ffb12c1b
Separate overlay implementation from plugin
Put the overlay plugin in a separate package to allow the overlay package to be
used without needing to import and initialize the plugin.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-08-26 18:50:51 -07:00
Ashray Jain
5ed177a2da Add configurable overlayfs path
This allows configuring the location of the overlayfs snapshotter by
adding the following in config.toml
```
[plugins]
  [plugins.overlayfs]
    root_path = "/custom_location"
```

This is useful to isolate disk i/o for overlayfs from the rest of
containerd and prevent containers saturating disk i/o from negatively
affecting containerd operations and cause timeouts.

Signed-off-by: Ashray Jain <ashrayj@palantir.com>
2020-08-26 16:08:10 +01:00
Kazuyoshi Kato
a1f6c9dd88 snapshots/devmapper: fix rollback
The rollback mechanism is implemented by calling deleteDevice() and
RemoveDevice(). But RemoveDevice() is internally calling
deleteDevice() as well.

Since a device will be deleted by first deleteDevice(),
RemoveDevice() always will see ENODATA. The specific error must be
ignored to remove the device's metadata correctly.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2020-08-17 15:41:03 -07:00
Kazuyoshi Kato
74e9aa7abb snapshots/devmapper: don't hardcord the platform strings
The snapshotter doesn't have to exclude non-amd64 platforms.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2020-08-03 11:55:36 -07:00
Kazuyoshi Kato
c383436af7 snapshots/devmapper: suspend a device to avoid data corruption
According to https://github.com/torvalds/linux/blob/v5.7/Documentation/admin-guide/device-mapper/thin-provisioning.rst#internal-snapshots;

> If the origin device that you wish to snapshot is active, you
> must suspend it before creating the snapshot to avoid corruption.

However the devmapper snapshotter was not doing that.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2020-07-16 15:08:07 -07:00
Rudy Zhang
d36810d66d overlay: use index=off to fix EBUSY on mount
kernel version > 4.13rc1 support index=on feature, it will be failed
with EBUSY when trying to mount.

Related: https://github.com/moby/moby/pull/37993

Signed-off-by: Rudy Zhang <rudyflyzhang@gmail.com>
2020-06-08 15:51:15 +08:00
Sebastiaan van Stijn
dc92ad6520
Replace errors.Cause() with errors.Is()
Dependencies may be switching to use the new `%w` formatting
option to wrap errors; switching to use `errors.Is()` makes
sure that we are still able to unwrap the error and detect the
underlying cause.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-08 14:36:45 +02:00
Eric Ren
63b7587cd6 snapshots/devmapper: fix race windown causing IO hangup
The issue beblow happens several times beforing the root
cause found:

  1. A `fdisk -l` process has being hung up for a long time;
  2. A image layer snapshot device is visiable to dmsetup, which
       should *not* happen because it should be deactivated after
       `Commit()`;

The backtrace of `fdisk` is always the same over time:

```bash
[<ffffffff810bbc6a>] io_schedule+0x2a/0x80
[<ffffffff81295a3f>] do_blockdev_direct_IO+0x1e9f/0x2f10
[<ffffffff81296aea>] __blockdev_direct_IO+0x3a/0x40
[<ffffffff81290e43>] blkdev_direct_IO+0x43/0x50
[<ffffffff811b8a14>] generic_file_read_iter+0x374/0x960
[<ffffffff81291ad5>] blkdev_read_iter+0x35/0x40
[<ffffffff8125229b>] new_sync_read+0xfb/0x240
[<ffffffff81252406>] __vfs_read+0x26/0x40
[<ffffffff81252b96>] vfs_read+0x96/0x130
[<ffffffff812540e5>] SyS_read+0x55/0xc0
[<ffffffff81003c04>] do_syscall_64+0x74/0x180
```

The root cause is, in Commit(), there's a race window between
`SuspendDevice()` and `DeactivateDevice()`, which may cause the
IOs of a process or command like `fdisk` on the "suspended" device
hang up forever. It has twofold:

  1. The IOs suspends on the devices;
  2. The device is in `Suspended` state, because it's deactivated with
     `deferred` flag and without `force` flag;

So they cannot make progress.

One reproducer is:
 1. enlarge the race window by putting sleep seconds there;
 2. run `while true; do sudo fdisk -l; sleep 0.5; done` on one terminal;
 3. and pull image on another terminal;

Fixes it by:
 1. Resume the devices again after flushing IO by suspend;
 2. Remove device without `deferred` flag;

Fix: #4234
Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>
2020-05-07 07:46:45 +08:00