containerd

Author	SHA1	Message	Date
Alakesh Haloi	5ce35ac398	devmapper: log pool status when mkfs fails If mkfs on device mapper thin pool fails, it will show pool status as returned by dmsetup for enahnced error reporting. Signed-off-by: Alakesh Haloi <alakeshh@amazon.com>	2021-04-12 19:24:04 +00:00
Derek McGowan	261c107ffc	Merge pull request #5278 from mxpv/toml Migrate TOML to github.com/pelletier/go-toml	2021-04-01 21:24:52 -07:00
Kazuyoshi Kato	e1f51ba73d	Use os.File#Seek() to get the size of a block device Instead of calling blockdev(1), this change uses os.File#Seek which would be more effecient. https://github.com/firecracker-microvm/firecracker/pull/1371 Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2021-03-26 10:14:38 -07:00
Maksym Pavlenko	ddd4298a10	Migrate current TOML code to github.com/pelletier/go-toml Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2021-03-25 13:13:33 -07:00
Kazuyoshi Kato	7704fe72d0	Specifically mention "mkfs.ext4" on the error from the command Before the change, the error on the caller-side (e.g. ctr) was something like > unpack: failed to prepare extraction snapshot "...": exit status 5: > unknown which was too cryptic. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2021-03-19 10:38:47 -07:00
Derek McGowan	35eeb24a17	Fix exported comments enforcer in CI Add comments where missing and fix incorrect comments Signed-off-by: Derek McGowan <derek@mcg.dev>	2021-03-12 08:47:05 -08:00
Michael Crosby	7738246cd9	Merge pull request #5111 from ctrlaltdel121/master mark device faulty after parent fails to suspend	2021-03-08 14:13:25 -05:00
Maksym Pavlenko	e1b4c0ad43	Remove flaky devmapper check Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2021-03-03 14:51:11 -08:00
Jeremy Williams	51a72f0492	mark device faulty after parent fails to suspend When an error is returned here, unlike the other error returns in the function, nothing is done to mark the added device as faulty or remove it. I have observed this causing future snapshot creations to continue to attempt to use the same ID (from the sequence) to create new devices and get blocked because the device already exists because it was not rolled back here. Hopefully fixes #5110 Signed-off-by: Jeremy Williams <ctrlaltdel121@gmail.com>	2021-03-03 17:02:07 -05:00
Kazuyoshi Kato	2ac33d79fe	test: fix assert.Check's argumets to show its parameters correctly The change I made at `db6075fc2` didn't show its parameters correctly. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2021-02-04 10:56:58 -08:00
Kazuyoshi Kato	db6075fc24	snapshot/devmapper: log actual values to investigate #4965 This test has been flaky in GitHub Actions. This change logs the values from devmapper to further investigate the issue. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2021-02-01 16:27:59 -08:00
Peng Tao	b7026236f4	snapshot/devmapper: use losetup in mount package No need to use the private losetup command line wrapper package. The generic package provides the same functionality. Signed-off-by: Peng Tao <bergwolf@hyper.sh>	2021-01-04 10:15:04 -08:00
Shengjing Zhu	5988bfc1ef	docs: Various typo found by codespell Signed-off-by: Shengjing Zhu <zhsj@debian.org>	2020-12-22 13:22:16 +08:00
Maksym Pavlenko	da68609866	Fix devmapper test Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2020-12-09 09:35:17 -08:00
Maksym Pavlenko	2b87d4554f	Add retries when deleting a devmapper device Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2020-12-09 09:13:34 -08:00
Teemu Kallio	71fd68a920	devicemapper: seperate implementation pkg from plugin pkg Signed-off-by: Teemu Kallio <teemu.kallio@pm.me>	2020-09-18 12:00:14 +02:00
Kazuyoshi Kato	a1f6c9dd88	snapshots/devmapper: fix rollback The rollback mechanism is implemented by calling deleteDevice() and RemoveDevice(). But RemoveDevice() is internally calling deleteDevice() as well. Since a device will be deleted by first deleteDevice(), RemoveDevice() always will see ENODATA. The specific error must be ignored to remove the device's metadata correctly. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2020-08-17 15:41:03 -07:00
Kazuyoshi Kato	74e9aa7abb	snapshots/devmapper: don't hardcord the platform strings The snapshotter doesn't have to exclude non-amd64 platforms. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2020-08-03 11:55:36 -07:00
Kazuyoshi Kato	c383436af7	snapshots/devmapper: suspend a device to avoid data corruption According to https://github.com/torvalds/linux/blob/v5.7/Documentation/admin-guide/device-mapper/thin-provisioning.rst#internal-snapshots; > If the origin device that you wish to snapshot is active, you > must suspend it before creating the snapshot to avoid corruption. However the devmapper snapshotter was not doing that. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2020-07-16 15:08:07 -07:00
Sebastiaan van Stijn	dc92ad6520	Replace errors.Cause() with errors.Is() Dependencies may be switching to use the new `%w` formatting option to wrap errors; switching to use `errors.Is()` makes sure that we are still able to unwrap the error and detect the underlying cause. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2020-05-08 14:36:45 +02:00
Eric Ren	63b7587cd6	snapshots/devmapper: fix race windown causing IO hangup The issue beblow happens several times beforing the root cause found: 1. A `fdisk -l` process has being hung up for a long time; 2. A image layer snapshot device is visiable to dmsetup, which should not happen because it should be deactivated after `Commit()`; The backtrace of `fdisk` is always the same over time: ```bash [<ffffffff810bbc6a>] io_schedule+0x2a/0x80 [<ffffffff81295a3f>] do_blockdev_direct_IO+0x1e9f/0x2f10 [<ffffffff81296aea>] __blockdev_direct_IO+0x3a/0x40 [<ffffffff81290e43>] blkdev_direct_IO+0x43/0x50 [<ffffffff811b8a14>] generic_file_read_iter+0x374/0x960 [<ffffffff81291ad5>] blkdev_read_iter+0x35/0x40 [<ffffffff8125229b>] new_sync_read+0xfb/0x240 [<ffffffff81252406>] __vfs_read+0x26/0x40 [<ffffffff81252b96>] vfs_read+0x96/0x130 [<ffffffff812540e5>] SyS_read+0x55/0xc0 [<ffffffff81003c04>] do_syscall_64+0x74/0x180 ``` The root cause is, in Commit(), there's a race window between `SuspendDevice()` and `DeactivateDevice()`, which may cause the IOs of a process or command like `fdisk` on the "suspended" device hang up forever. It has twofold: 1. The IOs suspends on the devices; 2. The device is in `Suspended` state, because it's deactivated with `deferred` flag and without `force` flag; So they cannot make progress. One reproducer is: 1. enlarge the race window by putting sleep seconds there; 2. run `while true; do sudo fdisk -l; sleep 0.5; done` on one terminal; 3. and pull image on another terminal; Fixes it by: 1. Resume the devices again after flushing IO by suspend; 2. Remove device without `deferred` flag; Fix: #4234 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>	2020-05-07 07:46:45 +08:00
Maksym Pavlenko	bd22653003	Add devmapper configuration examples Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2020-03-17 18:16:58 -07:00
Maksym Pavlenko	e2e40e19d7	Merge pull request #3924 from renzhengeek/renzhen/snapshot-gc snapshots/devmapper: do not stop snapshot GC when one snapshot removing fails	2020-03-12 19:28:55 -07:00
Eric Ren	a3685262fe	snapshots/devmapper: do not stop snapshot GC when one snapshot removing fails Snapshots GC takes use of pruneBranch() function to remove snapshots, but GC will stop if snapshotter.Remove() returns error and the error number is not ErrFailedPrecondition. This results in thousands of dm snapshots not deleted if one snapshot is not deleted, due to errors like "contains a filesystem in use". So return ErrFailedPrecondition error number in Remove() function where appropriate, and let GC process go on collecting other snapshots. Fix: #3923 Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Eric Ren <renzhen.rz@linux.alibaba.com>	2020-02-29 13:32:48 +08:00
Eric Ren	b6bf7b97c2	devmapper: async remove device using Cleanup Fix: #3923 Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>	2020-02-29 13:32:48 +08:00
Sebastiaan van Stijn	f2edc6f164	vendor: update gotest.tools v3.0.2 full diff: https://github.com/gotestyourself/gotest.tools/compare/v2.3.0...v3.0.2 Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2020-02-28 17:47:20 +01:00
Maksym Pavlenko	f0652e1434	Make tests less flaky Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2020-01-30 09:57:34 -08:00
Maksym Pavlenko	75efbaf678	Attempt to make device mapper snapshotter tests less flaky Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-12-13 13:20:23 -08:00
Eric Ernst	731e144a48	devmapper: update example base image size in README base_image_size effectively is the limit of a layer size that can be created using the devmapper snapshotter. While this will also depend on the thinpool size itself, something closer to the total image size (80%?) is more appropriate. As is, if you try to run an image like elastic, you'll need a much larger base_image_size than 128MB. Signed-off-by: Eric Ernst <eric.ernst@intel.com>	2019-11-20 12:26:16 -08:00
Derek McGowan	66aa1d3ef6	Add snapshot walk implementations Temporarily remove zfs and aufs until interface update Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2019-10-24 11:11:22 -07:00
bpopovschi	e8c14c07c6	Added filters to snapshots API Signed-off-by: bpopovschi <zyqsempai@mail.ru>	2019-10-24 11:11:22 -07:00
renzhen.rz	4d11bb36ad	devmapper: activate dm device if snap device marked as activated - reproducer 1. stop a container; 2. reboot, or dmsetup remove its corresponding dm device; 3. start the container, it will fail like: """ Error: failed to start containers: {"message":"failed to create container(4f33d2760760c41518a84821153ccdf7f80980b797b783cdd75178fc6ca0bf4b) on containerd: failed to create task for container(4f33d2760760c41518a84821153ccdf7f80980b797b783cdd75178fc6ca0bf4b): failed to mount rootfs component &{ext4 /dev/mapper/vg0-mythinpool-snap-2 []}: no such file or directory: unknown"} """ - how the fix works activate the dm device if necessary, and give a warn msg: """ time="2019-08-21T22:44:08.422695797+08:00" level=warning msg="devmapper device \"vg0-mythinpool-snap-2\" marked as \"Activated\" but not active, activating it" """ Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>	2019-08-23 10:19:28 +08:00
Maksym Pavlenko	0a4bf1bd1e	Mark faulty devices Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-08-05 12:05:36 -07:00
Maksym Pavlenko	3741fd8591	Remove deferred flag when removing devmapper device Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-07-31 11:28:33 -07:00
Maksym Pavlenko	4d5a0e19eb	Mark faulty device in one transaction Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-07-30 16:26:55 -07:00
Maksym Pavlenko	878a3205cd	Better error recovery in devmapper Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-07-30 15:17:17 -07:00
renzhen.rz	3887053177	snapshots/devmapper: deactivate thin device after committed 1. reason to deactivate committed snapshot The thin device will not be used for IO after committed, and further thin snapshotting is OK using an inactive thin device as origin. The benefits to deactivate are: - device is not unneccesary visible avoiding any unexpected IO; - save useless kernel data structs for maintaining active dm. Quote from kernel doc (Documentation/device-mapper/provisioning.txt): " ii) Using an internal snapshot. Once created, the user doesn't have to worry about any connection between the origin and the snapshot. Indeed the snapshot is no different from any other thinly-provisioned device and can be snapshotted itself via the same method. It's perfectly legal to have only one of them active, and there's no ordering requirement on activating or removing them both. (This differs from conventional device-mapper snapshots.) " 2. an thinpool metadata bug is naturally removed An problem happens when failed to suspend/resume origin thin device when creating snapshot: "failed to create snapshot device from parent vg0-mythinpool-snap-3" error="failed to save initial metadata for snapshot "vg0-mythinpool-snap-19": object already exists" This issue occurs because when failed to create snapshot, the snapshotter.store can be rollbacked, but the thin pool metadata boltdb failed to rollback in PoolDevice.CreateSnapshotDevice(), therefore metadata becomes inconsistent: the snapshotID is not taken in snapshotter.store, but saved in pool metadata boltdb. The cause is, in PoolDevice.CreateSnapshotDevice(), the defer calls are invoked on "first-in-last-out" order. When the error happens on the "resume device" defer call, the metadata is saved and snapshot is created, which has no chance to be rollbacked. Signed-off-by: Eric Ren <renzhen@linux.alibaba.com>	2019-05-09 10:58:21 +08:00
Davor Kapsa	cfc36388b3	Remove redundant error checks Signed-off-by: Davor Kapsa <davor.kapsa@gmail.com>	2019-04-30 21:28:51 +02:00
Maksym Pavlenko	87289a0c62	devmapper: implement Usage Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2019-03-27 14:50:12 -07:00
Maksym Pavlenko	010b4da36f	devmapper: implement dmsetup status Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>	2019-03-27 14:26:07 -07:00
Maksym Pavlenko	208957ba3c	devmapper: proper cleanup in pool device test Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-22 12:51:27 -08:00
Maksym Pavlenko	734989c2a0	Update README Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-22 11:10:51 -08:00
Maksym Pavlenko	95f0a4903c	devmapper: rollback thin devices on error Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 17:40:10 -08:00
Maksym Pavlenko	adf5c640f4	devmapper: don't create or reload thin-pool from snapshotter Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:26:46 -08:00
Maksym Pavlenko	7efda48c53	devmapper: more precise way of checking if device is activated Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:26:46 -08:00
Maksym Pavlenko	37cdedc61c	devmapper: add linux tags, fix build Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:26:46 -08:00
Maksym Pavlenko	0c6d194cce	devmapper: add README and minor fixes Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:25:55 -08:00
Maksym Pavlenko	2218275ec9	devmapper: register plugin Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:25:55 -08:00
Maksym Pavlenko	cec72efc2a	devmapper: add snapshotter Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:25:55 -08:00
Maksym Pavlenko	3a75882520	devmapper: add pool device manager Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-02-21 16:25:55 -08:00

1 2

54 Commits