Commit Graph

102 Commits

Author SHA1 Message Date
Amit Barve
daa1ea522b Add cimfs differ and snapshotter
Details about CimFs project are discussed in #8346

Signed-off-by: Amit Barve <ambarve@microsoft.com>
2023-12-20 09:29:08 -08:00
Wei Fu
23278c81fb *: introduce image_pull_with_sync_fs in CRI
It's to ensure the data integrity during unexpected power failure.

Background:

Since release 1.3, in Linux system, containerD unpacks and writes files into
overlayfs snapshot directly. It doesn’t involve any mount-umount operations
so that the performance of pulling image has been improved.

As we know, the umount syscall for overlayfs will force kernel to flush
all the dirty pages into disk. Without umount syscall, the files’ data relies
on kernel’s writeback threads or filesystem's commit setting (for
instance, ext4 filesystem).

The files in committed snapshot can be loss after unexpected power failure.
However, the snapshot has been committed and the metadata also has been
fsynced. There is data inconsistency between snapshot metadata and files
in that snapshot.

We, containerd, received several issues about data loss after unexpected
power failure.

* https://github.com/containerd/containerd/issues/5854
* https://github.com/containerd/containerd/issues/3369#issuecomment-1787334907

Solution:

* Option 1: SyncFs after unpack

Linux platform provides [syncfs][syncfs] syscall to synchronize just the
filesystem containing a given file.

* Option 2: Fsync directories recursively and fsync on regular file

The fsync doesn't support symlink/block device/char device files. We
need to use fsync the parent directory to ensure that entry is
persisted.

However, based on [xfstest-dev][xfstest-dev], there is no case to ensure
fsync-on-parent can persist the special file's metadata, for example,
uid/gid, access mode.

Checkout [generic/690][generic/690]: Syncing parent dir can persist
symlink. But for f2fs, it needs special mount option. And it doesn't say
that uid/gid can be persisted. All the details are behind the
implemetation.

> NOTE: All the related test cases has `_flakey_drop_and_remount` in
[xfstest-dev].

Based on discussion about [Documenting the crash-recovery guarantees of Linux file systems][kernel-crash-recovery-data-integrity],
we can't rely on Fsync-on-parent.

* Option 1 is winner

This patch is using option 1.

There is test result based on [test-tool][test-tool].
All the networking traffic created by pull is local.

  * Image: docker.io/library/golang:1.19.4 (992 MiB)
    * Current: 5.446738579s
      * WIOS=21081, WBytes=1329741824, RIOS=79, RBytes=1197056
    * Option 1: 6.239686088s
      * WIOS=34804, WBytes=1454845952, RIOS=79, RBytes=1197056
    * Option 2: 1m30.510934813s
      * WIOS=42143, WBytes=1471397888, RIOS=82, RBytes=1209344

  * Image: docker.io/tensorflow/tensorflow:latest (1.78 GiB, ~32590 Inodes)
    * Current: 8.852718042s
      * WIOS=39417, WBytes=2412818432, RIOS=2673, RBytes=335987712
    * Option 1: 9.683387174s
      * WIOS=42767, WBytes=2431750144, RIOS=89, RBytes=1238016
    * Option 2: 1m54.302103719s
      * WIOS=54403, WBytes=2460528640, RIOS=1709, RBytes=208237568

The Option 1 will increase `wios`. So, the `image_pull_with_sync_fs` is
option in CRI plugin.

[syncfs]: <https://man7.org/linux/man-pages/man2/syncfs.2.html>
[xfstest-dev]: <https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git>
[generic/690]: <https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/tree/tests/generic/690?h=v2023.11.19>
[kernel-crash-recovery-data-integrity]: <https://lore.kernel.org/linux-fsdevel/1552418820-18102-1-git-send-email-jaya@cs.utexas.edu/>
[test-tool]: <a17fb2010d/contrib/syncfs/containerd/main_test.go (L51)>

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-12-12 10:18:39 +08:00
Sebastiaan van Stijn
2af6db672e
switch back from golang.org/x/sys/execabs to os/exec (go1.19)
This is effectively a revert of 2ac9968401, which
switched from os/exec to the golang.org/x/sys/execabs package to mitigate
security issues (mainly on Windows) with lookups resolving to binaries in the
current directory.

from the go1.19 release notes https://go.dev/doc/go1.19#os-exec-path

> ## PATH lookups
>
> Command and LookPath no longer allow results from a PATH search to be found
> relative to the current directory. This removes a common source of security
> problems but may also break existing programs that depend on using, say,
> exec.Command("prog") to run a binary named prog (or, on Windows, prog.exe) in
> the current directory. See the os/exec package documentation for information
> about how best to update such programs.
>
> On Windows, Command and LookPath now respect the NoDefaultCurrentDirectoryInExePath
> environment variable, making it possible to disable the default implicit search
> of “.” in PATH lookups on Windows systems.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-11-02 21:15:40 +01:00
Derek McGowan
9db21401c4
Switch to github.com/containerd/plugin
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-11-01 23:01:42 -07:00
Derek McGowan
5fdf55e493
Update go module to github.com/containerd/containerd/v2
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-10-29 20:52:21 -07:00
Derek McGowan
7b2a918213
Generalize the plugin package
Remove containerd specific parts of the plugin package to prepare its
move out of the main repository. Separate the plugin registration
singleton into a separate package.

Separating out the plugin package and registration makes it easier to
implement external plugins without creating a dependency loop.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-10-12 21:22:32 -07:00
Derek McGowan
a80606bc2d
Move plugin type definitions to containerd plugins package
The plugins packages defines the plugins used by containerd.
Move all the types and properties to this package.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-10-12 20:52:56 -07:00
Derek McGowan
508aa3a1ef
Move to use github.com/containerd/log
Add github.com/containerd/log to go.mod

Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-09-22 07:53:23 -07:00
Akihiro Suda
e939d13195
Revert "Revert 416899fc8e81a80a4b09b59c801f98d36ddc0e74"
This reverts commit 6c9c711120.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-07-19 00:22:05 +09:00
Marat Radchenko
6c9c711120 Revert 416899fc8e
That commit neither helps without a working bind-mount implementation nor is needed when such implementation exists.

Testing shows that containerd can properly download and unpack image using bindfs mounts (see previous commit) even without Darwin-specific applier code.

Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
2023-07-17 23:27:04 +03:00
Kazuyoshi Kato
099d2e7c76
Merge pull request #8757 from dcantah/proto-api-conversions
Add From/ToProto helpers
2023-07-03 10:59:08 -07:00
Akihiro Suda
5dedb6d0d2
archive: use 1970-01-01 as the whiteout timestamp
The whiteout timestamps are no longer set to the source date epoch.
The source date epoch still applies to non-whiteout files.

Discussion happened in moby/buildkit PR 3560.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-06-30 11:30:01 +09:00
Danny Canter
0a6b8f0ee0 OCI: Add From/ToProto helpers for Descriptor
Helpers to convert from the OCI image specs [Descriptor] to its protobuf
structure for Descriptor and vice-versa appear three times. It seems sane
to just expose this facility in /oci.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-28 12:16:20 -07:00
Danny Canter
55a8102ec1 mount: Add From/ToProto helpers
Helpers to convert from containerd's [Mount] to its protobuf structure for
[Mount] and vice-versa appear three times. It seems sane to just expose
this facility in /mount.

Signed-off-by: Danny Canter <danny@dcantah.dev>
2023-06-28 04:03:18 -07:00
Maksym Pavlenko
6f34da5f80 Cleanup logrus imports
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-05-05 11:54:14 -07:00
Derek McGowan
0d975230e1
Fix panic when remote differ returns empty result
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-04-29 22:55:21 -07:00
Derek McGowan
3784c1c917
Add proxy differ
Signed-off-by: Derek McGowan <derek@mcg.dev>
2023-04-15 22:37:23 -07:00
Gabriel Adrian Samfira
6a5b4c9c24 Remove "bind" code path from diff
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-04-03 08:11:35 -07:00
Gabriel Adrian Samfira
7f82dd91f4 Add ReadOnly() function
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-04-01 08:43:14 -07:00
Paul "TBBle" Hampson
34b07d3e2d Implement WCOW parentless active snapshots and view snapshots
The WCOW layer support does not support creating sandboxes with no
parent.  Instead, parentless scratch layers must be layed out as a
directory containing only a directory named 'Files', and all data stored
inside 'Files'. At commit-time, this will be converted in-place into a
read-only layer suitable for use as a parent layer.

The WCOW layer support also does not deal with making read-only layers,
i.e. layers that are prepared to be parent layers, visible in a
read-only manner. A bind-mount or junction point cannot be made
read-only, so a view must instead be a small sandbox layer that we can
mount via WCOW, and discard later, to protect the layer against
accidental or deliberate modification.

Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
2023-03-31 06:15:17 -07:00
Derek McGowan
e2cb6b82d1
Merge pull request #8259 from laurazard/readonly-overlay
Add `ReadonlyMounts` to make overlay mounts readonly
2023-03-17 22:34:38 -07:00
Laura Brehm
daa3a7665e
Add WithReadonlyTempMount to create readonly temporary mounts
This is necessary so we can mount snapshots more than once with overlayfs,
otherwise mounts enter an unknown state.

related: https://github.com/moby/buildkit/pull/1100

Signed-off-by: Laura Brehm <laurabrehm@hey.com>
Co-authored-by: Zou Nengren <zouyee1989@gmail.com>
2023-03-17 15:51:18 +00:00
Akihiro Suda
86fc1ccab4
Remove aufs snapshotter (deprecated since v1.5)
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-03-14 14:37:13 +09:00
Fu Wei
8cb00f45c9
Merge pull request #8143 from mxpv/log
Add Fields type alias to log package
2023-02-21 10:22:23 +08:00
Maksym Pavlenko
06e085c8b5 Add Fields type alias to log package
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-02-20 17:29:08 -08:00
Akihiro Suda
d8b68e3ccc
Stop using math/rand.Read and rand.Seed (deprecated in Go 1.20)
From golangci-lint:

> SA1019: rand.Read has been deprecated since Go 1.20 because it
>shouldn't be used: For almost all use cases, crypto/rand.Read is more
>appropriate. (staticcheck)

> SA1019: rand.Seed has been deprecated since Go 1.20 and an alternative
>has been available since Go 1.0: Programs that call Seed and then expect
>a specific sequence of results from the global random source (using
>functions such as Int) can be broken when a dependency changes how
>much it consumes from the global random source. To avoid such breakages,
>programs that need a specific result sequence should use
>NewRand(NewSource(seed)) to obtain a random generator that other
>packages cannot access. (staticcheck)

See also:

- https://pkg.go.dev/math/rand@go1.20#Read
- https://pkg.go.dev/math/rand@go1.20#Seed

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-16 03:50:23 +09:00
Akihiro Suda
b61988670c
go.mod: github.com/containerd/typeurl/v2 v2.1.0
Changes: https://github.com/containerd/typeurl/compare/7f6e6d160d67...v2.1.0

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-11 23:39:52 +09:00
Akihiro Suda
4adf3fb3af
Merge pull request #7906 from Iceber/use_label_uncompressed
Use the const labels.LabelUncompressed
2023-01-04 01:04:20 +09:00
Iceber Gu
778e8f2af4 Use the const labels.LabelUncompressed
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2023-01-03 18:29:21 +08:00
Wei Fu
6b7e237fc7 chore: use go fix to cleanup old +build buildtag
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-12-29 14:25:14 +08:00
Akihiro Suda
dc48349248
epoch: propagate SOURCE_DATE_EPOCH via ctx
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-12-12 09:02:35 +09:00
Maksym Pavlenko
3bc8fc4d30 Cleanup build constraints
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-12-08 09:36:20 -08:00
Phil Estes
cef99bea26
Merge pull request #7478 from AkihiroSuda/archive-whiteout-source-date-epoch
archive: add WithSourceDateEpoch() for whiteouts
2022-10-10 15:45:25 -07:00
Akihiro Suda
39158629f7
diff/apply.readCounter: check negative size
`rc.r.Read()` may return a negative `int` on an error
when the reader is set to a custom content store implementation

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-10-08 10:19:14 +09:00
Akihiro Suda
70fbedc217
archive: add WithSourceDateEpoch() for whiteouts
This makes diff archives to be reproducible.

The value is expected to be passed from CLI applications via the $SOUCE_DATE_EPOCH env var.

See https://reproducible-builds.org/docs/source-date-epoch/
for the $SOURCE_DATE_EPOCH specification.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-10-08 08:45:03 +09:00
Maksym Pavlenko
0f51aa874d Add NoSameOwner option when unpacking tars
When unpacking a TAR archive, containerd preserves file's owner:
https://github.com/containerd/containerd/blob/main/archive/tar.go#L384

In some cases this behavior is not desired. In current implementation we
avoid `Lchown` on Windows. Another case when this should be skipped is
when using native snapshotter on darwin and running as non-root user.

This PR extracts a generic option - `WithNoSameOwner` (same as
`tar --no-same-owner`) to skip `Lchown` when its not required.

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-09-09 17:07:26 -07:00
Daniel Canter
d2588b3fa4 LCOW differ return ErrNotImplemented for wrong mount type
On Windows the two differs we register by default are the "windows" and
"windows-lcow" differs. The diff service checks if Apply returns
ErrNotImplemented and will move on to the next differ in the line.
The Windows differ makes use of this to fallback to LCOW if it's
determined the mount type passed is incorrect, but the LCOW differ
does not return ErrNotImplemented for the same scenario. This puts
a strict ordering requirement on the default differ entries in the config,
namely that ["windows", "windows-lcow"] will work, as windows will correctly
fall back to the lcow differ, but ["windows-lcow", "windows"] won't as
the diff services Apply will just return the error directly.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-06-28 17:19:04 -07:00
Hamza El-Saawy
ad8b87ba23 Add Wait to binaryProcessor
Add exported `Wait(ctx context.Context) error` interface that waits on
the underlying command (or context cancellation) and returns the error.

This fixes a race condition between `.wait()` and `.Err error`:
https://github.com/containerd/containerd/issues/6914

Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
2022-05-09 17:15:00 -04:00
Kazuyoshi Kato
dfa6e8763e diff: hide types.Any from clients
This commit hides types.Any from the diff package's interface. Clients
(incl. imgcrypt) shouldn't aware about gogo/protobuf.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-21 13:43:20 +00:00
Kazuyoshi Kato
88c0c7201e Consolidate gogo/protobuf dependencies under our own protobuf package
This would make gogo/protobuf migration easier.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 15:53:36 +00:00
haoyun
bbe46b8c43 feat: replace github.com/pkg/errors to errors
Signed-off-by: haoyun <yun.hao@daocloud.io>
Co-authored-by: zounengren <zouyee1989@gmail.com>
2022-01-07 10:27:03 +08:00
Maksym Pavlenko
416899fc8e Allow native snapshotter on Darwin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-12-04 14:08:09 -08:00
zounengren
3a713811be run gofmt with Go 1.17
Signed-off-by: Zou Nengren <zouyee1989@gmail.com>
2021-10-07 20:16:59 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Sebastiaan van Stijn
2ac9968401
replace uses of os/exec with golang.org/x/sys/execabs
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows. This issue also
affects Unix users who have “.” listed explicitly in their PATH and are running
“go get” outside of a module or with module mode disabled.

This issue is not limited to the go command itself, and can also affect binaries
that use `os.Command`, `os.LookPath`, etc.

From the related blogpost (ttps://blog.golang.org/path-security):

> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing

This patch replaces all uses of `os/exec` with `golang.org/x/sys/execabs`. While
some uses of `os/exec` should not be problematic (e.g. part of tests), it is
probably good to be consistent, in case code gets moved around.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2021-08-25 18:11:09 +02:00
Akihiro Suda
d3aa7ee9f0
Run go fmt with Go 1.17
The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-08-22 09:31:50 +09:00
Derek McGowan
6f027e38a8
Remove redundant build tags
Remove build tags which are already implied by the name of the file.
Ensures build tags are used consistently

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:27:46 -07:00
ktock
b483177ee2 Support custom compressor for walking differ
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-07-20 11:00:01 +09:00
Derek McGowan
69f43d4589
Revert diff/walking error change
Clarify error scope and create variable for deferring cleanup

Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-06-04 08:44:24 -07:00
Iceber Gu
558fdc6808 diff/walking: fix defer cleanup
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2021-06-01 18:24:47 +08:00