Commit Graph

10422 Commits

Author SHA1 Message Date
Derek McGowan
ffddd4446c
Merge pull request #6761 from kzys/bbolt-freelist
Disable writing freelist to make the file robust against data corruptions
2022-04-05 21:36:49 -07:00
Derek McGowan
d351162178
Merge pull request #6777 from AkihiroSuda/docs-move-design-historical
mv design docs/historical/design
2022-04-05 21:29:52 -07:00
Derek McGowan
a5b0d3b3af
Merge pull request #6702 from fuweid/RFC-enhance-pull-performance
CRI: improve image pulling performance
2022-04-05 21:28:34 -07:00
Wei Fu
8113758568 CRI: improve image pulling performance
Background:

With current design, the content backend uses key-lock for long-lived
write transaction. If the content reference has been marked for write
transaction, the other requestes on the same reference will fail fast with
unavailable error. Since the metadata plugin is based on boltbd which
only supports single-writer, the content backend can't block or handle
the request too long. It requires the client to handle retry by itself,
like OpenWriter - backoff retry helper. But the maximum retry interval
can be up to 2 seconds. If there are several concurrent requestes fo the
same image, the waiters maybe wakeup at the same time and there is only
one waiter can continue. A lot of waiters will get into sleep and we will
take long time to finish all the pulling jobs and be worse if the image
has many more layers, which mentioned in issue #4937.

After fetching, containerd.Pull API allows several hanlers to commit
same ChainID snapshotter but only one can be done successfully. Since
unpack tar.gz is time-consuming job, it can impact the performance on
unpacking for same ChainID snapshotter in parallel.

For instance, the Request 2 doesn't need to prepare and commit, it
should just wait for Request 1 finish, which mentioned in pull
request #6318.

```text
	Request 1	Request 2

	Prepare
	   |
	   |
	   |
	   |		Prepare
	Commit		   |
			   |
			   |
			   |
			Commit(failed on exist)
```

Both content backoff retry and unnecessary unpack impacts the performance.

Solution:

Introduced the duplicate suppression in fetch and unpack context. The
deplicate suppression uses key-mutex and single-waiter-notify to support
singleflight. The caller can use the duplicate suppression in different
PullImage handlers so that we can avoid unnecessary unpack and spin-lock
in OpenWriter.

Test Result:

Before enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	1m6.172s
user	0m0.268s
sys	0m0.193s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.324s
user	0m0.441s
sys	0m0.316s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	1m47.657s
user	0m0.284s
sys	0m0.224s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.381s
user	0m0.488s
sys	0m0.358s
```

With this enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	0m1.140s
user	0m0.243s
sys	0m0.178s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.239s
user	0m0.463s
sys	0m0.275s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	0m5.546s
user	0m0.217s
sys	0m0.219s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.090s
user	0m0.501s
sys	0m0.331s
```

Test Script:

localhost:5000/{redis|golang}:latest is equal to
docker.io/library/{redis|golang}:latest. The image is hold in local registry
service by `docker run -d -p 5000:5000 --name registry registry:2`.

```bash

image_name="${1}"
pull_times="${2:-10}"

cleanup() {
  ctr image rmi "${image_name}"
  ctr -n k8s.io image rmi "${image_name}"
  crictl rmi "${image_name}"
  docker rmi "${image_name}"
  sleep 2
}

crictl_testing() {
  for idx in $(seq 1 ${pull_times}); do
    crictl pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

docker_testing() {
  for idx in $(seq 1 ${pull_times}); do
    docker pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

cleanup > /dev/null 2>&1

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "crictl pull $image_name (x${pull_times}) takes ..."
time crictl_testing
echo

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "docker pull $image_name (x${pull_times}) takes ..."
time docker_testing
```

Fixes: #4937
Close: #4985
Close: #6318

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-06 07:14:18 +08:00
Kazuyoshi Kato
83f44ddab5
Merge pull request #6776 from AkihiroSuda/docs-remove-runtime-v1
docs: remove runtime v1; migrate config v1 to v2
2022-04-05 11:13:50 -07:00
Kazuyoshi Kato
626608e272
Merge pull request #6779 from gabriel-samfira/skip-flaky-test
Skip flaky test on Windows
2022-04-05 11:12:02 -07:00
Maksym Pavlenko
acdbf05adc
Merge pull request #6775 from AkihiroSuda/docs-typo-20220405
docs/getting-started.md: typo
2022-04-05 09:36:00 -07:00
Gabriel Adrian Samfira
16fbbaeeea
Skip flaky test on Windows
The tty test fails on ltsc2022. Disable that test until we manage to
reproduce and fix it.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-04-05 16:43:48 +03:00
Akihiro Suda
44d7cd1528
mv design docs/historical/design
The docs have been out of the sync with the actual implementation since 2018.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-05 16:50:12 +09:00
Akihiro Suda
195fc74244
docs: migrate config v1 to v2
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-05 16:41:54 +09:00
Akihiro Suda
84cebafe8f
docs: remove deprecated io.containerd.runtime.v1.linux
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-05 16:13:42 +09:00
Akihiro Suda
83665bf8d2
docs/getting-started.md: typo
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-05 15:56:21 +09:00
Derek McGowan
e079e4a155
Merge pull request #6750 from mxpv/tracing
Add no_tracing tag
2022-04-04 09:53:22 -07:00
Kazuyoshi Kato
0f5c06bd72
Merge pull request #6754 from AkihiroSuda/move-historical-docs
Move historical docs to `docs/historical`
2022-04-04 09:48:49 -07:00
Akihiro Suda
7f7ba2b1a0
Merge pull request #6768 from gabriel-samfira/tidy-integration-modules
Run go mod tidy in integration tests
2022-04-04 18:53:14 +09:00
Akihiro Suda
ccea927d95
Move historical docs to docs/historical
To clarify that end users do not need to read these docs, and that these
docs do not need to be updated

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-04 17:48:46 +09:00
Gabriel Adrian Samfira
50921e71bb
Run go mod tidy in integration tests
make integration currently fails due to outdated go.mod.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-04-04 11:01:56 +03:00
Akihiro Suda
9f4e13973d
Merge pull request #6765 from thaJeztah/move_indirects
go.mod: move indirects, and update integration go.mod to 1.18
2022-04-03 07:09:20 +09:00
Phil Estes
aaf64c455a
Merge pull request #6762 from mxpv/testify
Drop gotest.tools
2022-04-02 17:25:32 -04:00
Sebastiaan van Stijn
99c194e033
go.mod: move indirects, and update integration go.mod to 1.18
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-04-02 12:02:13 +02:00
Maksym Pavlenko
6ccec53d3e Remove gotest.tools
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-04-01 18:18:04 -07:00
Maksym Pavlenko
871b6b6a9f Use testify
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-04-01 18:17:58 -07:00
Maksym Pavlenko
9766107a53
Merge pull request #6760 from mxpv/env
Use t.Setenv instead of os.Setenv
2022-04-01 14:37:13 -07:00
Kazuyoshi Kato
999cbc4049
Merge pull request #6709 from BooleanCat/main
Upgrade to Go 1.18
2022-04-01 14:26:01 -07:00
Kazuyoshi Kato
6da3183105 Disable writing freelist to make the file robust against data corruptions
A bbolt database has a freelist to track all pages that are available
for allocation. However writing the list takes some time and reading
the list sometimes panics.

This commit sets NoFreelistSync true to skipping the freelist entirely,
following what etcd does.

https://github.com/etcd-io/etcd/blob/v3.5.2/server/mvcc/backend/config_linux.go#L31

Fixes #4838.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-01 21:06:42 +00:00
Maksym Pavlenko
62c846b17b Update linters to use t.Setenv
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-04-01 13:53:42 -07:00
Maksym Pavlenko
2d59a39445 Use t.Setenv instead of os.Setenv
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-04-01 13:53:17 -07:00
Phil Estes
00d951e16a
Merge pull request #6751 from nobellium1997/arm-support-gce-configure
Adding multi-arch support for the configure.sh script
2022-04-01 15:58:30 -04:00
Nobel Barakat
4bdac2b43b Adding multi-arch support for the configure.sh script
This script is used by various tests to install and configure
containerd. However, right now it's hardcoded to only support x86
and not other architectures like arm. This change will now check the
architecture of the machine the script is running on and will pull
the correct artifacts accordingly from the correct artifact store
(GCS vs Github).

Signed-off-by: Nobel Barakat <nobelbarakat@google.com>
2022-04-01 19:10:13 +00:00
Maksym Pavlenko
1ba613e200
Merge pull request #6758 from AkihiroSuda/docs-getting-started
docs/getting-started.md: massive update
2022-04-01 11:20:28 -07:00
Derek McGowan
00a1cb4b97
Merge pull request #6755 from AkihiroSuda/remove-unmaintained-linuxkit-kubernetes
Remove unmaintained contrib/linuxkit
2022-04-01 11:09:36 -07:00
Maksym Pavlenko
549cc6890d
Merge pull request #6757 from gabriel-samfira/address-test-timeouts
[Windows CI] Address some timeout issues
2022-04-01 10:32:41 -07:00
Maksym Pavlenko
1e5b23aeb1
Merge pull request #6756 from AkihiroSuda/update-build-deps
BUILDING.md: update supported Go versions
2022-04-01 10:31:37 -07:00
Fu Wei
7a9e5ac42b
Merge pull request #6753 from AkihiroSuda/runc-1.1.1 2022-04-01 20:24:03 +08:00
Akihiro Suda
6f269ccb3c
docs/getting-started.md: massive update
The previous documentation was too much forcusing on the Go API and not useful
for users who are not interested in implementing their own containerd client.
It was also recommending the deprecated way (cri-containerd-*.tar.gz) to install
containerd and its dependencies.

The new documentation recommends the current official way to install containerd,
and provides several links for end users.

This will replace the content of https://containerd.io/docs/getting-started/
after merging the containerd/containerd.io PR 120.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-01 20:45:17 +09:00
Gabriel Adrian Samfira
c7bdcdfbef
Address some timeout issues in the Windows CI
This change disables Windows Defender real-time monitoring on the test
workers, and increases the test timeout to 20 minutes (default is 10).

The Windows Defender real time monitoring feature scans any newly
created files for malitious contents. This takes up a lot of CPU when
expanding image archives, which contain lots of files. The CI has been
timing out due to the fact that tests take longer than 10 minutes. This
change should address that issue.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-04-01 14:02:20 +03:00
Akihiro Suda
a2d22ac057
BUILDING.md: update supported Go versions
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-01 18:22:00 +09:00
Akihiro Suda
d0bd65d3c7
Remove unmaintained contrib/linuxkit
The last commit on https://github.com/linuxkit/kubernetes was on Nov 2018.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-01 18:16:21 +09:00
Akihiro Suda
f2d5f71a78
update runc binary to v1.1.1
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-01 15:20:26 +09:00
Akihiro Suda
11a31320bb
go.mod: github.com/opencontainers/runc v1.1.1
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-04-01 15:16:08 +09:00
Maksym Pavlenko
0b2a95e107 Add no_tracing tag
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2022-03-31 14:37:11 -07:00
Fu Wei
887615c7d0
Merge pull request #6747 from AkihiroSuda/rocky-ci 2022-03-31 20:19:19 +08:00
Fu Wei
9617d2f29b
Merge pull request #6748 from AkihiroSuda/crun-1.4.4
CI: bump up crun to 1.4.4
2022-03-30 22:10:27 +08:00
Akihiro Suda
b42e936c55
CI: add Rocky Linux 8
Testing containerd on an EL8 variant will be beneficial for enterprise users.

EL9 is coming soon, but we should keep maintaining EL8 CI for a couple of years for long-time stability.

Fixes issue 6542

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-03-30 21:27:15 +09:00
Akihiro Suda
b1030e7b68
CI: bump up crun to 1.4.4
https://github.com/containers/crun/compare/1.3...1.4.4

Also adds `crun-version` file for consistency with `runc-version`.
(Note: unlike runc, crun does not prepend "v" to a version tag)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-03-30 21:08:17 +09:00
Phil Estes
11685ccdef
Merge pull request #6743 from Jordy24/makehelp
added make help for cri integration
2022-03-29 09:59:59 -04:00
Phil Estes
bbabe76d5f
Merge pull request #6740 from dabaooline/dabaooline-patch-1
Update README.md cncf landscape url
2022-03-29 08:38:57 -04:00
dabaooline
b737cb10e6 Update README.md
update cncf landscape url

Signed-off-by: Baoli Qiao <201028369@qq.com>
2022-03-29 10:04:00 +08:00
Jordan Karaze
cf571fa968 added make help for cri integration
Signed-off-by: Jordan Karaze <jordan.karaze@ibm.com>
2022-03-28 16:44:12 -05:00
Fu Wei
d394e00c7e
Merge pull request #6738 from zhsj/fix-test-msg
Fix error message in TestNewBinaryIO
2022-03-25 23:40:06 +08:00