Commit Graph

21 Commits

Author SHA1 Message Date
Swagat Bora
7def13dde3 Add a thin wrapper around otel Span object
Signed-off-by: Swagat Bora <sbora@amazon.com>
2022-11-11 01:28:27 +00:00
Swagat Bora
ee64926a72 add SpanAttribute
Signed-off-by: Swagat Bora <sbora@amazon.com>
2022-11-03 18:34:06 +00:00
Swagat Bora
3b87d46ce2 Add tracing spans in CRI image service and pull.go
Signed-off-by: Swagat Bora <sbora@amazon.com>

Add spans around image unpack operations
Use image.ref to denote image name and image.id for the image config digest
Add top-level spand and record errors in the CRI instrumentation service
2022-11-03 17:03:43 +00:00
Sebastiaan van Stijn
d215725136
pkg/cri/(server|sbserver): criService.getTLSConfig() add TODO to verify nolint
This `//nolint`  was added in f5c7ac9272
to suppress warnings about the `NameToCertificate` function being deprecated:

    // Deprecated: NameToCertificate only allows associating a single certificate
    // with a given name. Leave that field nil to let the library select the first
    // compatible chain from Certificates.

Looking at that, it was deprecated in Go 1.14 through
eb93c684d4
(https://go-review.googlesource.com/c/go/+/205059), which describes:

    crypto/tls: select only compatible chains from Certificates

    Now that we have a full implementation of the logic to check certificate
    compatibility, we can let applications just list multiple chains in
    Certificates (for example, an RSA and an ECDSA one) and choose the most
    appropriate automatically.

    NameToCertificate only maps each name to one chain, so simply deprecate
    it, and while at it simplify its implementation by not stripping
    trailing dots from the SNI (which is specified not to have any, see RFC
    6066, Section 3) and by not supporting multi-level wildcards, which are
    not a thing in the WebPKI (and in crypto/x509).

We should at least have a comment describing why we are ignoring this, but preferably
review whether we should still use it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-10-12 14:40:11 +02:00
Iceber Gu
3cfde732e1 remotes/docker/config: Skipping TLS verification for localhost
Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>
2022-09-13 17:40:23 +08:00
shuaichang
7b9f1d4058 Added support for runtime level snapshotter, issue 6657
Signed-off-by: shuaichang <shuai.chang@databricks.com>

Updated annotation name
2022-06-02 16:29:59 -07:00
Akihiro Suda
42584167b7
Officially deprecate Schema 1
Schema 1 has been substantially deprecated since circa. 2017 in favor of Schema 2 introduced in Docker 1.10 (Feb 2016)
and its successor OCI Image Spec v1, but we have not officially deprecated Schema 1.

One of the reasons was that Quay did not support Schema 2 so far, but it is reported that Quay has been
supporting Schema 2 since Feb 2020 (moby/buildkit issue 409).

This PR deprecates pulling Schema 1 images but the feature will not be removed before containerd 2.0.
Pushing Schema 1 images was never implemented in containerd (and its consumers such as BuildKit).

Docker/Moby already disabled pushing Schema 1 images in Docker 20.10 (moby/moby PR 41295),
but Docker/Moby has not yet disabled pulling Schema 1 as containerd has not yet deprecated Schema 1.
(See the comments in moby/moby PR 42300.)
Docker/Moby is expected to disable pulling Schema 1 images in future after this deprecation.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-05-02 19:08:38 +09:00
Wei Fu
00d102da9f feature: support image pull progress timeout
Kubelet sends the PullImage request without timeout, because the image size
is unknown and timeout is hard to defined. The pulling request might run
into 0B/s speed, if containerd can't receive any packet in that connection.
For this case, the containerd should cancel the PullImage request.

Although containerd provides ingester manager to track the progress of pulling
request, for example `ctr image pull` shows the console progress bar, it needs
more CPU resources to open/read the ingested files to get status.

In order to support progress timeout feature with lower overhead, this
patch uses http.RoundTripper wrapper to track active progress. That
wrapper will increase active-request number and return the
countingReadCloser wrapper for http.Response.Body. Each bytes-read
can be count and the active-request number will be descreased when the
countingReadCloser wrapper has been closed. For the progress tracker,
it can check the active-request number and bytes-read at intervals. If
there is no any progress, the progress tracker should cancel the
request.

NOTE: For each blob data, the containerd will make sure that the content
writer is opened before sending http request to the registry. Therefore, the
progress reporter can rely on the active-request number.

fixed: #4984

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-27 00:02:27 +08:00
Wei Fu
8113758568 CRI: improve image pulling performance
Background:

With current design, the content backend uses key-lock for long-lived
write transaction. If the content reference has been marked for write
transaction, the other requestes on the same reference will fail fast with
unavailable error. Since the metadata plugin is based on boltbd which
only supports single-writer, the content backend can't block or handle
the request too long. It requires the client to handle retry by itself,
like OpenWriter - backoff retry helper. But the maximum retry interval
can be up to 2 seconds. If there are several concurrent requestes fo the
same image, the waiters maybe wakeup at the same time and there is only
one waiter can continue. A lot of waiters will get into sleep and we will
take long time to finish all the pulling jobs and be worse if the image
has many more layers, which mentioned in issue #4937.

After fetching, containerd.Pull API allows several hanlers to commit
same ChainID snapshotter but only one can be done successfully. Since
unpack tar.gz is time-consuming job, it can impact the performance on
unpacking for same ChainID snapshotter in parallel.

For instance, the Request 2 doesn't need to prepare and commit, it
should just wait for Request 1 finish, which mentioned in pull
request #6318.

```text
	Request 1	Request 2

	Prepare
	   |
	   |
	   |
	   |		Prepare
	Commit		   |
			   |
			   |
			   |
			Commit(failed on exist)
```

Both content backoff retry and unnecessary unpack impacts the performance.

Solution:

Introduced the duplicate suppression in fetch and unpack context. The
deplicate suppression uses key-mutex and single-waiter-notify to support
singleflight. The caller can use the duplicate suppression in different
PullImage handlers so that we can avoid unnecessary unpack and spin-lock
in OpenWriter.

Test Result:

Before enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	1m6.172s
user	0m0.268s
sys	0m0.193s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.324s
user	0m0.441s
sys	0m0.316s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	1m47.657s
user	0m0.284s
sys	0m0.224s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.381s
user	0m0.488s
sys	0m0.358s
```

With this enhancement:

```bash
➜  /tmp sudo bash testing.sh "localhost:5000/redis:latest" 20
crictl pull localhost:5000/redis:latest (x20) takes ...

real	0m1.140s
user	0m0.243s
sys	0m0.178s

docker pull localhost:5000/redis:latest (x20) takes ...

real	0m1.239s
user	0m0.463s
sys	0m0.275s

➜  /tmp sudo bash testing.sh "localhost:5000/golang:latest" 20
crictl pull localhost:5000/golang:latest (x20) takes ...

real	0m5.546s
user	0m0.217s
sys	0m0.219s

docker pull localhost:5000/golang:latest (x20) takes ...

real	0m6.090s
user	0m0.501s
sys	0m0.331s
```

Test Script:

localhost:5000/{redis|golang}:latest is equal to
docker.io/library/{redis|golang}:latest. The image is hold in local registry
service by `docker run -d -p 5000:5000 --name registry registry:2`.

```bash

image_name="${1}"
pull_times="${2:-10}"

cleanup() {
  ctr image rmi "${image_name}"
  ctr -n k8s.io image rmi "${image_name}"
  crictl rmi "${image_name}"
  docker rmi "${image_name}"
  sleep 2
}

crictl_testing() {
  for idx in $(seq 1 ${pull_times}); do
    crictl pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

docker_testing() {
  for idx in $(seq 1 ${pull_times}); do
    docker pull "${image_name}" > /dev/null 2>&1 &
  done
  wait
}

cleanup > /dev/null 2>&1

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "crictl pull $image_name (x${pull_times}) takes ..."
time crictl_testing
echo

echo 3 > /proc/sys/vm/drop_caches
sleep 3
echo "docker pull $image_name (x${pull_times}) takes ..."
time docker_testing
```

Fixes: #4937
Close: #4985
Close: #6318

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-06 07:14:18 +08:00
Shengjing Zhu
f4f41296c2 Replace golang.org/x/net/context with std library
Signed-off-by: Shengjing Zhu <zhsj@debian.org>
2022-02-22 02:27:05 +08:00
haoyun
bbe46b8c43 feat: replace github.com/pkg/errors to errors
Signed-off-by: haoyun <yun.hao@daocloud.io>
Co-authored-by: zounengren <zouyee1989@gmail.com>
2022-01-07 10:27:03 +08:00
wanglei01
5f293d9ac4 [CRI] Fix panic when registry.mirrors use localhost
When containerd use this config:

```
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
      endpoint = ["http://localhost:5000"]
```

Due to the `newTransport` function does not initialize the `TLSClientConfig` field.
Then use `TLSClientConfig` to cause nil pointer dereference

Signed-off-by: wanglei <wllenyj@linux.alibaba.com>
2021-11-19 10:56:46 +08:00
Eng Zer Jun
50da673592
refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-09-21 09:50:38 +08:00
Wei Fu
2bcd6a4e88 cri: patch update image labels
The CRI-plugin subscribes the image event on k8s.io namespace. By
default, the image event is created by CRI-API. However, the image can
be downloaded by containerd API on k8s.io with the customized labels.
The CRI-plugin should use patch update for `io.cri-containerd.image`
label in this case.

Fixes: #5900

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2021-09-05 18:48:26 +08:00
Mike Brown
a5c417ac06 move up to CRI v1 and support v1alpha in parallel
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2021-06-28 09:34:12 -05:00
Phil Estes
e47400cbd2
Merge pull request #5100 from adisky/skip-tls-localHost
Skip TLS verification for localhost
2021-05-12 14:56:53 -04:00
Lantao Liu
81402e4758 Fix different registry hosts referencing the same auth config.
Signed-off-by: Lantao Liu <lantaol@google.com>
2021-05-03 17:42:57 -07:00
Aditi Sharma
8014d9fee0 Skip TLS verification for localhost
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
2021-05-03 10:21:54 +05:30
Brian Goff
b0b6d9aa03 Add support for using a host registry dir in cri
This will be used instead of the cri registry config in the main config
toml.

---

Also pulls in changes from containerd/cri@d0b4eecbb3

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2021-03-12 22:42:22 +00:00
Samarth Shah
5fc721370d Add manifest digest annotation for snapshotters
Signed-off-by: Samarth Shah <samarthmshah@gmail.com>
2020-10-07 23:12:01 +00:00
Derek McGowan
b22b627300
Move cri server packages under pkg/cri
Organizes the cri related server packages under pkg/cri

Signed-off-by: Derek McGowan <derek@mcg.dev>
2020-10-07 13:09:37 -07:00