For LCOW using the Virtual Machines SID for the shared read-only layers
improves overall performance avoiding the need to set per VM access at runtime.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
This whitelists the statx syscall; libseccomp-2.3.3 or up
is needed for this, older seccomp versions will ignore this.
Equivalent of https://github.com/moby/moby/pull/36417
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Docker registries return errors in a know format so this change now checks for these
errors and returns the message field. If the error is not in the expected format fall
back to the original behaviour.
https://github.com/containerd/containerd/issues/3076
Signed-off-by: Jack Baines <jack.baines@uk.ibm.com>
io_pgetevents() is a new Linux system call, similar to the already-whitelisted
io_getevents(). It has no security implications. Whitelist it so applications can
use the new system call.
Fixes#3105.
Signed-off-by: Avi Kivity <avi@scylladb.com>
Use sha256 hash to shorten the unix socket path to satisfy the
length limitation of abstract socket path
This commit also backports the feature storing address path to
a file from v2 to keep compatibility
Fixes#3032
Signed-off-by: Eric Lin <linxiulei@gmail.com>
With this patch applied, the package-name in the `--version` output can be overridden;
make PACKAGE=containerd.io binaries
./bin/containerd --version
containerd containerd.io v1.2.0-329-ga15b6e20.m a15b6e2097c48b632dbdc63254bad4c62b69e709.m
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Closes#603
This adds logging facilities at the shim level to provide minimal I/O
overhead and pluggable logging options. Log handling is done within the
shim so that all I/O, cpu, and memory can be charged to the container.
A sample logging driver setting up logging for a container the systemd
journal looks like this:
```go
package main
import (
"bufio"
"context"
"fmt"
"io"
"sync"
"github.com/containerd/containerd/runtime/v2/logging"
"github.com/coreos/go-systemd/journal"
)
func main() {
logging.Run(log)
}
func log(ctx context.Context, config *logging.Config, ready func() error) error {
// construct any log metadata for the container
vars := map[string]string{
"SYSLOG_IDENTIFIER": fmt.Sprintf("%s:%s", config.Namespace, config.ID),
}
var wg sync.WaitGroup
wg.Add(2)
// forward both stdout and stderr to the journal
go copy(&wg, config.Stdout, journal.PriInfo, vars)
go copy(&wg, config.Stderr, journal.PriErr, vars)
// signal that we are ready and setup for the container to be started
if err := ready(); err != nil {
return err
}
wg.Wait()
return nil
}
func copy(wg *sync.WaitGroup, r io.Reader, pri journal.Priority, vars map[string]string) {
defer wg.Done()
s := bufio.NewScanner(r)
for s.Scan() {
if s.Err() != nil {
return
}
journal.Send(s.Text(), pri, vars)
}
}
```
A `logging` package has been created to assist log developers create
logging plugins for containerd.
This uses a URI based approach for logging drivers that can be expanded
in the future.
Supported URI scheme's are:
* binary
* fifo
* file
You can pass the log url via ctr on the command line:
```bash
> ctr run --rm --runtime io.containerd.runc.v2 --log-uri binary://shim-journald docker.io/library/redis:alpine redis
```
```bash
> journalctl -f -t default:redis
-- Logs begin at Tue 2018-12-11 16:29:51 EST. --
Mar 08 16:08:22 deathstar default:redis[120760]: 1:C 08 Mar 2019 21:08:22.703 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # Current maximum open files is 1024. maxclients has been reduced to 992 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 * Running mode=standalone, port=6379.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # Server initialized
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 * Ready to accept connections
Mar 08 16:08:50 deathstar default:redis[120760]: 1:signal-handler (1552079330) Received SIGINT scheduling shutdown...
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.405 # User requested shutdown...
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.406 * Saving the final RDB snapshot before exiting.
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.452 * DB saved on disk
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.453 # Redis is now ready to exit, bye bye...
```
The following client side Opts are added:
```go
// LogURI provides the raw logging URI
func LogURI(uri *url.URL) Creator { }
// BinaryIO forwards contianer STDOUT|STDERR directly to a logging binary
func BinaryIO(binary string, args map[string]string) Creator {}
```
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
We can use cross repository push feature to reuse the existing blobs in
the same registry. Before make push fast, we know where the blob comes
from.
Use the `containerd.io/distribution.source. = [,]` as label format. For
example, the blob is downloaded by the docker.io/library/busybox:latest
and the label will be
containerd.io/distribution.source.docker.io = library/busybox
If the blob is shared by different repos in the same registry, the repo
name will be appended, like:
containerd.io/distribution.source.docker.io = library/busybox,x/y
NOTE:
1. no need to apply for legacy docker image schema1.
2. the concurrent fetch actions might miss some repo names in label, but
it is ok.
3. it is optional. no need to add label if the engine only uses images
not push.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
This includes an improved fix for CVE-2019-5736 to reduce the
increased memory-consumption introduced by the original patch,
RHEL 7.6 getting into a loop due to a kernel bug in those kernels,
and improve compatibility with older kernels.
changes included:
- opencontainers/runc#1973 Vendor opencontainers/runtime-spec 29686dbc
- opencontainers/runc#1978 Remove detection for scope properties, which have always been broken
- opencontainers/runc#1963 Vendor in go-criu and use it for CRIU's RPC definition
- opencontainers/runc#1995 exec: expose --preserve-fds
- opencontainers/runc#2000 fix preserve-fds flag may cause runc hang
- opencontainers/runc#1968 Create bind mount mountpoints during restore
- opencontainers/runc#1984 nsenter: cloned_binary: "memfd" cleanups
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make sure that Annotations we write into ocispec.Descriptors are
written into the store and can be read back.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>