containerd

Author	SHA1	Message	Date
Akshat Kumar	4cc99e57a7	Remove unnecessary logging binary helpers and add godoc Signed-off-by: Akshat Kumar <kshtku@amazon.com>	2020-08-26 09:15:02 -07:00
Akshat Kumar	7a9fbec5fb	Add logging binary support when terminal is true Currently the shims only support starting the logging binary process if the io.Creator Config does not specify Terminal: true. This means that the program using containerd will only be able to specify FIFO io when Terminal: true, rather than allowing the shim to fork the logging binary process. Hence, containerd consumers face an inconsistent behavior regarding logging binary management depending on the Terminal option. Allowing the shim to fork the logging binary process will introduce consistency between the running container and the logging process. Otherwise, the logging process may die if its parent process dies whereas the container will keep running, resulting in the loss of container logs. Signed-off-by: Akshat Kumar <kshtku@amazon.com>	2020-08-25 17:28:29 -07:00
Brian Goff	d7b9cb0019	shim: move event context timeout to publsher Before this change, if an event fails to send on the first attempt, subsequent attempts will fail with context.Cancelled because the the caller of publish passes a cancellable timeout, which the publisher uses to send the event. The publisher returns immediately if the send fails, but adds the event to an async queue to try again. Meanwhile the caller will return cancelling the context. Additionally, subsequent attempts may fail to send because the timeout was expected to be for a single request but the queue sleeps for `attempt*time.Second`. In the shim service, the timeout was set to 5s, which means the send will fail with context.DeadlineExceeded before it reaches `maxRequeue` (which is currently 5). This change moves the timeout to the publisher so each send attempt gets its own timeout. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2020-07-20 17:51:10 -07:00
Akihiro Suda	fd99b6566b	decrease log level of cgroup2 ToggleController error when running in UserNS Fix #4312 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-06-24 18:15:16 +09:00
Akihiro Suda	f1a469a035	shim v2 runc: propagate options.Root to Cleanup Previously shim v2 (`io.containerd.runc.{v1,v2}`) always used `/run/containerd/runc` as the runc root. Fix #4326 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-06-17 19:06:36 +09:00
Akihiro Suda	2f601013e6	cgroup2: implement `containerd.events.TaskOOM` event How to test (from https://github.com/opencontainers/runc/pull/2352#issuecomment-620834524): (host)$ sudo swapoff -a (host)$ sudo ctr run -t --rm --memory-limit $((1024102432)) docker.io/library/alpine:latest foo (container)$ sh -c 'VAR=$(seq 1 100000000)' An event `/tasks/oom {"container_id":"foo"}` will be displayed in `ctr events`. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-06-01 14:00:13 +09:00
Tobias Klauser	a9bd451ab4	Avoid duplicate imports of github.com/gogo/protobuf/types Re-use the import aliased as `ptypes`. Signed-off-by: Tobias Klauser <tklauser@distanz.ch>	2020-03-10 09:41:03 +01:00
Ted Yu	a687d3a36d	Check error return from json.Unmarshal Signed-off-by: Ted Yu <yuzhihong@gmail.com> Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2020-03-05 13:38:08 -05:00
Maksym Pavlenko	4d242818bf	Merge pull request #4053 from AkihiroSuda/vendor-grpc-20200225 vendor protobuf & grpc (GoGoProtoPackageIsVersion3)	2020-02-27 11:59:59 -08:00
Phil Estes	669f516b0e	Merge pull request #4062 from tedyu/start-shim-defer Use named error return for service#StartShim	2020-02-27 13:23:31 -05:00
Ted Yu	f8ade8debd	Use named error return for service#StartShim Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-02-27 06:18:05 -08:00
Ted Yu	4105135e36	fix killall when use pidnamespace Signed-off-by: Ted Yu <yuzhihong@gmail.com>	2020-02-26 20:56:49 -08:00
Akihiro Suda	8e448bb279	vendor protobuf & grpc Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-02-26 10:57:05 +09:00
Seth Pellegrino	66508589d3	fix: eventfd leak for v2 runtime with v1 cgroups There's no OOM monitoring for the v2 cgroups yet, so it seems unlikely that there was a leak in that case. Signed-off-by: Seth Pellegrino <spellegrino@newrelic.com>	2020-01-13 10:49:11 -08:00
Seth Pellegrino	9456040acb	fix: eventfd leak Only start watching the cgroup for OOMs when the first process starts instead of on every process. Signed-off-by: Seth Pellegrino <spellegrino@newrelic.com>	2020-01-13 10:39:54 -08:00
Erik Sipsma	fbd46d7094	runtime v2: Close platform in runc shim's Shutdown method. Previously, the platform was closed as part of the Delete method when the process was an init for a task and there were no more tasks after its deletion. This can create problems if another task is created within the shim right after the delete runs, which results in the platform being closed but the shim continuing to run. This change moves closing the platform to the Shutdown method after the shim's context is canceled, which ensures the platform is only closed once the shim is sure its done servicing containers. Signed-off-by: Erik Sipsma <sipsma@amazon.com>	2019-12-19 09:47:40 -05:00
Akihiro Suda	b02e20f12e	cgroup2: enable controllers automatically Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-12-12 02:56:51 +09:00
Akihiro Suda	8f870c233f	support cgroup2 * only shim v2 runc v2 ("io.containerd.runc.v2") is supported * only PID metrics is implemented. Others should be implemented in separate PRs. * lots of code duplication in v1 metrics and v2 metrics. Dedupe should be separate PR. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2019-12-12 02:56:51 +09:00
Michael Crosby	f8cca26f3c	Handle large output in v2 shim with TTY Reized the I/O buffers to align with the size of the kernel buffers with fifos and move the close aspect of the console to key off of the stdin closing. Fixes #3738 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-10-11 15:42:05 -04:00
Michael Crosby	6cf031e1e4	Pass ttrpc address to shim via env Because of the way go handles flags, passing a flag that is not defined will cause an error. In our case, if we kept this as a flag, then third-party shims would break when they see this new flag. To fix this, I moved this new configuration option to an env var. We should use env vars from here on out to avoid breaking shim compat. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-08-22 20:37:49 +00:00
Kevin Parsons	d7e1b25384	Allow explicit configuration of TTRPC address Previously the TTRPC address was generated as "<GRPC address>.ttrpc". This change now allows explicit configuration of the TTRPC address, with the default still being the old format if no value is specified. As part of this change, a new configuration section is added for TTRPC listener options. Signed-off-by: Kevin Parsons <kevpar@microsoft.com>	2019-08-22 00:56:27 -07:00
Phil Estes	640860a042	Merge pull request #3559 from fuweid/avoid-read-config runtime: only check killall for init process	2019-08-20 13:08:55 -04:00
Wei Fu	1073868e5e	runtime: only check killall for init process When containerd-shim does reaper, the most processes are not init process. Since json.Decode consumes more CPU resource, we should check killall option for init process only. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2019-08-20 19:18:34 +08:00
Michael Crosby	0d27d8f4f2	Unifi reaper logic into package Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-08-16 13:55:05 +00:00
Maksym Pavlenko	ef7f46eb7b	Fix linter errors Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-07-14 20:49:40 -07:00
Michael Crosby	6601b406b7	Refactor runtime code for code sharing Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-07-08 11:47:53 -04:00
Michael Crosby	7dfc605fc6	Set shim OOM scores to +1 containerd daemon score This changes the shim's OOM score from a static max killable of -999 to be +1 of the containerd daemon's score. This should allow the shim's to be killed first in an OOM condition but leave the daemon alone for a bit to help cleanup and manage the containers during this situation. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-27 11:14:14 -04:00
Michael Crosby	1a8df3f237	Reserve exec id to prevent race ref #2820 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-21 14:52:44 -04:00
Lantao Liu	48b81e872c	Do not return error when rootfs already exists. Signed-off-by: Lantao Liu <lantaol@google.com>	2019-05-22 15:57:19 -07:00
Michael Crosby	fe6a2b03ed	Add shim cgroup support for v2 runtimes Closes #3198 Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-05-20 16:04:06 +00:00
Michael Crosby	57fbb16234	Merge pull request #3149 from lifubang/pidnamespace fix killall when use pidnamespace	2019-05-09 14:28:44 -04:00
Michael Crosby	19af235051	Merge pull request #3148 from masters-of-cats/wip-rootless-containerd Skip rootfs unmount when no mounts are provided	2019-05-07 10:39:02 -04:00
Michael Crosby	ae87730ad2	Improve shim shutdown logic Shims no longer call `os.Exit` but close the context on shutdown so that events and other resources have hit the `defer`s. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-04-10 18:17:07 -04:00
Michael Crosby	a6f587e4c4	Use ttrpc to publish runtime v2 events Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-04-09 14:38:50 -04:00
Georgi Sabev	c0f0b21314	Apply PR feedback * Rootfs dir is created during container creation not during bundle creation * Add support for v2 * UnmountAll is a no-op when the path to unmount (i.e. the rootfs dir) does not exist or is invalid Co-authored-by: Danail Branekov <danailster@gmail.com> Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>	2019-04-04 18:40:30 +03:00
Sebastiaan van Stijn	2d11f5e6d5	Regenerate protobufs Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-04-03 23:41:15 +02:00
Sebastiaan van Stijn	01310eaebc	do not use unkeyed fields in compose literals Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-04-03 22:20:39 +02:00
Lifubang	872296642a	fix shouldKillAllOnExit check for v2 Signed-off-by: Lifubang <lifubang@acmcoder.com>	2019-03-30 11:37:14 +08:00
Michael Crosby	e6ae9cc64f	Shim pluggable logging Closes #603 This adds logging facilities at the shim level to provide minimal I/O overhead and pluggable logging options. Log handling is done within the shim so that all I/O, cpu, and memory can be charged to the container. A sample logging driver setting up logging for a container the systemd journal looks like this: ```go package main import ( "bufio" "context" "fmt" "io" "sync" "github.com/containerd/containerd/runtime/v2/logging" "github.com/coreos/go-systemd/journal" ) func main() { logging.Run(log) } func log(ctx context.Context, config logging.Config, ready func() error) error { // construct any log metadata for the container vars := map[string]string{ "SYSLOG_IDENTIFIER": fmt.Sprintf("%s:%s", config.Namespace, config.ID), } var wg sync.WaitGroup wg.Add(2) // forward both stdout and stderr to the journal go copy(&wg, config.Stdout, journal.PriInfo, vars) go copy(&wg, config.Stderr, journal.PriErr, vars) // signal that we are ready and setup for the container to be started if err := ready(); err != nil { return err } wg.Wait() return nil } func copy(wg sync.WaitGroup, r io.Reader, pri journal.Priority, vars map[string]string) { defer wg.Done() s := bufio.NewScanner(r) for s.Scan() { if s.Err() != nil { return } journal.Send(s.Text(), pri, vars) } } ``` A `logging` package has been created to assist log developers create logging plugins for containerd. This uses a URI based approach for logging drivers that can be expanded in the future. Supported URI scheme's are: * binary * fifo * file You can pass the log url via ctr on the command line: ```bash > ctr run --rm --runtime io.containerd.runc.v2 --log-uri binary://shim-journald docker.io/library/redis:alpine redis ``` ```bash > journalctl -f -t default:redis -- Logs begin at Tue 2018-12-11 16:29:51 EST. -- Mar 08 16:08:22 deathstar default:redis[120760]: 1:C 08 Mar 2019 21:08:22.703 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # You requested maxclients of 10000 requiring at least 10032 max file descriptors. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # Current maximum open files is 1024. maxclients has been reduced to 992 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 * Running mode=standalone, port=6379. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # Server initialized Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 * Ready to accept connections Mar 08 16:08:50 deathstar default:redis[120760]: 1:signal-handler (1552079330) Received SIGINT scheduling shutdown... Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.405 # User requested shutdown... Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.406 * Saving the final RDB snapshot before exiting. Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.452 * DB saved on disk Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.453 # Redis is now ready to exit, bye bye... ``` The following client side Opts are added: ```go // LogURI provides the raw logging URI func LogURI(uri *url.URL) Creator { } // BinaryIO forwards contianer STDOUT\|STDERR directly to a logging binary func BinaryIO(binary string, args map[string]string) Creator {} ``` Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-03-12 12:18:28 -04:00
Michael Crosby	84a24711e8	Add runc.v2 multi-shim Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-02-21 11:09:46 -05:00
Michael Crosby	6bcbf88f82	Move runc shim code into common package Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-02-21 10:47:41 -05:00
Michael Crosby	85aa8ad361	Move task events to runc v2 shim Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-01-25 14:15:43 -05:00
Lantao Liu	26ab393e7d	Use context.Background for `O_NONBLOCK` `OpenFifo`. Signed-off-by: Lantao Liu <lantaol@google.com>	2019-01-23 10:18:54 -08:00
Ace-Tang	6593399e9f	cr: support checkpoint/restore without image support checkpoint without committing a checkpoint dir into a checkpoint image and restore without untar image into checkpoint directory. support for both v1 and v2 runtime Signed-off-by: Ace-Tang <aceapril@126.com>	2018-11-29 10:19:39 +08:00
Ace-Tang	fd16bf6d46	runtimev2: add image-path and work-path for c/r add ImagePath and WorkPath for checkpoint process, add CriuImagePath and CriuWorkPath for create process in runtime v2 protobuf Signed-off-by: Ace-Tang <aceapril@126.com>	2018-11-24 23:08:25 +08:00
Lifubang	e76a8879eb	fix pipe in broken may cause shim lock forever for runtime v1 Signed-off-by: Lifubang <lifubang@acmcoder.com>	2018-11-19 09:25:43 +08:00
Lifubang	b3438f7a6f	fix pipe in broken may cause shim lock forever for runtime v2 Signed-off-by: Lifubang <lifubang@acmcoder.com>	2018-11-19 09:02:49 +08:00
Ace-Tang	c4feaa75cf	fix: fix failed to get container-shim relation with io.containerd.runc.v1 add '-id' flag when start container with io.containerd.runc.v1 shim, or user can not get container-shim relation from 'ps -ef',like ``` /usr/bin/containerd-shim-runc-v1 -namespace default -address /run/containerd/containerd.sock -publish-binary /usr/bin/containerd ``` Signed-off-by: Ace-Tang <aceapril@126.com>	2018-11-09 11:01:35 +08:00
Wei Fu	38d7d59e8a	enhance: update v1/v2 runtime 1. avoid dead lock during kill, fetch allProcesses before handle events 2. use argu's ctx instead of context.Backgroud() in openlog Signed-off-by: Wei Fu <fuweid89@gmail.com>	2018-11-06 22:48:43 +08:00
Tom Godkin	b5ccc66c2c	Do not kill all on task delete by default - Still KillAll if the task uses the hosts pid namespace - Test for both host pid namespace and normal cases Co-authored-by: Oliver Stenbom <ostenbom@pivotal.io> Co-authored-by: Georgi Sabev <georgethebeatle@gmail.com> Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io>	2018-08-30 15:58:33 +01:00

1 2 3

109 Commits