Commit Graph

144 Commits

Author SHA1 Message Date
Ashray Jain
3e95727f39 Make killing shims more resilient
Currently, we send a single SIGKILL to the shim process
once and then we spin in a loop where we use kill(pid, 0)
to detect when the pid has disappeared completely.

Unfortunately, this has a race condition since pids can be reused causing us
to spin in an infinite loop when that happens.

This adds a timeout to this loop which logs a warning and exits the
infinite loop.

Signed-off-by: Ashray Jain <ashrayj@palantir.com>
2020-06-03 12:57:08 +01:00
Sebastiaan van Stijn
dc92ad6520
Replace errors.Cause() with errors.Is()
Dependencies may be switching to use the new `%w` formatting
option to wrap errors; switching to use `errors.Is()` makes
sure that we are still able to unwrap the error and detect the
underlying cause.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2020-05-08 14:36:45 +02:00
zyu
e3ab8bda60 Avoid allocating slice for finding Process
Signed-off-by: zyu <yuzhihong@gmail.com>
2020-03-06 09:51:26 -08:00
Ted Yu
a687d3a36d Check error return from json.Unmarshal
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2020-03-05 13:38:08 -05:00
Maksym Pavlenko
4d242818bf
Merge pull request #4053 from AkihiroSuda/vendor-grpc-20200225
vendor protobuf & grpc (GoGoProtoPackageIsVersion3)
2020-02-27 11:59:59 -08:00
Phil Estes
ebec675a8d
Merge pull request #3802 from vladimiroff/unify-dialers
Unify dialer implementations
2020-02-26 16:54:22 -05:00
Kiril Vladimiroff
4dd75be2b9
Unify dialer implementations
Instead of having several dialer implementations, leave only one in
`pkg/dialer` and call it from `pkg/ttrpcutil`, `runtime/v(1|2)/shim`
which had their own

Closes #3471.

Signed-off-by: Kiril Vladimiroff <kiril@vladimiroff.org>
2020-02-26 23:29:04 +02:00
Akihiro Suda
8e448bb279 vendor protobuf & grpc
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-02-26 10:57:05 +09:00
Wei Fu
18e581dd91 bugfix: cleanup dangling shim by brand new context
When there is timeout or cancel for create container, killShim will fail
because of canceled context. The shim will be dangling and unmanageable.

Need to use new context to do cleanup.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2020-02-21 16:49:58 +08:00
Michael Crosby
f8cca26f3c Handle large output in v2 shim with TTY
Reized the I/O buffers to align with the size of the kernel buffers with fifos
and move the close aspect of the console to key off of the stdin closing.

Fixes #3738

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-10-11 15:42:05 -04:00
Lantao Liu
ffcb1cc9be Fix delete error code on the containerd daemon side.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-10-09 00:28:51 -07:00
Lantao Liu
06be794cb2 Fix shim delete error code.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-10-07 23:21:57 -07:00
Derek McGowan
0b224ac7d6
Update metadata interfaces for containers and leases
Add more thorough dirty checking across all types which
may be deleted and hold references.

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2019-09-23 15:27:39 -07:00
Phil Estes
640860a042
Merge pull request #3559 from fuweid/avoid-read-config
runtime: only check killall for init process
2019-08-20 13:08:55 -04:00
Wei Fu
1073868e5e runtime: only check killall for init process
When containerd-shim does reaper, the most processes are not init
process. Since json.Decode consumes more CPU resource, we should check
killall option for init process only.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2019-08-20 19:18:34 +08:00
Michael Crosby
0d27d8f4f2 Unifi reaper logic into package
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-08-16 13:55:05 +00:00
Shukui Yang
bb4c92c773 Fix shim hung
shim.Reap and shim.Default.Wait may deadlock, use Monitor.Notify
to fix this issue.

Signed-off-by: Shukui Yang <keloyangsk@gmail.com>
2019-08-16 13:55:05 +00:00
Michael Crosby
eb4b3e8772 Fast path getting pid from task
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-07-26 17:48:00 +00:00
dzzg
c27e48d666
fix mis-spelling in client.go
Signed-off-by: dzzg <zhengguang.zhu@daocloud.io>
2019-07-26 13:33:04 +08:00
Akihiro Suda
fab016c7a1 runtime/v1/linux: ignore ErrCgroupDeleted in Task.Start
Fix a Rootless Docker-in-Docker issue on Fedora 30: https://github.com/docker-library/docker/pull/165#issuecomment-511717143
Related: #1598

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-07-17 12:19:15 +09:00
Michael Crosby
6601b406b7 Refactor runtime code for code sharing
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-07-08 11:47:53 -04:00
Derek McGowan
041d8d7051
Merge pull request #3366 from crosbymichael/exec-pid
Robust pid locking for shim processes
2019-06-29 15:36:51 +08:00
Michael Crosby
7dfc605fc6 Set shim OOM scores to +1 containerd daemon score
This changes the shim's OOM score from a static max killable of -999 to
be +1 of the containerd daemon's score.  This should allow the shim's to
be killed first in an OOM condition but leave the daemon alone for a bit
to help cleanup and manage the containers during this situation.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-06-27 11:14:14 -04:00
Michael Crosby
719a2c594e Robust pid locking for shim processes
Closes #2832

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-06-26 11:43:57 -04:00
Maksym Pavlenko
174c4907d0 Fix shim's file IO logging
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2019-06-24 13:21:41 -07:00
Michael Crosby
245052243d Add timeout for I/O waitgroups
Closes #3286

This and a combination of a couple Docker changes are needed to fully
resolve the issue on the Docker side.  However, this ensures that after
processes exit, we still leave some time for the I/O to fully flush
before closing.  Without this timeout, the delete methods would block
forever.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-06-20 16:13:51 -04:00
Maksym Pavlenko
fbf96d302a Fix path in LogFile creator
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2019-06-19 16:53:33 -07:00
Maksym Pavlenko
5e0d793801 Fix bugs in BinaryIO creator
Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2019-06-19 11:15:17 -07:00
Maksym Pavlenko
bca5667362 Make newBinaryIO public
Allow third-party runtime implementations to reuse NewBinaryIO
in order to support pluggable shim logging binary protocol.

Signed-off-by: Maksym Pavlenko <makpav@amazon.com>
2019-06-12 16:22:10 -07:00
Michael Crosby
42f4bb98ac
Merge pull request #3311 from jing-rui/shimlog
fix shim std logs not close after shim exit
2019-06-10 12:05:35 -04:00
Jing Rui
9e0cd529d3 fix shim std logs not close after shim exit
Signed-off-by: Jing Rui <jingrui@huawei.com>
2019-06-10 11:50:07 +08:00
Michael Crosby
7531c66d5a Ensure that the rootfs dir is created in the bundle
This fixes issues running gvisor on top of containerd without docker.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-06-03 19:56:19 +00:00
Danni Xia
bf24fb0cad Close file r.log after used to release resources.
Signed-off-by: Danni Xia <xiadanni1@huawei.com>
2019-06-04 06:41:38 +08:00
Lantao Liu
48b81e872c Do not return error when rootfs already exists.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-05-22 15:57:19 -07:00
Derek McGowan
c9c555cd71
Merge pull request #3226 from Ace-Tang/kill_shim_in_clean
runtime-v1: kill shim in exit handler
2019-05-22 11:56:40 -07:00
Ace-Tang
5b7a327c47 Improve atomic delete
skip hidden directories in load task, and return soon if path not exist
in atomicDelete

carry of #3233

Closes #3233

Signed-off-by: Ace-Tang <aceapril@126.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-05-20 20:13:35 +00:00
Justin Terry (VM)
5e962dd8ba Remove unused Resize method from initState
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
2019-05-13 12:35:22 -07:00
Li Yuxuan
66036d9206 v1: Respect the shim_debug flag when load tasks
Currently when we restart containerd it will load all tasks with shim
logs whether the `shim_debug` is set or not.

Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
2019-05-13 23:51:16 +08:00
Michael Crosby
57fbb16234
Merge pull request #3149 from lifubang/pidnamespace
fix killall when use pidnamespace
2019-05-09 14:28:44 -04:00
Li Yuxuan
cf6e008542 Fix fd leak of shim log
Open shim v2 log with the flag `O_RDWR` will cause the `Read()` block
forever even if the pipe has been closed on the shim side. Then the
`io.Copy()` would never return and lead to a fd leak.
Fix typo when closing shim v1 log which causes the `stdouLog` leak.
Update `numPipes` function in test case to get the opened FIFO
correctly.

Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
2019-05-09 20:21:57 +08:00
Michael Crosby
5cf1356c5c
Merge pull request #3255 from dvrkps/usecancel
Use cancel on errors
2019-05-07 11:40:35 -04:00
Phil Estes
836cf53e40
Merge pull request #3244 from Random-Liu/fix-container-cleanup
Return NotFound error for kill and delete in deleted state.
2019-05-07 16:49:45 +02:00
Michael Crosby
19af235051
Merge pull request #3148 from masters-of-cats/wip-rootless-containerd
Skip rootfs unmount when no mounts are provided
2019-05-07 10:39:02 -04:00
Davor Kapsa
38e3696574 Use cancel on errors
Signed-off-by: Davor Kapsa <davor.kapsa@gmail.com>
2019-04-30 21:11:34 +02:00
Sebastiaan van Stijn
8c5779c32b
bump containerd/ttrpc 699c4e40d1e7416e08bf7019c7ce2e9beced4636
full diff: f02858b145...699c4e40d1

- containerd/ttrpc#33 Fix returns error message
- containerd/ttrpc#35 Make onclose an option

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-04-27 15:30:18 -07:00
Lantao Liu
dff7456804 Return NotFound error for kill and delete in deleted state.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-04-26 15:17:18 -07:00
Ace-Tang
dfa51c9279 runtime-v1: kill shim in cleanupAfterDeadShim
1. kill shim in cleanupAfterDeadShim avoid shim leak
2. refactor cleanupAfterDeadShim, get pid from bundle
path instead of make pid as a parameter

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-04-22 14:29:40 +08:00
Lantao Liu
9cc58781fa Check task list to avoid unnecessary cleanup.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-04-11 16:29:09 -07:00
Georgi Sabev
c0f0b21314 Apply PR feedback
* Rootfs dir is created during container creation not during bundle
  creation
* Add support for v2
* UnmountAll is a no-op when the path to unmount (i.e. the rootfs dir)
  does not exist or is invalid

Co-authored-by: Danail Branekov <danailster@gmail.com>
Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>
2019-04-04 18:40:30 +03:00
Georgi Sabev
2a5e4c4be7 Skip rootfs unmount when no mounts are provided
Co-authored-by: Julia Nedialkova <julianedialkova@hotmail.com>
Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>
2019-04-04 18:20:09 +03:00
Sebastiaan van Stijn
2d11f5e6d5
Regenerate protobufs
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-04-03 23:41:15 +02:00
Peter Wagner
ae04c16607 runtime: guard Close() until both streams are complete
Signed-off-by: Peter Wagner <thepwagner@github.com>
2019-04-01 15:23:57 -04:00
Peter Wagner
e96ac2040d runtime: log IO error when copying output streams
Signed-off-by: Peter Wagner <thepwagner@github.com>
2019-04-01 15:23:57 -04:00
Lifubang
fa5f744a79 fix shouldKillAllOnExit check
Signed-off-by: Lifubang <lifubang@acmcoder.com>
2019-03-30 11:36:56 +08:00
Michael Crosby
ef45e4f021
Merge pull request #3046 from linxiulei/fix_shim_socket
Shorten the unix socket path for shim
2019-03-15 09:10:47 -05:00
Eric Lin
a631796fda horten the unix socket path for shim
Use sha256 hash to shorten the unix socket path to satisfy the
length limitation of abstract socket path

This commit also backports the feature storing address path to
a file from v2 to keep compatibility

Fixes #3032

Signed-off-by: Eric Lin <linxiulei@gmail.com>
2019-03-15 11:58:30 +08:00
Michael Crosby
e6ae9cc64f Shim pluggable logging
Closes #603

This adds logging facilities at the shim level to provide minimal I/O
overhead and pluggable logging options.  Log handling is done within the
shim so that all I/O, cpu, and memory can be charged to the container.

A sample logging driver setting up logging for a container the systemd
journal looks like this:

```go
package main

import (
	"bufio"
	"context"
	"fmt"
	"io"
	"sync"

	"github.com/containerd/containerd/runtime/v2/logging"
	"github.com/coreos/go-systemd/journal"
)

func main() {
	logging.Run(log)
}

func log(ctx context.Context, config *logging.Config, ready func() error) error {
	// construct any log metadata for the container
	vars := map[string]string{
		"SYSLOG_IDENTIFIER": fmt.Sprintf("%s:%s", config.Namespace, config.ID),
	}
	var wg sync.WaitGroup
	wg.Add(2)
	// forward both stdout and stderr to the journal
	go copy(&wg, config.Stdout, journal.PriInfo, vars)
	go copy(&wg, config.Stderr, journal.PriErr, vars)

	// signal that we are ready and setup for the container to be started
	if err := ready(); err != nil {
		return err
	}
	wg.Wait()
	return nil
}

func copy(wg *sync.WaitGroup, r io.Reader, pri journal.Priority, vars map[string]string) {
	defer wg.Done()
	s := bufio.NewScanner(r)
	for s.Scan() {
		if s.Err() != nil {
			return
		}
		journal.Send(s.Text(), pri, vars)
	}
}
```

A `logging` package has been created to assist log developers create
logging plugins for containerd.

This uses a URI based approach for logging drivers that can be expanded
in the future.

Supported URI scheme's are:

* binary
* fifo
* file

You can pass the log url via ctr on the command line:

```bash
> ctr run --rm --runtime io.containerd.runc.v2 --log-uri binary://shim-journald docker.io/library/redis:alpine redis
```

```bash
> journalctl -f -t default:redis

-- Logs begin at Tue 2018-12-11 16:29:51 EST. --
Mar 08 16:08:22 deathstar default:redis[120760]: 1:C 08 Mar 2019 21:08:22.703 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.704 # Current maximum open files is 1024. maxclients has been reduced to 992 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 * Running mode=standalone, port=6379.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # Server initialized
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
Mar 08 16:08:22 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:22.705 * Ready to accept connections
Mar 08 16:08:50 deathstar default:redis[120760]: 1:signal-handler (1552079330) Received SIGINT scheduling shutdown...
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.405 # User requested shutdown...
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.406 * Saving the final RDB snapshot before exiting.
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.452 * DB saved on disk
Mar 08 16:08:50 deathstar default:redis[120760]: 1:M 08 Mar 2019 21:08:50.453 # Redis is now ready to exit, bye bye...
```

The following client side Opts are added:

```go
// LogURI provides the raw logging URI
func LogURI(uri *url.URL) Creator { }
// BinaryIO forwards contianer STDOUT|STDERR directly to a logging binary
func BinaryIO(binary string, args map[string]string) Creator {}
```

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-03-12 12:18:28 -04:00
Lantao Liu
952d58297d Add a separate lock for pid.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-02-01 08:48:26 -08:00
Lantao Liu
9777d76890 Revert "use state machine management for exec.Pid()"
This reverts commit bbc2a995f9.

Signed-off-by: Lantao Liu <lantaol@google.com>
2019-01-31 18:59:34 -08:00
Lantao Liu
26ab393e7d Use context.Background for O_NONBLOCK OpenFifo.
Signed-off-by: Lantao Liu <lantaol@google.com>
2019-01-23 10:18:54 -08:00
Michael Crosby
35582cb7a3
Merge pull request #2899 from fuweid/proposal-add-Add-method-in-PlatformRuntime
runtime: add Add/Delete method in PlatformRuntime interface
2019-01-22 13:48:39 -05:00
Wei Fu
568b5be936 runtime: add Add/Delete method in PlatformRuntime interface
The two new method Add/Delete can allow custom plugin to add or migrate
existing task into major Runtime plugin.

close: #2888

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2018-12-29 13:56:38 +08:00
Phil Estes
47b328aab7
Merge pull request #2897 from crosbymichael/atomic-delete
Ensure bundle removal is atomic
2018-12-21 08:27:43 -05:00
Michael Crosby
36e4dc603e Ensure bundle removal is atomic
This makes bundle removal atomic by first renaming the bundle and
working directories to a hidden path before removing the underlying
directories.

Closes #2567
Closes #2327

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-12-20 13:45:18 -05:00
Phil Estes
06e04bc5a9
Merge pull request #2830 from Ace-Tang/support_cr_without_image
cr: support checkpoint/restore without image
2018-12-20 13:24:37 -05:00
Michael Crosby
a2a4241979 Add timeout and cancel to shim fifo open
There is still a special case where the client side fails to open or
load causes things to be slow and the shim can lock up when this
happens.  This adds a timeout to the context for this case to abort fifo
creation.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-12-13 14:43:41 -05:00
Michael Crosby
66c20f2b75 Update runc to 96ec2177ae841256168fcf76954f7177af
This fixes a regression in runc that didn't allow signals being sent to
paused containers.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-12-04 11:21:20 -05:00
Lantao Liu
79499980e4 Kill should still work in stopped state.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-12-03 16:57:20 -08:00
Ace-Tang
6593399e9f cr: support checkpoint/restore without image
support checkpoint without committing a checkpoint dir into a
checkpoint image and restore without untar image into checkpoint
directory. support for both v1 and v2 runtime

Signed-off-by: Ace-Tang <aceapril@126.com>
2018-11-29 10:19:39 +08:00
Phil Estes
dcb82064d3
Merge pull request #2826 from lifubang/statemachineforpid
Fixes: shim service event blocked when waiting for IO finished
2018-11-27 15:46:28 -05:00
Michael Crosby
3eae8b9c3f
Merge pull request #2631 from masters-of-cats/shim-io-redirect
Use named pipes for shim logs
2018-11-27 10:44:00 -05:00
Lifubang
bbc2a995f9 use state machine management for exec.Pid()
Signed-off-by: Lifubang <lifubang@acmcoder.com>
2018-11-23 17:46:32 +08:00
Phil Estes
181a522142
Merge pull request #2807 from lifubang/shimlockwhenstdinclose
fix pipe in broken may cause shim lock forever
2018-11-20 22:38:22 +08:00
Lantao Liu
7d91d631e0 Lock KillAll.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-11-19 15:19:35 -08:00
Lifubang
e76a8879eb fix pipe in broken may cause shim lock forever for runtime v1
Signed-off-by: Lifubang <lifubang@acmcoder.com>
2018-11-19 09:25:43 +08:00
Michael Crosby
d48d7464ad
Merge pull request #2773 from crosbymichael/state-locking
Fix process locking and state management
2018-11-16 14:15:04 -05:00
Julia Nedialkova
1d4105cacf Use named pipes for shim logs
Relating to issue [#2606](https://github.com/containerd/containerd/issues/2606)

Co-authored-by: Oliver Stenbom <ostenbom@pivotal.io>
Co-authored-by: Georgi Sabev <georgethebeatle@gmail.com>
Co-authored-by: Giuseppe Capizzi <gcapizzi@pivotal.io>
Co-authored-by: Danail Branekov <danailster@gmail.com>

Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io>
Signed-off-by: Georgi Sabev <georgethebeatle@gmail.com>
Signed-off-by: Giuseppe Capizzi <gcapizzi@pivotal.io>
Signed-off-by: Danail Branekov <danailster@gmail.com>
2018-11-16 16:11:43 +02:00
Michael Crosby
831a41b958 Fix process locking and state management
There were races with the way process states.  This displayed in ways,
especially around pausing the container for atomic operations.  Users
would get errors like, cannnot delete container in paused state and
such.

This can be eaisly reproduced with `docker` and the following command:

```bash
> (for i in `seq 1 25`; do id=$(docker create  alpine usleep 50000);docker start $id;docker commit $id;docker wait $id;docker rm $id; done)
```

This two issues that this fixes are:

* locks must be held by the owning process, not the state operations.
* If a container ends up being paused but before the operation
completes, the process exists, make sure we resume the container before
setting the the process as exited.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-11-09 11:40:37 -05:00
Lantao Liu
c524b9ce41 Partially revert the event discard change in #2748.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-11-08 14:47:32 -08:00
Wei Fu
38d7d59e8a enhance: update v1/v2 runtime
1. avoid dead lock during kill, fetch allProcesses before handle events
2. use argu's ctx instead of context.Backgroud() in openlog

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2018-11-06 22:48:43 +08:00
Michael Crosby
232a063496 Increase reaper buffer size and non-blocking send
Fixes #2709

This increases the buffer size for process exit subscribers. It also
implements a non-blocking send on the subscriber channel.  It is better
to drop an exit even than it is to block a shim for one slow subscriber.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-10-29 16:46:58 -04:00
Ace-Tang
c206da7957 optimize shim lock in runtime v1
apply lock only around process map of shim service, avoid lock affect
other procs operations.

Signed-off-by: Ace-Tang <aceapril@126.com>
2018-10-25 15:27:46 +08:00
Michael Crosby
87d1118a0f
Merge pull request #2605 from lifubang/runafterstart
fix delete running bundle dir when ctr t start a container again
2018-09-21 14:22:33 -04:00
Lifubang
557e8e0b0d fix delete running bundle dir when run t start cmd again
Signed-off-by: Lifubang <lifubang@acmcoder.com>

code optimization after review

Signed-off-by: Lifubang <lifubang@acmcoder.com>
2018-09-21 06:33:23 +08:00
John Howard
2586f3fbb9 boltdb/bolt --> go.etcd.io/bbolt
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-09-12 15:23:57 -07:00
Phil Estes
ed2bf6dd8a
Merge pull request #2624 from Ace-Tang/fix_delete_lock
fix: modify lock location of exec delete avoid exec hang
2018-09-11 10:26:32 -04:00
Ace-Tang
079292e3fc fix: modify lock location of exec delete
func (e *execProcess) delete(ctx context.Context) error {
    e.wg.Wait()
...
}
delete exec process will wait for io copy finish, if wait here,
other process can not get lock of shim service.

1. apply lock around s.transition() calls in the Delete methods.
2. put lock after wait io copy in exec Delete.

Signed-off-by: Ace-Tang <aceapril@126.com>
2018-09-11 13:22:59 +08:00
Michael Crosby
906acb18b6 Don't provide IO when it's not set
This makes sure that runc does not get any valid IO for the pipe.  Some
builds and other containers will be stuck if they inspect stdin
expecially and its a pipe but not connected to any user input.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-09-07 18:30:31 -04:00
Claudia Beresford
32e6aa742b Fix teeny tiny typos
Signed-off-by: Claudia Beresford <cberesford@pivotal.io>
2018-09-05 14:44:44 +01:00
yanxuean
517930187e remove useless parameter from newTask
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
2018-09-04 10:59:00 +08:00
Lantao Liu
7a4e0806c2 Fix runc state error handling.
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-08-30 10:47:04 -07:00
Tom Godkin
b5ccc66c2c Do not kill all on task delete by default
- Still KillAll if the task uses the hosts pid namespace
 - Test for both host pid namespace and normal cases

Co-authored-by: Oliver Stenbom <ostenbom@pivotal.io>
Co-authored-by: Georgi Sabev <georgethebeatle@gmail.com>
Signed-off-by: Oliver Stenbom <ostenbom@pivotal.io>
2018-08-30 15:58:33 +01:00
Michael Crosby
bc1ff51411 Don't block on STDIN open
This was found testing other runtime shims that are faster than runc(no
containerization).  This is a race that can cause the shim to block
forever.  It's not an issue for out/err because we open both sides of
the pipe, but for stdin, it expects the client to have it opened.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-08-27 10:44:53 -04:00
Michael Crosby
da1b5470cd Runtime v2
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-07-17 10:21:29 -04:00