Use vndr instead of godep.
Signed-off-by: Lantao Liu <lantaol@google.com>
This commit is contained in:
1
vendor/github.com/opencontainers/go-digest/.mailmap
generated
vendored
1
vendor/github.com/opencontainers/go-digest/.mailmap
generated
vendored
@@ -1 +0,0 @@
|
||||
Stephen J Day <stephen.day@docker.com> <stevvooe@users.noreply.github.com>
|
12
vendor/github.com/opencontainers/go-digest/.pullapprove.yml
generated
vendored
12
vendor/github.com/opencontainers/go-digest/.pullapprove.yml
generated
vendored
@@ -1,12 +0,0 @@
|
||||
approve_by_comment: true
|
||||
approve_regex: '^(Approved|lgtm|LGTM|:shipit:|:star:|:\+1:|:ship:)'
|
||||
reject_regex: ^Rejected
|
||||
reset_on_push: true
|
||||
author_approval: ignored
|
||||
signed_off_by:
|
||||
required: true
|
||||
reviewers:
|
||||
teams:
|
||||
- go-digest-maintainers
|
||||
name: default
|
||||
required: 2
|
4
vendor/github.com/opencontainers/go-digest/.travis.yml
generated
vendored
4
vendor/github.com/opencontainers/go-digest/.travis.yml
generated
vendored
@@ -1,4 +0,0 @@
|
||||
language: go
|
||||
go:
|
||||
- 1.7
|
||||
- master
|
72
vendor/github.com/opencontainers/go-digest/CONTRIBUTING.md
generated
vendored
72
vendor/github.com/opencontainers/go-digest/CONTRIBUTING.md
generated
vendored
@@ -1,72 +0,0 @@
|
||||
# Contributing to Docker open source projects
|
||||
|
||||
Want to hack on this project? Awesome! Here are instructions to get you started.
|
||||
|
||||
This project is a part of the [Docker](https://www.docker.com) project, and follows
|
||||
the same rules and principles. If you're already familiar with the way
|
||||
Docker does things, you'll feel right at home.
|
||||
|
||||
Otherwise, go read Docker's
|
||||
[contributions guidelines](https://github.com/docker/docker/blob/master/CONTRIBUTING.md),
|
||||
[issue triaging](https://github.com/docker/docker/blob/master/project/ISSUE-TRIAGE.md),
|
||||
[review process](https://github.com/docker/docker/blob/master/project/REVIEWING.md) and
|
||||
[branches and tags](https://github.com/docker/docker/blob/master/project/BRANCHES-AND-TAGS.md).
|
||||
|
||||
For an in-depth description of our contribution process, visit the
|
||||
contributors guide: [Understand how to contribute](https://docs.docker.com/opensource/workflow/make-a-contribution/)
|
||||
|
||||
### Sign your work
|
||||
|
||||
The sign-off is a simple line at the end of the explanation for the patch. Your
|
||||
signature certifies that you wrote the patch or otherwise have the right to pass
|
||||
it on as an open-source patch. The rules are pretty simple: if you can certify
|
||||
the below (from [developercertificate.org](http://developercertificate.org/)):
|
||||
|
||||
```
|
||||
Developer Certificate of Origin
|
||||
Version 1.1
|
||||
|
||||
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
|
||||
1 Letterman Drive
|
||||
Suite D4700
|
||||
San Francisco, CA, 94129
|
||||
|
||||
Everyone is permitted to copy and distribute verbatim copies of this
|
||||
license document, but changing it is not allowed.
|
||||
|
||||
|
||||
Developer's Certificate of Origin 1.1
|
||||
|
||||
By making a contribution to this project, I certify that:
|
||||
|
||||
(a) The contribution was created in whole or in part by me and I
|
||||
have the right to submit it under the open source license
|
||||
indicated in the file; or
|
||||
|
||||
(b) The contribution is based upon previous work that, to the best
|
||||
of my knowledge, is covered under an appropriate open source
|
||||
license and I have the right under that license to submit that
|
||||
work with modifications, whether created in whole or in part
|
||||
by me, under the same open source license (unless I am
|
||||
permitted to submit under a different license), as indicated
|
||||
in the file; or
|
||||
|
||||
(c) The contribution was provided directly to me by some other
|
||||
person who certified (a), (b) or (c) and I have not modified
|
||||
it.
|
||||
|
||||
(d) I understand and agree that this project and the contribution
|
||||
are public and that a record of the contribution (including all
|
||||
personal information I submit with it, including my sign-off) is
|
||||
maintained indefinitely and may be redistributed consistent with
|
||||
this project or the open source license(s) involved.
|
||||
```
|
||||
|
||||
Then you just add a line to every git commit message:
|
||||
|
||||
Signed-off-by: Joe Smith <joe.smith@email.com>
|
||||
|
||||
Use your real name (sorry, no pseudonyms or anonymous contributions.)
|
||||
|
||||
If you set your `user.name` and `user.email` git configs, you can sign your
|
||||
commit automatically with `git commit -s`.
|
9
vendor/github.com/opencontainers/go-digest/MAINTAINERS
generated
vendored
9
vendor/github.com/opencontainers/go-digest/MAINTAINERS
generated
vendored
@@ -1,9 +0,0 @@
|
||||
Aaron Lehmann <aaron.lehmann@docker.com> (@aaronlehmann)
|
||||
Brandon Philips <brandon.philips@coreos.com> (@philips)
|
||||
Brendan Burns <bburns@microsoft.com> (@brendandburns)
|
||||
Derek McGowan <derek@mcgstyle.net> (@dmcgowan)
|
||||
Jason Bouzane <jbouzane@google.com> (@jbouzane)
|
||||
John Starks <jostarks@microsoft.com> (@jstarks)
|
||||
Jonathan Boulle <jon.boulle@coreos.com> (@jonboulle)
|
||||
Stephen Day <stephen.day@docker.com> (@stevvooe)
|
||||
Vincent Batts <vbatts@redhat.com> (@vbatts)
|
167
vendor/github.com/opencontainers/image-spec/README.md
generated
vendored
Normal file
167
vendor/github.com/opencontainers/image-spec/README.md
generated
vendored
Normal file
@@ -0,0 +1,167 @@
|
||||
# OCI Image Format Specification
|
||||
<div>
|
||||
<a href="https://travis-ci.org/opencontainers/image-spec">
|
||||
<img src="https://travis-ci.org/opencontainers/image-spec.svg?branch=master"></img>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
The OCI Image Format project creates and maintains the software shipping container image format spec (OCI Image Format).
|
||||
|
||||
**[The specification can be found here](spec.md).**
|
||||
|
||||
This repository also provides [Go types](specs-go), [intra-blob validation tooling, and JSON Schema](schema).
|
||||
The Go types and validation should be compatible with the current Go release; earlier Go releases are not supported.
|
||||
|
||||
Additional documentation about how this group operates:
|
||||
|
||||
- [Code of Conduct](https://github.com/opencontainers/tob/blob/d2f9d68c1332870e40693fe077d311e0742bc73d/code-of-conduct.md)
|
||||
- [Roadmap](#roadmap)
|
||||
- [Releases](RELEASES.md)
|
||||
- [Project Documentation](project.md)
|
||||
|
||||
The _optional_ and _base_ layers of all OCI projects are tracked in the [OCI Scope Table](https://www.opencontainers.org/about/oci-scope-table).
|
||||
|
||||
## Running an OCI Image
|
||||
|
||||
The OCI Image Format partner project is the [OCI Runtime Spec project](https://github.com/opencontainers/runtime-spec).
|
||||
The Runtime Specification outlines how to run a "[filesystem bundle](https://github.com/opencontainers/runtime-spec/blob/master/bundle.md)" that is unpacked on disk.
|
||||
At a high-level an OCI implementation would download an OCI Image then unpack that image into an OCI Runtime filesystem bundle.
|
||||
At this point the OCI Runtime Bundle would be run by an OCI Runtime.
|
||||
|
||||
This entire workflow supports the UX that users have come to expect from container engines like Docker and rkt: primarily, the ability to run an image with no additional arguments:
|
||||
|
||||
* docker run example.com/org/app:v1.0.0
|
||||
* rkt run example.com/org/app,version=v1.0.0
|
||||
|
||||
To support this UX the OCI Image Format contains sufficient information to launch the application on the target platform (e.g. command, arguments, environment variables, etc).
|
||||
|
||||
## FAQ
|
||||
|
||||
**Q: Why doesn't this project mention distribution?**
|
||||
|
||||
A: Distribution, for example using HTTP as both Docker v2.2 and AppC do today, is currently out of scope on the [OCI Scope Table](https://www.opencontainers.org/about/oci-scope-table).
|
||||
There has been [some discussion on the TOB mailing list](https://groups.google.com/a/opencontainers.org/d/msg/tob/A3JnmI-D-6Y/tLuptPDHAgAJ) to make distribution an optional layer, but this topic is a work in progress.
|
||||
|
||||
**Q: What happens to AppC or Docker Image Formats?**
|
||||
|
||||
A: Existing formats can continue to be a proving ground for technologies, as needed.
|
||||
The OCI Image Format project strives to provide a dependable open specification that can be shared between different tools and be evolved for years or decades of compatibility; as the deb and rpm format have.
|
||||
|
||||
Find more [FAQ on the OCI site](https://www.opencontainers.org/faq).
|
||||
|
||||
## Roadmap
|
||||
|
||||
The [GitHub milestones](https://github.com/opencontainers/image-spec/milestones) lay out the path to the OCI v1.0.0 release in late 2016.
|
||||
|
||||
# Contributing
|
||||
|
||||
Development happens on GitHub for the spec.
|
||||
Issues are used for bugs and actionable items and longer discussions can happen on the [mailing list](#mailing-list).
|
||||
|
||||
The specification and code is licensed under the Apache 2.0 license found in the `LICENSE` file of this repository.
|
||||
|
||||
## Discuss your design
|
||||
|
||||
The project welcomes submissions, but please let everyone know what you are working on.
|
||||
|
||||
Before undertaking a nontrivial change to this specification, send mail to the [mailing list](#mailing-list) to discuss what you plan to do.
|
||||
This gives everyone a chance to validate the design, helps prevent duplication of effort, and ensures that the idea fits.
|
||||
It also guarantees that the design is sound before code is written; a GitHub pull-request is not the place for high-level discussions.
|
||||
|
||||
Typos and grammatical errors can go straight to a pull-request.
|
||||
When in doubt, start on the [mailing-list](#mailing-list).
|
||||
|
||||
## Weekly Call
|
||||
|
||||
The contributors and maintainers of all OCI projects have a weekly meeting Wednesdays at 2:00 PM (USA Pacific).
|
||||
Everyone is welcome to participate via [UberConference web][UberConference] or audio-only: +1-415-968-0849 (no PIN needed).
|
||||
An initial agenda will be posted to the [mailing list](#mailing-list) earlier in the week, and everyone is welcome to propose additional topics or suggest other agenda alterations there.
|
||||
Minutes are posted to the [mailing list](#mailing-list) and minutes from past calls are archived [here][minutes].
|
||||
|
||||
## Mailing List
|
||||
|
||||
You can subscribe and join the mailing list on [Google Groups](https://groups.google.com/a/opencontainers.org/forum/#!forum/dev).
|
||||
|
||||
## IRC
|
||||
|
||||
OCI discussion happens on #opencontainers on Freenode ([logs][irc-logs]).
|
||||
|
||||
## Markdown style
|
||||
|
||||
To keep consistency throughout the Markdown files in the Open Container spec all files should be formatted one sentence per line.
|
||||
This fixes two things: it makes diffing easier with git and it resolves fights about line wrapping length.
|
||||
For example, this paragraph will span three lines in the Markdown source.
|
||||
|
||||
## Git commit
|
||||
|
||||
### Sign your work
|
||||
|
||||
The sign-off is a simple line at the end of the explanation for the patch, which certifies that you wrote it or otherwise have the right to pass it on as an open-source patch.
|
||||
The rules are pretty simple: if you can certify the below (from [developercertificate.org](http://developercertificate.org/)):
|
||||
|
||||
```
|
||||
Developer Certificate of Origin
|
||||
Version 1.1
|
||||
|
||||
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
|
||||
660 York Street, Suite 102,
|
||||
San Francisco, CA 94110 USA
|
||||
|
||||
Everyone is permitted to copy and distribute verbatim copies of this
|
||||
license document, but changing it is not allowed.
|
||||
|
||||
|
||||
Developer's Certificate of Origin 1.1
|
||||
|
||||
By making a contribution to this project, I certify that:
|
||||
|
||||
(a) The contribution was created in whole or in part by me and I
|
||||
have the right to submit it under the open source license
|
||||
indicated in the file; or
|
||||
|
||||
(b) The contribution is based upon previous work that, to the best
|
||||
of my knowledge, is covered under an appropriate open source
|
||||
license and I have the right under that license to submit that
|
||||
work with modifications, whether created in whole or in part
|
||||
by me, under the same open source license (unless I am
|
||||
permitted to submit under a different license), as indicated
|
||||
in the file; or
|
||||
|
||||
(c) The contribution was provided directly to me by some other
|
||||
person who certified (a), (b) or (c) and I have not modified
|
||||
it.
|
||||
|
||||
(d) I understand and agree that this project and the contribution
|
||||
are public and that a record of the contribution (including all
|
||||
personal information I submit with it, including my sign-off) is
|
||||
maintained indefinitely and may be redistributed consistent with
|
||||
this project or the open source license(s) involved.
|
||||
```
|
||||
|
||||
then you just add a line to every git commit message:
|
||||
|
||||
Signed-off-by: Joe Smith <joe@gmail.com>
|
||||
|
||||
using your real name (sorry, no pseudonyms or anonymous contributions.)
|
||||
|
||||
You can add the sign off when creating the git commit via `git commit -s`.
|
||||
|
||||
### Commit Style
|
||||
|
||||
Simple house-keeping for clean git history.
|
||||
Read more on [How to Write a Git Commit Message](http://chris.beams.io/posts/git-commit/) or the Discussion section of [`git-commit(1)`](http://git-scm.com/docs/git-commit).
|
||||
|
||||
1. Separate the subject from body with a blank line
|
||||
2. Limit the subject line to 50 characters
|
||||
3. Capitalize the subject line
|
||||
4. Do not end the subject line with a period
|
||||
5. Use the imperative mood in the subject line
|
||||
6. Wrap the body at 72 characters
|
||||
7. Use the body to explain what and why vs. how
|
||||
* If there was important/useful/essential conversation or information, copy or include a reference
|
||||
8. When possible, one keyword to scope the change in the subject (i.e. "README: ...", "runtime: ...")
|
||||
|
||||
|
||||
[UberConference]: https://www.uberconference.com/opencontainers
|
||||
[irc-logs]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/
|
||||
[minutes]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/
|
244
vendor/github.com/opencontainers/runc/README.md
generated
vendored
Normal file
244
vendor/github.com/opencontainers/runc/README.md
generated
vendored
Normal file
@@ -0,0 +1,244 @@
|
||||
# runc
|
||||
|
||||
[](https://travis-ci.org/opencontainers/runc)
|
||||
[](https://goreportcard.com/report/github.com/opencontainers/runc)
|
||||
[](https://godoc.org/github.com/opencontainers/runc)
|
||||
|
||||
## Introduction
|
||||
|
||||
`runc` is a CLI tool for spawning and running containers according to the OCI specification.
|
||||
|
||||
## Releases
|
||||
|
||||
`runc` depends on and tracks the [runtime-spec](https://github.com/opencontainers/runtime-spec) repository.
|
||||
We will try to make sure that `runc` and the OCI specification major versions stay in lockstep.
|
||||
This means that `runc` 1.0.0 should implement the 1.0 version of the specification.
|
||||
|
||||
You can find official releases of `runc` on the [release](https://github.com/opencontainers/runc/releases) page.
|
||||
|
||||
### Security
|
||||
|
||||
If you wish to report a security issue, please disclose the issue responsibly
|
||||
to security@opencontainers.org.
|
||||
|
||||
## Building
|
||||
|
||||
`runc` currently supports the Linux platform with various architecture support.
|
||||
It must be built with Go version 1.6 or higher in order for some features to function properly.
|
||||
|
||||
In order to enable seccomp support you will need to install `libseccomp` on your platform.
|
||||
> e.g. `libseccomp-devel` for CentOS, or `libseccomp-dev` for Ubuntu
|
||||
|
||||
Otherwise, if you do not want to build `runc` with seccomp support you can add `BUILDTAGS=""` when running make.
|
||||
|
||||
```bash
|
||||
# create a 'github.com/opencontainers' in your GOPATH/src
|
||||
cd github.com/opencontainers
|
||||
git clone https://github.com/opencontainers/runc
|
||||
cd runc
|
||||
|
||||
make
|
||||
sudo make install
|
||||
```
|
||||
|
||||
`runc` will be installed to `/usr/local/sbin/runc` on your system.
|
||||
|
||||
#### Build Tags
|
||||
|
||||
`runc` supports optional build tags for compiling support of various features.
|
||||
To add build tags to the make option the `BUILDTAGS` variable must be set.
|
||||
|
||||
```bash
|
||||
make BUILDTAGS='seccomp apparmor'
|
||||
```
|
||||
|
||||
| Build Tag | Feature | Dependency |
|
||||
|-----------|------------------------------------|-------------|
|
||||
| seccomp | Syscall filtering | libseccomp |
|
||||
| selinux | selinux process and mount labeling | <none> |
|
||||
| apparmor | apparmor profile support | libapparmor |
|
||||
| ambient | ambient capability support | kernel 4.3 |
|
||||
|
||||
|
||||
### Running the test suite
|
||||
|
||||
`runc` currently supports running its test suite via Docker.
|
||||
To run the suite just type `make test`.
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
There are additional make targets for running the tests outside of a container but this is not recommended as the tests are written with the expectation that they can write and remove anywhere.
|
||||
|
||||
You can run a specific test case by setting the `TESTFLAGS` variable.
|
||||
|
||||
```bash
|
||||
# make test TESTFLAGS="-run=SomeTestFunction"
|
||||
```
|
||||
|
||||
### Dependencies Management
|
||||
|
||||
`runc` uses [vndr](https://github.com/LK4D4/vndr) for dependencies management.
|
||||
Please refer to [vndr](https://github.com/LK4D4/vndr) for how to add or update
|
||||
new dependencies.
|
||||
|
||||
## Using runc
|
||||
|
||||
### Creating an OCI Bundle
|
||||
|
||||
In order to use runc you must have your container in the format of an OCI bundle.
|
||||
If you have Docker installed you can use its `export` method to acquire a root filesystem from an existing Docker container.
|
||||
|
||||
```bash
|
||||
# create the top most bundle directory
|
||||
mkdir /mycontainer
|
||||
cd /mycontainer
|
||||
|
||||
# create the rootfs directory
|
||||
mkdir rootfs
|
||||
|
||||
# export busybox via Docker into the rootfs directory
|
||||
docker export $(docker create busybox) | tar -C rootfs -xvf -
|
||||
```
|
||||
|
||||
After a root filesystem is populated you just generate a spec in the format of a `config.json` file inside your bundle.
|
||||
`runc` provides a `spec` command to generate a base template spec that you are then able to edit.
|
||||
To find features and documentation for fields in the spec please refer to the [specs](https://github.com/opencontainers/runtime-spec) repository.
|
||||
|
||||
```bash
|
||||
runc spec
|
||||
```
|
||||
|
||||
### Running Containers
|
||||
|
||||
Assuming you have an OCI bundle from the previous step you can execute the container in two different ways.
|
||||
|
||||
The first way is to use the convenience command `run` that will handle creating, starting, and deleting the container after it exits.
|
||||
|
||||
```bash
|
||||
# run as root
|
||||
cd /mycontainer
|
||||
runc run mycontainerid
|
||||
```
|
||||
|
||||
If you used the unmodified `runc spec` template this should give you a `sh` session inside the container.
|
||||
|
||||
The second way to start a container is using the specs lifecycle operations.
|
||||
This gives you more power over how the container is created and managed while it is running.
|
||||
This will also launch the container in the background so you will have to edit the `config.json` to remove the `terminal` setting for the simple examples here.
|
||||
Your process field in the `config.json` should look like this below with `"terminal": false` and `"args": ["sleep", "5"]`.
|
||||
|
||||
|
||||
```json
|
||||
"process": {
|
||||
"terminal": false,
|
||||
"user": {
|
||||
"uid": 0,
|
||||
"gid": 0
|
||||
},
|
||||
"args": [
|
||||
"sleep", "5"
|
||||
],
|
||||
"env": [
|
||||
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
|
||||
"TERM=xterm"
|
||||
],
|
||||
"cwd": "/",
|
||||
"capabilities": {
|
||||
"bounding": [
|
||||
"CAP_AUDIT_WRITE",
|
||||
"CAP_KILL",
|
||||
"CAP_NET_BIND_SERVICE"
|
||||
],
|
||||
"effective": [
|
||||
"CAP_AUDIT_WRITE",
|
||||
"CAP_KILL",
|
||||
"CAP_NET_BIND_SERVICE"
|
||||
],
|
||||
"inheritable": [
|
||||
"CAP_AUDIT_WRITE",
|
||||
"CAP_KILL",
|
||||
"CAP_NET_BIND_SERVICE"
|
||||
],
|
||||
"permitted": [
|
||||
"CAP_AUDIT_WRITE",
|
||||
"CAP_KILL",
|
||||
"CAP_NET_BIND_SERVICE"
|
||||
],
|
||||
"ambient": [
|
||||
"CAP_AUDIT_WRITE",
|
||||
"CAP_KILL",
|
||||
"CAP_NET_BIND_SERVICE"
|
||||
]
|
||||
},
|
||||
"rlimits": [
|
||||
{
|
||||
"type": "RLIMIT_NOFILE",
|
||||
"hard": 1024,
|
||||
"soft": 1024
|
||||
}
|
||||
],
|
||||
"noNewPrivileges": true
|
||||
},
|
||||
```
|
||||
|
||||
Now we can go through the lifecycle operations in your shell.
|
||||
|
||||
|
||||
```bash
|
||||
# run as root
|
||||
cd /mycontainer
|
||||
runc create mycontainerid
|
||||
|
||||
# view the container is created and in the "created" state
|
||||
runc list
|
||||
|
||||
# start the process inside the container
|
||||
runc start mycontainerid
|
||||
|
||||
# after 5 seconds view that the container has exited and is now in the stopped state
|
||||
runc list
|
||||
|
||||
# now delete the container
|
||||
runc delete mycontainerid
|
||||
```
|
||||
|
||||
This adds more complexity but allows higher level systems to manage runc and provides points in the containers creation to setup various settings after the container has created and/or before it is deleted.
|
||||
This is commonly used to setup the container's network stack after `create` but before `start` where the user's defined process will be running.
|
||||
|
||||
#### Rootless containers
|
||||
`runc` has the ability to run containers without root privileges. This is called `rootless`. You need to pass some parameters to `runc` in order to run rootless containers. See below and compare with the previous version. Run the following commands as an ordinary user:
|
||||
```bash
|
||||
# Same as the first example
|
||||
mkdir ~/mycontainer
|
||||
cd ~/mycontainer
|
||||
mkdir rootfs
|
||||
docker export $(docker create busybox) | tar -C rootfs -xvf -
|
||||
|
||||
# The --rootless parameter instructs runc spec to generate a configuration for a rootless container, which will allow you to run the container as a non-root user.
|
||||
runc spec --rootless
|
||||
|
||||
# The --root parameter tells runc where to store the container state. It must be writable by the user.
|
||||
runc --root /tmp/runc run mycontainerid
|
||||
```
|
||||
|
||||
#### Supervisors
|
||||
|
||||
`runc` can be used with process supervisors and init systems to ensure that containers are restarted when they exit.
|
||||
An example systemd unit file looks something like this.
|
||||
|
||||
```systemd
|
||||
[Unit]
|
||||
Description=Start My Container
|
||||
|
||||
[Service]
|
||||
Type=forking
|
||||
ExecStart=/usr/local/sbin/runc run -d --pid-file /run/mycontainerid.pid mycontainerid
|
||||
ExecStopPost=/usr/local/sbin/runc delete mycontainerid
|
||||
WorkingDirectory=/mycontainer
|
||||
PIDFile=/run/mycontainerid.pid
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
328
vendor/github.com/opencontainers/runc/libcontainer/README.md
generated
vendored
Normal file
328
vendor/github.com/opencontainers/runc/libcontainer/README.md
generated
vendored
Normal file
@@ -0,0 +1,328 @@
|
||||
# libcontainer
|
||||
|
||||
[](https://godoc.org/github.com/opencontainers/runc/libcontainer)
|
||||
|
||||
Libcontainer provides a native Go implementation for creating containers
|
||||
with namespaces, cgroups, capabilities, and filesystem access controls.
|
||||
It allows you to manage the lifecycle of the container performing additional operations
|
||||
after the container is created.
|
||||
|
||||
|
||||
#### Container
|
||||
A container is a self contained execution environment that shares the kernel of the
|
||||
host system and which is (optionally) isolated from other containers in the system.
|
||||
|
||||
#### Using libcontainer
|
||||
|
||||
Because containers are spawned in a two step process you will need a binary that
|
||||
will be executed as the init process for the container. In libcontainer, we use
|
||||
the current binary (/proc/self/exe) to be executed as the init process, and use
|
||||
arg "init", we call the first step process "bootstrap", so you always need a "init"
|
||||
function as the entry of "bootstrap".
|
||||
|
||||
In addition to the go init function the early stage bootstrap is handled by importing
|
||||
[nsenter](https://github.com/opencontainers/runc/blob/master/libcontainer/nsenter/README.md).
|
||||
|
||||
```go
|
||||
import (
|
||||
_ "github.com/opencontainers/runc/libcontainer/nsenter"
|
||||
)
|
||||
|
||||
func init() {
|
||||
if len(os.Args) > 1 && os.Args[1] == "init" {
|
||||
runtime.GOMAXPROCS(1)
|
||||
runtime.LockOSThread()
|
||||
factory, _ := libcontainer.New("")
|
||||
if err := factory.StartInitialization(); err != nil {
|
||||
logrus.Fatal(err)
|
||||
}
|
||||
panic("--this line should have never been executed, congratulations--")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then to create a container you first have to initialize an instance of a factory
|
||||
that will handle the creation and initialization for a container.
|
||||
|
||||
```go
|
||||
factory, err := libcontainer.New("/var/lib/container", libcontainer.Cgroupfs, libcontainer.InitArgs(os.Args[0], "init"))
|
||||
if err != nil {
|
||||
logrus.Fatal(err)
|
||||
return
|
||||
}
|
||||
```
|
||||
|
||||
Once you have an instance of the factory created we can create a configuration
|
||||
struct describing how the container is to be created. A sample would look similar to this:
|
||||
|
||||
```go
|
||||
defaultMountFlags := unix.MS_NOEXEC | unix.MS_NOSUID | unix.MS_NODEV
|
||||
config := &configs.Config{
|
||||
Rootfs: "/your/path/to/rootfs",
|
||||
Capabilities: &configs.Capabilities{
|
||||
Bounding: []string{
|
||||
"CAP_CHOWN",
|
||||
"CAP_DAC_OVERRIDE",
|
||||
"CAP_FSETID",
|
||||
"CAP_FOWNER",
|
||||
"CAP_MKNOD",
|
||||
"CAP_NET_RAW",
|
||||
"CAP_SETGID",
|
||||
"CAP_SETUID",
|
||||
"CAP_SETFCAP",
|
||||
"CAP_SETPCAP",
|
||||
"CAP_NET_BIND_SERVICE",
|
||||
"CAP_SYS_CHROOT",
|
||||
"CAP_KILL",
|
||||
"CAP_AUDIT_WRITE",
|
||||
},
|
||||
Effective: []string{
|
||||
"CAP_CHOWN",
|
||||
"CAP_DAC_OVERRIDE",
|
||||
"CAP_FSETID",
|
||||
"CAP_FOWNER",
|
||||
"CAP_MKNOD",
|
||||
"CAP_NET_RAW",
|
||||
"CAP_SETGID",
|
||||
"CAP_SETUID",
|
||||
"CAP_SETFCAP",
|
||||
"CAP_SETPCAP",
|
||||
"CAP_NET_BIND_SERVICE",
|
||||
"CAP_SYS_CHROOT",
|
||||
"CAP_KILL",
|
||||
"CAP_AUDIT_WRITE",
|
||||
},
|
||||
Inheritable: []string{
|
||||
"CAP_CHOWN",
|
||||
"CAP_DAC_OVERRIDE",
|
||||
"CAP_FSETID",
|
||||
"CAP_FOWNER",
|
||||
"CAP_MKNOD",
|
||||
"CAP_NET_RAW",
|
||||
"CAP_SETGID",
|
||||
"CAP_SETUID",
|
||||
"CAP_SETFCAP",
|
||||
"CAP_SETPCAP",
|
||||
"CAP_NET_BIND_SERVICE",
|
||||
"CAP_SYS_CHROOT",
|
||||
"CAP_KILL",
|
||||
"CAP_AUDIT_WRITE",
|
||||
},
|
||||
Permitted: []string{
|
||||
"CAP_CHOWN",
|
||||
"CAP_DAC_OVERRIDE",
|
||||
"CAP_FSETID",
|
||||
"CAP_FOWNER",
|
||||
"CAP_MKNOD",
|
||||
"CAP_NET_RAW",
|
||||
"CAP_SETGID",
|
||||
"CAP_SETUID",
|
||||
"CAP_SETFCAP",
|
||||
"CAP_SETPCAP",
|
||||
"CAP_NET_BIND_SERVICE",
|
||||
"CAP_SYS_CHROOT",
|
||||
"CAP_KILL",
|
||||
"CAP_AUDIT_WRITE",
|
||||
},
|
||||
Ambient: []string{
|
||||
"CAP_CHOWN",
|
||||
"CAP_DAC_OVERRIDE",
|
||||
"CAP_FSETID",
|
||||
"CAP_FOWNER",
|
||||
"CAP_MKNOD",
|
||||
"CAP_NET_RAW",
|
||||
"CAP_SETGID",
|
||||
"CAP_SETUID",
|
||||
"CAP_SETFCAP",
|
||||
"CAP_SETPCAP",
|
||||
"CAP_NET_BIND_SERVICE",
|
||||
"CAP_SYS_CHROOT",
|
||||
"CAP_KILL",
|
||||
"CAP_AUDIT_WRITE",
|
||||
},
|
||||
},
|
||||
Namespaces: configs.Namespaces([]configs.Namespace{
|
||||
{Type: configs.NEWNS},
|
||||
{Type: configs.NEWUTS},
|
||||
{Type: configs.NEWIPC},
|
||||
{Type: configs.NEWPID},
|
||||
{Type: configs.NEWUSER},
|
||||
{Type: configs.NEWNET},
|
||||
}),
|
||||
Cgroups: &configs.Cgroup{
|
||||
Name: "test-container",
|
||||
Parent: "system",
|
||||
Resources: &configs.Resources{
|
||||
MemorySwappiness: nil,
|
||||
AllowAllDevices: nil,
|
||||
AllowedDevices: configs.DefaultAllowedDevices,
|
||||
},
|
||||
},
|
||||
MaskPaths: []string{
|
||||
"/proc/kcore",
|
||||
"/sys/firmware",
|
||||
},
|
||||
ReadonlyPaths: []string{
|
||||
"/proc/sys", "/proc/sysrq-trigger", "/proc/irq", "/proc/bus",
|
||||
},
|
||||
Devices: configs.DefaultAutoCreatedDevices,
|
||||
Hostname: "testing",
|
||||
Mounts: []*configs.Mount{
|
||||
{
|
||||
Source: "proc",
|
||||
Destination: "/proc",
|
||||
Device: "proc",
|
||||
Flags: defaultMountFlags,
|
||||
},
|
||||
{
|
||||
Source: "tmpfs",
|
||||
Destination: "/dev",
|
||||
Device: "tmpfs",
|
||||
Flags: unix.MS_NOSUID | unix.MS_STRICTATIME,
|
||||
Data: "mode=755",
|
||||
},
|
||||
{
|
||||
Source: "devpts",
|
||||
Destination: "/dev/pts",
|
||||
Device: "devpts",
|
||||
Flags: unix.MS_NOSUID | unix.MS_NOEXEC,
|
||||
Data: "newinstance,ptmxmode=0666,mode=0620,gid=5",
|
||||
},
|
||||
{
|
||||
Device: "tmpfs",
|
||||
Source: "shm",
|
||||
Destination: "/dev/shm",
|
||||
Data: "mode=1777,size=65536k",
|
||||
Flags: defaultMountFlags,
|
||||
},
|
||||
{
|
||||
Source: "mqueue",
|
||||
Destination: "/dev/mqueue",
|
||||
Device: "mqueue",
|
||||
Flags: defaultMountFlags,
|
||||
},
|
||||
{
|
||||
Source: "sysfs",
|
||||
Destination: "/sys",
|
||||
Device: "sysfs",
|
||||
Flags: defaultMountFlags | unix.MS_RDONLY,
|
||||
},
|
||||
},
|
||||
UidMappings: []configs.IDMap{
|
||||
{
|
||||
ContainerID: 0,
|
||||
HostID: 1000,
|
||||
Size: 65536,
|
||||
},
|
||||
},
|
||||
GidMappings: []configs.IDMap{
|
||||
{
|
||||
ContainerID: 0,
|
||||
HostID: 1000,
|
||||
Size: 65536,
|
||||
},
|
||||
},
|
||||
Networks: []*configs.Network{
|
||||
{
|
||||
Type: "loopback",
|
||||
Address: "127.0.0.1/0",
|
||||
Gateway: "localhost",
|
||||
},
|
||||
},
|
||||
Rlimits: []configs.Rlimit{
|
||||
{
|
||||
Type: unix.RLIMIT_NOFILE,
|
||||
Hard: uint64(1025),
|
||||
Soft: uint64(1025),
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Once you have the configuration populated you can create a container:
|
||||
|
||||
```go
|
||||
container, err := factory.Create("container-id", config)
|
||||
if err != nil {
|
||||
logrus.Fatal(err)
|
||||
return
|
||||
}
|
||||
```
|
||||
|
||||
To spawn bash as the initial process inside the container and have the
|
||||
processes pid returned in order to wait, signal, or kill the process:
|
||||
|
||||
```go
|
||||
process := &libcontainer.Process{
|
||||
Args: []string{"/bin/bash"},
|
||||
Env: []string{"PATH=/bin"},
|
||||
User: "daemon",
|
||||
Stdin: os.Stdin,
|
||||
Stdout: os.Stdout,
|
||||
Stderr: os.Stderr,
|
||||
}
|
||||
|
||||
err := container.Run(process)
|
||||
if err != nil {
|
||||
container.Destroy()
|
||||
logrus.Fatal(err)
|
||||
return
|
||||
}
|
||||
|
||||
// wait for the process to finish.
|
||||
_, err := process.Wait()
|
||||
if err != nil {
|
||||
logrus.Fatal(err)
|
||||
}
|
||||
|
||||
// destroy the container.
|
||||
container.Destroy()
|
||||
```
|
||||
|
||||
Additional ways to interact with a running container are:
|
||||
|
||||
```go
|
||||
// return all the pids for all processes running inside the container.
|
||||
processes, err := container.Processes()
|
||||
|
||||
// get detailed cpu, memory, io, and network statistics for the container and
|
||||
// it's processes.
|
||||
stats, err := container.Stats()
|
||||
|
||||
// pause all processes inside the container.
|
||||
container.Pause()
|
||||
|
||||
// resume all paused processes.
|
||||
container.Resume()
|
||||
|
||||
// send signal to container's init process.
|
||||
container.Signal(signal)
|
||||
|
||||
// update container resource constraints.
|
||||
container.Set(config)
|
||||
|
||||
// get current status of the container.
|
||||
status, err := container.Status()
|
||||
|
||||
// get current container's state information.
|
||||
state, err := container.State()
|
||||
```
|
||||
|
||||
|
||||
#### Checkpoint & Restore
|
||||
|
||||
libcontainer now integrates [CRIU](http://criu.org/) for checkpointing and restoring containers.
|
||||
This let's you save the state of a process running inside a container to disk, and then restore
|
||||
that state into a new process, on the same machine or on another machine.
|
||||
|
||||
`criu` version 1.5.2 or higher is required to use checkpoint and restore.
|
||||
If you don't already have `criu` installed, you can build it from source, following the
|
||||
[online instructions](http://criu.org/Installation). `criu` is also installed in the docker image
|
||||
generated when building libcontainer with docker.
|
||||
|
||||
|
||||
## Copyright and license
|
||||
|
||||
Code and documentation copyright 2014 Docker, inc. Code released under the Apache 2.0 license.
|
||||
Docs released under Creative commons.
|
||||
|
44
vendor/github.com/opencontainers/runc/libcontainer/nsenter/README.md
generated
vendored
Normal file
44
vendor/github.com/opencontainers/runc/libcontainer/nsenter/README.md
generated
vendored
Normal file
@@ -0,0 +1,44 @@
|
||||
## nsenter
|
||||
|
||||
The `nsenter` package registers a special init constructor that is called before
|
||||
the Go runtime has a chance to boot. This provides us the ability to `setns` on
|
||||
existing namespaces and avoid the issues that the Go runtime has with multiple
|
||||
threads. This constructor will be called if this package is registered,
|
||||
imported, in your go application.
|
||||
|
||||
The `nsenter` package will `import "C"` and it uses [cgo](https://golang.org/cmd/cgo/)
|
||||
package. In cgo, if the import of "C" is immediately preceded by a comment, that comment,
|
||||
called the preamble, is used as a header when compiling the C parts of the package.
|
||||
So every time we import package `nsenter`, the C code function `nsexec()` would be
|
||||
called. And package `nsenter` is now only imported in `main_unix.go`, so every time
|
||||
before we call `cmd.Start` on linux, that C code would run.
|
||||
|
||||
Because `nsexec()` must be run before the Go runtime in order to use the
|
||||
Linux kernel namespace, you must `import` this library into a package if
|
||||
you plan to use `libcontainer` directly. Otherwise Go will not execute
|
||||
the `nsexec()` constructor, which means that the re-exec will not cause
|
||||
the namespaces to be joined. You can import it like this:
|
||||
|
||||
```go
|
||||
import _ "github.com/opencontainers/runc/libcontainer/nsenter"
|
||||
```
|
||||
|
||||
`nsexec()` will first get the file descriptor number for the init pipe
|
||||
from the environment variable `_LIBCONTAINER_INITPIPE` (which was opened
|
||||
by the parent and kept open across the fork-exec of the `nsexec()` init
|
||||
process). The init pipe is used to read bootstrap data (namespace paths,
|
||||
clone flags, uid and gid mappings, and the console path) from the parent
|
||||
process. `nsexec()` will then call `setns(2)` to join the namespaces
|
||||
provided in the bootstrap data (if available), `clone(2)` a child process
|
||||
with the provided clone flags, update the user and group ID mappings, do
|
||||
some further miscellaneous setup steps, and then send the PID of the
|
||||
child process to the parent of the `nsexec()` "caller". Finally,
|
||||
the parent `nsexec()` will exit and the child `nsexec()` process will
|
||||
return to allow the Go runtime take over.
|
||||
|
||||
NOTE: We do both `setns(2)` and `clone(2)` even if we don't have any
|
||||
CLONE_NEW* clone flags because we must fork a new process in order to
|
||||
enter the PID namespace.
|
||||
|
||||
|
||||
|
32
vendor/github.com/opencontainers/runc/libcontainer/nsenter/namespace.h
generated
vendored
Normal file
32
vendor/github.com/opencontainers/runc/libcontainer/nsenter/namespace.h
generated
vendored
Normal file
@@ -0,0 +1,32 @@
|
||||
#ifndef NSENTER_NAMESPACE_H
|
||||
#define NSENTER_NAMESPACE_H
|
||||
|
||||
#ifndef _GNU_SOURCE
|
||||
# define _GNU_SOURCE
|
||||
#endif
|
||||
#include <sched.h>
|
||||
|
||||
/* All of these are taken from include/uapi/linux/sched.h */
|
||||
#ifndef CLONE_NEWNS
|
||||
# define CLONE_NEWNS 0x00020000 /* New mount namespace group */
|
||||
#endif
|
||||
#ifndef CLONE_NEWCGROUP
|
||||
# define CLONE_NEWCGROUP 0x02000000 /* New cgroup namespace */
|
||||
#endif
|
||||
#ifndef CLONE_NEWUTS
|
||||
# define CLONE_NEWUTS 0x04000000 /* New utsname namespace */
|
||||
#endif
|
||||
#ifndef CLONE_NEWIPC
|
||||
# define CLONE_NEWIPC 0x08000000 /* New ipc namespace */
|
||||
#endif
|
||||
#ifndef CLONE_NEWUSER
|
||||
# define CLONE_NEWUSER 0x10000000 /* New user namespace */
|
||||
#endif
|
||||
#ifndef CLONE_NEWPID
|
||||
# define CLONE_NEWPID 0x20000000 /* New pid namespace */
|
||||
#endif
|
||||
#ifndef CLONE_NEWNET
|
||||
# define CLONE_NEWNET 0x40000000 /* New network namespace */
|
||||
#endif
|
||||
|
||||
#endif /* NSENTER_NAMESPACE_H */
|
12
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsenter.go
generated
vendored
Normal file
12
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsenter.go
generated
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
// +build linux,!gccgo
|
||||
|
||||
package nsenter
|
||||
|
||||
/*
|
||||
#cgo CFLAGS: -Wall
|
||||
extern void nsexec();
|
||||
void __attribute__((constructor)) init(void) {
|
||||
nsexec();
|
||||
}
|
||||
*/
|
||||
import "C"
|
25
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsenter_gccgo.go
generated
vendored
Normal file
25
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsenter_gccgo.go
generated
vendored
Normal file
@@ -0,0 +1,25 @@
|
||||
// +build linux,gccgo
|
||||
|
||||
package nsenter
|
||||
|
||||
/*
|
||||
#cgo CFLAGS: -Wall
|
||||
extern void nsexec();
|
||||
void __attribute__((constructor)) init(void) {
|
||||
nsexec();
|
||||
}
|
||||
*/
|
||||
import "C"
|
||||
|
||||
// AlwaysFalse is here to stay false
|
||||
// (and be exported so the compiler doesn't optimize out its reference)
|
||||
var AlwaysFalse bool
|
||||
|
||||
func init() {
|
||||
if AlwaysFalse {
|
||||
// by referencing this C init() in a noop test, it will ensure the compiler
|
||||
// links in the C function.
|
||||
// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65134
|
||||
C.init()
|
||||
}
|
||||
}
|
5
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsenter_unsupported.go
generated
vendored
Normal file
5
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsenter_unsupported.go
generated
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
// +build !linux !cgo
|
||||
|
||||
package nsenter
|
||||
|
||||
import "C"
|
862
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsexec.c
generated
vendored
Normal file
862
vendor/github.com/opencontainers/runc/libcontainer/nsenter/nsexec.c
generated
vendored
Normal file
@@ -0,0 +1,862 @@
|
||||
#define _GNU_SOURCE
|
||||
#include <endian.h>
|
||||
#include <errno.h>
|
||||
#include <fcntl.h>
|
||||
#include <grp.h>
|
||||
#include <sched.h>
|
||||
#include <setjmp.h>
|
||||
#include <signal.h>
|
||||
#include <stdarg.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <stdbool.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include <sys/ioctl.h>
|
||||
#include <sys/prctl.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/types.h>
|
||||
|
||||
#include <linux/limits.h>
|
||||
#include <linux/netlink.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
/* Get all of the CLONE_NEW* flags. */
|
||||
#include "namespace.h"
|
||||
|
||||
/* Synchronisation values. */
|
||||
enum sync_t {
|
||||
SYNC_USERMAP_PLS = 0x40, /* Request parent to map our users. */
|
||||
SYNC_USERMAP_ACK = 0x41, /* Mapping finished by the parent. */
|
||||
SYNC_RECVPID_PLS = 0x42, /* Tell parent we're sending the PID. */
|
||||
SYNC_RECVPID_ACK = 0x43, /* PID was correctly received by parent. */
|
||||
SYNC_GRANDCHILD = 0x44, /* The grandchild is ready to run. */
|
||||
SYNC_CHILD_READY = 0x45, /* The child or grandchild is ready to return. */
|
||||
|
||||
/* XXX: This doesn't help with segfaults and other such issues. */
|
||||
SYNC_ERR = 0xFF, /* Fatal error, no turning back. The error code follows. */
|
||||
};
|
||||
|
||||
/* longjmp() arguments. */
|
||||
#define JUMP_PARENT 0x00
|
||||
#define JUMP_CHILD 0xA0
|
||||
#define JUMP_INIT 0xA1
|
||||
|
||||
/* JSON buffer. */
|
||||
#define JSON_MAX 4096
|
||||
|
||||
/* Assume the stack grows down, so arguments should be above it. */
|
||||
struct clone_t {
|
||||
/*
|
||||
* Reserve some space for clone() to locate arguments
|
||||
* and retcode in this place
|
||||
*/
|
||||
char stack[4096] __attribute__ ((aligned(16)));
|
||||
char stack_ptr[0];
|
||||
|
||||
/* There's two children. This is used to execute the different code. */
|
||||
jmp_buf *env;
|
||||
int jmpval;
|
||||
};
|
||||
|
||||
struct nlconfig_t {
|
||||
char *data;
|
||||
uint32_t cloneflags;
|
||||
char *uidmap;
|
||||
size_t uidmap_len;
|
||||
char *gidmap;
|
||||
size_t gidmap_len;
|
||||
char *namespaces;
|
||||
size_t namespaces_len;
|
||||
uint8_t is_setgroup;
|
||||
uint8_t is_rootless;
|
||||
char *oom_score_adj;
|
||||
size_t oom_score_adj_len;
|
||||
};
|
||||
|
||||
/*
|
||||
* List of netlink message types sent to us as part of bootstrapping the init.
|
||||
* These constants are defined in libcontainer/message_linux.go.
|
||||
*/
|
||||
#define INIT_MSG 62000
|
||||
#define CLONE_FLAGS_ATTR 27281
|
||||
#define NS_PATHS_ATTR 27282
|
||||
#define UIDMAP_ATTR 27283
|
||||
#define GIDMAP_ATTR 27284
|
||||
#define SETGROUP_ATTR 27285
|
||||
#define OOM_SCORE_ADJ_ATTR 27286
|
||||
#define ROOTLESS_ATTR 27287
|
||||
|
||||
/*
|
||||
* Use the raw syscall for versions of glibc which don't include a function for
|
||||
* it, namely (glibc 2.12).
|
||||
*/
|
||||
#if __GLIBC__ == 2 && __GLIBC_MINOR__ < 14
|
||||
# define _GNU_SOURCE
|
||||
# include "syscall.h"
|
||||
# if !defined(SYS_setns) && defined(__NR_setns)
|
||||
# define SYS_setns __NR_setns
|
||||
# endif
|
||||
|
||||
#ifndef SYS_setns
|
||||
# error "setns(2) syscall not supported by glibc version"
|
||||
#endif
|
||||
|
||||
int setns(int fd, int nstype)
|
||||
{
|
||||
return syscall(SYS_setns, fd, nstype);
|
||||
}
|
||||
#endif
|
||||
|
||||
/* XXX: This is ugly. */
|
||||
static int syncfd = -1;
|
||||
|
||||
/* TODO(cyphar): Fix this so it correctly deals with syncT. */
|
||||
#define bail(fmt, ...) \
|
||||
do { \
|
||||
int ret = __COUNTER__ + 1; \
|
||||
fprintf(stderr, "nsenter: " fmt ": %m\n", ##__VA_ARGS__); \
|
||||
if (syncfd >= 0) { \
|
||||
enum sync_t s = SYNC_ERR; \
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s)) \
|
||||
fprintf(stderr, "nsenter: failed: write(s)"); \
|
||||
if (write(syncfd, &ret, sizeof(ret)) != sizeof(ret)) \
|
||||
fprintf(stderr, "nsenter: failed: write(ret)"); \
|
||||
} \
|
||||
exit(ret); \
|
||||
} while(0)
|
||||
|
||||
static int write_file(char *data, size_t data_len, char *pathfmt, ...)
|
||||
{
|
||||
int fd, len, ret = 0;
|
||||
char path[PATH_MAX];
|
||||
|
||||
va_list ap;
|
||||
va_start(ap, pathfmt);
|
||||
len = vsnprintf(path, PATH_MAX, pathfmt, ap);
|
||||
va_end(ap);
|
||||
if (len < 0)
|
||||
return -1;
|
||||
|
||||
fd = open(path, O_RDWR);
|
||||
if (fd < 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
len = write(fd, data, data_len);
|
||||
if (len != data_len) {
|
||||
ret = -1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
close(fd);
|
||||
return ret;
|
||||
}
|
||||
|
||||
enum policy_t {
|
||||
SETGROUPS_DEFAULT = 0,
|
||||
SETGROUPS_ALLOW,
|
||||
SETGROUPS_DENY,
|
||||
};
|
||||
|
||||
/* This *must* be called before we touch gid_map. */
|
||||
static void update_setgroups(int pid, enum policy_t setgroup)
|
||||
{
|
||||
char *policy;
|
||||
|
||||
switch (setgroup) {
|
||||
case SETGROUPS_ALLOW:
|
||||
policy = "allow";
|
||||
break;
|
||||
case SETGROUPS_DENY:
|
||||
policy = "deny";
|
||||
break;
|
||||
case SETGROUPS_DEFAULT:
|
||||
default:
|
||||
/* Nothing to do. */
|
||||
return;
|
||||
}
|
||||
|
||||
if (write_file(policy, strlen(policy), "/proc/%d/setgroups", pid) < 0) {
|
||||
/*
|
||||
* If the kernel is too old to support /proc/pid/setgroups,
|
||||
* open(2) or write(2) will return ENOENT. This is fine.
|
||||
*/
|
||||
if (errno != ENOENT)
|
||||
bail("failed to write '%s' to /proc/%d/setgroups", policy, pid);
|
||||
}
|
||||
}
|
||||
|
||||
static void update_uidmap(int pid, char *map, size_t map_len)
|
||||
{
|
||||
if (map == NULL || map_len <= 0)
|
||||
return;
|
||||
|
||||
if (write_file(map, map_len, "/proc/%d/uid_map", pid) < 0)
|
||||
bail("failed to update /proc/%d/uid_map", pid);
|
||||
}
|
||||
|
||||
static void update_gidmap(int pid, char *map, size_t map_len)
|
||||
{
|
||||
if (map == NULL || map_len <= 0)
|
||||
return;
|
||||
|
||||
if (write_file(map, map_len, "/proc/%d/gid_map", pid) < 0)
|
||||
bail("failed to update /proc/%d/gid_map", pid);
|
||||
}
|
||||
|
||||
static void update_oom_score_adj(char *data, size_t len)
|
||||
{
|
||||
if (data == NULL || len <= 0)
|
||||
return;
|
||||
|
||||
if (write_file(data, len, "/proc/self/oom_score_adj") < 0)
|
||||
bail("failed to update /proc/self/oom_score_adj");
|
||||
}
|
||||
|
||||
/* A dummy function that just jumps to the given jumpval. */
|
||||
static int child_func(void *arg) __attribute__ ((noinline));
|
||||
static int child_func(void *arg)
|
||||
{
|
||||
struct clone_t *ca = (struct clone_t *)arg;
|
||||
longjmp(*ca->env, ca->jmpval);
|
||||
}
|
||||
|
||||
static int clone_parent(jmp_buf *env, int jmpval) __attribute__ ((noinline));
|
||||
static int clone_parent(jmp_buf *env, int jmpval)
|
||||
{
|
||||
struct clone_t ca = {
|
||||
.env = env,
|
||||
.jmpval = jmpval,
|
||||
};
|
||||
|
||||
return clone(child_func, ca.stack_ptr, CLONE_PARENT | SIGCHLD, &ca);
|
||||
}
|
||||
|
||||
/*
|
||||
* Gets the init pipe fd from the environment, which is used to read the
|
||||
* bootstrap data and tell the parent what the new pid is after we finish
|
||||
* setting up the environment.
|
||||
*/
|
||||
static int initpipe(void)
|
||||
{
|
||||
int pipenum;
|
||||
char *initpipe, *endptr;
|
||||
|
||||
initpipe = getenv("_LIBCONTAINER_INITPIPE");
|
||||
if (initpipe == NULL || *initpipe == '\0')
|
||||
return -1;
|
||||
|
||||
pipenum = strtol(initpipe, &endptr, 10);
|
||||
if (*endptr != '\0')
|
||||
bail("unable to parse _LIBCONTAINER_INITPIPE");
|
||||
|
||||
return pipenum;
|
||||
}
|
||||
|
||||
/* Returns the clone(2) flag for a namespace, given the name of a namespace. */
|
||||
static int nsflag(char *name)
|
||||
{
|
||||
if (!strcmp(name, "cgroup"))
|
||||
return CLONE_NEWCGROUP;
|
||||
else if (!strcmp(name, "ipc"))
|
||||
return CLONE_NEWIPC;
|
||||
else if (!strcmp(name, "mnt"))
|
||||
return CLONE_NEWNS;
|
||||
else if (!strcmp(name, "net"))
|
||||
return CLONE_NEWNET;
|
||||
else if (!strcmp(name, "pid"))
|
||||
return CLONE_NEWPID;
|
||||
else if (!strcmp(name, "user"))
|
||||
return CLONE_NEWUSER;
|
||||
else if (!strcmp(name, "uts"))
|
||||
return CLONE_NEWUTS;
|
||||
|
||||
/* If we don't recognise a name, fallback to 0. */
|
||||
return 0;
|
||||
}
|
||||
|
||||
static uint32_t readint32(char *buf)
|
||||
{
|
||||
return *(uint32_t *) buf;
|
||||
}
|
||||
|
||||
static uint8_t readint8(char *buf)
|
||||
{
|
||||
return *(uint8_t *) buf;
|
||||
}
|
||||
|
||||
static void nl_parse(int fd, struct nlconfig_t *config)
|
||||
{
|
||||
size_t len, size;
|
||||
struct nlmsghdr hdr;
|
||||
char *data, *current;
|
||||
|
||||
/* Retrieve the netlink header. */
|
||||
len = read(fd, &hdr, NLMSG_HDRLEN);
|
||||
if (len != NLMSG_HDRLEN)
|
||||
bail("invalid netlink header length %zu", len);
|
||||
|
||||
if (hdr.nlmsg_type == NLMSG_ERROR)
|
||||
bail("failed to read netlink message");
|
||||
|
||||
if (hdr.nlmsg_type != INIT_MSG)
|
||||
bail("unexpected msg type %d", hdr.nlmsg_type);
|
||||
|
||||
/* Retrieve data. */
|
||||
size = NLMSG_PAYLOAD(&hdr, 0);
|
||||
current = data = malloc(size);
|
||||
if (!data)
|
||||
bail("failed to allocate %zu bytes of memory for nl_payload", size);
|
||||
|
||||
len = read(fd, data, size);
|
||||
if (len != size)
|
||||
bail("failed to read netlink payload, %zu != %zu", len, size);
|
||||
|
||||
/* Parse the netlink payload. */
|
||||
config->data = data;
|
||||
while (current < data + size) {
|
||||
struct nlattr *nlattr = (struct nlattr *)current;
|
||||
size_t payload_len = nlattr->nla_len - NLA_HDRLEN;
|
||||
|
||||
/* Advance to payload. */
|
||||
current += NLA_HDRLEN;
|
||||
|
||||
/* Handle payload. */
|
||||
switch (nlattr->nla_type) {
|
||||
case CLONE_FLAGS_ATTR:
|
||||
config->cloneflags = readint32(current);
|
||||
break;
|
||||
case ROOTLESS_ATTR:
|
||||
config->is_rootless = readint8(current);
|
||||
break;
|
||||
case OOM_SCORE_ADJ_ATTR:
|
||||
config->oom_score_adj = current;
|
||||
config->oom_score_adj_len = payload_len;
|
||||
break;
|
||||
case NS_PATHS_ATTR:
|
||||
config->namespaces = current;
|
||||
config->namespaces_len = payload_len;
|
||||
break;
|
||||
case UIDMAP_ATTR:
|
||||
config->uidmap = current;
|
||||
config->uidmap_len = payload_len;
|
||||
break;
|
||||
case GIDMAP_ATTR:
|
||||
config->gidmap = current;
|
||||
config->gidmap_len = payload_len;
|
||||
break;
|
||||
case SETGROUP_ATTR:
|
||||
config->is_setgroup = readint8(current);
|
||||
break;
|
||||
default:
|
||||
bail("unknown netlink message type %d", nlattr->nla_type);
|
||||
}
|
||||
|
||||
current += NLA_ALIGN(payload_len);
|
||||
}
|
||||
}
|
||||
|
||||
void nl_free(struct nlconfig_t *config)
|
||||
{
|
||||
free(config->data);
|
||||
}
|
||||
|
||||
void join_namespaces(char *nslist)
|
||||
{
|
||||
int num = 0, i;
|
||||
char *saveptr = NULL;
|
||||
char *namespace = strtok_r(nslist, ",", &saveptr);
|
||||
struct namespace_t {
|
||||
int fd;
|
||||
int ns;
|
||||
char type[PATH_MAX];
|
||||
char path[PATH_MAX];
|
||||
} *namespaces = NULL;
|
||||
|
||||
if (!namespace || !strlen(namespace) || !strlen(nslist))
|
||||
bail("ns paths are empty");
|
||||
|
||||
/*
|
||||
* We have to open the file descriptors first, since after
|
||||
* we join the mnt namespace we might no longer be able to
|
||||
* access the paths.
|
||||
*/
|
||||
do {
|
||||
int fd;
|
||||
char *path;
|
||||
struct namespace_t *ns;
|
||||
|
||||
/* Resize the namespace array. */
|
||||
namespaces = realloc(namespaces, ++num * sizeof(struct namespace_t));
|
||||
if (!namespaces)
|
||||
bail("failed to reallocate namespace array");
|
||||
ns = &namespaces[num - 1];
|
||||
|
||||
/* Split 'ns:path'. */
|
||||
path = strstr(namespace, ":");
|
||||
if (!path)
|
||||
bail("failed to parse %s", namespace);
|
||||
*path++ = '\0';
|
||||
|
||||
fd = open(path, O_RDONLY);
|
||||
if (fd < 0)
|
||||
bail("failed to open %s", path);
|
||||
|
||||
ns->fd = fd;
|
||||
ns->ns = nsflag(namespace);
|
||||
strncpy(ns->path, path, PATH_MAX);
|
||||
} while ((namespace = strtok_r(NULL, ",", &saveptr)) != NULL);
|
||||
|
||||
/*
|
||||
* The ordering in which we join namespaces is important. We should
|
||||
* always join the user namespace *first*. This is all guaranteed
|
||||
* from the container_linux.go side of this, so we're just going to
|
||||
* follow the order given to us.
|
||||
*/
|
||||
|
||||
for (i = 0; i < num; i++) {
|
||||
struct namespace_t ns = namespaces[i];
|
||||
|
||||
if (setns(ns.fd, ns.ns) < 0)
|
||||
bail("failed to setns to %s", ns.path);
|
||||
|
||||
close(ns.fd);
|
||||
}
|
||||
|
||||
free(namespaces);
|
||||
}
|
||||
|
||||
void nsexec(void)
|
||||
{
|
||||
int pipenum;
|
||||
jmp_buf env;
|
||||
int sync_child_pipe[2], sync_grandchild_pipe[2];
|
||||
struct nlconfig_t config = {0};
|
||||
|
||||
/*
|
||||
* If we don't have an init pipe, just return to the go routine.
|
||||
* We'll only get an init pipe for start or exec.
|
||||
*/
|
||||
pipenum = initpipe();
|
||||
if (pipenum == -1)
|
||||
return;
|
||||
|
||||
/* Parse all of the netlink configuration. */
|
||||
nl_parse(pipenum, &config);
|
||||
|
||||
/* Set oom_score_adj. This has to be done before !dumpable because
|
||||
* /proc/self/oom_score_adj is not writeable unless you're an privileged
|
||||
* user (if !dumpable is set). All children inherit their parent's
|
||||
* oom_score_adj value on fork(2) so this will always be propagated
|
||||
* properly.
|
||||
*/
|
||||
update_oom_score_adj(config.oom_score_adj, config.oom_score_adj_len);
|
||||
|
||||
/*
|
||||
* Make the process non-dumpable, to avoid various race conditions that
|
||||
* could cause processes in namespaces we're joining to access host
|
||||
* resources (or potentially execute code).
|
||||
*
|
||||
* However, if the number of namespaces we are joining is 0, we are not
|
||||
* going to be switching to a different security context. Thus setting
|
||||
* ourselves to be non-dumpable only breaks things (like rootless
|
||||
* containers), which is the recommendation from the kernel folks.
|
||||
*/
|
||||
if (config.namespaces) {
|
||||
if (prctl(PR_SET_DUMPABLE, 0, 0, 0, 0) < 0)
|
||||
bail("failed to set process as non-dumpable");
|
||||
}
|
||||
|
||||
/* Pipe so we can tell the child when we've finished setting up. */
|
||||
if (socketpair(AF_LOCAL, SOCK_STREAM, 0, sync_child_pipe) < 0)
|
||||
bail("failed to setup sync pipe between parent and child");
|
||||
|
||||
/*
|
||||
* We need a new socketpair to sync with grandchild so we don't have
|
||||
* race condition with child.
|
||||
*/
|
||||
if (socketpair(AF_LOCAL, SOCK_STREAM, 0, sync_grandchild_pipe) < 0)
|
||||
bail("failed to setup sync pipe between parent and grandchild");
|
||||
|
||||
/* TODO: Currently we aren't dealing with child deaths properly. */
|
||||
|
||||
/*
|
||||
* Okay, so this is quite annoying.
|
||||
*
|
||||
* In order for this unsharing code to be more extensible we need to split
|
||||
* up unshare(CLONE_NEWUSER) and clone() in various ways. The ideal case
|
||||
* would be if we did clone(CLONE_NEWUSER) and the other namespaces
|
||||
* separately, but because of SELinux issues we cannot really do that. But
|
||||
* we cannot just dump the namespace flags into clone(...) because several
|
||||
* usecases (such as rootless containers) require more granularity around
|
||||
* the namespace setup. In addition, some older kernels had issues where
|
||||
* CLONE_NEWUSER wasn't handled before other namespaces (but we cannot
|
||||
* handle this while also dealing with SELinux so we choose SELinux support
|
||||
* over broken kernel support).
|
||||
*
|
||||
* However, if we unshare(2) the user namespace *before* we clone(2), then
|
||||
* all hell breaks loose.
|
||||
*
|
||||
* The parent no longer has permissions to do many things (unshare(2) drops
|
||||
* all capabilities in your old namespace), and the container cannot be set
|
||||
* up to have more than one {uid,gid} mapping. This is obviously less than
|
||||
* ideal. In order to fix this, we have to first clone(2) and then unshare.
|
||||
*
|
||||
* Unfortunately, it's not as simple as that. We have to fork to enter the
|
||||
* PID namespace (the PID namespace only applies to children). Since we'll
|
||||
* have to double-fork, this clone_parent() call won't be able to get the
|
||||
* PID of the _actual_ init process (without doing more synchronisation than
|
||||
* I can deal with at the moment). So we'll just get the parent to send it
|
||||
* for us, the only job of this process is to update
|
||||
* /proc/pid/{setgroups,uid_map,gid_map}.
|
||||
*
|
||||
* And as a result of the above, we also need to setns(2) in the first child
|
||||
* because if we join a PID namespace in the topmost parent then our child
|
||||
* will be in that namespace (and it will not be able to give us a PID value
|
||||
* that makes sense without resorting to sending things with cmsg).
|
||||
*
|
||||
* This also deals with an older issue caused by dumping cloneflags into
|
||||
* clone(2): On old kernels, CLONE_PARENT didn't work with CLONE_NEWPID, so
|
||||
* we have to unshare(2) before clone(2) in order to do this. This was fixed
|
||||
* in upstream commit 1f7f4dde5c945f41a7abc2285be43d918029ecc5, and was
|
||||
* introduced by 40a0d32d1eaffe6aac7324ca92604b6b3977eb0e. As far as we're
|
||||
* aware, the last mainline kernel which had this bug was Linux 3.12.
|
||||
* However, we cannot comment on which kernels the broken patch was
|
||||
* backported to.
|
||||
*
|
||||
* -- Aleksa "what has my life come to?" Sarai
|
||||
*/
|
||||
|
||||
switch (setjmp(env)) {
|
||||
/*
|
||||
* Stage 0: We're in the parent. Our job is just to create a new child
|
||||
* (stage 1: JUMP_CHILD) process and write its uid_map and
|
||||
* gid_map. That process will go on to create a new process, then
|
||||
* it will send us its PID which we will send to the bootstrap
|
||||
* process.
|
||||
*/
|
||||
case JUMP_PARENT: {
|
||||
int len;
|
||||
pid_t child;
|
||||
char buf[JSON_MAX];
|
||||
bool ready = false;
|
||||
|
||||
/* For debugging. */
|
||||
prctl(PR_SET_NAME, (unsigned long) "runc:[0:PARENT]", 0, 0, 0);
|
||||
|
||||
/* Start the process of getting a container. */
|
||||
child = clone_parent(&env, JUMP_CHILD);
|
||||
if (child < 0)
|
||||
bail("unable to fork: child_func");
|
||||
|
||||
/*
|
||||
* State machine for synchronisation with the children.
|
||||
*
|
||||
* Father only return when both child and grandchild are
|
||||
* ready, so we can receive all possible error codes
|
||||
* generated by children.
|
||||
*/
|
||||
while (!ready) {
|
||||
enum sync_t s;
|
||||
int ret;
|
||||
|
||||
syncfd = sync_child_pipe[1];
|
||||
close(sync_child_pipe[0]);
|
||||
|
||||
if (read(syncfd, &s, sizeof(s)) != sizeof(s))
|
||||
bail("failed to sync with child: next state");
|
||||
|
||||
switch (s) {
|
||||
case SYNC_ERR:
|
||||
/* We have to mirror the error code of the child. */
|
||||
if (read(syncfd, &ret, sizeof(ret)) != sizeof(ret))
|
||||
bail("failed to sync with child: read(error code)");
|
||||
|
||||
exit(ret);
|
||||
case SYNC_USERMAP_PLS:
|
||||
/*
|
||||
* Enable setgroups(2) if we've been asked to. But we also
|
||||
* have to explicitly disable setgroups(2) if we're
|
||||
* creating a rootless container (this is required since
|
||||
* Linux 3.19).
|
||||
*/
|
||||
if (config.is_rootless && config.is_setgroup) {
|
||||
kill(child, SIGKILL);
|
||||
bail("cannot allow setgroup in an unprivileged user namespace setup");
|
||||
}
|
||||
|
||||
if (config.is_setgroup)
|
||||
update_setgroups(child, SETGROUPS_ALLOW);
|
||||
if (config.is_rootless)
|
||||
update_setgroups(child, SETGROUPS_DENY);
|
||||
|
||||
/* Set up mappings. */
|
||||
update_uidmap(child, config.uidmap, config.uidmap_len);
|
||||
update_gidmap(child, config.gidmap, config.gidmap_len);
|
||||
|
||||
s = SYNC_USERMAP_ACK;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s)) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with child: write(SYNC_USERMAP_ACK)");
|
||||
}
|
||||
break;
|
||||
case SYNC_RECVPID_PLS: {
|
||||
pid_t old = child;
|
||||
|
||||
/* Get the init_func pid. */
|
||||
if (read(syncfd, &child, sizeof(child)) != sizeof(child)) {
|
||||
kill(old, SIGKILL);
|
||||
bail("failed to sync with child: read(childpid)");
|
||||
}
|
||||
|
||||
/* Send ACK. */
|
||||
s = SYNC_RECVPID_ACK;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s)) {
|
||||
kill(old, SIGKILL);
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with child: write(SYNC_RECVPID_ACK)");
|
||||
}
|
||||
}
|
||||
break;
|
||||
case SYNC_CHILD_READY:
|
||||
ready = true;
|
||||
break;
|
||||
default:
|
||||
bail("unexpected sync value: %u", s);
|
||||
}
|
||||
}
|
||||
|
||||
/* Now sync with grandchild. */
|
||||
|
||||
ready = false;
|
||||
while (!ready) {
|
||||
enum sync_t s;
|
||||
int ret;
|
||||
|
||||
syncfd = sync_grandchild_pipe[1];
|
||||
close(sync_grandchild_pipe[0]);
|
||||
|
||||
s = SYNC_GRANDCHILD;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s)) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with child: write(SYNC_GRANDCHILD)");
|
||||
}
|
||||
|
||||
if (read(syncfd, &s, sizeof(s)) != sizeof(s))
|
||||
bail("failed to sync with child: next state");
|
||||
|
||||
switch (s) {
|
||||
case SYNC_ERR:
|
||||
/* We have to mirror the error code of the child. */
|
||||
if (read(syncfd, &ret, sizeof(ret)) != sizeof(ret))
|
||||
bail("failed to sync with child: read(error code)");
|
||||
|
||||
exit(ret);
|
||||
case SYNC_CHILD_READY:
|
||||
ready = true;
|
||||
break;
|
||||
default:
|
||||
bail("unexpected sync value: %u", s);
|
||||
}
|
||||
}
|
||||
|
||||
/* Send the init_func pid back to our parent. */
|
||||
len = snprintf(buf, JSON_MAX, "{\"pid\": %d}\n", child);
|
||||
if (len < 0) {
|
||||
kill(child, SIGKILL);
|
||||
bail("unable to generate JSON for child pid");
|
||||
}
|
||||
if (write(pipenum, buf, len) != len) {
|
||||
kill(child, SIGKILL);
|
||||
bail("unable to send child pid to bootstrapper");
|
||||
}
|
||||
|
||||
exit(0);
|
||||
}
|
||||
|
||||
/*
|
||||
* Stage 1: We're in the first child process. Our job is to join any
|
||||
* provided namespaces in the netlink payload and unshare all
|
||||
* of the requested namespaces. If we've been asked to
|
||||
* CLONE_NEWUSER, we will ask our parent (stage 0) to set up
|
||||
* our user mappings for us. Then, we create a new child
|
||||
* (stage 2: JUMP_INIT) for PID namespace. We then send the
|
||||
* child's PID to our parent (stage 0).
|
||||
*/
|
||||
case JUMP_CHILD: {
|
||||
pid_t child;
|
||||
enum sync_t s;
|
||||
|
||||
/* We're in a child and thus need to tell the parent if we die. */
|
||||
syncfd = sync_child_pipe[0];
|
||||
close(sync_child_pipe[1]);
|
||||
|
||||
/* For debugging. */
|
||||
prctl(PR_SET_NAME, (unsigned long) "runc:[1:CHILD]", 0, 0, 0);
|
||||
|
||||
/*
|
||||
* We need to setns first. We cannot do this earlier (in stage 0)
|
||||
* because of the fact that we forked to get here (the PID of
|
||||
* [stage 2: JUMP_INIT]) would be meaningless). We could send it
|
||||
* using cmsg(3) but that's just annoying.
|
||||
*/
|
||||
if (config.namespaces)
|
||||
join_namespaces(config.namespaces);
|
||||
|
||||
/*
|
||||
* Unshare all of the namespaces. Now, it should be noted that this
|
||||
* ordering might break in the future (especially with rootless
|
||||
* containers). But for now, it's not possible to split this into
|
||||
* CLONE_NEWUSER + [the rest] because of some RHEL SELinux issues.
|
||||
*
|
||||
* Note that we don't merge this with clone() because there were
|
||||
* some old kernel versions where clone(CLONE_PARENT | CLONE_NEWPID)
|
||||
* was broken, so we'll just do it the long way anyway.
|
||||
*/
|
||||
if (unshare(config.cloneflags) < 0)
|
||||
bail("failed to unshare namespaces");
|
||||
|
||||
/*
|
||||
* Deal with user namespaces first. They are quite special, as they
|
||||
* affect our ability to unshare other namespaces and are used as
|
||||
* context for privilege checks.
|
||||
*/
|
||||
if (config.cloneflags & CLONE_NEWUSER) {
|
||||
/*
|
||||
* We don't have the privileges to do any mapping here (see the
|
||||
* clone_parent rant). So signal our parent to hook us up.
|
||||
*/
|
||||
|
||||
/* Switching is only necessary if we joined namespaces. */
|
||||
if (config.namespaces) {
|
||||
if (prctl(PR_SET_DUMPABLE, 1, 0, 0, 0) < 0)
|
||||
bail("failed to set process as dumpable");
|
||||
}
|
||||
s = SYNC_USERMAP_PLS;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s))
|
||||
bail("failed to sync with parent: write(SYNC_USERMAP_PLS)");
|
||||
|
||||
/* ... wait for mapping ... */
|
||||
|
||||
if (read(syncfd, &s, sizeof(s)) != sizeof(s))
|
||||
bail("failed to sync with parent: read(SYNC_USERMAP_ACK)");
|
||||
if (s != SYNC_USERMAP_ACK)
|
||||
bail("failed to sync with parent: SYNC_USERMAP_ACK: got %u", s);
|
||||
/* Switching is only necessary if we joined namespaces. */
|
||||
if (config.namespaces) {
|
||||
if (prctl(PR_SET_DUMPABLE, 0, 0, 0, 0) < 0)
|
||||
bail("failed to set process as dumpable");
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* TODO: What about non-namespace clone flags that we're dropping here?
|
||||
*
|
||||
* We fork again because of PID namespace, setns(2) or unshare(2) don't
|
||||
* change the PID namespace of the calling process, because doing so
|
||||
* would change the caller's idea of its own PID (as reported by getpid()),
|
||||
* which would break many applications and libraries, so we must fork
|
||||
* to actually enter the new PID namespace.
|
||||
*/
|
||||
child = clone_parent(&env, JUMP_INIT);
|
||||
if (child < 0)
|
||||
bail("unable to fork: init_func");
|
||||
|
||||
/* Send the child to our parent, which knows what it's doing. */
|
||||
s = SYNC_RECVPID_PLS;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s)) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with parent: write(SYNC_RECVPID_PLS)");
|
||||
}
|
||||
if (write(syncfd, &child, sizeof(child)) != sizeof(child)) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with parent: write(childpid)");
|
||||
}
|
||||
|
||||
/* ... wait for parent to get the pid ... */
|
||||
|
||||
if (read(syncfd, &s, sizeof(s)) != sizeof(s)) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with parent: read(SYNC_RECVPID_ACK)");
|
||||
}
|
||||
if (s != SYNC_RECVPID_ACK) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with parent: SYNC_RECVPID_ACK: got %u", s);
|
||||
}
|
||||
|
||||
s = SYNC_CHILD_READY;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s)) {
|
||||
kill(child, SIGKILL);
|
||||
bail("failed to sync with parent: write(SYNC_CHILD_READY)");
|
||||
}
|
||||
|
||||
/* Our work is done. [Stage 2: JUMP_INIT] is doing the rest of the work. */
|
||||
exit(0);
|
||||
}
|
||||
|
||||
/*
|
||||
* Stage 2: We're the final child process, and the only process that will
|
||||
* actually return to the Go runtime. Our job is to just do the
|
||||
* final cleanup steps and then return to the Go runtime to allow
|
||||
* init_linux.go to run.
|
||||
*/
|
||||
case JUMP_INIT: {
|
||||
/*
|
||||
* We're inside the child now, having jumped from the
|
||||
* start_child() code after forking in the parent.
|
||||
*/
|
||||
enum sync_t s;
|
||||
|
||||
/* We're in a child and thus need to tell the parent if we die. */
|
||||
syncfd = sync_grandchild_pipe[0];
|
||||
close(sync_grandchild_pipe[1]);
|
||||
close(sync_child_pipe[0]);
|
||||
close(sync_child_pipe[1]);
|
||||
|
||||
/* For debugging. */
|
||||
prctl(PR_SET_NAME, (unsigned long) "runc:[2:INIT]", 0, 0, 0);
|
||||
|
||||
if (read(syncfd, &s, sizeof(s)) != sizeof(s))
|
||||
bail("failed to sync with parent: read(SYNC_GRANDCHILD)");
|
||||
if (s != SYNC_GRANDCHILD)
|
||||
bail("failed to sync with parent: SYNC_GRANDCHILD: got %u", s);
|
||||
|
||||
if (setsid() < 0)
|
||||
bail("setsid failed");
|
||||
|
||||
if (setuid(0) < 0)
|
||||
bail("setuid failed");
|
||||
|
||||
if (setgid(0) < 0)
|
||||
bail("setgid failed");
|
||||
|
||||
if (!config.is_rootless && config.is_setgroup) {
|
||||
if (setgroups(0, NULL) < 0)
|
||||
bail("setgroups failed");
|
||||
}
|
||||
|
||||
s = SYNC_CHILD_READY;
|
||||
if (write(syncfd, &s, sizeof(s)) != sizeof(s))
|
||||
bail("failed to sync with patent: write(SYNC_CHILD_READY)");
|
||||
|
||||
/* Close sync pipes. */
|
||||
close(sync_grandchild_pipe[0]);
|
||||
|
||||
/* Free netlink data. */
|
||||
nl_free(&config);
|
||||
|
||||
/* Finish executing, let the Go runtime take over. */
|
||||
return;
|
||||
}
|
||||
default:
|
||||
bail("unexpected jump value");
|
||||
}
|
||||
|
||||
/* Should never be reached. */
|
||||
bail("should never be reached");
|
||||
}
|
21
vendor/github.com/opencontainers/runc/vendor.conf
generated
vendored
Normal file
21
vendor/github.com/opencontainers/runc/vendor.conf
generated
vendored
Normal file
@@ -0,0 +1,21 @@
|
||||
# OCI runtime-spec. When updating this, make sure you use a version tag rather
|
||||
# than a commit ID so it's much more obvious what version of the spec we are
|
||||
# using.
|
||||
github.com/opencontainers/runtime-spec v1.0.0
|
||||
# Core libcontainer functionality.
|
||||
github.com/mrunalp/fileutils ed869b029674c0e9ce4c0dfa781405c2d9946d08
|
||||
github.com/opencontainers/selinux v1.0.0-rc1
|
||||
github.com/seccomp/libseccomp-golang 32f571b70023028bd57d9288c20efbcb237f3ce0
|
||||
github.com/sirupsen/logrus a3f95b5c423586578a4e099b11a46c2479628cac
|
||||
github.com/syndtr/gocapability db04d3cc01c8b54962a58ec7e491717d06cfcc16
|
||||
github.com/vishvananda/netlink 1e2e08e8a2dcdacaae3f14ac44c5cfa31361f270
|
||||
# systemd integration.
|
||||
github.com/coreos/go-systemd v14
|
||||
github.com/coreos/pkg v3
|
||||
github.com/godbus/dbus v3
|
||||
github.com/golang/protobuf 18c9bb3261723cd5401db4d0c9fbc5c3b6c70fe8
|
||||
# Command-line interface.
|
||||
github.com/docker/docker 0f5c9d301b9b1cca66b3ea0f9dec3b5317d3686d
|
||||
github.com/docker/go-units v0.2.0
|
||||
github.com/urfave/cli d53eb991652b1d438abdd34ce4bfa3ef1539108e
|
||||
golang.org/x/sys 0e0164865330d5cf1c00247be08330bf96e2f87c https://github.com/golang/sys
|
158
vendor/github.com/opencontainers/runtime-spec/README.md
generated
vendored
Normal file
158
vendor/github.com/opencontainers/runtime-spec/README.md
generated
vendored
Normal file
@@ -0,0 +1,158 @@
|
||||
# Open Container Initiative Runtime Specification
|
||||
|
||||
The [Open Container Initiative][oci] develops specifications for standards on Operating System process and application containers.
|
||||
|
||||
The specification can be found [here](spec.md).
|
||||
|
||||
## Table of Contents
|
||||
|
||||
Additional documentation about how this group operates:
|
||||
|
||||
- [Code of Conduct][code-of-conduct]
|
||||
- [Style and Conventions](style.md)
|
||||
- [Implementations](implementations.md)
|
||||
- [Releases](RELEASES.md)
|
||||
- [project](project.md)
|
||||
- [charter][charter]
|
||||
|
||||
## Use Cases
|
||||
|
||||
To provide context for users the following section gives example use cases for each part of the spec.
|
||||
|
||||
### Application Bundle Builders
|
||||
|
||||
Application bundle builders can create a [bundle](bundle.md) directory that includes all of the files required for launching an application as a container.
|
||||
The bundle contains an OCI [configuration file](config.md) where the builder can specify host-independent details such as [which executable to launch](config.md#process) and host-specific settings such as [mount](config.md#mounts) locations, [hook](config.md#hooks) paths, Linux [namespaces](config-linux.md#namespaces) and [cgroups](config-linux.md#control-groups).
|
||||
Because the configuration includes host-specific settings, application bundle directories copied between two hosts may require configuration adjustments.
|
||||
|
||||
### Hook Developers
|
||||
|
||||
[Hook](config.md#hooks) developers can extend the functionality of an OCI-compliant runtime by hooking into a container's lifecycle with an external application.
|
||||
Example use cases include sophisticated network configuration, volume garbage collection, etc.
|
||||
|
||||
### Runtime Developers
|
||||
|
||||
Runtime developers can build runtime implementations that run OCI-compliant bundles and container configuration, containing low-level OS and host-specific details, on a particular platform.
|
||||
|
||||
## Contributing
|
||||
|
||||
Development happens on GitHub for the spec.
|
||||
Issues are used for bugs and actionable items and longer discussions can happen on the [mailing list](#mailing-list).
|
||||
|
||||
The specification and code is licensed under the Apache 2.0 license found in the [LICENSE](./LICENSE) file.
|
||||
|
||||
### Discuss your design
|
||||
|
||||
The project welcomes submissions, but please let everyone know what you are working on.
|
||||
|
||||
Before undertaking a nontrivial change to this specification, send mail to the [mailing list](#mailing-list) to discuss what you plan to do.
|
||||
This gives everyone a chance to validate the design, helps prevent duplication of effort, and ensures that the idea fits.
|
||||
It also guarantees that the design is sound before code is written; a GitHub pull-request is not the place for high-level discussions.
|
||||
|
||||
Typos and grammatical errors can go straight to a pull-request.
|
||||
When in doubt, start on the [mailing-list](#mailing-list).
|
||||
|
||||
### Weekly Call
|
||||
|
||||
The contributors and maintainers of all OCI projects have a weekly meeting on Wednesdays at:
|
||||
|
||||
* 8:00 AM (USA Pacific), during [odd weeks][iso-week].
|
||||
* 2:00 PM (USA Pacific), during [even weeks][iso-week].
|
||||
|
||||
There is an [iCalendar][rfc5545] format for the meetings [here](meeting.ics).
|
||||
|
||||
Everyone is welcome to participate via [UberConference web][uberconference] or audio-only: +1 415 968 0849 (no PIN needed).
|
||||
An initial agenda will be posted to the [mailing list](#mailing-list) earlier in the week, and everyone is welcome to propose additional topics or suggest other agenda alterations there.
|
||||
Minutes are posted to the [mailing list](#mailing-list) and minutes from past calls are archived [here][minutes], with minutes from especially old meetings (September 2015 and earlier) archived [here][runtime-wiki].
|
||||
|
||||
### Mailing List
|
||||
|
||||
You can subscribe and join the mailing list on [Google Groups][dev-list].
|
||||
|
||||
### IRC
|
||||
|
||||
OCI discussion happens on #opencontainers on Freenode ([logs][irc-logs]).
|
||||
|
||||
### Git commit
|
||||
|
||||
#### Sign your work
|
||||
|
||||
The sign-off is a simple line at the end of the explanation for the patch, which certifies that you wrote it or otherwise have the right to pass it on as an open-source patch.
|
||||
The rules are pretty simple: if you can certify the below (from http://developercertificate.org):
|
||||
|
||||
```
|
||||
Developer Certificate of Origin
|
||||
Version 1.1
|
||||
|
||||
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
|
||||
660 York Street, Suite 102,
|
||||
San Francisco, CA 94110 USA
|
||||
|
||||
Everyone is permitted to copy and distribute verbatim copies of this
|
||||
license document, but changing it is not allowed.
|
||||
|
||||
|
||||
Developer's Certificate of Origin 1.1
|
||||
|
||||
By making a contribution to this project, I certify that:
|
||||
|
||||
(a) The contribution was created in whole or in part by me and I
|
||||
have the right to submit it under the open source license
|
||||
indicated in the file; or
|
||||
|
||||
(b) The contribution is based upon previous work that, to the best
|
||||
of my knowledge, is covered under an appropriate open source
|
||||
license and I have the right under that license to submit that
|
||||
work with modifications, whether created in whole or in part
|
||||
by me, under the same open source license (unless I am
|
||||
permitted to submit under a different license), as indicated
|
||||
in the file; or
|
||||
|
||||
(c) The contribution was provided directly to me by some other
|
||||
person who certified (a), (b) or (c) and I have not modified
|
||||
it.
|
||||
|
||||
(d) I understand and agree that this project and the contribution
|
||||
are public and that a record of the contribution (including all
|
||||
personal information I submit with it, including my sign-off) is
|
||||
maintained indefinitely and may be redistributed consistent with
|
||||
this project or the open source license(s) involved.
|
||||
```
|
||||
|
||||
then you just add a line to every git commit message:
|
||||
|
||||
Signed-off-by: Joe Smith <joe@gmail.com>
|
||||
|
||||
using your real name (sorry, no pseudonyms or anonymous contributions.)
|
||||
|
||||
You can add the sign off when creating the git commit via `git commit -s`.
|
||||
|
||||
#### Commit Style
|
||||
|
||||
Simple house-keeping for clean git history.
|
||||
Read more on [How to Write a Git Commit Message][how-to-git-commit] or the Discussion section of [git-commit(1)][git-commit.1].
|
||||
|
||||
1. Separate the subject from body with a blank line
|
||||
2. Limit the subject line to 50 characters
|
||||
3. Capitalize the subject line
|
||||
4. Do not end the subject line with a period
|
||||
5. Use the imperative mood in the subject line
|
||||
6. Wrap the body at 72 characters
|
||||
7. Use the body to explain what and why vs. how
|
||||
* If there was important/useful/essential conversation or information, copy or include a reference
|
||||
8. When possible, one keyword to scope the change in the subject (i.e. "README: ...", "runtime: ...")
|
||||
|
||||
|
||||
[charter]: https://www.opencontainers.org/about/governance
|
||||
[code-of-conduct]: https://github.com/opencontainers/tob/blob/master/code-of-conduct.md
|
||||
[dev-list]: https://groups.google.com/a/opencontainers.org/forum/#!forum/dev
|
||||
[how-to-git-commit]: http://chris.beams.io/posts/git-commit
|
||||
[irc-logs]: http://ircbot.wl.linuxfoundation.org/eavesdrop/%23opencontainers/
|
||||
[iso-week]: https://en.wikipedia.org/wiki/ISO_week_date#Calculating_the_week_number_of_a_given_date
|
||||
[minutes]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/
|
||||
[oci]: https://www.opencontainers.org
|
||||
[rfc5545]: https://tools.ietf.org/html/rfc5545
|
||||
[runtime-wiki]: https://github.com/opencontainers/runtime-spec/wiki
|
||||
[uberconference]: https://www.uberconference.com/opencontainers
|
||||
|
||||
[git-commit.1]: http://git-scm.com/docs/git-commit
|
84
vendor/github.com/opencontainers/runtime-tools/README.md
generated
vendored
Normal file
84
vendor/github.com/opencontainers/runtime-tools/README.md
generated
vendored
Normal file
@@ -0,0 +1,84 @@
|
||||
# oci-runtime-tool [](https://travis-ci.org/opencontainers/runtime-tools) [](https://goreportcard.com/report/github.com/opencontainers/runtime-tools)
|
||||
|
||||
oci-runtime-tool is a collection of tools for working with the [OCI runtime specification][runtime-spec].
|
||||
To build from source code, runtime-tools requires Go 1.7.x or above.
|
||||
|
||||
## Generating an OCI runtime spec configuration files
|
||||
|
||||
[`oci-runtime-tool generate`][generate.1] generates [configuration JSON][config.json] for an [OCI bundle][bundle].
|
||||
[OCI-compatible runtimes][runtime-spec] like [runC][] expect to read the configuration from `config.json`.
|
||||
|
||||
```sh
|
||||
$ oci-runtime-tool generate --output config.json
|
||||
$ cat config.json
|
||||
{
|
||||
"ociVersion": "0.5.0",
|
||||
…
|
||||
}
|
||||
```
|
||||
|
||||
## Validating an OCI bundle
|
||||
|
||||
[`oci-runtime-tool validate`][validate.1] validates an OCI bundle.
|
||||
The error message will be printed if the OCI bundle failed the validation procedure.
|
||||
|
||||
```sh
|
||||
$ oci-runtime-tool generate
|
||||
$ oci-runtime-tool validate
|
||||
INFO[0000] Bundle validation succeeded.
|
||||
```
|
||||
|
||||
## Testing OCI runtimes
|
||||
|
||||
```sh
|
||||
$ sudo make RUNTIME=runc localvalidation
|
||||
RUNTIME=runc go test -tags "" -v github.com/opencontainers/runtime-tools/validation
|
||||
=== RUN TestValidateBasic
|
||||
TAP version 13
|
||||
ok 1 - root filesystem
|
||||
ok 2 - hostname
|
||||
ok 3 - mounts
|
||||
ok 4 - capabilities
|
||||
ok 5 - default symlinks
|
||||
ok 6 - default devices
|
||||
ok 7 - linux devices
|
||||
ok 8 - linux process
|
||||
ok 9 - masked paths
|
||||
ok 10 - oom score adj
|
||||
ok 11 - read only paths
|
||||
ok 12 - rlimits
|
||||
ok 13 - sysctls
|
||||
ok 14 - uid mappings
|
||||
ok 15 - gid mappings
|
||||
1..15
|
||||
--- PASS: TestValidateBasic (0.08s)
|
||||
=== RUN TestValidateSysctls
|
||||
TAP version 13
|
||||
ok 1 - root filesystem
|
||||
ok 2 - hostname
|
||||
ok 3 - mounts
|
||||
ok 4 - capabilities
|
||||
ok 5 - default symlinks
|
||||
ok 6 - default devices
|
||||
ok 7 - linux devices
|
||||
ok 8 - linux process
|
||||
ok 9 - masked paths
|
||||
ok 10 - oom score adj
|
||||
ok 11 - read only paths
|
||||
ok 12 - rlimits
|
||||
ok 13 - sysctls
|
||||
ok 14 - uid mappings
|
||||
ok 15 - gid mappings
|
||||
1..15
|
||||
--- PASS: TestValidateSysctls (0.20s)
|
||||
PASS
|
||||
ok github.com/opencontainers/runtime-tools/validation 0.281s
|
||||
```
|
||||
|
||||
[bundle]: https://github.com/opencontainers/runtime-spec/blob/master/bundle.md
|
||||
[config.json]: https://github.com/opencontainers/runtime-spec/blob/master/config.md
|
||||
[runC]: https://github.com/opencontainers/runc
|
||||
[runtime-spec]: https://github.com/opencontainers/runtime-spec
|
||||
|
||||
[generate.1]: man/oci-runtime-tool-generate.1.md
|
||||
[validate.1]: man/oci-runtime-tool-validate.1.md
|
Reference in New Issue
Block a user