Let OCI runtime create netns when userns is used

As explained in the comments, this patch lets the OCI runtime create the
netns when userns are in use. This is needed because the netns needs to
be owned by the userns (otherwise can't modify the IP, etc.).

Before this patch, we are creating the netns and then starting the pod
sandbox asking to join this netns. This can't never work with userns, as
the userns needs to be created first for the netns ownership to be
correct.

One option would be to also create the userns in containerd, then create
the netns. But this is painful (needs tricks with the go runtime,
special care to write the mapping, etc.).

So, we just let the OCI runtime create the userns and netns, that
creates them with the proper ownership.

As requested by Mike Brown, the current code when userns is not used is
left unchanged. We can unify the cases (with and without userns) in a
future release.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
This commit is contained in:
Rodrigo Campos
2022-12-15 14:31:36 -03:00
parent 3233d5d6f5
commit 36f520dc04
4 changed files with 144 additions and 5 deletions

View File

@@ -50,7 +50,9 @@ import (
// newNS creates a new persistent (bind-mounted) network namespace and returns the
// path to the network namespace.
func newNS(baseDir string) (nsPath string, err error) {
// If pid is not 0, returns the netns from that pid persistently mounted. Otherwise,
// a new netns is created.
func newNS(baseDir string, pid uint32) (nsPath string, err error) {
b := make([]byte, 16)
_, err = rand.Read(b)
@@ -81,6 +83,16 @@ func newNS(baseDir string) (nsPath string, err error) {
}
}()
if pid != 0 {
procNsPath := getNetNSPathFromPID(pid)
// bind mount the netns onto the mount point. This causes the namespace
// to persist, even when there are no threads in the ns.
if err = unix.Mount(procNsPath, nsPath, "none", unix.MS_BIND, ""); err != nil {
return "", fmt.Errorf("failed to bind mount ns src: %v at %s: %w", procNsPath, nsPath, err)
}
return nsPath, nil
}
var wg sync.WaitGroup
wg.Add(1)
@@ -155,6 +167,10 @@ func getCurrentThreadNetNSPath() string {
return fmt.Sprintf("/proc/%d/task/%d/ns/net", os.Getpid(), unix.Gettid())
}
func getNetNSPathFromPID(pid uint32) string {
return fmt.Sprintf("/proc/%d/ns/net", pid)
}
// NetNS holds network namespace.
type NetNS struct {
path string
@@ -162,7 +178,12 @@ type NetNS struct {
// NewNetNS creates a network namespace.
func NewNetNS(baseDir string) (*NetNS, error) {
path, err := newNS(baseDir)
return NewNetNSFromPID(baseDir, 0)
}
// NewNetNS returns the netns from pid or a new netns if pid is 0.
func NewNetNSFromPID(baseDir string, pid uint32) (*NetNS, error) {
path, err := newNS(baseDir, pid)
if err != nil {
return nil, fmt.Errorf("failed to setup netns: %w", err)
}