cri: ensure NRI API never has nil CRI
A nil CRIImplementation field can cause a nil pointer dereference and panic during startup recovery. Prior to this change, the nri.API struct would have a nil cri (CRIImplementation) field after nri.NewAPI until nri.Register was called. Register is called mid-way through initialization of the CRI plugin, but recovery for containers occurs prior to that. Container recovery includes establishing new exit monitors for existing containers that were discovered. When a container exits, NRI plugins are given the opportunity to be notified about the lifecycle event, and this is done by accessing that CRIImplementation field inside the nri.API. If a container exits prior to nri.Register being called, access to the CRIImplementation field can cause a panic. Here's the call-path: * The CRI plugin starts running [here](ae71819c4f/pkg/cri/server/service.go (L222)) * It then [calls into](ae71819c4f/pkg/cri/server/service.go (L227)) `recover()` to recover state from previous runs of containerd * `recover()` then attempts to recover all containers through [`loadContainer()`](ae7d74b9e2/internal/cri/server/restart.go (L175)) * When `loadContainer()` finds a container that is still running, it waits for the task (internal containerd object) to exit and sets up [exit monitoring](ae7d74b9e2/internal/cri/server/restart.go (L391)) * Any exit that then happens must be [handled](ae7d74b9e2/internal/cri/server/events.go (L145)) * Handling an exit includes [deleting the Task](ae7d74b9e2/internal/cri/server/events.go (L188)) and specifying [`nri.WithContainerExit`](ae7d74b9e2/internal/cri/nri/nri_api_linux.go (L348)) to [notify](ae7d74b9e2/internal/cri/nri/nri_api_linux.go (L356)) any subscribed NRI plugins * NRI plugins need to know information about the pod (not just the sandbox), so before a plugin is notified the NRI API package [queries the Sandbox Store](ae7d74b9e2/internal/cri/nri/nri_api_linux.go (L232)) through the CRI implementation * The `cri` implementation member field in the `nri.API` struct is set as part of the [`Register()`](ae7d74b9e2/internal/cri/nri/nri_api_linux.go (L66)) method * The `nri.Register()` method is only called [much further down in the CRI `Run()` method](ae71819c4f/pkg/cri/server/service.go (L279)) Signed-off-by: Samuel Karp <samuelkarp@google.com>
This commit is contained in:
@@ -32,7 +32,6 @@ import (
|
||||
criconfig "github.com/containerd/containerd/v2/internal/cri/config"
|
||||
"github.com/containerd/containerd/v2/internal/cri/constants"
|
||||
"github.com/containerd/containerd/v2/internal/cri/instrument"
|
||||
"github.com/containerd/containerd/v2/internal/cri/nri"
|
||||
"github.com/containerd/containerd/v2/internal/cri/server"
|
||||
nriservice "github.com/containerd/containerd/v2/internal/nri"
|
||||
"github.com/containerd/containerd/v2/plugins"
|
||||
@@ -212,7 +211,7 @@ func (c criGRPCServerWithTCP) RegisterTCP(s *grpc.Server) error {
|
||||
}
|
||||
|
||||
// Get the NRI plugin, and set up our NRI API for it.
|
||||
func getNRIAPI(ic *plugin.InitContext) *nri.API {
|
||||
func getNRIAPI(ic *plugin.InitContext) nriservice.API {
|
||||
const (
|
||||
pluginType = plugins.NRIApiPlugin
|
||||
pluginName = "nri"
|
||||
@@ -234,8 +233,7 @@ func getNRIAPI(ic *plugin.InitContext) *nri.API {
|
||||
}
|
||||
|
||||
log.G(ctx).Info("using experimental NRI integration - disable nri plugin to prevent this")
|
||||
|
||||
return nri.NewAPI(api)
|
||||
return api
|
||||
}
|
||||
|
||||
func getSandboxControllers(ic *plugin.InitContext) (map[string]sandbox.Controller, error) {
|
||||
|
||||
Reference in New Issue
Block a user