Microsoft announced the removal of nondistributable layers from their
images today. This makes the convert test fail since it assumes the
first layer is nondistributable on Windows during the test.
Signed-off-by: Phil Estes <estesp@amazon.com>
As a follow up change to adding a SandboxMetrics rpc to the core
sandbox service, the controller needed a corresponding rpc for CRI
and others to eventually implement.
This leaves the CRI (non-shim mode) controller unimplemented just to
have a change with the API addition to start.
Signed-off-by: Danny Canter <danny@dcantah.dev>
To gather metrics/stats about a specific sandbox instance, it'd be nice to
have a dedicated rpc for this. Due to the same "what kind of stats are going
to be returned" dilemma exists for sandboxes as well, I've re-used the metrics
type we have as the data field is just an `any`, leaving the metrics returned
entirely up to the shim author. For CRI usecases this will just be cgroup and
windows stats as that's all that's supported right now.
Signed-off-by: Danny Canter <danny@dcantah.dev>
eventSendMu is causing severe lock contention when multiple processes
start and exit concurrently. Replace it with a different scheme for
maintaining causality w.r.t. start and exit events for a process which
does not rely on big locks for synchronization.
Keep track of all processes for which a Task(Exec)Start event has been
published and have not yet exited in a map, keyed by their PID.
Processing exits then is as simple as looking up which process
corresponds to the PID. If there are no started processes known with
that PID, the PID must either belong to a process which was started by
s.Start() and before the s.Start() call has added the process to the map
of running processes, or a reparented process which we don't care about.
Handle the former case by having each s.Start() call subscribe to exit
events before starting the process. It checks if the PID has exited in
the time between it starting the process and publishing the TaskStart
event, handling the exit if it has. Exit events for reparented processes
received when no s.Start() calls are in flight are immediately
discarded, and events received during an s.Start() call are discarded
when the s.Start() call returns.
Co-authored-by: Laura Brehm <laurabrehm@hey.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
When a container is just created, exited state the container will not have stats. A common case for this in k8s is the init containers for a pod. The will be present in the listed containers but will not have a running task and there for no stats.
Signed-off-by: James Sturtevant <jstur@microsoft.com>
This allows standard OTLP env vars to be used for configuring tracing
exporters.
Note: This does mean that, as written now, if no env var is set the
trace exporter will try to connect to the default OTLP address
(`localhost:4318`).
I've left this alone for now, but we could detect the OTLP vars
ourselves and if not set don't configure the exporter.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The 10-containerd-net.conflist file generated from the conf_template
should be written atomically so that partial writes are not visible to
CNI plugins. Use the new consistentfile package to ensure this on
Unix-like platforms such as Linux, FreeBSD, and Darwin.
Fixes https://github.com/containerd/containerd/issues/8607
Signed-off-by: Samuel Karp <samuelkarp@google.com>
Certain files may need to be written atomically so that partial writes
are not visible to other processes. On Unix-like platforms such as
Linux, FreeBSD, and Darwin, this is accomplished by writing a temporary
file, syncing, and renaming over the destination file name. On Windows,
the same operations are performed, but Windows does not guarantee that a
rename operation is atomic.
Partial/inconsistent reads can occur due to:
1. A process attempting to read the file while containerd is writing it
(both in the case of a new file with a short/incomplete write or in
the case of an existing, updated file where new bytes may be written
at the beginning but old bytes may still be present after).
2. Concurrent goroutines in containerd leading to multiple active
writers of the same file.
The above mechanism explicitly protects against (1) as all writes are to
a file with a temporary name.
There is no explicit protection against multiple, concurrent goroutines
attempting to write the same file. However, atomically writing the file
should mean only one writer will "win" and a consistent file will be
visible.
Signed-off-by: Samuel Karp <samuelkarp@google.com>
The initial PR had a check for nil metrics but after some refactoring in the PR the test case that was suppose cover HPC was missing a scenario where the metric was not nil but didn't contain any metrics. This fixes that case and adds a testcase to cover it.
Signed-off-by: James Sturtevant <jstur@microsoft.com>
Go deprecation comments must be formatted to have an empty comment line before
them. Fix the formatting to make sure linters and editors detect that these
are deprecated.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>