77 Commits

Author SHA1 Message Date
Maksym Pavlenko
6f34da5f80 Cleanup logrus imports
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-05-05 11:54:14 -07:00
Maksym Pavlenko
ef516a1507 Remove runtime v1
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2023-03-15 09:18:14 -07:00
Akihiro Suda
b61988670c go.mod: github.com/containerd/typeurl/v2 v2.1.0
Changes: https://github.com/containerd/typeurl/compare/7f6e6d160d67...v2.1.0

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2023-02-11 23:39:52 +09:00
Wei Fu
6b7e237fc7 chore: use go fix to cleanup old +build buildtag
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-12-29 14:25:14 +08:00
Kazuyoshi Kato
6596a70861 Use github.com/containerd/cgroups/v3 to remove gogo
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-11-14 21:07:48 +00:00
Daniel Canter
3a2197f5fe metrics/cgroups/v1: Remove unused event parameter
The event parameter wasn't actually used when processing oom events,
likely because it's only ever available for reads.

Additionally clarify flush is for eventfds, and point to where the
buffer size of 8 is coming from.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
2022-09-02 20:38:09 -07:00
Kazuyoshi Kato
88c0c7201e Consolidate gogo/protobuf dependencies under our own protobuf package
This would make gogo/protobuf migration easier.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
2022-04-19 15:53:36 +00:00
Nguyen Phan Huy
c525aa5f85 Set timeout when collecting metrics from shim's Stat
Signed-off-by: Nguyen Phan Huy <phanhuy1502@gmail.com>
2022-04-12 10:49:29 +08:00
Wei Fu
8a1280b2b6 metrics/cgroups: fix deadlock issue in Add during Collect
The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

	ns.Collect
	  ns.Lock
	    Collector.Collect
	      Collector.RLock

Goroutine Y:

	Collector.Add
	  Collector.Lock
	    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: #6772

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2022-04-10 09:17:21 +08:00
Akihiro Suda
d3aa7ee9f0 Run go fmt with Go 1.17
The new `go fmt` adds `//go:build` lines (https://golang.org/doc/go1.17#tools).

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2021-08-22 09:31:50 +09:00
Derek McGowan
0a0621bb47 Move plugin context events into separate plugin
Signed-off-by: Derek McGowan <derek@mcg.dev>
2021-08-05 22:59:20 -07:00
Maksym Pavlenko
efa8ab7158 Add runtime label to metrics
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-07-23 10:04:46 -07:00
Akihiro Suda
2f601013e6 cgroup2: implement containerd.events.TaskOOM event
How to test (from https://github.com/opencontainers/runc/pull/2352#issuecomment-620834524):
  (host)$ sudo swapoff -a
  (host)$ sudo ctr run -t --rm --memory-limit $((1024*1024*32)) docker.io/library/alpine:latest foo
  (container)$ sh -c 'VAR=$(seq 1 100000000)'

An event `/tasks/oom {"container_id":"foo"}` will be displayed in `ctr events`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-06-01 14:00:13 +09:00
Michael Crosby
d654dbafac Allow the id for cgroup metrics to be changed
This makes the metrics package more extensible by allowing the default name of
`container_id` to be changed by the package caller.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2020-03-26 11:55:44 -04:00
Michael Crosby
1239f54035 export cgroups collectors
This makes it easier to extend the collectors to be used by external code and
task managers

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2020-03-06 12:51:22 -05:00
Boris Popovschi
3eb57b01be Added IO metrics
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-01-15 14:35:47 +02:00
Boris Popovschi
b9d9bdf1fd make cpu metrics consistent with v2 docs
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2019-12-17 12:22:55 +02:00
Boris Popovschi
929ab521c6 fix system usage naming
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2019-12-17 11:05:28 +02:00
Boris Popovschi
23dbae3e71 Schema name fix
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2019-12-17 10:40:11 +02:00
Boris Popovschi
17d61d6b7e Units fix
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2019-12-16 19:54:21 +02:00
bpopovschi
f287bc2292 Schema names fix
Signed-off-by: bpopovschi <zyqsempai@mail.ru>
2019-12-16 19:28:42 +02:00
bpopovschi
6bfb24824b Fix prometheus metrics units
Signed-off-by: bpopovschi <zyqsempai@mail.ru>
2019-12-16 18:27:50 +02:00
bpopovschi
b98cc79184 Added memory and cpu metrics for cgroupv2
Signed-off-by: bpopovschi <zyqsempai@mail.ru>
2019-12-16 16:10:51 +02:00
Akihiro Suda
43fca9eba2 metrics: rename pids_v2 to pids
dicussed in #3726

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-12-13 15:41:08 +09:00
Akihiro Suda
8f870c233f support cgroup2
* only shim v2 runc v2 ("io.containerd.runc.v2") is supported
* only PID metrics is implemented. Others should be implemented in separate PRs.
* lots of code duplication in v1 metrics and v2 metrics. Dedupe should be separate PR.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-12-12 02:56:51 +09:00
Michael Crosby
f3148d0b98 Add metrics type alias
This will help to decouple the import in CRI from the cgroups package
directly by importing the type alias in containerd repo.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2019-09-19 14:13:56 -04:00
ethan
a80db38c33 blkio.go: correct help message word spells.
Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>
2019-08-13 09:41:25 +08:00
Mike Brown
879b2ae291 allow idempotence when adding a task to cgroup metrics collection
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2018-10-18 01:32:56 -05:00
Stephen Day
a88b631961 Merge pull request #2471 from crosbymichael/fatal
Don't fatal on epoll wait
2018-07-23 14:17:57 -07:00
Michael Crosby
9743ff21c9 Don't fatal on epoll wait
This removes a log fatal on epoll wait for OOM events.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-07-18 16:40:31 -04:00
Michael Crosby
da1b5470cd Runtime v2
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-07-17 10:21:29 -04:00
Evan Hazlett
cae94b930d linux -> runtime/linux
Signed-off-by: Evan Hazlett <ejhazlett@gmail.com>
2018-05-30 09:23:10 -04:00
Kir Kolyshkin
bbe14f0a2e Switch from x/net/context to context
Since Go 1.7, context is a standard package, superceding the
"x/net/context". Since Go 1.9, the latter only provides a few type
aliases from the former. Therefore, it makes sense to switch to the
standard package.

This commit was generated by the following script (with a couple of
minor fixups to remove extra changes done by goimports):

	#!/bin/bash

	if [ $# -ge 1 ]; then
		FILES=$*
	else
		FILES=$(git ls-files \*.go | grep -vF ".pb.go" | grep -v
	^vendor/)
	fi

	for f in $FILES; do
		printf .
		sed -i -e 's|"golang.org/x/net/context"$|"context"|' $f
		goimports -w $f
		awk '	/^$/ {e=1; next;}
			/[[:space:]]"context"$/ {e=0;}
			{if (e) {print ""; e=0}; print;}' < $f > $f.new && \
				mv $f.new $f
		goimports -w $f
	done
	echo

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2018-04-24 14:33:34 -07:00
Kunal Kushwaha
b12c3215a0 Licence header added
Signed-off-by: Kunal Kushwaha <kushwaha_kunal_v7@lab.ntt.co.jp>
2018-02-19 10:32:26 +09:00
Daniel Nephin
184bc25629 Add unconvert linter
This linter checks for unnecessary type convertions.

Some convertions are whitelisted because their type is different
on 32bit platforms

Signed-off-by: Daniel Nephin <dnephin@gmail.com>
2018-01-09 17:36:44 -05:00
Daniel Nephin
8fe12adc20 Warn if OOM monitoring is not available
Instead of failing with an error

Signed-off-by: Daniel Nephin <dnephin@gmail.com>
2017-11-23 15:10:44 -05:00
Stephen J Day
09b5ca1072 api/events: split event types from events service
To avoid importing all of grpc when consuming events, the types of
events have been split in to a separate package. This should allow a
reduction in memory usage in cases where a package is consuming events
but not using the gprc service directly.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-11-16 15:20:46 -08:00
Kenfe-Mickael Laventure
d8e489443c linux: Ensure count is 64bits aligned for proper atomic use on 32bits machines
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-10-16 10:15:01 -07:00
Stephen J Day
8508e8252b plugin: refactor plugin system to support initialization reporting
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-10-10 16:40:47 -07:00
Stephen J Day
77e5f6553c metrics/cgroups: handle error on call to cgroup
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-10-06 15:51:20 -07:00
Phil Estes
987fcd1201 Merge pull request #1598 from Random-Liu/fix-load-task
Fix task load.
2017-10-06 16:38:40 -04:00
Michael Crosby
d92f6eea1f Allow blocking and non-blocking metrics collection
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-10-06 12:05:56 -04:00
Mathieu Pasquet
ed519bb5ce Collect cgroup stats one last time before exit
This commit adds a collection step in the Stop() task handler which will
retrieve the metrics available for this container at that time, and
store them until the next prometheus Collect() cycle.

This allows short-lived containers to be visible in prometheus, which
would otherwise be ignored (for example, running containerd-stress would
show something like 2 or 3 containers in the end, while now we can see
all of them). It also allows for more accurate collection when
long-running containers end (for example CPU usage could spike in the
last few seconds).

A simple case illustrating this with cpu usage would be:

  ctr run -t --rm docker.io/library/alpine:latest mycontainer sh -c 'yes > /dev/null & sleep 3 && pkill yes'

Signed-off-by: Mathieu Pasquet <mathieu.pasquet@alterway.fr>
2017-10-06 15:40:38 +02:00
Lantao Liu
28ca8f05d3 Fix task load.
Signed-off-by: Lantao Liu <lantaol@google.com>
2017-10-05 21:03:24 +00:00
Michael Crosby
451421b615 Comment more packages to pass go lint
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-10-02 13:54:56 -04:00
Michael Crosby
72bcdb8fa9 Add config for exporting container metrics to prom
This adds an option for the cgroups monitor to include container metrics
in the prometheus output.  We will have to use the plugin to emit oom
events via the events service but when the `no_prom` setting is set for
the plugin container metrics will not be included in the prom output.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-07 13:40:55 -04:00
Michael Crosby
2ed3c62e27 Update cgroups to 5933ab4dc4f7caa3a73a1dc141bd11f4
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-06 16:20:19 -04:00
Michael Crosby
ed45952826 Use cgroups proto for prom metrics
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-05 17:26:26 -04:00
Michael Crosby
697dcdd407 Refactor task service metrics
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-05 17:26:26 -04:00
Michael Crosby
b04e408a4b Convert OOM Metric to Const
This converts the oom metric to be a const metric so that deleted tasks
do not fill up the metric labels.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-09-01 16:43:30 -04:00