* Add topologyScopeName parameter to NewManager().
* Add scope interface and structure that implement common logic
* Add pod scope & container scopes
* Add pod lifecycle functions
Co-authored-by: sw.han <sw.han@samsung.com>
Signed-off-by: Krzysztof Wiatrzyk <k.wiatrzyk@samsung.com>
This PR changes the buckets of the
kubelet_runtime_operation_duration_seconds metric to be
metrics.ExponentialBuckets(.005, 2.5, 14) in order to
allow debugging image pull times. Right now the biggest bucket is 10
seconds, which is an ordinary time frame to pull an image, making the
metric useless for the aforementioned usecase.
It covers deviceplugin & cpumanager.
It has drawback, since cpuset and all other structs including cadvisor's keep
cpu as int, but for protobuf based interface is better to have fixed
int.
This patch also introduces additional interface CPUsProvider, while
DeviceProvider might have been extended too.
Checkpoint not covered by unit test.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
check in-memory cache whether volumes are still mounted and check disk directory for the volume paths instead of mounted volumes check
Signed-off-by: Mucahit Kurt <mucahitkurt@gmail.com>
Ensure node name and file path are included in the hash which produces
the pod UID, otherwise all pods created from the same manifest have
the same UID across the cluster.
The real author of this code is Yu-Ju Hong <yjhong@google.com>.
I am resurrecting an abandoned PR, and changed the git author to pass
CLA check.
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Based on the comments in this file, it seems like these import
restrictions were originally meant for the kubelet CRI streaming
package. This commit moves the import restrictions to
pkg/kubelet/cri/streaming so that pkg/kubelet/cri can import internal
packages like pkg/probe/exec
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
This change also involves adding a custom error type for probe timeouts
so that the kubelet exec prober can distinguish between failed probes
that have exited or probes that have timed out.
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
This fixes a bug where the exec timeouts are not respected with
containerd
Exec prober expects a utilexec.CodeExitError on failed probes, otherwise
the prober returns 'Unknown' and a non-nil error which the kubelet throws
away. As a temporary fix, ExecSync as part of the CRI remote runtime
should return utilexec.CodeExitError when the grpc error code is
DeadlineContextExceeded. This ensure the exec prober registers exec
timeouts as real probe failures to the kubelet. We should also add a
TimededError type to k8s.io/utils/exec since it doesn't really make
sense to use CodeExitError for exec time outs.
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* Rename const for topology.../zone
* Rename const for topology.../region
* Rename const for failure-domain.../zone
* Rename const for failure-domain.../region
* Restore old names for compat
Pod metrics may not be the same as the sum of container metrics. Add support for pod specific
metrics to allow for more accurate accounting of resources.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>