kubernetes

Author	SHA1	Message	Date
RainbowMango	debe2f7b43	Refactor TestRunningPodAndContainerCount with metrics testutil	2019-10-09 15:09:23 +08:00
Rajdeep Das	c02d49d775	Update running_pod_count and running_container_count metric As already mentioned in this issue https://github.com/kubernetes/kubernetes/issues/79286, some metrics like "running_pod_count" and "running_container_count" uses non-standard prometheus metrics, this change converts them to be standard prometheus gauges Minor refactor in kubelet/pleg/generic.go and added some test for ruuning container and running pod metrics Fixed issues related to github CI pipeline failure * Updated bazel for new deps * Add comment for exported metrics variables,RuuningContainerCount and RunningPodCount * Specify keys explicitly in Guage metric instantation Fix go lint errors Replace "+=1" with "++", as reported by go lint Set container state as a label for the metrics "running_container_count" As per the metrics name "running_container_count" it should "ideally" be showing the number of containers in "running" state , but it was showing all the container count, irrespective of the state it is in. This commit adds a new label "container_running_state" to the metrics "running_container_count", which doesn't change the base metrics but adds the option to query the metrics with "container_state" such as "running"/"unknown/... remove unused methods reported by staticcheck Remove variables while instantiating gauge(vec) which are default set to nil Convert kubelet metrics(running_pod_count and running_container_count) to standard gauges and added label to running_container_count metrics. Currently kubelet metrics(running_pod_count and running_container_count) use non-standard prometheus collectors , this change converts them to standard prometheus gauges. Also this adds a new label(container_state) to running_container_count which does a breakdown of containers tracked by kubelet based on the containers' state(running/unknown/created/exited). Set statbility explicitly for running_pod_count and running_container_count and reformat test register metrics explicitly in test , so that they don't become no-op	2019-08-29 17:23:04 +02:00
Khaled Henidak(Kal)	dba434c4ba	kubenet for ipv6 dualstack	2019-07-02 22:26:25 +00:00
changyaowei	850f4bbd36	modify random failure	2019-04-27 08:04:58 +08:00
changyaowei	123d1a925f	modify random failure	2019-04-15 20:26:00 +08:00
changyaowei	19f73899fc	modify test case	2019-02-13 16:27:15 +08:00
changyaowei	c70ee4272b	delete prometheus in unit testing	2019-01-31 12:18:02 +08:00
changyaowei	b52afc350f	when pleg channel is full, discard events and record how many events discard	2019-01-30 20:43:54 +08:00
Seth Jennings	5eab76934b	improve pleg error msg when it has never been successful	2018-10-01 16:41:01 -05:00
Dan Williams	8c16260160	kubelet: fix inconsistent display of terminated pod IPs by using events instead PLEG and kubelet race when reading and sending pod status to the apiserver. PLEG inserts status into a cache, and then signals kubelet. Kubelet then eventually reads the status out of that cache, but in the mean time the status could have been changed by PLEG. When a pod exits, pod status will no longer include the pod's IP address because the network plugin/runtime will report "" for terminated pod IPs. If this status gets inserted into the PLEG cache before kubelet gets the status out of the cache, kubelet will see a blank pod IP address. This happens in about 1/5 of cases when pods are short-lived, and somewhat less frequently for longer running pods. To ensure consistency for properties of dead pods, copy an old status update's IP address over to the new status update if (a) the new status update's IP is missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility for a valid IP from the sandbox). Fixes: https://github.com/kubernetes/kubernetes/issues/47265	2017-07-21 09:52:10 -05:00
Clayton Coleman	3e095d12b4	Refactor move of client-go/util/clock to apimachinery	2017-05-20 14:19:48 -04:00
deads2k	5a8f075197	move authoritative client-go utils out of pkg	2017-01-24 08:59:18 -05:00
deads2k	c47717134b	move utils used in restclient to client-go	2017-01-19 07:55:14 -05:00
deads2k	6a4d5cd7cc	start the apimachinery repo	2017-01-11 09:09:48 -05:00
Yu-Ju Hong	a49d28710a	Extend PLEG to handle pod sandboxes PLEG will treat them as if they are regular containers and detect changes the same manner. Note that this makes an assumption that container IDs will not collide with the podsandbox IDs.	2016-08-30 09:54:24 -07:00
Harry Zhang	cb14b35bde	Refactor util clock into it's own pkg	2016-07-28 02:29:04 -04:00
Ron Lai	a58c774c08	Including ContainerRemoved in PLEG event reporting	2016-07-14 16:39:03 -07:00
David McMahon	ef0c9f0c5b	Remove "All rights reserved" from all the headers.	2016-06-29 17:47:36 -07:00
Dan Williams	9865ac325c	kubelet/cni: make cni plugin runtime agnostic Use the generic runtime method to get the netns path. Also move reading the container IP address into cni (based off kubenet) instead of having it in the Docker manager code. Both old and new methods use nsenter and /sbin/ip and should be functionally equivalent.	2016-06-22 11:36:10 -05:00
Andy Goldstein	3a87bfb6f7	PLEG: reinspect pods that failed prior inspections Fix the following sequence of events: 1. relist call 1 successfully inspects a pod (just has infra container) 1. relist call 2 gets an error inspecting the same pod (has infra container and a transient container that failed to create) and doesn't update the old/new pod records 1. relist calls 3+ don't inspect the pod any more (just has infra container so it doesn't look like anything changed) This change adds a new list that keeps track of pods that failed inspection and retries them the next time relist is called. Without this change, a pod in this state would never be inspected again, its entry in the status cache would never be updated, and the pod worker would never call syncPod again because the most recent entry in the status cache has an error associated with it. Without this change, pods in this state would be stuck Terminating forever, unless the user issued a deletion with a grace period value of 0.	2016-05-03 11:06:35 -04:00
harry	b0900bf0d4	Refactor diff into sub pkg	2016-03-21 20:21:39 +08:00
k8s-merge-robot	3f16f5f2b8	Merge pull request #22233 from yujuhong/pleg_health Auto commit by PR queue bot	2016-03-03 11:01:26 -08:00
Yu-Ju Hong	4846c1e1b2	pleg: add an internal clock for testability Also add tests for the health check.	2016-03-01 17:53:03 -08:00
Yu-Ju Hong	e770f25882	pleg: add more tests for detecting missing container/pods	2016-03-01 17:23:23 -08:00
Tim St. Clair	7b6d843309	Move test-only files to test-only packages	2016-03-01 09:11:32 -08:00
Yu-Ju Hong	b56ed1a8c2	Support populating the runtime cache in PLEG This changes does not turn on this feature (cache) for kubelet.	2016-01-13 10:19:47 -08:00
Yu-Ju Hong	73a4f8225c	PLEG should report events if a container is removed Currently, pleg would report a event if a container transitions from running to exited between relisting. However, if would not report any event if a container gets stopped and removed between relisting. This event will eventually be handled when the pod syncs periodically, but this is undesirable. This change ensures that we detect all such events.	2016-01-12 16:25:19 -08:00
Random-Liu	3cbdf79f8c	Change original PodStatus to APIPodStatus, and start using kubelet internal PodStatus in dockertools	2015-12-04 17:37:39 -08:00
Yu-Ju Hong	bc6414a873	kubelet: add a generic pod lifecycle event generator This change introduces pod lifecycle event generator (PLEG), and adds a generic PLEG. The generic PLEG relies on relisting to discover container events, and is container-runtime-agnostic. Both docker and rkt are changed to use generic PLEG.	2015-11-13 09:55:36 -08:00

29 Commits