kubernetes

Author	SHA1	Message	Date
Chen Wang	7db339dba2	This commit contains the following: 1. Scheduler bug-fix + scheduler-focussed E2E tests 2. Add cgroup v2 support for in-place pod resize 3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes. Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>	2023-02-24 18:21:21 +00:00
Vinay Kulkarni	f2bd94a0de	In-place Pod Vertical Scaling - core implementation 1. Core Kubelet changes to implement In-place Pod Vertical Scaling. 2. E2E tests for In-place Pod Vertical Scaling. 3. Refactor kubelet code and add missing tests (Derek's kubelet review) 4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature. 5. Fix corner-case where resize A->B->A gets ignored 6. Add cgroup v2 support to pod resize E2E test. KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>	2023-02-24 18:21:21 +00:00
Harshal Patil	86284d42f8	Add support for Evented PLEG Signed-off-by: Harshal Patil <harpatil@redhat.com> Co-authored-by: Swarup Ghosh <swghosh@redhat.com>	2022-11-08 20:06:16 +05:30
David Ashpole	64af1adace	Second attempt: Plumb context to Kubelet CRI calls (#113591 ) * plumb context from CRI calls through kubelet * clean up extra timeouts * try fixing incorrectly cancelled context	2022-11-05 06:02:13 -07:00
Antonio Ojea	9c2b333925	Revert "plumb context from CRI calls through kubelet" This reverts commit `f43b4f1b95`.	2022-11-02 13:37:23 +00:00
David Ashpole	f43b4f1b95	plumb context from CRI calls through kubelet	2022-10-28 02:55:28 +00:00
XuzhengChang	6266554b34	refactor: pleg/getContainersFromPods	2022-04-06 14:12:52 +08:00
Patrick Ohly	edffc700a4	enhance and fix log calls Some of these changes are cosmetic (repeatedly calling klog.V instead of reusing the result), others address real issues: - Logging a message only above a certain verbosity threshold without recording that verbosity level (if klog.V().Enabled() { klog.Info... }): this matters when using a logging backend which records the verbosity level. - Passing a format string with parameters to a logging function that doesn't do string formatting. All of these locations where found by the enhanced logcheck tool from https://github.com/kubernetes/klog/pull/297. In some cases it reports false positives, but those can be suppressed with source code comments.	2022-03-24 11:13:50 +01:00
Patrick Ohly	9eaa2dc554	avoid klog Info calls without verbosity In the following code pattern, the log message will get logged with v=0 in JSON output although conceptually it has a higher verbosity: if klog.V(5).Enabled() { klog.Info("hello world") } Having the actual verbosity in the JSON output is relevant, for example for filtering out only the important info messages. The solution is to use klog.V(5).Info or something similar. Whether the outer if is necessary at all depends on how complex the parameters are. The return value of klog.V can be captured in a variable and be used multiple times to avoid the overhead for that function call and to avoid repeating the verbosity level.	2022-01-12 07:48:36 +01:00
Kubernetes Prow Robot	5f4914604d	Merge pull request #106353 from gjkim42/remove-false-pleg-errors kubelet: Remove false PLEG errors	2022-01-11 10:48:26 -08:00
Sascha Grunert	de37b9d293	Make CRI `v1` the default and allow a fallback to `v1alpha2` This patch makes the CRI `v1` API the new project-wide default version. To allow backwards compatibility, a fallback to `v1alpha2` has been added as well. This fallback can either used by automatically determined by the kubelet. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-11-17 11:05:05 -08:00
Gunju Kim	2dd4a00509	kubelet: Remove false PLEG errors	2021-11-12 00:03:01 +09:00
wojtekt	53ce79a18a	Migrate to k8s.io/utils/clock in pkg/kubelet	2021-09-10 12:20:09 +02:00
yuzhiquan	d483872d64	fix potential nil pointer	2021-04-26 15:31:34 +08:00
Kubernetes Prow Robot	27e23967f4	Merge pull request #99880 from Dragoncell/pleg-log Add exit code log when container died	2021-04-22 13:18:01 -07:00
Jiaming Xu	5f8dd349d1	Add exit code log when container died update log exit code logic adjust log exit code logic fix invalid memory access in unit test adjust log update log message address latest comment change logging format remove space in key of log address latest comments address comments	2021-04-20 00:19:16 +00:00
Paco Xu	54606db1b4	Update pkg/kubelet/pleg/generic.go Co-authored-by: Elana Hashman <ehashman@users.noreply.github.com>	2021-03-26 13:19:51 +08:00
pacoxu	3fc1e0891b	Update the kubelet log status to level 6 as it is so big Signed-off-by: pacoxu <paco.xu@daocloud.io>	2021-03-26 10:09:20 +08:00
Geonju Kim	221025ce74	Migrate `pkg/kubelet/pleg` to structured logging	2021-02-14 07:24:28 +09:00
xiaofei.sun	a724481f5c	fix metrics kubelet_running_pod_count	2020-07-31 16:35:53 +08:00
Sergey Kanzhelev	ee53488f19	fix golint issues in pkg/kubelet/container	2020-06-19 15:48:08 +00:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Kubernetes Prow Robot	de34d2ce1e	Merge pull request #87193 from mattjmcnaughton/mattjmcnaughton/cleanup-rkt-code-in-pleg Clean up rkt specific code in `pkg/kubelet/pleg`	2020-01-14 22:21:46 -08:00
mattjmcnaughton	ab7e0f58d5	Clean up rkt specific code in `pkg/kubelet/pleg` Clean up code in PLEG which was only necessary for the `rkt` runtime. Rkt is no longer a built-in runtime and docker(shim) uses the CRI, so its safe to remove this code entirely. This diff removes the last mentions of `rkt` in the kubelet.	2020-01-14 07:42:30 -05:00
danielqsj	1a9b121764	remove deprecated metrics of kubelet	2020-01-10 16:46:52 +08:00
Kubernetes Prow Robot	49bc696614	Merge pull request #86251 from bboreham/pleg-last-seen-metric Kubelet: add a metric to observe time since PLEG last seen	2020-01-06 18:06:18 -08:00
Bryan Boreham	cc0b3e82eb	Kubelet: add a metric to observe time since PLEG last seen Expose the measurement that kubelet uses to judge that "PLEG is unhealthy". If we can observe the measurement growing then we can alert before the node goes unhealthy. Note that the existing metrics PLEGRelistInterval and PLEGRelistDuration are poor for this, because when relist() gets stuck they are never updated. Signed-off-by: Bryan Boreham <bryan@weave.works>	2020-01-03 10:01:27 +00:00
yiyang5055	0f410d625a	change CounterVec to use Counter in the Kubelet's Pod Lifecycle Event Generator	2019-12-11 23:51:28 +08:00
Rajdeep Das	c02d49d775	Update running_pod_count and running_container_count metric As already mentioned in this issue https://github.com/kubernetes/kubernetes/issues/79286, some metrics like "running_pod_count" and "running_container_count" uses non-standard prometheus metrics, this change converts them to be standard prometheus gauges Minor refactor in kubelet/pleg/generic.go and added some test for ruuning container and running pod metrics Fixed issues related to github CI pipeline failure * Updated bazel for new deps * Add comment for exported metrics variables,RuuningContainerCount and RunningPodCount * Specify keys explicitly in Guage metric instantation Fix go lint errors Replace "+=1" with "++", as reported by go lint Set container state as a label for the metrics "running_container_count" As per the metrics name "running_container_count" it should "ideally" be showing the number of containers in "running" state , but it was showing all the container count, irrespective of the state it is in. This commit adds a new label "container_running_state" to the metrics "running_container_count", which doesn't change the base metrics but adds the option to query the metrics with "container_state" such as "running"/"unknown/... remove unused methods reported by staticcheck Remove variables while instantiating gauge(vec) which are default set to nil Convert kubelet metrics(running_pod_count and running_container_count) to standard gauges and added label to running_container_count metrics. Currently kubelet metrics(running_pod_count and running_container_count) use non-standard prometheus collectors , this change converts them to standard prometheus gauges. Also this adds a new label(container_state) to running_container_count which does a breakdown of containers tracked by kubelet based on the containers' state(running/unknown/created/exited). Set statbility explicitly for running_pod_count and running_container_count and reformat test register metrics explicitly in test , so that they don't become no-op	2019-08-29 17:23:04 +02:00
Tim Allclair	a2c51674cf	Cleanup more static check issues (S1,ST)	2019-08-21 10:40:21 -07:00
Khaled Henidak(Kal)	dba434c4ba	kubenet for ipv6 dualstack	2019-07-02 22:26:25 +00:00
Davanum Srinivas	33081c1f07	New staging repository for cri-api Change-Id: I2160b0b0ec4b9870a2d4452b428e395bbe12afbb	2019-03-26 18:21:04 -04:00
danielqsj	79a3eb816c	rename latency to duration in metrics	2019-02-18 17:40:04 +08:00
danielqsj	9fd99a48f5	Change kubelet metrics to conform guideline	2019-02-18 14:01:58 +08:00
Kubernetes Prow Robot	289a60ad71	Merge pull request #72709 from changyaowei/pleg_relist When pleg channel is full, discard events and record its count	2019-02-13 01:44:48 -08:00
xichengliudui	5dd26ecab5	Fix function comment to consistent with its name update pull request update pull request	2019-02-12 01:37:20 -05:00
changyaowei	b52afc350f	when pleg channel is full, discard events and record how many events discard	2019-01-30 20:43:54 +08:00
Robert Krawitz	3373fcf0fc	Reduce logspam for crash looping containers	2018-11-28 10:48:52 -05:00
Davanum Srinivas	954996e231	Move from glog to klog - Move from the old github.com/golang/glog to k8s.io/klog - klog as explicit InitFlags() so we add them as necessary - we update the other repositories that we vendor that made a similar change from glog to klog * github.com/kubernetes/repo-infra * k8s.io/gengo/ * k8s.io/kube-openapi/ * github.com/google/cadvisor - Entirely remove all references to glog - Fix some tests by explicit InitFlags in their init() methods Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135	2018-11-10 07:50:31 -05:00
k8s-ci-robot	45f6845a59	Merge pull request #69008 from sjenning/better-pleg-msg improve pleg error msg when it has never been successful	2018-10-30 16:15:43 -07:00
Seth Jennings	5eab76934b	improve pleg error msg when it has never been successful	2018-10-01 16:41:01 -05:00
Pingan2017	158552ff35	fix golint failures - /pkg/kubelet/images	2018-09-17 10:52:25 +08:00
Lee Verberne	e10042d22f	Increment CRI version from v1alpha1 to v1alpha2 This also incorporates the version string into the package name so that incompatibile versions will fail to connect. Arbitrary choices: - The proto3 package name is runtime.v1alpha2. The proto compiler normally translates this to a go package of "runtime_v1alpha2", but I renamed it to "v1alpha2" for consistency with existing packages. - kubelet/apis/cri is used as "internalapi". I left it alone and put the public "runtimeapi" in kubelet/apis/cri/runtime.	2018-02-07 09:06:26 +01:00
Marcin Owsiany	36dc1c4515	Fix typo in function name. Also remove a superfluous comment.	2017-10-17 11:31:46 +02:00
Kubernetes Submit Queue	28df7a1cae	Merge pull request #47806 from dcbw/fix-pod-ip-race Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.. kubelet: fix inconsistent display of terminated pod IPs PLEG and kubelet race when reading and sending pod status to the apiserver. PLEG inserts status into a cache, and then signals kubelet. Kubelet then eventually reads the status out of that cache, but in the mean time the status could have been changed by PLEG. When a pod exits, pod status will no longer include the pod's IP address because the network plugin/runtime will report "" for terminated pod IPs. If this status gets inserted into the PLEG cache before kubelet gets the status out of the cache, kubelet will see a blank pod IP address. This happens in about 1/5 of cases when pods are short-lived, and somewhat less frequently for longer running pods. To ensure consistency for properties of dead pods, copy an old status update's IP address over to the new status update if (a) the new status update's IP is missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility for a valid IP from the sandbox). Fixes: https://github.com/kubernetes/kubernetes/issues/47265 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1449373 @eparis @freehan @kubernetes/rh-networking @kubernetes/sig-network-misc	2017-09-22 21:01:50 -07:00
Casey Davenport	be5cd7fed2	Recreate pod sandbox when the sandbox does not have an IP address.	2017-09-15 09:23:52 -07:00
Dan Williams	8c16260160	kubelet: fix inconsistent display of terminated pod IPs by using events instead PLEG and kubelet race when reading and sending pod status to the apiserver. PLEG inserts status into a cache, and then signals kubelet. Kubelet then eventually reads the status out of that cache, but in the mean time the status could have been changed by PLEG. When a pod exits, pod status will no longer include the pod's IP address because the network plugin/runtime will report "" for terminated pod IPs. If this status gets inserted into the PLEG cache before kubelet gets the status out of the cache, kubelet will see a blank pod IP address. This happens in about 1/5 of cases when pods are short-lived, and somewhat less frequently for longer running pods. To ensure consistency for properties of dead pods, copy an old status update's IP address over to the new status update if (a) the new status update's IP is missing and (b) all sandboxes of the pod are dead/not-ready (eg, no possibility for a valid IP from the sandbox). Fixes: https://github.com/kubernetes/kubernetes/issues/47265	2017-07-21 09:52:10 -05:00
Kubernetes Submit Queue	c1f8fcd9fe	Merge pull request #45496 from andyxning/fix_pleg_relist_time Automatic merge from submit-queue fix pleg relist time This PR fix pleg reslist time. According to current implementation, we have a `Healthy` method periodically check the relist time. If current timestamp subtracts latest relist time is longer than `relistThreshold`(default is 3 minutes), we should return an error to indicate the error of runtime. `relist` method is also called periodically. If runtime(docker) hung, the relist method should return immediately without updating the latest relist time. If we update latest relist time no matter runtime(docker) hung(default timeout is 2 minutes), the `Healthy` method will never return an error. ```release-note Kubelet PLEG updates the relist timestamp only after successfully relisting. ``` /cc @yujuhong @Random-Liu @dchen1107	2017-05-21 04:17:14 -07:00
Clayton Coleman	3e095d12b4	Refactor move of client-go/util/clock to apimachinery	2017-05-20 14:19:48 -04:00
Andy Xie	af6c040630	fix pleg relist time	2017-05-18 11:40:04 +08:00

1 2

78 Commits