kubernetes

Author	SHA1	Message	Date
Clayton Coleman	e9a5fb7372	kubelet: Record a metric for latency of pod status update Track how long it takes for pod updates to propagate from detection to successful change on API server. Will guide future improvements in pod start and shutdown latency. Metric is `kubelet_pod_status_sync_duration_seconds` and is ALPHA stability. Histogram buckets are chosen based on distribution of observed status delays in practice.	2022-09-08 12:17:44 -04:00
Michal Wozniak	04fcbd721c	Introduction of a pod condition type indicating disruption. Its `reason` field indicates the reason: - PreemptionByKubeScheduler (Pod preempted by kube-scheduler) - DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint) - EvictionByEvictionAPI (Pod evicted by Eviction API) - DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)	2022-08-02 11:12:16 +02:00
Deep Debroy	dfdf8245bb	Introduce PodHasNetwork condition for pods Signed-off-by: Deep Debroy <ddebroy@gmail.com>	2022-08-01 09:51:43 -07:00
David Porter	7811d84fef	kubelet: Mark ready condition as false explicitly for terminal pods Terminal pods may continue to report a ready condition of true because there is a delay in reconciling the ready condition of the containers from the runtime with the pod status. It should be invalid for kubelet to report a terminal phase with a true ready condition. To fix the issue, explicitly override the ready condition to false for terminal pods during status updates. Signed-off-by: David Porter <david@porter.me>	2022-06-08 16:19:16 -07:00
Antonio Ojea	d16d23e0c7	add pod util to verify pod is terminal pods on phase succeeded or failed are guaranteed to have all containers stopped and to not ever regress	2022-05-27 06:42:39 +02:00
Kir Kolyshkin	4513de06a8	Regen mocks using go 1.18 Generated by ./hack/update-mocks.sh using go 1.18 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-23 10:19:38 -07:00
David Porter	c70f1955c4	test: Add E2E for job completions with cpu reservation Create an E2E test that creates a job that spawns a pod that should succeed. The job reserves a fixed amount of CPU and has a large number of completions and parallelism. Use to repro github.com/kubernetes/kubernetes/issues/106884 Signed-off-by: David Porter <david@porter.me>	2022-03-16 13:15:03 -04:00
Clayton Coleman	69a3820214	kubelet: Delay writing a terminal phase until the pod is terminated Other components must know when the Kubelet has released critical resources for terminal pods. Do not set the phase in the apiserver to terminal until all containers are stopped and cannot restart. As a consequence of this change, the Kubelet must explicitly transition a terminal pod to the terminating state in the pod worker which is handled by returning a new isTerminal boolean from syncPod. Finally, if a pod with init containers hasn't been initialized yet, don't default container statuses or not yet attempted init containers to the unknown failure state.	2022-03-16 13:15:00 -04:00
Patrick Ohly	9eaa2dc554	avoid klog Info calls without verbosity In the following code pattern, the log message will get logged with v=0 in JSON output although conceptually it has a higher verbosity: if klog.V(5).Enabled() { klog.Info("hello world") } Having the actual verbosity in the JSON output is relevant, for example for filtering out only the important info messages. The solution is to use klog.V(5).Info or something similar. Whether the outer if is necessary at all depends on how complex the parameters are. The return value of klog.V can be captured in a variable and be used multiple times to avoid the overhead for that function call and to avoid repeating the verbosity level.	2022-01-12 07:48:36 +01:00
Hanna Lee	e78b3e8dfe	Use nolint directive instead of stopping ticker, per liggit's suggestion	2021-11-17 08:56:57 +01:00
Hanna Lee	69d029bddb	Add syncTicker.Stop()	2021-11-17 08:56:57 +01:00
Hanna Lee	1fbf06f5ad	Use time.NewTicker instead of time.Tick to avoid leaking	2021-11-17 08:56:00 +01:00
Elana Hashman	5ff6c2396d	Do not sync Waiting statuses for Terminated pods	2021-10-04 11:05:54 -07:00
vikram Jadhav	0de4397490	mockery to mockgen conversion	2021-09-25 16:15:08 +00:00
Clayton Coleman	3eadd1a9ea	Keep pod worker running until pod is truly complete A number of race conditions exist when pods are terminated early in their lifecycle because components in the kubelet need to know "no running containers" or "containers can't be started from now on" but were relying on outdated state. Only the pod worker knows whether containers are being started for a given pod, which is required to know when a pod is "terminated" (no running containers, none coming). Move that responsibility and podKiller function into the pod workers, and have everything that was killing the pod go into the UpdatePod loop. Split syncPod into three phases - setup, terminate containers, and cleanup pod - and have transitions between those methods be visible to other components. After this change, to kill a pod you tell the pod worker to UpdatePod({UpdateType: SyncPodKill, Pod: pod}). Several places in the kubelet were incorrect about whether they were handling terminating (should stop running, might have containers) or terminated (no running containers) pods. The pod worker exposes methods that allow other loops to know when to set up or tear down resources based on the state of the pod - these methods remove the possibility of race conditions by ensuring a single component is responsible for knowing each pod's allowed state and other components simply delegate to checking whether they are in the window by UID. Removing containers now no longer blocks final pod deletion in the API server and are handled as background cleanup. Node shutdown no longer marks pods as failed as they can be restarted in the next step. See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details	2021-07-06 15:55:22 -04:00
sanwishe	9e257ec194	Optimization logging format for pkg/kubelet Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>	2021-05-25 08:52:08 +08:00
Elana Hashman	6af7eb6d49	Migrate missed log entries in kubelet Co-Authored-By: pacoxu <paco.xu@daocloud.io>	2021-03-18 14:26:26 -07:00
Navid Shaikh	dbe5476a2a	Migrate pkg/kubelet/status to structured logging	2021-03-05 20:58:46 +05:30
Benjamin Elder	56e092e382	hack/update-bazel.sh	2021-02-28 15:17:29 -08:00
Seth Jennings	acae34be79	kubelet: reduce no-op status manager msg log level	2020-12-03 13:06:02 -06:00
Jun Gong	454f9acc24	Remove unuseful error message about updating pod conditions not owned by kubelet	2020-07-24 09:56:03 +08:00
Amim Knabben	0ed41c3f10	Deprecating --bootstrap-checkpoint-path flag	2020-06-09 15:27:01 -04:00
Davanum Srinivas	07d88617e5	Run hack/update-vendor.sh Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:33 -04:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Mike Danese	76f8594378	more artisanal fixes Most of these could have been refactored automatically but it wouldn't have been uglier. The unsophisticated tooling left lots of unnecessary struct -> pointer -> struct transitions.	2020-03-05 14:59:47 -08:00
Clayton Coleman	8bc5cb01a9	kubelet: Clear the podStatusChannel before invoking syncBatch The status manager syncBatch() method processes the current state of the cache, which should include all entries in the channel. Flush the channel before we call a batch to avoid unnecessary work and to unblock pod workers when the node is congested. Discovered while investigating long shutdown intervals on the node where the status channel stayed full for tens of seconds. Add a for loop around the select statement to avoid unnecessary invocations of the wait.Forever closure each time.	2020-03-04 13:34:25 -05:00
Clayton Coleman	ad3d8949f0	kubelet: Preserve existing container status when pod terminated The kubelet must not allow a container that was reported failed in a restartPolicy=Never pod to be reported to the apiserver as success. If a client deletes a restartPolicy=Never pod, the dispatchWork and status manager race to update the container status. When dispatchWork (specifically podIsTerminated) returns true, it means all containers are stopped, which means status in the container is accurate. However, the TerminatePod method then clears this status. This results in a pod that has been reported with status.phase=Failed getting reset to status.phase.Succeeded, which is a violation of the guarantees around terminal phase. Ensure the Kubelet never reports that a container succeeded when it hasn't run or been executed by guarding the terminate pod loop from ever reporting 0 in the absence of container status.	2020-03-04 13:34:24 -05:00
Clayton Coleman	b252865479	kubelet: Avoid sending no-op patches In an e2e run, out of 1857 pod status updates executed by the Kubelet 453 (25%) were no-ops - they only contained the UID of the pod and no status changes. If the patch is a no-op we can avoid invoking the server and continue.	2020-02-26 23:06:38 -05:00
Kubernetes Prow Robot	dde6e8e746	Merge pull request #87858 from smarterclayton/different_type kubelet: Debug pod status output diff is wrong	2020-02-08 06:44:06 -08:00
Mike Danese	3aa59f7f30	generated: run refactor	2020-02-07 18:16:47 -08:00
Clayton Coleman	aed4d639a5	kubelet: Debug pod status output diff is wrong The types were different so the diff output is not useful, both should be pointers: ``` Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: I0205 19:44:40.222259 2737 status_manager.go:642] Pod status is inconsistent with cached status for pod "prometheus-k8s-1_openshift-monitoring(0e9137b8-3bd2-4353-b7f5-672749106dc1)", a reconciliation should be triggered: Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: interface{}( Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: - s"&PodStatus{Phase:Running,Conditions:[]PodCondition{PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2020-02-05 19:13:30 +0000 UTC,Reason:,Message:,},PodCondit> Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + v1.PodStatus{ Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + Phase: "Running", Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + Conditions: []v1.PodCondition{ ```	2020-02-05 14:52:46 -05:00
Jordan Liggitt	a65d8aeb76	Add UID precondition to kubelet pod status patch updates	2019-12-16 14:27:32 -05:00
yuxiaobo	81e9f21f83	Correct spelling mistakes Signed-off-by: yuxiaobo <yuxiaobogo@163.com>	2019-11-06 20:25:19 +08:00
Kubernetes Prow Robot	72cd1c14ef	Merge pull request #83325 from yutedz/static-mirror-pod Check whether mirror pod is ciritical in managerImpl#evictPod	2019-10-07 22:15:40 -07:00
Ted Yu	0939f90103	Check whether mirror pod is ciritical in managerImpl#evictPod	2019-10-01 11:12:18 -07:00
Uzuku	5a2e6bd000	Fix golint failures of pkg/kubelet/status/...	2019-09-21 23:43:37 +08:00
Kubernetes Prow Robot	3f4e30a80e	Merge pull request #82113 from kebe7jun/fix/log-format-and-typo Fix sync pod log format	2019-09-11 10:39:14 -07:00
Matthias Bertschy	1a08ea5984	startupProbe: Test changes	2019-08-30 00:40:26 +02:00
Matthias Bertschy	323f99ea8c	startupProbe: Kubelet changes	2019-08-30 00:40:26 +02:00
KEBE	8dc401d141	Fix sync pod log format and a func typo.	2019-08-29 14:39:43 +08:00
Tim Allclair	6510d26b6a	Fix misc static check issues	2019-08-21 10:40:21 -07:00
Himanshu Pandey	c05d506019	changed IsCriticalPod to return true in case of static pods	2019-08-07 15:47:43 -07:00
Mike Brown	7b6bb58f3a	update code docs around old todo that is not going to happen Signed-off-by: Mike Brown <brownwm@us.ibm.com>	2019-07-08 09:24:50 -05:00
Davanum Srinivas	7b8c9acc09	remove unused code Change-Id: If821920ec8872e326b7d85437ad8d2620807799d	2019-04-19 08:36:31 -04:00
Minhan Xia	47bc948fe3	reconcile pod ready condition when message is not expected	2019-03-20 14:05:40 -07:00
Pingan2017	fddaf257af	correct the type in status_manager.go	2019-01-25 14:34:11 +08:00
Kubernetes Prow Robot	d582682b7f	Merge pull request #72312 from Pingan2017/correct-ready-condition correctly update pod ready condition	2019-01-02 16:51:50 -08:00
Pingan2017	1148ecfaf6	correctly update pod ready condition	2018-12-25 09:36:37 +08:00
David Ashpole	70a7fdda02	use Pod.Status.StartTime as pod's cgroup start time in summary API	2018-12-14 14:26:55 -08:00
Davanum Srinivas	954996e231	Move from glog to klog - Move from the old github.com/golang/glog to k8s.io/klog - klog as explicit InitFlags() so we add them as necessary - we update the other repositories that we vendor that made a similar change from glog to klog * github.com/kubernetes/repo-infra * k8s.io/gengo/ * k8s.io/kube-openapi/ * github.com/google/cadvisor - Entirely remove all references to glog - Fix some tests by explicit InitFlags in their init() methods Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135	2018-11-10 07:50:31 -05:00

1 2 3 4

184 Commits