kubernetes

Author	SHA1	Message	Date
Vinay Kulkarni	f2bd94a0de	In-place Pod Vertical Scaling - core implementation 1. Core Kubelet changes to implement In-place Pod Vertical Scaling. 2. E2E tests for In-place Pod Vertical Scaling. 3. Refactor kubelet code and add missing tests (Derek's kubelet review) 4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature. 5. Fix corner-case where resize A->B->A gets ignored 6. Add cgroup v2 support to pod resize E2E test. KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>	2023-02-24 18:21:21 +00:00
arrowfeng	6a57404e28	kubelet: cleanup secretManager and configManager in podManager Signed-off-by: arrowfeng <289716347@qq.com>	2022-11-14 23:05:32 +08:00
Michal Wozniak	c803892bd8	Enable the feature into beta	2022-11-09 09:02:40 +01:00
Michal Wozniak	026b97352f	Add comments to clarify the updated logic in kubelet's status_manager	2022-11-08 10:21:25 +01:00
Michal Wozniak	4e732e20d0	Do not revert the pod condition if there might be running containers, skip condition update instead.	2022-11-07 16:22:29 +01:00
Michal Wozniak	52cd6755eb	Add pod disruption conditions for kubelet initiated failures	2022-11-07 11:23:22 +01:00
Artur Żyliński	492f5fa82c	Regenerate mocks	2022-10-26 11:31:50 +02:00
Artur Żyliński	b0fac15cd6	Make the interface local to each package	2022-10-26 11:28:18 +02:00
Artur Żyliński	9f31669a53	New histogram: Pod start SLI duration	2022-10-26 11:28:17 +02:00
Kubernetes Prow Robot	244c035b87	Merge pull request #110263 from claudiubelu/unittests unittests: Fixes unit tests for Windows	2022-10-25 14:50:34 -07:00
Claudiu Belu	6f2eeed2e8	unittests: Fixes unit tests for Windows Currently, there are some unit tests that are failing on Windows due to various reasons: - config options not supported on Windows. - files not closed, which means that they cannot be removed / renamed. - paths not properly joined (filepath.Join should be used). - time.Now() is not as precise on Windows, which means that 2 consecutive calls may return the same timestamp. - different error messages on Windows. - files have \r\n line endings on Windows. - /tmp directory being used, which might not exist on Windows. Instead, the OS-specific Temp directory should be used. - the default value for Kubelet's EvictionHard field was containing OS-specific fields. This is now moved, the field is now set during Kubelet's initialization, after the config file is read.	2022-10-25 23:46:56 +03:00
Clayton Coleman	e9a5fb7372	kubelet: Record a metric for latency of pod status update Track how long it takes for pod updates to propagate from detection to successful change on API server. Will guide future improvements in pod start and shutdown latency. Metric is `kubelet_pod_status_sync_duration_seconds` and is ALPHA stability. Histogram buckets are chosen based on distribution of observed status delays in practice.	2022-09-08 12:17:44 -04:00
Michal Wozniak	04fcbd721c	Introduction of a pod condition type indicating disruption. Its `reason` field indicates the reason: - PreemptionByKubeScheduler (Pod preempted by kube-scheduler) - DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint) - EvictionByEvictionAPI (Pod evicted by Eviction API) - DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)	2022-08-02 11:12:16 +02:00
Deep Debroy	dfdf8245bb	Introduce PodHasNetwork condition for pods Signed-off-by: Deep Debroy <ddebroy@gmail.com>	2022-08-01 09:51:43 -07:00
David Porter	7811d84fef	kubelet: Mark ready condition as false explicitly for terminal pods Terminal pods may continue to report a ready condition of true because there is a delay in reconciling the ready condition of the containers from the runtime with the pod status. It should be invalid for kubelet to report a terminal phase with a true ready condition. To fix the issue, explicitly override the ready condition to false for terminal pods during status updates. Signed-off-by: David Porter <david@porter.me>	2022-06-08 16:19:16 -07:00
Antonio Ojea	d16d23e0c7	add pod util to verify pod is terminal pods on phase succeeded or failed are guaranteed to have all containers stopped and to not ever regress	2022-05-27 06:42:39 +02:00
Kir Kolyshkin	4513de06a8	Regen mocks using go 1.18 Generated by ./hack/update-mocks.sh using go 1.18 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-23 10:19:38 -07:00
David Porter	c70f1955c4	test: Add E2E for job completions with cpu reservation Create an E2E test that creates a job that spawns a pod that should succeed. The job reserves a fixed amount of CPU and has a large number of completions and parallelism. Use to repro github.com/kubernetes/kubernetes/issues/106884 Signed-off-by: David Porter <david@porter.me>	2022-03-16 13:15:03 -04:00
Clayton Coleman	69a3820214	kubelet: Delay writing a terminal phase until the pod is terminated Other components must know when the Kubelet has released critical resources for terminal pods. Do not set the phase in the apiserver to terminal until all containers are stopped and cannot restart. As a consequence of this change, the Kubelet must explicitly transition a terminal pod to the terminating state in the pod worker which is handled by returning a new isTerminal boolean from syncPod. Finally, if a pod with init containers hasn't been initialized yet, don't default container statuses or not yet attempted init containers to the unknown failure state.	2022-03-16 13:15:00 -04:00
Patrick Ohly	9eaa2dc554	avoid klog Info calls without verbosity In the following code pattern, the log message will get logged with v=0 in JSON output although conceptually it has a higher verbosity: if klog.V(5).Enabled() { klog.Info("hello world") } Having the actual verbosity in the JSON output is relevant, for example for filtering out only the important info messages. The solution is to use klog.V(5).Info or something similar. Whether the outer if is necessary at all depends on how complex the parameters are. The return value of klog.V can be captured in a variable and be used multiple times to avoid the overhead for that function call and to avoid repeating the verbosity level.	2022-01-12 07:48:36 +01:00
Hanna Lee	e78b3e8dfe	Use nolint directive instead of stopping ticker, per liggit's suggestion	2021-11-17 08:56:57 +01:00
Hanna Lee	69d029bddb	Add syncTicker.Stop()	2021-11-17 08:56:57 +01:00
Hanna Lee	1fbf06f5ad	Use time.NewTicker instead of time.Tick to avoid leaking	2021-11-17 08:56:00 +01:00
Elana Hashman	5ff6c2396d	Do not sync Waiting statuses for Terminated pods	2021-10-04 11:05:54 -07:00
vikram Jadhav	0de4397490	mockery to mockgen conversion	2021-09-25 16:15:08 +00:00
Clayton Coleman	3eadd1a9ea	Keep pod worker running until pod is truly complete A number of race conditions exist when pods are terminated early in their lifecycle because components in the kubelet need to know "no running containers" or "containers can't be started from now on" but were relying on outdated state. Only the pod worker knows whether containers are being started for a given pod, which is required to know when a pod is "terminated" (no running containers, none coming). Move that responsibility and podKiller function into the pod workers, and have everything that was killing the pod go into the UpdatePod loop. Split syncPod into three phases - setup, terminate containers, and cleanup pod - and have transitions between those methods be visible to other components. After this change, to kill a pod you tell the pod worker to UpdatePod({UpdateType: SyncPodKill, Pod: pod}). Several places in the kubelet were incorrect about whether they were handling terminating (should stop running, might have containers) or terminated (no running containers) pods. The pod worker exposes methods that allow other loops to know when to set up or tear down resources based on the state of the pod - these methods remove the possibility of race conditions by ensuring a single component is responsible for knowing each pod's allowed state and other components simply delegate to checking whether they are in the window by UID. Removing containers now no longer blocks final pod deletion in the API server and are handled as background cleanup. Node shutdown no longer marks pods as failed as they can be restarted in the next step. See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details	2021-07-06 15:55:22 -04:00
sanwishe	9e257ec194	Optimization logging format for pkg/kubelet Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>	2021-05-25 08:52:08 +08:00
Elana Hashman	6af7eb6d49	Migrate missed log entries in kubelet Co-Authored-By: pacoxu <paco.xu@daocloud.io>	2021-03-18 14:26:26 -07:00
Navid Shaikh	dbe5476a2a	Migrate pkg/kubelet/status to structured logging	2021-03-05 20:58:46 +05:30
Benjamin Elder	56e092e382	hack/update-bazel.sh	2021-02-28 15:17:29 -08:00
Seth Jennings	acae34be79	kubelet: reduce no-op status manager msg log level	2020-12-03 13:06:02 -06:00
Jun Gong	454f9acc24	Remove unuseful error message about updating pod conditions not owned by kubelet	2020-07-24 09:56:03 +08:00
Amim Knabben	0ed41c3f10	Deprecating --bootstrap-checkpoint-path flag	2020-06-09 15:27:01 -04:00
Davanum Srinivas	07d88617e5	Run hack/update-vendor.sh Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:33 -04:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Mike Danese	76f8594378	more artisanal fixes Most of these could have been refactored automatically but it wouldn't have been uglier. The unsophisticated tooling left lots of unnecessary struct -> pointer -> struct transitions.	2020-03-05 14:59:47 -08:00
Clayton Coleman	8bc5cb01a9	kubelet: Clear the podStatusChannel before invoking syncBatch The status manager syncBatch() method processes the current state of the cache, which should include all entries in the channel. Flush the channel before we call a batch to avoid unnecessary work and to unblock pod workers when the node is congested. Discovered while investigating long shutdown intervals on the node where the status channel stayed full for tens of seconds. Add a for loop around the select statement to avoid unnecessary invocations of the wait.Forever closure each time.	2020-03-04 13:34:25 -05:00
Clayton Coleman	ad3d8949f0	kubelet: Preserve existing container status when pod terminated The kubelet must not allow a container that was reported failed in a restartPolicy=Never pod to be reported to the apiserver as success. If a client deletes a restartPolicy=Never pod, the dispatchWork and status manager race to update the container status. When dispatchWork (specifically podIsTerminated) returns true, it means all containers are stopped, which means status in the container is accurate. However, the TerminatePod method then clears this status. This results in a pod that has been reported with status.phase=Failed getting reset to status.phase.Succeeded, which is a violation of the guarantees around terminal phase. Ensure the Kubelet never reports that a container succeeded when it hasn't run or been executed by guarding the terminate pod loop from ever reporting 0 in the absence of container status.	2020-03-04 13:34:24 -05:00
Clayton Coleman	b252865479	kubelet: Avoid sending no-op patches In an e2e run, out of 1857 pod status updates executed by the Kubelet 453 (25%) were no-ops - they only contained the UID of the pod and no status changes. If the patch is a no-op we can avoid invoking the server and continue.	2020-02-26 23:06:38 -05:00
Kubernetes Prow Robot	dde6e8e746	Merge pull request #87858 from smarterclayton/different_type kubelet: Debug pod status output diff is wrong	2020-02-08 06:44:06 -08:00
Mike Danese	3aa59f7f30	generated: run refactor	2020-02-07 18:16:47 -08:00
Clayton Coleman	aed4d639a5	kubelet: Debug pod status output diff is wrong The types were different so the diff output is not useful, both should be pointers: ``` Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: I0205 19:44:40.222259 2737 status_manager.go:642] Pod status is inconsistent with cached status for pod "prometheus-k8s-1_openshift-monitoring(0e9137b8-3bd2-4353-b7f5-672749106dc1)", a reconciliation should be triggered: Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: interface{}( Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: - s"&PodStatus{Phase:Running,Conditions:[]PodCondition{PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2020-02-05 19:13:30 +0000 UTC,Reason:,Message:,},PodCondit> Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + v1.PodStatus{ Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + Phase: "Running", Feb 05 19:44:40 ci-ln-6k7l4-w-c-w9wbb.c.openshift-gce-devel-ci.internal hyperkube[2737]: + Conditions: []v1.PodCondition{ ```	2020-02-05 14:52:46 -05:00
Jordan Liggitt	a65d8aeb76	Add UID precondition to kubelet pod status patch updates	2019-12-16 14:27:32 -05:00
yuxiaobo	81e9f21f83	Correct spelling mistakes Signed-off-by: yuxiaobo <yuxiaobogo@163.com>	2019-11-06 20:25:19 +08:00
Kubernetes Prow Robot	72cd1c14ef	Merge pull request #83325 from yutedz/static-mirror-pod Check whether mirror pod is ciritical in managerImpl#evictPod	2019-10-07 22:15:40 -07:00
Ted Yu	0939f90103	Check whether mirror pod is ciritical in managerImpl#evictPod	2019-10-01 11:12:18 -07:00
Uzuku	5a2e6bd000	Fix golint failures of pkg/kubelet/status/...	2019-09-21 23:43:37 +08:00
Kubernetes Prow Robot	3f4e30a80e	Merge pull request #82113 from kebe7jun/fix/log-format-and-typo Fix sync pod log format	2019-09-11 10:39:14 -07:00
Matthias Bertschy	1a08ea5984	startupProbe: Test changes	2019-08-30 00:40:26 +02:00
Matthias Bertschy	323f99ea8c	startupProbe: Kubelet changes	2019-08-30 00:40:26 +02:00

1 2 3 4

195 Commits