kubernetes

Author	SHA1	Message	Date
Michal Wozniak	c803892bd8	Enable the feature into beta	2022-11-09 09:02:40 +01:00
Michal Wozniak	52cd6755eb	Add pod disruption conditions for kubelet initiated failures	2022-11-07 11:23:22 +01:00
Patrick Ohly	65385fec20	kubelet: convert node shutdown manager to contextual logging This will make output checking easier (done in a separate commit). kubelet itself still uses the global logger.	2022-06-24 11:20:34 +02:00
Clayton Coleman	1d518adb76	kubelet: Pod probes should be handled by pod worker The pod worker is the owner of when a container is running or not, and the start and stop of the probes for a given pod should be handled during the pod sync loop. This ensures that probes do not continue running even after eviction. Because the pod semantics allow lifecycle probes to shorten grace period, the probe is removed after the containers in a pod are terminated successfully. As an optimization, if the pod will have a very short grace period (0 or 1 seconds) we stop the probes immediately to reduce resource usage during eviction slightly. After this change, the probe manager is only called by the pod worker or by the reconcile loop.	2022-06-06 17:00:54 -05:00
Shiming Zhang	ced991cb00	Emit Metrics in the shutdown process	2022-03-16 10:14:55 +08:00
Shiming Zhang	5eb3e88f6b	Support metrics for node shutdown	2022-03-11 17:31:10 +08:00
Kubernetes Prow Robot	09fccc3533	Merge pull request #106796 from jonyhy96/fix-timer kubelet: use newtimer instead in nodeshutdown manager	2022-01-06 11:47:12 -08:00
haoyun	92fa957dd1	feat: use clock instead Signed-off-by: haoyun <yun.hao@daocloud.io>	2021-12-10 13:59:12 +08:00
David Porter	95264a418d	kubelet: set failed phase during graceful shutdown Revert to previous behavior in 1.21/1.20 of setting pod phase to failed during graceful node shutdown. Setting pods to failed phase will ensure that external controllers that manage pods like deployments will create new pods to replace those that are shutdown. Many customers have taken a dependency on this behavior and it was breaking change in 1.22, so this change reverts back to the previous behavior. Signed-off-by: David Porter <david@porter.me>	2021-12-09 13:17:40 -08:00
Shiming Zhang	545313bdc7	Implement graceful shutdown based on Pod priority	2021-11-17 11:47:12 +08:00
Shiming Zhang	e47c78a354	Add log for creating node shutdown manager	2021-10-15 11:16:21 +08:00
Shiming Zhang	b468c24e85	Refactor to use structure to pass parameters	2021-10-15 11:16:21 +08:00
Ryan Phillips	e2e938066d	kubelet: add probe termination to graceful shutdowns	2021-09-22 14:13:25 -05:00
wojtekt	53ce79a18a	Migrate to k8s.io/utils/clock in pkg/kubelet	2021-09-10 12:20:09 +02:00
Stephen Augustus	481cf6fbe7	generated: Run hack/update-gofmt.sh Signed-off-by: Stephen Augustus <foo@auggie.dev>	2021-08-24 15:47:49 -04:00
Kubernetes Prow Robot	8dbc33d649	Merge pull request #101081 from rphillips/add_graceful_shutdown_event kubelet: add graceful shutdown events	2021-08-17 22:08:08 -07:00
Kubernetes Prow Robot	d7c1663556	Merge pull request #103137 from wzshiming/fix/expected_inhibit_delay Allow the actual inhibit delay to be greater than the expected inhibit delay	2021-08-17 11:41:49 -07:00
Clayton Coleman	3eadd1a9ea	Keep pod worker running until pod is truly complete A number of race conditions exist when pods are terminated early in their lifecycle because components in the kubelet need to know "no running containers" or "containers can't be started from now on" but were relying on outdated state. Only the pod worker knows whether containers are being started for a given pod, which is required to know when a pod is "terminated" (no running containers, none coming). Move that responsibility and podKiller function into the pod workers, and have everything that was killing the pod go into the UpdatePod loop. Split syncPod into three phases - setup, terminate containers, and cleanup pod - and have transitions between those methods be visible to other components. After this change, to kill a pod you tell the pod worker to UpdatePod({UpdateType: SyncPodKill, Pod: pod}). Several places in the kubelet were incorrect about whether they were handling terminating (should stop running, might have containers) or terminated (no running containers) pods. The pod worker exposes methods that allow other loops to know when to set up or tear down resources based on the state of the pod - these methods remove the possibility of race conditions by ensuring a single component is responsible for knowing each pod's allowed state and other components simply delegate to checking whether they are in the window by UID. Removing containers now no longer blocks final pod deletion in the API server and are handled as background cleanup. Node shutdown no longer marks pods as failed as they can be restarted in the next step. See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details	2021-07-06 15:55:22 -04:00
Shiming Zhang	97bcfbd674	Allow the actual inhibit delay to be greater than the expected inhibit delay	2021-06-24 14:11:58 +08:00
Ryan Phillips	d9be5abc37	kubelet: add shutdown events	2021-06-23 16:44:19 -05:00
Guillaume Le Biller	f1de598233	Improve terminated pod message when node is shutting down Signed-off-by: Guillaume Le Biller <glebiller@Traveldoo.com>	2021-06-15 18:29:54 +02:00
Shiming Zhang	9c59e6c85f	After dbus restarts, make GracefulNodeShutdown work again	2021-05-19 10:05:38 +08:00
Kubernetes Prow Robot	3c20c5aa2f	Merge pull request #100177 from wangyx1992/wrapped-error fix errors in wrapped format	2021-04-13 23:24:42 -07:00
wangyx1992	34c2b2360b	fix errors in wrapped format Signed-off-by: wangyx1992 <wang.yixiang@zte.com.cn>	2021-03-26 14:57:55 +08:00
JUN YANG	90bfd38b83	Structured Logging migration: modify node and pod part logs of kubelet. Signed-off-by: JunYang <yang.jun22@zte.com.cn>	2021-03-13 12:31:09 +08:00
David Porter	893f5fd4f0	Promote kubelet graceful node shutdown to beta - Change the feature gate from alpha to beta and enable it by default - Update a few of the unit tests due to feature gate being enabled by default - Small refactor in `nodeshutdown_manager` which adds `featureEnabled` function (which checks that feature gate and that `kubeletConfig.ShutdownGracePeriod > 0`). - Use `featureEnabled()` to exit early from shutdown manager in the case that the feature is disabled - Update kubelet config defaulting to be explicit that `ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` default to zero and update the godoc comments. - Update defaults and add featureGate tag in api config godoc. With this feature now in beta and the feature gate enabled by default, to enable graceful shutdown all that will be required is to configure `ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` in the kubelet config. If not configured, they will be defaulted to zero, and graceful shutdown will effectively be disabled.	2021-03-05 15:21:37 -08:00
Kubernetes Prow Robot	9ec1e23e41	Merge pull request #98005 from wzshiming/fix-rescheduling-to-the-shutdown-node Sync node status during kubelet node shutdown	2021-01-28 17:51:53 -08:00
wzshiming	d9df265af0	Sync node status during kubelet node shutdown	2021-01-21 11:01:13 +08:00
wzshiming	4e17e58552	Fix repeatedly aquire the inhibit lock	2021-01-15 10:49:11 +08:00
David Porter	16f71c6d47	Implement shutdown manager in kubelet Implements KEP 2000, Graceful Node Shutdown: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2000-graceful-node-shutdown * Add new FeatureGate `GracefulNodeShutdown` to control enabling/disabling the feature * Add two new KubeletConfiguration options * `ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods` * Add new package, `nodeshutdown` that implements the Node shutdown manager * The node shutdown manager uses the systemd inhibit package, to create an system inhibitor, monitor for node shutdown events, and gracefully terminate pods upon a node shutdown.	2020-11-12 21:47:55 +00:00

30 Commits