Commit Graph

2131 Commits

Author SHA1 Message Date
Clayton Coleman
3eadd1a9ea
Keep pod worker running until pod is truly complete
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.

Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).

Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.

Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.

See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
2021-07-06 15:55:22 -04:00
Elana Hashman
0deef4610e
Set MemorySwapLimitInBytes for CRI when NodeSwapEnabled 2021-06-29 11:59:02 -07:00
Ryan Phillips
d9be5abc37 kubelet: add shutdown events 2021-06-23 16:44:19 -05:00
Sascha Grunert
8b7003aff4
Add SeccompDefault feature
This adds the gate `SeccompDefault` as new alpha feature. Seccomp path
and field fallbacks are now passed to the helper functions, whereas unit
tests covering those code paths have been added as well.

Beside enabling the feature gate, the feature has to be enabled by the
`SeccompDefault` kubelet configuration or its corresponding
`--seccomp-default` CLI flag.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>

Apply suggestions from code review

Co-authored-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-06-23 10:22:57 +02:00
Artyom Lukianov
03830db82d Implement all necessary methods to provide memory manager data under pod resources metrics
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-06-22 13:06:32 +03:00
sanwishe
9e257ec194 Optimization logging format for pkg/kubelet
Signed-off-by: sanwishe <jiang.mingzhi35@zte.com.cn>
2021-05-25 08:52:08 +08:00
Danil-Grigorev
5d57b3794c Add DisableCloudProviders FG
FeatureGate acts as a secondary switch to disable cloud-controller loops
in KCM, Kubelet and KAPI.

Provide comprehensive logging information to users, so they will be
guided in adoption of out-of-tree cloud provider implementation.
2021-05-21 16:09:44 +02:00
Kubernetes Prow Robot
3e588be763
Merge pull request #101712 from SergeyKanzhelev/disableAcceleratorUsageMetricsOnContainerd
disable collecting of accelerator metrics in cAdvisor
2021-05-17 13:39:51 -07:00
Kubernetes Prow Robot
cff652d951
Merge pull request #101369 from markusthoemmes/status-simplification
pkg/kubelet: Simplify status string generation on probes
2021-05-03 17:21:22 -07:00
Sergey Kanzhelev
e8ae653c1d disable collecting of accelerator metrics and exposing it for containerd 2021-04-30 22:16:34 +00:00
yuzhiquan
02c3d53a23 typo 2021-04-23 17:55:54 +08:00
Markus Thömmes
f00441d2ee pkg/kubelet: Simplify status string generation on probes 2021-04-22 14:06:18 +02:00
Lubomir I. Ivanov
7deac5e697 pkg/kubelet: improve the node informer sync check
GetNode() is called in a lot of places including a hot loop in
fastStatusUpdateOnce. Having a poll in it is delaying
the kubelet /readyz status=200 report.

If a client is available attempt to wait for the sync to happen,
before starting the list watch for pods at the apiserver.
2021-04-21 22:46:27 +03:00
Niekvdplas
fec272a7b2 Fixed several spelling mistakes 2021-03-30 23:02:09 +02:00
Elana Hashman
6af7eb6d49
Migrate missed log entries in kubelet
Co-Authored-By: pacoxu <paco.xu@daocloud.io>
2021-03-18 14:26:26 -07:00
Navid Shaikh
be91ea5bd1 Migrate pkg/kubelet/kubelet.go to structured logging 2021-03-17 14:39:08 +05:30
Kubernetes Prow Robot
4b6e3e164f
Merge pull request #99221 from jsturtevant/windows-host-stats-provider
Get filesystem stats for files on Windows
2021-03-10 11:09:23 -08:00
fengzixu
edc1c62471 feature: add CSIVolumeHealth feature and gate
1. add EventRecorder to ResourceAnalyzer
2. add CSIVolumeHealth feature and gate
2021-03-10 01:16:37 +09:00
Kubernetes Prow Robot
b1367af8b5
Merge pull request #99799 from QiWang19/kobj-slice
Refactor pods format to return ObjRef slice
2021-03-08 16:27:19 -08:00
James Sturtevant
c9eff4e906 Get filesystem stats for files on Windows 2021-03-08 12:50:23 -08:00
Kubernetes Prow Robot
eb4dafb7f1
Merge pull request #99651 from umohnani8/cri
Move CRIContainerLogRotation to GA
2021-03-08 12:07:20 -08:00
chenyw1990
57a3b0abd6 reduce configmap and secret watch of kubelet 2021-03-08 16:55:39 +08:00
Kubernetes Prow Robot
c193c1b234
Merge pull request #98376 from matthyx/mega
Make all health checks probing consistent
2021-03-06 11:45:41 -08:00
Qi Wang
8133d29586 Refactor pods format to ObjRef slice
Refactor format.Pods to return a slice of ObjRef for structured logging.
Ref: https://github.com/kubernetes/kubernetes/pull/99029#discussion_r586785552

Signed-off-by: Qi Wang <qiwan@redhat.com>
2021-03-06 11:26:50 -05:00
Kubernetes Prow Robot
55f255208a
Merge pull request #83730 from claudiubelu/windows/containerd-etc-hosts
Windows: Fixes /etc/hosts file mounting support for containerd
2021-03-05 05:08:22 -08:00
Matthias Bertschy
431e6a7044 Move readinessManager updates handling to kubelet 2021-03-05 07:02:25 +01:00
Matthias Bertschy
eed218a3a2 Move startupManager updates handling to kubelet 2021-03-05 07:02:25 +01:00
Urvashi Mohnani
ca99aa587d Move CRIContainerLogRotation to GA
Graduate the CRIContainerLogRotation feature gate
from beta to GA.

Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
2021-03-04 09:40:02 -05:00
pacoxu
cd54bd94e9 deprecate cAdvisor json metrics collected by Kubelet
- remove unused code for cadvisor json metrics collected

Signed-off-by: pacoxu <paco.xu@daocloud.io>
2021-03-02 15:36:21 +08:00
Kubernetes Prow Robot
17c3ee8708
Merge pull request #98742 from gjkim42/sync-until-terminate-containers
kubelet: Sync completed pods until their containers have been terminated
2021-02-24 15:29:26 -08:00
Kubernetes Prow Robot
739a72b9cc
Merge pull request #99158 from wgahnagl/lock-sysctls
Graduate sysctls to GA
2021-02-24 13:39:24 -08:00
xiaofei.sun
fd62f32125 Scheduler: remove pkg/apis/core/field_constants.go 2021-02-24 18:06:29 +08:00
pacoxu
3de4dd841f
remove featuregate for sysctl
Co-authored-by: Skyler Clark <wgahnagl@protonmail.com>
2021-02-22 16:51:43 -05:00
Sri Saran Balaji Vellore Rajakumar
af05a7eca3 Refactor Kubelet Server to take kubeConfiguration instead of multiple fields 2021-02-11 16:15:35 -08:00
Sri Saran Balaji Vellore Rajakumar
51cdf4e97b Add support to disable /debug/pprof and /debug/flags/v endpoint
Co-authored-by: xiaofei.sun <sunxiaofei@kuaishou.com>
Co-authored-by: SaranBalaji90 <srisaranbalaji@gmail.com>
2021-02-11 15:56:53 -08:00
Geonju Kim
321ca8af52 kubelet: Sync completed pods until their containers have been terminated 2021-02-06 14:06:50 +09:00
Ryan Phillips
f918e11e3a register all pending pod deletions and check for kill
do not delete the cgroup from a pod when it is being killed
2021-02-04 11:45:42 -06:00
Claudiu Belu
de4602995b Windows: Fixes /etc/hosts file mounting support for containerd
If Containerd is used on Windows, then we can also mount individual
files into containers (e.g.: /etc/hosts), which was not possible with Docker.

Checks if the container runtime is containerd, and if it is, then also
mount /etc/hosts file (to C:\Windows\System32\drivers\etc\hosts).
2021-01-30 04:54:42 -08:00
Kubernetes Prow Robot
9ec1e23e41
Merge pull request #98005 from wzshiming/fix-rescheduling-to-the-shutdown-node
Sync node status during kubelet node shutdown
2021-01-28 17:51:53 -08:00
Kubernetes Prow Robot
e05c9ab04b
Merge pull request #97932 from ehashman/kubelet-standalone-doc
Add explanation for kubeClient != nil in NewMainKubelet
2021-01-28 16:59:59 -08:00
wzshiming
d9df265af0 Sync node status during kubelet node shutdown 2021-01-21 11:01:13 +08:00
Kubernetes Prow Robot
09f4baed35
Merge pull request #98103 from gjkim42/delete-static-pod-gracefully
Delete static pod gracefully and fix mirrorPodTerminationMap leak
2021-01-19 10:01:44 -08:00
chymy
f25b902b83 kubelet logs print 'kubelet nodes sync' frequently
Signed-off-by: chymy <chang.min1@zte.com.cn>
2021-01-19 08:57:35 +08:00
Geonju Kim
1563fb68e6 kubelet: Fix mirrorPodTerminationMap leak 2021-01-19 08:55:54 +09:00
Geonju Kim
9bcb451d7d kubelet: Delete static pods gracefully
Add a new static pod after checking if its mirror pod is pending termination.
2021-01-19 08:54:24 +09:00
Kubernetes Prow Robot
8bf42039e6
Merge pull request #96552 from pandaamanda/klog_fmt
use klog.Info and klog.Warning when had no format
2021-01-15 17:57:43 -08:00
Kubernetes Prow Robot
e0b2787ee1
Merge pull request #97980 from SergeyKanzhelev/revertSandboxCheckInStatus
Revert "Merge pull request #92817 from kmala/kubelet"
2021-01-12 16:54:35 -08:00
Kubernetes Prow Robot
4e93dbcd0d
Merge pull request #94087 from derekwaynecarr/node-sync-once
kubelet waits for node lister to sync at least once
2021-01-12 15:06:35 -08:00
Sergey Kanzhelev
4c9e96c238 Revert "Merge pull request #92817 from kmala/kubelet"
This reverts commit 88512be213, reversing
changes made to c3b888f647.
2021-01-12 22:27:22 +00:00
Elana Hashman
f3073b739f
Add explanation for kubeClient != nil in NewMainKubelet 2021-01-11 14:57:38 -08:00
Kubernetes Prow Robot
75115236e7
Merge pull request #97006 from lingsamuel/fix-cadvisor-machine-metrics
Fix missing cadvisor machine metrics
2020-12-31 21:15:51 -08:00
Erik Wilson
a4037d2684
Fix cadvisor machine metrics
Signed-off-by: Ling Samuel <lingsamuelgrace@gmail.com>
2020-12-04 10:08:05 +08:00
Derek Carr
acb43c7c4a Rework hostfs metrics
Ephemeral storage usage should be calculated by the metrics code,
not the eviction code.
2020-12-03 13:04:25 -07:00
Joel Smith
39a11744ce Partially revert "Include pod /etc/hosts in ephemeral storage calculation for eviction"
This reverts (most of) commit f34b586d01.
2020-12-03 04:47:16 -07:00
xiongzhongliang
90f4aeeea4 use klog.Info and klog.Warning when had no format 2020-11-14 00:55:06 +08:00
David Porter
16f71c6d47 Implement shutdown manager in kubelet
Implements KEP 2000, Graceful Node Shutdown:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2000-graceful-node-shutdown

* Add new FeatureGate `GracefulNodeShutdown` to control
enabling/disabling the feature
* Add two new KubeletConfiguration options
  * `ShutdownGracePeriod` and `ShutdownGracePeriodCriticalPods`
* Add new package, `nodeshutdown` that implements the Node shutdown
manager
  * The node shutdown manager uses the systemd inhibit package, to
  create an system inhibitor, monitor for node shutdown events, and
  gracefully terminate pods upon a node shutdown.
2020-11-12 21:47:55 +00:00
Kubernetes Prow Robot
f653e6cf92
Merge pull request #94624 from dims/deprecate-dockershim
Deprecate Dockershim
2020-11-12 08:04:50 -08:00
Kubernetes Prow Robot
12d9183da0
Merge pull request #95718 from SergeyKanzhelev/runtimeClass2
RuntimeClass GA
2020-11-12 00:44:51 -08:00
Kubernetes Prow Robot
d233111f5b
Merge pull request #94196 from andrewsykim/registry-creds
kubelet: add alpha credential provider plugins
2020-11-11 19:59:11 -08:00
Kubernetes Prow Robot
f5abe26a19
Merge pull request #93243 from AlexeyPerevalov/NUMAidAndCPUidsInPodResources
Implement TopologyInfo and cpu_ids in podresources interface
2020-11-11 12:35:11 -08:00
Sergey Kanzhelev
06da0e5e74 GA of RuntimeClass feature gate and API 2020-11-11 19:22:32 +00:00
Alexey Perevalov
a8b8995ef2 Implement TopologyInfo and cpu_ids in podresources
It covers deviceplugin & cpumanager.

It has drawback, since cpuset and all other structs including cadvisor's keep
cpu as int, but for protobuf based interface is better to have fixed
int.
This patch also introduces additional interface CPUsProvider, while
DeviceProvider might have been extended too.

Checkpoint not covered by unit test.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
2020-11-11 13:50:49 +03:00
Haowei Cai
e79ba4877e move lease controller to k8s.io/component-helpers/apimachinery 2020-11-10 15:51:03 -08:00
Andrew Sy Kim
51441fd052 kubelet: support alpha credential provider exec plugins
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
2020-11-10 13:44:06 -05:00
Tim Allclair
a439bc5572
Remove --redirect-container-streaming functionality (#95935)
* Remove --redirect-container-streaming functionality

* Update bazel
2020-11-09 11:50:11 -08:00
Ling Samuel
c7f2b2aa2f
Remove useless variable and if
Signed-off-by: Ling Samuel <lingsamuelgrace@gmail.com>
2020-11-03 15:57:22 +08:00
Sergey Kanzhelev
d974b142d3 follow up for #94109 2020-10-27 07:02:44 +00:00
Kubernetes Prow Robot
47943d5f9c
Merge pull request #94109 from derekwaynecarr/cleanup-kubelet-todos
Cleanup kubelet TODOs that are no longer pertinent.
2020-10-26 23:49:59 -07:00
Kubernetes Prow Robot
f20a36f784
Merge pull request #95428 from roycaihw/cleanup/generalize-lease-controller
Generalize node lease controller
2020-10-23 13:43:02 -07:00
Kubernetes Prow Robot
c6f7fbcfbc
Merge pull request #93220 from wawa0210/fix-93165
ingore apparmor on windows
2020-10-22 23:17:59 -07:00
Haowei Cai
c9bbd8532f generalize lease controller 2020-10-22 11:58:59 -07:00
Kubernetes Prow Robot
01f3f67989
Merge pull request #92663 from AndersonQ/68026-golint-/pkg/kubelet/stats
cleanup: fix golint errors in /pkg/kubelet/stats
2020-10-12 23:48:26 -07:00
Anderson Queiroz
8c724d7933 cleanup: fix golint errors in /pkg/kubelet/stats 2020-10-08 21:59:42 +02:00
Dan Winship
75242fce7a kubelet: allow specifying dual-stack node IPs on bare metal
Discussion is ongoing about how to best handle dual-stack with clouds
and autodetected IPs, but there is at least agreement that people on
bare metal ought to be able to specify two explicit IPs on dual-stack
hosts, so allow that.
2020-10-07 17:25:54 -04:00
Dan Winship
9a7afa69ef kubelet: do dual-stack iptables rules
When the dual-stack feature gate is enabled, set up dual-stack
iptables rules. (When it is not enabled, preserve the traditional
behavior of setting up IPv4 rules only, unless the user explicitly
passed an IPv6 --node-ip.)
2020-10-03 07:46:02 -04:00
Srini Brahmaroutu
fbe5daed73 Change code to use staging/k8s.io/mount-utils 2020-09-16 21:51:24 -07:00
wawa0210
995d654167
ingore apparmor on non Linux operating systems. 2020-09-15 17:30:44 +08:00
Kubernetes Prow Robot
88512be213
Merge pull request #92817 from kmala/kubelet
Check for sandboxes before deleting the pod from apiserver
2020-09-10 07:27:45 -07:00
Davanum Srinivas
d63dd5fe4d
Deprecate Dockershim
Start the clock on removing dockershim in a subsequent release

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-09-09 10:08:07 -04:00
Kubernetes Prow Robot
1d1daaa044
Merge pull request #94084 from brianpursley/kubernetes-93925-logging
Add logging when fail to kill container or pod
2020-09-04 03:32:23 -07:00
Derek Carr
752135242e WIP: node sync at least once 2020-09-01 12:17:26 -04:00
Kubernetes Prow Robot
dce91dece3
Merge pull request #93283 from runzexia/cleanup-unused-container-cache
clean up unused var containerCache
2020-08-28 06:36:33 -07:00
brianpursley
6d001ebb68 Add logging if container or pod fails to be killed 2020-08-25 20:37:49 -04:00
Kubernetes Prow Robot
6da73aa572
Merge pull request #93333 from loburm/fix-logrotate
Fix an issue when rotated logs of dead containers are not removed.
2020-08-20 03:27:23 -07:00
Derek Carr
02daa3ec23 Cleanup kubelet TODOs that are no longer pertinent. 2020-08-19 16:40:54 -04:00
Jordan Liggitt
b181c76cbd Deflake TestUpdateNodeStatusWithLease - guard cached machineInfo 2020-08-05 10:00:36 -04:00
Marian Lobur
5d1b3e26af Fix an issue when rotated logs of dead containers are not removed. 2020-07-24 10:06:24 +02:00
Keerthan Reddy,Mala
90cc954eed add sandbox deletor to delete sandboxes on pod delete event 2020-07-22 11:54:58 -07:00
Jordan Liggitt
d195fc2ec8 Ensure runtimeCache contains all observed started containers on pod delete 2020-07-21 15:54:29 -04:00
RyderXia
136df8ce53 update 2020-07-21 17:00:49 +08:00
Kubernetes Prow Robot
8398bc3b53
Merge pull request #92916 from joelsmith/count-etc-hosts
Include pod /etc/hosts in ephemeral storage calculation for eviction
2020-07-12 06:59:36 -07:00
Kubernetes Prow Robot
93e76f5081
Merge pull request #92442 from tedyu/grace-period-with-map
Respect grace period when removing mirror pod
2020-07-10 17:49:23 -07:00
Kubernetes Prow Robot
a6378d8b12
Merge pull request #92779 from fisherxu/patch-2
Return err when create ContainerLogsDir failed
2020-07-10 15:41:37 -07:00
Kubernetes Prow Robot
1e3eeba9fa
Merge pull request #91577 from knabben/kubelet-bootstrap
kubelet: remove the --bootstrap-checkpoint-path feature
2020-07-09 00:03:41 -07:00
Ted Yu
a76a959294 Respect grace period when removing mirror pod
Signed-off-by: Ted Yu <yuzhihong@gmail.com>
2020-07-08 13:38:24 -07:00
Joel Smith
f34b586d01 Include pod /etc/hosts in ephemeral storage calculation for eviction 2020-07-08 12:58:11 -06:00
Fei Xu
34826c82be Return err when create ContainerLogsDir failed 2020-07-07 09:36:35 +08:00
Sri Saran Balaji Vellore Rajakumar
05240c9218 Add support for disabling /logs endpoint in kubelet 2020-07-06 07:52:30 -07:00
Kubernetes Prow Robot
4a91ecb976
Merge pull request #91863 from knabben/kubelet-memcg-notification
Moving Kubelet kernel-memgc-notification to configuration file
2020-06-25 00:20:37 -07:00
Amim Knabben
c39cf28ed3 Moving Kubelet kernel-memgc-notification to configuration file 2020-06-24 06:44:00 -04:00