kubernetes/pkg/kubelet
Clayton Coleman 3eadd1a9ea
Keep pod worker running until pod is truly complete
A number of race conditions exist when pods are terminated early in
their lifecycle because components in the kubelet need to know "no
running containers" or "containers can't be started from now on" but
were relying on outdated state.

Only the pod worker knows whether containers are being started for
a given pod, which is required to know when a pod is "terminated"
(no running containers, none coming). Move that responsibility and
podKiller function into the pod workers, and have everything that
was killing the pod go into the UpdatePod loop. Split syncPod into
three phases - setup, terminate containers, and cleanup pod - and
have transitions between those methods be visible to other
components. After this change, to kill a pod you tell the pod worker
to UpdatePod({UpdateType: SyncPodKill, Pod: pod}).

Several places in the kubelet were incorrect about whether they
were handling terminating (should stop running, might have
containers) or terminated (no running containers) pods. The pod worker
exposes methods that allow other loops to know when to set up or tear
down resources based on the state of the pod - these methods remove
the possibility of race conditions by ensuring a single component is
responsible for knowing each pod's allowed state and other components
simply delegate to checking whether they are in the window by UID.

Removing containers now no longer blocks final pod deletion in the
API server and are handled as background cleanup. Node shutdown
no longer marks pods as failed as they can be restarted in the
next step.

See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details
2021-07-06 15:55:22 -04:00
..
apis Merge pull request #101943 from saschagrunert/seccomp-default 2021-06-24 13:07:41 -07:00
cadvisor disable collecting of accelerator metrics and exposing it for containerd 2021-04-30 22:16:34 +00:00
certificate Add test cases to the LoadClientConfig function 2021-06-02 15:22:00 +00:00
checkpointmanager remove fakefs to drop spf13/afero dependency 2021-06-24 09:51:34 -04:00
client hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
cloudresource Apply suggestions from code review 2021-03-05 23:59:23 +05:30
cm Merge pull request #103146 from tech-geek29/fix-95380 2021-06-25 07:44:45 -07:00
config Pre-allocated memory 2021-06-01 15:19:44 +08:00
configmap reduce configmap and secret watch of kubelet 2021-03-08 16:55:39 +08:00
container Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
cri Merge pull request #100175 from changshuchao/testcase_utils 2021-04-08 20:28:22 -07:00
custommetrics hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
dockershim Update pause image to v3.5 2021-05-25 09:04:46 +02:00
envvars hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
events hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
eviction Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
images Merge pull request #99994 from AfrouzMashayekhi/sl-cmd-kubelet 2021-03-16 14:49:56 -07:00
kubeletconfig remove fakefs to drop spf13/afero dependency 2021-06-24 09:51:34 -04:00
kuberuntime Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
leaky hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
legacy hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
lifecycle Scheduler: remove pkg/features dependency from NodeResources plugins 2021-05-18 08:59:02 -04:00
logs Merge pull request #99680 from CaoDonghui123/fixissues4 2021-05-24 16:18:20 -07:00
metrics update kubelet_running_pods metrics comments: pods that have a running pod sandbox 2021-04-29 11:05:52 +08:00
network Add feature gate ExpandedDNSConfig 2021-05-27 07:10:13 +09:00
nodeshutdown Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
nodestatus fix_change_error_to_info 2021-04-21 10:35:23 +02:00
oom Merge pull request #99479 from mengjiao-liu/migrate_to_structured_logs 2021-03-11 17:28:33 -08:00
pleg Merge pull request #101308 from pacoxu/doc-kubelet-running-pods 2021-05-26 03:17:20 -07:00
pluginmanager migrate pkg/kubelet/pluginmanager to structured logging 2021-03-10 15:44:16 +08:00
pod Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
preemption Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
prober fix manual trigger of readinessProbe on startupProbe success 2021-05-05 11:21:40 +02:00
qos Only system-node-critical pods should be OOM Killed last 2021-03-03 16:34:27 -05:00
runtimeclass hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
secret reduce configmap and secret watch of kubelet 2021-03-08 16:55:39 +08:00
server Implement all necessary methods to provide memory manager data under pod resources metrics 2021-06-22 13:06:32 +03:00
stats Merge pull request #101712 from SergeyKanzhelev/disableAcceleratorUsageMetricsOnContainerd 2021-05-17 13:39:51 -07:00
status Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
sysctl hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
token Merge pull request #99264 from palnabarun/structured-logging-pkg/kubelet/token 2021-03-08 19:23:11 -08:00
types Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
util remove fakefs to drop spf13/afero dependency 2021-06-24 09:51:34 -04:00
volumemanager Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
winstats Merge pull request #99855 from hexxdump/master 2021-03-17 00:46:56 -07:00
active_deadline_test.go
active_deadline.go
doc.go
errors.go
kubelet_dockerless_test.go
kubelet_dockershim_nodocker.go
kubelet_dockershim.go Structured Logging migration: modify dockershim and network part logs of kubelet. 2021-02-14 16:01:47 +08:00
kubelet_getters_test.go
kubelet_getters.go pkg/kubelet: improve the node informer sync check 2021-04-21 22:46:27 +03:00
kubelet_network_linux.go Structured Logging migration: modify dockershim and network part logs of kubelet. 2021-02-14 16:01:47 +08:00
kubelet_network_others.go
kubelet_network_test.go
kubelet_network.go Structured Logging migration: modify dockershim and network part logs of kubelet. 2021-02-14 16:01:47 +08:00
kubelet_node_status_others.go
kubelet_node_status_test.go Move pkg/kubelet/apis to k8s.io/kubelet/pkg/apis 2021-02-09 21:37:39 +01:00
kubelet_node_status_windows.go
kubelet_node_status.go Merge pull request #98154 from yangjunmyfm192085/run-test 2021-03-11 17:28:18 -08:00
kubelet_pods_linux_test.go Windows: Fixes /etc/hosts file mounting support for containerd 2021-01-30 04:54:42 -08:00
kubelet_pods_test.go Ensure kubelet statuses can handle loss of container runtime state 2021-06-15 11:12:55 -07:00
kubelet_pods_windows_test.go Windows: Fixes /etc/hosts file mounting support for containerd 2021-01-30 04:54:42 -08:00
kubelet_pods.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
kubelet_resources_test.go
kubelet_resources.go Migrate pkg/kubelet/kubeletconfig to Structured Logging 2021-03-15 15:42:34 -07:00
kubelet_test.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
kubelet_volumes_linux_test.go kubelet: do not call RemoveAll on volumes directory for orphaned pods 2021-06-08 13:57:35 -06:00
kubelet_volumes_test.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
kubelet_volumes.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
kubelet.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
OWNERS Add klueska as an approver in pkg/kubelet/OWNERS 2021-02-09 21:43:35 +01:00
pod_container_deletor_test.go
pod_container_deletor.go Structured Logging migration: modify volume and container part logs of kubelet. 2021-03-17 08:59:03 +08:00
pod_workers_test.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
pod_workers.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
reason_cache_test.go
reason_cache.go
runonce_test.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
runonce.go Keep pod worker running until pod is truly complete 2021-07-06 15:55:22 -04:00
runtime.go
time_cache_test.go
time_cache.go
volume_host.go Structured Logging migration: modify volume and container part logs of kubelet. 2021-03-17 08:59:03 +08:00