kubernetes

Author	SHA1	Message	Date
Jason Simmons	5a6acf85fa	Align lifecycle handlers and probes Align the behavior of HTTP-based lifecycle handlers and HTTP-based probers, converging on the probers implementation. This fixes multiple deficiencies in the current implementation of lifecycle handlers surrounding what functionality is available. The functionality is gated by the features.ConsistentHTTPGetHandlers feature gate.	2022-10-19 09:51:52 -07:00
Kubernetes Prow Robot	843ad71cac	Merge pull request #113041 from saschagrunert/kubelet-pods-creation-time Sort kubelet pods by their creation time	2022-10-18 09:17:19 -07:00
Sascha Grunert	b296f82c69	Sort kubelet pods by their creation time There is a corner case when blocking Pod termination via a lifecycle preStop hook, for example by using this StateFulSet: ```yaml apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: selector: matchLabels: app: ubi serviceName: "ubi" replicas: 1 template: metadata: labels: app: ubi spec: terminationGracePeriodSeconds: 1000 containers: - name: ubi image: ubuntu:22.04 command: ['sh', '-c', 'echo The app is running! && sleep 360000'] ports: - containerPort: 80 name: web lifecycle: preStop: exec: command: - /bin/sh - -c - 'echo aaa; trap : TERM INT; sleep infinity & wait' ``` After creation, downscaling, forced deletion and upscaling of the replica like this: ``` > kubectl apply -f sts.yml > kubectl scale sts web --replicas=0 > kubectl delete pod web-0 --grace-period=0 --force > kubectl scale sts web --replicas=1 ``` We will end up having two pods running by the container runtime, while the API only reports one: ``` > kubectl get pods NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 92s ``` ``` > sudo crictl pods POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME e05bb7dbb7e44 12 minutes ago Ready web-0 default 0 (default) d90088614c73b 12 minutes ago Ready web-0 default 0 (default) ``` When now running `kubectl exec -it web-0 -- ps -ef`, there is a random chance that we hit the wrong container reporting the lifecycle command `/bin/sh -c echo aaa; trap : TERM INT; sleep infinity & wait`. This is caused by the container lookup via its name (and no podUID) at: `02109414e8/pkg/kubelet/kubelet_pods.go (L1905-L1914)` And more specifiy by the conversion of the pod result map to a slice in `GetPods`: `02109414e8/pkg/kubelet/kuberuntime/kuberuntime_manager.go (L407-L411)` We now solve that unexpected behavior by tracking the creation time of the pod and sorting the result based on that. This will cause to always match the most recently created pod. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2022-10-13 16:32:44 +02:00
Dixita Narang	ff1f525511	Setting LockToDefault as true for KubeletCredentialProviders feature, and removing conditions that check if the feature is enabled since now the feature is enabled by default	2022-09-29 16:42:48 +00:00
Antonio Ojea	d434c588d7	Revert "change CPUCFSQuotaPeriod default value to 100us to match Linux default" This reverts commit `f2d591fae6`.	2022-08-26 23:51:04 +02:00
Dmitry Verkhoturov	f2d591fae6	change CPUCFSQuotaPeriod default value to 100us to match Linux default cpu.cfs_period_us is 100μs by default despite having an "ms" unit for some unfortunate reason. Documentation: https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management The desired effect of that change is to match k8s default `CPUCFSQuotaPeriod` value (100ms before that change) with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled and Linux CFS (100us, 1000x smaller than 100ms).	2022-08-10 03:25:05 +02:00
Kubernetes Prow Robot	2e1a4da8df	Merge pull request #111358 from ddebroy/hasnet1 Introduce PodHasNetwork condition for pods	2022-08-01 15:04:52 -07:00
Deep Debroy	dfdf8245bb	Introduce PodHasNetwork condition for pods Signed-off-by: Deep Debroy <ddebroy@gmail.com>	2022-08-01 09:51:43 -07:00
Lee Verberne	d238e67ba6	Remove EphemeralContainers feature-gate checks	2022-07-26 02:55:30 +02:00
Adrian Reber	8c24857ba3	kubelet: add CheckpointContainer() to the runtime Signed-off-by: Adrian Reber <areber@redhat.com>	2022-07-14 10:27:41 +00:00
Kubernetes Prow Robot	1b2de5cf01	Merge pull request #109042 from bjorand/network_panic_kubelet kubelet: fix panic triggered when playing with a wip CRI	2022-05-03 18:24:20 -07:00
Benjamin Jorand	3c65728ede	kubelet: fix panic triggered when playing with a wip CRI	2022-03-26 00:23:35 +01:00
Deep Debroy	023d6fb8f4	Pass instrumented runtime service to containergc Signed-off-by: Deep Debroy <ddebroy@gmail.com>	2022-03-08 14:33:37 +00:00
KeZhang	3946d99904	Ignore container notfound error while getPodstatuses	2022-02-16 08:55:19 +08:00
Kubernetes Prow Robot	64e83a7e43	Merge pull request #107945 from saschagrunert/cri-verbose Add support for CRI `verbose` fields	2022-02-14 17:58:12 -08:00
Sascha Grunert	effbcd3a0a	Add support for CRI `verbose` fields The remote runtime implementation now supports the `verbose` fields, which are required for consumers like cri-tools to enable multi CRI version support. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2022-02-10 17:12:26 +01:00
Ciprian Hacman	0819451ea6	Clean up logic for deprecated flag --container-runtime in kubelet Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>	2022-02-10 13:26:59 +02:00
cyclinder	07999dac70	Clean up dockershim flags in the kubelet Signed-off-by: cyclinder <qifeng.guo@daocloud.io> Co-authored-by: Ciprian Hacman <ciprian@hakman.dev> Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>	2022-01-14 16:02:50 +02:00
Ciprian Hacman	5bae9b9288	Clean up DockerLegacyService interface Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>	2021-12-18 12:24:54 +02:00
Sascha Grunert	de37b9d293	Make CRI `v1` the default and allow a fallback to `v1alpha2` This patch makes the CRI `v1` API the new project-wide default version. To allow backwards compatibility, a fallback to `v1alpha2` has been added as well. This fallback can either used by automatically determined by the kubelet. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-11-17 11:05:05 -08:00
Kubernetes Prow Robot	5d60c8d857	Merge pull request #102393 from mengjiao-liu/fix-sysctl-regex Upgrade preparation to verify sysctl values containing forward slashes by regex	2021-11-09 18:23:26 -08:00
Mark Rossetti	ef324d6bbd	Adding kubelet metrics for started and failed to start HostProcess containers Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2021-11-04 14:39:57 -07:00
Mengjiao Liu	275d832ce2	Upgrade preparation to verify sysctl values containing forward slashes by regex	2021-11-04 11:49:56 +08:00
yxxhero	35df409a7e	remove StartedPodsErrorsTotal metrice message Signed-off-by: yxxhero <aiopsclub@163.com>	2021-09-23 22:18:56 +08:00
Sascha Grunert	46077e6be7	Remove deprecated `--seccomp-profile-root`/`seccompProfileRoot` configuration The configuration is deprecated and targets removal for v1.23. Tests cases have been changed as well. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-08-31 09:55:28 +02:00
Ryan Phillips	30e9a420c4	kubelet: fix sandbox creation error suppression when pods are quickly deleted	2021-08-10 08:55:25 -05:00
Kubernetes Prow Robot	dab6f6a43d	Merge pull request #102344 from smarterclayton/keep_pod_worker Prevent Kubelet from incorrectly interpreting "not yet started" pods as "ready to terminate pods" by unifying responsibility for pod lifecycle into pod worker	2021-07-08 16:48:53 -07:00
Kubernetes Prow Robot	a9d7526864	Merge pull request #102970 from tkestack/feature-memory-qos Feature: Support memory qos with cgroups v2	2021-07-08 14:01:36 -07:00
Kubernetes Prow Robot	7c84064a4f	Merge pull request #99000 from verb/1.21-kubelet-metrics Add kubelet metrics for ephemeral containers	2021-07-08 14:00:55 -07:00
Li Bo	c3d9b10ca8	feature: support Memory QoS for cgroups v2	2021-07-08 09:26:46 +08:00
Clayton Coleman	3eadd1a9ea	Keep pod worker running until pod is truly complete A number of race conditions exist when pods are terminated early in their lifecycle because components in the kubelet need to know "no running containers" or "containers can't be started from now on" but were relying on outdated state. Only the pod worker knows whether containers are being started for a given pod, which is required to know when a pod is "terminated" (no running containers, none coming). Move that responsibility and podKiller function into the pod workers, and have everything that was killing the pod go into the UpdatePod loop. Split syncPod into three phases - setup, terminate containers, and cleanup pod - and have transitions between those methods be visible to other components. After this change, to kill a pod you tell the pod worker to UpdatePod({UpdateType: SyncPodKill, Pod: pod}). Several places in the kubelet were incorrect about whether they were handling terminating (should stop running, might have containers) or terminated (no running containers) pods. The pod worker exposes methods that allow other loops to know when to set up or tear down resources based on the state of the pod - these methods remove the possibility of race conditions by ensuring a single component is responsible for knowing each pod's allowed state and other components simply delegate to checking whether they are in the window by UID. Removing containers now no longer blocks final pod deletion in the API server and are handled as background cleanup. Node shutdown no longer marks pods as failed as they can be restarted in the next step. See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details	2021-07-06 15:55:22 -04:00
Elana Hashman	0deef4610e	Set MemorySwapLimitInBytes for CRI when NodeSwapEnabled	2021-06-29 11:59:02 -07:00
Sascha Grunert	8b7003aff4	Add SeccompDefault feature This adds the gate `SeccompDefault` as new alpha feature. Seccomp path and field fallbacks are now passed to the helper functions, whereas unit tests covering those code paths have been added as well. Beside enabling the feature gate, the feature has to be enabled by the `SeccompDefault` kubelet configuration or its corresponding `--seccomp-default` CLI flag. Signed-off-by: Sascha Grunert <sgrunert@redhat.com> Apply suggestions from code review Co-authored-by: Paulo Gomes <pjbgf@linux.com> Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2021-06-23 10:22:57 +02:00
yuzhiquan	bebca30309	comment should have function name as prefix	2021-04-28 15:26:46 +08:00
Lee Verberne	29178fff1c	Add kubelet managed pod metrics	2021-04-13 14:13:30 +02:00
Aditi Sharma	461c0c1656	Fix structured logging for kuberuntime_manger.go	2021-03-15 10:13:18 +05:30
Elana Hashman	9fb6e712ff	Override terminationLivenessGracePeriod for probes	2021-03-11 14:38:03 -08:00
Kubernetes Prow Robot	c22f099395	Merge pull request #99841 from adisky/kuberuntime_manager Migrate pkg/kubelet/kuberuntime/kuberuntime_manager.go to structured logging	2021-03-08 16:27:44 -08:00
Aditi Sharma	45c7608379	Migrate to structured logging pkg/kubelet/kuberuntime/kuberuntime_manager.go Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>	2021-03-08 11:27:44 +05:30
Matthias Bertschy	431e6a7044	Move readinessManager updates handling to kubelet	2021-03-05 07:02:25 +01:00
Ryan Phillips	f989adaa18	kubelet: fix create create sandbox delete pod race	2021-02-18 11:22:12 -06:00
changshuchao	42eb85e4fb	Made some optimizations, including modifying variable names, omitting unnecessary parentheses, and conflicting variable names and package names. Signed-off-by: changshuchao <chang.shuchao1@zte.com.cn>	2021-01-16 17:24:08 +08:00
Andrew Sy Kim	51441fd052	kubelet: support alpha credential provider exec plugins Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>	2020-11-10 13:44:06 -05:00
Kubernetes Prow Robot	402b94f313	Merge pull request #91469 from kinvolk/rata/fix-kubelet-log-msg Fix kubelet log message when starting a container	2020-09-21 22:28:46 -07:00
Kubernetes Prow Robot	48d5d204c3	Merge pull request #92614 from tnqn/onfailure-recreate Don't create a new sandbox for pod with RestartPolicyOnFailure if all containers succeeded	2020-09-03 14:57:40 -07:00
Rodrigo Campos	e6c67c32e1	Fix kubelet log message when starting a container This code can be called not only when a container is dead and restarted, but when is started for the first time too. For example, any pod with initContainer and containers will exhibit this behaviour. The reason is that in that case, the "if createPodSandbox" path will return the initContainers only and on the next call to this function this code is executed to start the containers for the fist time. In that case, it is wrong to log that the container is dead and will be restarted, as it was never started. In fact, the restart count will not be increased. This commit just changes this to say that the container is not in the desired state and should be started. In the end, the kubelet is a state machine and that is all we really care about. No tests are added, as the behaviour was correct and tests don't check logs messages. Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>	2020-08-04 14:58:27 -03:00
Marian Lobur	5d1b3e26af	Fix an issue when rotated logs of dead containers are not removed.	2020-07-24 10:06:24 +02:00
Quan Tian	b2b082f54f	Don't create a new sandbox for pod with RestartPolicyOnFailure if all containers succeeded The kubelet would attempt to create a new sandbox for a pod whose RestartPolicy is OnFailure even after all container succeeded. It caused unnecessary CRI and CNI calls, confusing logs and conflicts between the routine that creates the new sandbox and the routine that kills the Pod. This patch checks the containers to start and stops creating sandbox if no container is supposed to start.	2020-07-07 22:49:48 +08:00
Kubernetes Prow Robot	14d9b5d758	Merge pull request #92325 from brianpursley/sync-pod-log Add pod and container name in log message when container fails to start	2020-06-24 04:55:18 -07:00
Brian Pursley	2afc8e0eab	Add pod and container name in log message when container fails to start	2020-06-23 12:59:53 -04:00

1 2 3 4 5

203 Commits