kubernetes

Author	SHA1	Message	Date
Davanum Srinivas	8b9a5b2dff	Avoid tainting with NoSchedule when DisableCloudProviders feature is on Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2022-10-02 13:00:58 -04:00
Kubernetes Prow Robot	02109414e8	Merge pull request #112542 from astraw99/fix-runtime-validate Add validation for runtime endpoint flag	2022-09-30 18:04:24 -07:00
Kubernetes Prow Robot	be22f605cf	Merge pull request #112097 from wongearl/cleanup_loop use copy() instead of a loop	2022-09-30 18:04:12 -07:00
Kubernetes Prow Robot	ad64f9c4dc	Merge pull request #112631 from tzneal/reword-image-gc-failure-log reword image gc failure log	2022-09-30 16:56:35 -07:00
jesse.tang	759e043136	Optimize: file /cpuset slice make cap (#112270 )	2022-09-30 16:56:25 -07:00
Kubernetes Prow Robot	5bcdc82911	Merge pull request #112184 from danwinship/kubelet-node-ip-annotation-cleanup Delete the cloud node IP annotation if it is stale	2022-09-30 16:56:13 -07:00
SataQiu	7308b83a99	remove the unused constant AnnotationInvalidReason since sysctl annotations are deprecated and migrated to fields	2022-09-30 14:53:46 +08:00
Kubernetes Prow Robot	4276ed3628	Merge pull request #112414 from pacoxu/kubelet-multi-options kubelet: append options to pod if there are multi options in /etc/resolv.conf	2022-09-29 21:10:28 -07:00
Swagat Bora	caa83c25ae	Support otel tracing in cri remote image service Signed-off-by: Swagat Bora <sbora@amazon.com>	2022-09-29 22:15:07 +00:00
Kubernetes Prow Robot	3af1e5fdf6	Merge pull request #112707 from enj/enj/i/https_links Use https links for k8s KEPs, issues, PRs, etc	2022-09-29 12:34:40 -07:00
Dixita Narang	ff1f525511	Setting LockToDefault as true for KubeletCredentialProviders feature, and removing conditions that check if the feature is enabled since now the feature is enabled by default	2022-09-29 16:42:48 +00:00
astraw99	805be30745	Add validation for runtime endpoint	2022-09-28 10:33:35 +08:00
Kubernetes Prow Robot	00532e305a	Merge pull request #107896 from smarterclayton/track_pod_sync_latency kubelet: Record a metric for latency of pod status update	2022-09-27 14:25:50 -07:00
Kubernetes Prow Robot	5579ddea8a	Merge pull request #112644 from vitorfhc/issue-112605 Improves message for pod status in rejectPod	2022-09-27 11:32:02 -07:00
Kubernetes Prow Robot	efc306a12d	Merge pull request #112316 from dengyufeng2206/0908test fix test order in pkg/kubelet/sysctl/util_test.go	2022-09-27 11:31:50 -07:00
Monis Khan	b738be9b46	Use https links for k8s KEPs, issues, PRs, etc Signed-off-by: Monis Khan <mok@microsoft.com>	2022-09-23 23:36:24 +00:00
Kubernetes Prow Robot	4e105c4814	Merge pull request #111343 from niulechuan/add_unit_test_for_asw Add unit test in kubelet volumemanager ASW: Detach a volume that had been mounted by pod should be skipped	2022-09-23 07:04:25 -07:00
Vitor Falcao	0beafd1a5a	Improved message for pod status in rejectPod Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>	2022-09-21 21:46:52 +00:00
Ryan Phillips	205adec698	kubelet: increase log level for Path does not exist message	2022-09-21 14:16:44 -05:00
Todd Neal	9e83c2d7eb	reword image gc failure log Reword the log so that it sounds less like a failure of kubelet and points towards the root cause of not enough data being eligible to free.	2022-09-20 21:57:59 -05:00
Mikko Ylinen	fbcdf48bb8	grpc: set localhost Authority to unix client calls Several reports exist (both with device plugins and CSI) that kubelet w/ grpc-go sends invalid Authority header and some non grpc-go servers reject these unix domain socket client connections. grpc-go sets the Authority header correct when the dial address is in a format where the its address scheme can be determined. Instead of making changes to get the all server addresses to unix:// prefixed format, set grpc.WithAuthority("localhost") client connection override to get the same result. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2022-09-20 13:15:36 +03:00
Paco Xu	3bbd025982	ut: compare dns options without order	2022-09-20 11:45:43 +08:00
Paco Xu	468b2a2297	kubelet: append options to pod if there are multi options in /etc/resolv.conf	2022-09-20 10:40:54 +08:00
Kubernetes Prow Robot	127f33f63d	Merge pull request #111221 from inosato/remove-ioutil-from-kubelet Remove ioutil in kubelet/kubeadm and its tests	2022-09-17 21:56:28 -07:00
inosato	7dc1f5e30b	Fix comments	2022-09-18 12:51:03 +09:00
Dixita Narang	9c3cb6e66d	Fixing boilerplate header	2022-09-16 21:20:30 +00:00
Kubernetes Prow Robot	c45ca46cdb	Merge pull request #112387 from mythi/kubelet-devicemanager-topologyinfo devicemanager: do not leak empty TopologyInfo to TopologyManager	2022-09-14 07:17:00 -07:00
Mikko Ylinen	68bb0935bd	devicemanager: do not leak empty TopologyInfo to TopologyManager Device Plugins that wish to leverage the Topology Manager can send back a populated TopologyInfo struct as part of the device registration, along with the device IDs and the health of the device. TopologyInfo is converted to TopologyHints and used by TopologyManager to find the optimal/desired resource allocation for a Pod. If a plugin sends an empty but non-nil instance of TopologyInfo for a resource, devicemanager passes it on as an empty instance of TopologyHint which is currently interpreted as "Hint Provider has no possible NUMA affinities for resource" which further means that pods requesting that resource will fail. To not block device resources that pass TopologyInfo{Nodes:[]*NUMANode{}} from being used, interprete that as nil set of hints and not a []TopologyHint{}. Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2022-09-14 16:13:31 +03:00
Kubernetes Prow Robot	74469ca4c5	Merge pull request #112123 from paskal/paskal/cfs_clarification clarify CPUCFSQuotaPeriod values, set the minimum to 1ms	2022-09-12 07:01:25 -07:00
Daniil Loktev	229ce27ae4	Add unit tests for active_deadline.go	2022-09-10 11:02:36 +03:00
Dixita Narang	4cc741955c	Adding default values for v1 credential provider config	2022-09-09 06:11:15 +00:00
Dixita Narang	977a8ebb3a	Renaming usage of v1beta1 to v1, and adding API violation exceptions and vendor module for v1	2022-09-09 06:11:06 +00:00
Dmitry Verkhoturov	d0f9e6dc36	clarify CPUCFSQuotaPeriod values, set the minimum to 1ms cpu.cfs_period_us is measured in microseconds in the kernel but provided in time.Duration by the user, that change clarifies the code to make this evident to the reader. Also, the minimum value for that feature is 1ms and not 1μs, and this change alters the validation to reject values smaller than 1ms.	2022-09-08 23:29:13 +02:00
Clayton Coleman	e9a5fb7372	kubelet: Record a metric for latency of pod status update Track how long it takes for pod updates to propagate from detection to successful change on API server. Will guide future improvements in pod start and shutdown latency. Metric is `kubelet_pod_status_sync_duration_seconds` and is ALPHA stability. Histogram buckets are chosen based on distribution of observed status delays in practice.	2022-09-08 12:17:44 -04:00
dengyufeng2206	e20071792f	fix test order in pkg/kubelet/sysctl/util_test.go Signed-off-by: dengyufeng2206 <deng.yufeng@zte.com.cn>	2022-09-08 17:20:22 +08:00
Kubernetes Prow Robot	6d1e9150d0	Merge pull request #108855 from haircommander/podStatsFix kubelet/stats: deduplicate makePodStorageStats	2022-09-06 12:58:22 -07:00
Kubernetes Prow Robot	780fe01858	Merge pull request #111935 from giuseppe/userns-manager-use-bitmask-pkg-registry kubelet: drop bitArray implementation	2022-09-06 10:27:51 -07:00
Dan Winship	e23f1a68af	Delete the cloud node IP annotation if it is stale If you run "kubelet --cloud-provider X --node-ip Y", kubelet will set an annotation on the node, but previously, if you then ran just "kubelet --cloud-provider X" (or just "kubelet --node-ip Y"), it wouldn't delete the stale annotation. Fix that.	2022-09-01 16:43:18 -04:00
Dalton Hubble	7850097fd0	Avoid propagating `search .` into containers /etc/resolv.conf * Adapt https://github.com/kubernetes/kubernetes/pull/109441 but ensures that `search .` does not get propagated into containers' /etc/resolv.conf. There is no reason to put `.` in a container's search field and it causes issues for musl	2022-09-01 12:07:18 -07:00
Kubernetes Prow Robot	67d75db890	Merge pull request #111932 from azylinski/rm-lastContainerStartedTime-lru Cleanup: Remove unused lastContainerStartedTime time.Cache lru	2022-08-29 09:54:37 -07:00
wongearl	47bd712b81	use copy() instead of a loop	2022-08-29 17:55:16 +08:00
Antonio Ojea	d434c588d7	Revert "change CPUCFSQuotaPeriod default value to 100us to match Linux default" This reverts commit `f2d591fae6`.	2022-08-26 23:51:04 +02:00
sivchari	c62a7cdb32	fix: test	2022-08-26 01:25:44 +09:00
sivchari	12d49b6bfb	fix: rename	2022-08-26 00:44:31 +09:00
Kubernetes Prow Robot	bc9f48b841	Merge pull request #112024 from cndoit18/remove-redundant-judgment style: remove redundant judgment	2022-08-25 07:28:18 -07:00
Kubernetes Prow Robot	2b5475b3fa	Merge pull request #111554 from paskal/paskal/clarify_default_cfs_period Clarify cpu.cfs_period_us default value	2022-08-25 07:28:07 -07:00
cndoit18	ec43037d0f	style: remove redundant judgment Signed-off-by: cndoit18 <cndoit18@outlook.com>	2022-08-25 12:07:36 +08:00
Mrunal Patel	65e693eccb	Set correct SELinux label for host paths volumes created by host path provisioner These host paths have a well known location under /tmp/hostpath_pv and are therefore safe to be labeled with the shared SELinux label. Without this label, the mounted volumes cannot be accessed by the container processes. Signed-off-by: Mrunal Patel <mpatel@redhat.com>	2022-08-24 17:57:47 -07:00
Kubernetes Prow Robot	70254065ea	Merge pull request #109966 from zhangxyjlu/config_validation_test Add validation test for features.GracefulNodeShutdownBasedOnPodPriority	2022-08-24 00:02:24 -07:00
Kubernetes Prow Robot	08aac4f0ac	Merge pull request #111520 from paskal/paskal/clarify_cfs_period_us Change CPUCFSQuotaPeriod default value from 100ms to 100us to match Linux default	2022-08-23 20:07:48 -07:00
Kubernetes Prow Robot	c9c7e245e4	Merge pull request #111692 from SataQiu/cleanup-kubelet-20220804 kubelet: remove unused custommetrics package	2022-08-23 17:17:34 -07:00
weizhichen	f2e7211ab8	delete stale code in kubelet volumemanager	2022-08-23 23:36:09 +00:00
Kubernetes Prow Robot	052bfc35b2	Merge pull request #110390 from major1201/fix_kubelet_test fix defer in loop and optimize test cases with explicit field name	2022-08-23 16:04:37 -07:00
Kubernetes Prow Robot	17dd76f5d4	Merge pull request #108832 from waynepeking348/fix_bugs_of_container_cpu_shares fix bugs of container cpu shares when cpu request set to zero	2022-08-23 16:04:03 -07:00
Giuseppe Scrivano	2b01af6c92	kubelet: drop bitArray implementation drop bitArray implementation and use the bitmap implementation from k8s.io/kubernetes/pkg/registry/core/service/allocator. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2022-08-19 16:55:15 +02:00
Artur Żyliński	15566d3d89	Cleanup: Remove unused lastContainerStartedTime time.Cache lru	2022-08-19 14:57:29 +02:00
zhoumingcheng	bb3fcde855	add unit test coverage for pkg/kubelet/types/ Signed-off-by: zhoumingcheng <zhoumingcheng@beyondcent.com>	2022-08-19 16:28:25 +08:00
Peter Hunt~	d6ffca04c9	kubelet/stats: drop makePodStorageStats errors to V(6) and by doing so, fix a bug where the stats providers report a directory is not found after a pod's storage is removed Signed-off-by: Peter Hunt <pehunt@redhat.com>	2022-08-16 16:54:29 -04:00
Peter Hunt	6d9264247d	kubelet/stats: deduplicate makePodStorageStats Signed-off-by: Peter Hunt <pehunt@redhat.com>	2022-08-16 16:54:29 -04:00
Dmitry Verkhoturov	f2d591fae6	change CPUCFSQuotaPeriod default value to 100us to match Linux default cpu.cfs_period_us is 100μs by default despite having an "ms" unit for some unfortunate reason. Documentation: https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management The desired effect of that change is to match k8s default `CPUCFSQuotaPeriod` value (100ms before that change) with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled and Linux CFS (100us, 1000x smaller than 100ms).	2022-08-10 03:25:05 +02:00
STRRL	1dcd0c3348	chore: use require instead of assert Signed-off-by: STRRL <im@strrl.dev>	2022-08-08 21:06:38 +08:00
Niu Lechuan	24614f8551	Add unit test in volumemanager: Detach a volume that had been mounted by pod should be skipped Signed-off-by: Niu Lechuan <lechuan.niu@daocloud.io>	2022-08-05 09:03:21 +08:00
SataQiu	ea9f0a8c8e	kubelet: remove unused custommetrics package	2022-08-04 20:00:38 +08:00
Jan Safranek	f9c7ce5b9c	Add unit tests for DesiredStateOfWorldPopulator	2022-08-04 10:51:59 +02:00
Jan Safranek	260912490e	Add a coment about handling same volumes with different contexts	2022-08-04 10:51:56 +02:00
Jan Safranek	a01e720a1a	Rename IsRWOP To be able to update content of the function to other access modes when we implement SELinux mount for more of them.	2022-08-04 10:51:54 +02:00
Jan Safranek	1490d51028	Remove noisy log The error would be logged every reconciler sync (100 ms).	2022-08-04 10:51:53 +02:00
Jan Safranek	0793ecee3a	Add unit tests for ASW.AddPodToVolume	2022-08-04 10:51:52 +02:00
Jan Safranek	17d850ee0e	Add interface for SELinuxOptionsToFileLabel github.com/opencontainers/selinux/go-selinux needs OS that supports SELinux and SELinux enabled in it to return useful data, therefore add an interface in front of it, so we can mock its behavior in unit tests.	2022-08-04 10:51:51 +02:00
Jan Safranek	d9f792633d	Add AddPodToVolume unit tests with SELinux	2022-08-04 10:51:50 +02:00
Jan Safranek	8d6b721ddd	Extract SELinux context error handling into a common func Add handlerSELinuxMetricError() which bumps the right metric + either consumes a SELinux error or lets it propagate up the stack.	2022-08-04 10:51:48 +02:00
Jan Safranek	49148ddfd0	Extract getSELinuxLabel from AddPodToVolume To keep the function smaller.	2022-08-04 10:51:46 +02:00
Jan Safranek	de7f5b66ed	Fix existing unit tests	2022-08-04 10:51:44 +02:00
Jan Safranek	b2e18c0b20	Add metrics for SELinux context mount Add separate _errors and _warnings to capture volumes that were rejected from those will be rejected when the feature is expanded to all access mode.	2022-08-04 10:51:43 +02:00
Jan Safranek	48b0751269	Add SELinux context tracking to volume manager Both ActualStateOfWorld and DesiredStateOfWorld must track SELinux context of volume mounts.	2022-08-04 10:51:41 +02:00
Kubernetes Prow Robot	442574f3a7	Merge pull request #111513 from jingxu97/july/localstorage Promote Local storage capacity isolation feature to GA	2022-08-03 13:05:59 -07:00
Kubernetes Prow Robot	4b6134b6dc	Merge pull request #111090 from kinvolk/rata/userns-support-2022 Add support for user namespaces phase 1 (KEP 127)	2022-08-03 13:05:47 -07:00
Rodrigo Campos	138e80819e	kubelet: set user namespace options Set the user namespace options to use for the pod. Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>	2022-08-03 19:53:22 +02:00
Giuseppe Scrivano	67b38ffe6e	kubelet: propagate errors from namespacesForPod it is a preparatory change for the next commit. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2022-08-03 19:53:22 +02:00
Rodrigo Campos	d07c2688fe	kubelet: add GetHostIDsForPod() In future commits we will need this to set the user/group of supported volumes of KEP 127 - Phase 1. Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>	2022-08-03 19:53:22 +02:00
Giuseppe Scrivano	9b2fc639a0	kubelet: add GetUserNamespaceMappings to RuntimeHelper Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2022-08-03 19:53:22 +02:00
Giuseppe Scrivano	63462285d5	kubelet: add userns manager it is used to allocate and keep track of the unique users ranges assigned to each pod that runs in a user namespace. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com> Co-authored-by: Rodrigo Campos <rodrigoca@microsoft.com>	2022-08-03 19:53:22 +02:00
jinxu	0064010cdd	Promote Local storage capacity isolation feature to GA This change is to promote local storage capacity isolation feature to GA At the same time, to allow rootless system disable this feature due to unable to get root fs, this change introduced a new kubelet config "localStorageCapacityIsolation". By default it is set to true. For rootless systems, they can set this configuration to false to disable the feature. Once it is set, user cannot set ephemeral-storage request/limit because capacity and allocatable will not be set. Change-Id: I48a52e737c6a09e9131454db6ad31247b56c000a	2022-08-02 23:45:48 -07:00
zhangxiaoyang	7375ba4e27	add validation test for features.GracefulNodeShutdownBasedOnPodPriority	2022-08-03 14:43:00 +08:00
Vinay Kulkarni	007d93ad08	Handle UpdateContainerResources for Windows in v1alpha2	2022-08-02 15:31:00 -07:00
Vinay Kulkarni	0ef263c3b0	CRI changes to support implementation of in-place pod resize. KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources	2022-08-02 15:08:25 -07:00
Kubernetes Prow Robot	8f3b2813dc	Merge pull request #111642 from harche/evented_pleg_cri_changes Update CRI API to support Evented PLEG	2022-08-02 13:59:16 -07:00
Kubernetes Prow Robot	369a465fae	Merge pull request #111301 from mattcary/migration-feature Upgrade CSIMigrationGCE feature gate to GA	2022-08-02 13:58:57 -07:00
Kubernetes Prow Robot	9fb1f67af7	Merge pull request #111278 from arpitsardhana/master KEP-3327: Add CPUManager policy option to align CPUs by Socket instead of by NUMA node	2022-08-02 13:58:45 -07:00
Kubernetes Prow Robot	d40bc18461	Merge pull request #105126 from sallyom/tracing-kubelet kubelet tracing instrumentation	2022-08-02 11:38:06 -07:00
Harshal Patil	668b2440c5	Update CRI API to support Evented PLEG Signed-off-by: Harshal Patil <harpatil@redhat.com>	2022-08-03 00:01:13 +05:30
Arpit Singh	d92fd8392d	Adding unit test for align-by-socket policy option Also addressed MR comments as part of same commit.	2022-08-02 11:02:07 -07:00
Arpit Singh	06f347f645	Adding validity checks for topology manager align-by-socket	2022-08-02 11:02:07 -07:00
Arpit Singh	35849bf7fb	KEP-3327: Add CPUManager policy option to align CPUs by Socket instead of by NUMA node	2022-08-02 11:02:07 -07:00
Matthew Cary	e5d387c5d6	Upgrade CSIMigrationGCE feature gate to GA Change-Id: I620bc4913765c0d6562eb1008216a72e8b0a2970	2022-08-02 09:14:27 -07:00
KunWuLuan	ccb8b2ebc3	docs(desired_state_of_world.go): log in desired_state_of_world.go seems to be wrong	2022-08-02 20:47:44 +08:00
Michal Wozniak	04fcbd721c	Introduction of a pod condition type indicating disruption. Its `reason` field indicates the reason: - PreemptionByKubeScheduler (Pod preempted by kube-scheduler) - DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint) - EvictionByEvictionAPI (Pod evicted by Eviction API) - DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)	2022-08-02 11:12:16 +02:00
Dmitry Verkhoturov	32df800ba7	change CPUCFSQuotaPeriod default value to 100us to match Linux default cpu.cfs_period_us is 100μs by default despite having an "ms" unit for some unfortunate reason. Documentation: https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management The desired effect of that change is to match k8s default `CPUCFSQuotaPeriod` value (100ms before that change) with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled and Linux CFS (100us, 1000x smaller than 100ms).	2022-08-02 09:55:50 +02:00
Kubernetes Prow Robot	ea21947641	Merge pull request #111426 from ping035627/k8s-220726 Update design-proposals URL	2022-08-01 23:50:30 -07:00
PingWang	473be65a3c	Update design-proposals URL Signed-off-by: PingWang <wang.ping5@zte.com.cn> update url Signed-off-by: PingWang <wang.ping5@zte.com.cn>	2022-08-02 09:13:38 +08:00
Kubernetes Prow Robot	2e1a4da8df	Merge pull request #111358 from ddebroy/hasnet1 Introduce PodHasNetwork condition for pods	2022-08-01 15:04:52 -07:00
Kubernetes Prow Robot	acc64759f5	Merge pull request #111549 from claudiubelu/log-compression Fixes kubelet log compression on Windows	2022-08-01 13:18:41 -07:00
Sally O'Malley	9e4e0bb48a	add runtime-service test with tracerProvider Signed-off-by: Sally O'Malley <somalley@redhat.com>	2022-08-01 12:55:21 -04:00
Sally O'Malley	0d558c51b5	add otelrestful restful.FilterFunction Signed-off-by: Sally O'Malley <somalley@redhat.com>	2022-08-01 12:55:19 -04:00
Sally O'Malley	7585aae1b4	kubelet-tracing:update Signed-off-by: Sally O'Malley <somalley@redhat.com>	2022-08-01 12:55:16 -04:00
Sally O'Malley	5b4456ceea	kubelet tracing: generated files Signed-off-by: Sally O'Malley <somalley@redhat.com>	2022-08-01 12:55:14 -04:00
Sally O'Malley	47e7d8034f	kubelet tracing Signed-off-by: Sally O'Malley <somalley@redhat.com> Co-authored-by: David Ashpole <dashpole@google.com>	2022-08-01 12:55:02 -04:00
Deep Debroy	dfdf8245bb	Introduce PodHasNetwork condition for pods Signed-off-by: Deep Debroy <ddebroy@gmail.com>	2022-08-01 09:51:43 -07:00
Kubernetes Prow Robot	ef8e7c471e	Merge pull request #110291 from danwinship/kep-3178-iptables-cleanup-kubelet Implement KEP-3178 "iptables cleanup" in kubelet	2022-08-01 07:50:40 -07:00
Sascha Grunert	584783ee9f	Partly remove support for seccomp annotations We now partly drop the support for seccomp annotations which is planned for v1.25 as part of the KEP: https://github.com/kubernetes/enhancements/issues/135 Pod security policies are not touched by this change and therefore we have to keep the annotation key constants. This means we only allow the usage of the annotations for backwards compatibility reasons while the synchronization of the field to annotation is no longer supported. Using the annotations for static pods is also not supported any more. Making the annotations fully non-functional will be deferred to a future release. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>	2022-08-01 09:19:29 +02:00
Kubernetes Prow Robot	bebea5f950	Merge pull request #111152 from sivchari/fix-refer-url fix: refer to url of Node Allocatable	2022-07-31 20:32:39 -07:00
Kubernetes Prow Robot	dd54a044ea	Merge pull request #110940 from pacoxu/ga-disable-accelerator Disable AcceleratorUsage Metrics: ga	2022-07-31 20:32:28 -07:00
Kubernetes Prow Robot	2e64ae6d62	Merge pull request #110733 from psschwei/probe-grace-period-units Add unit tests for grace period in killContainer func	2022-07-29 22:30:27 -07:00
Paco Xu	e073b0fd65	Disable AcceleratorUsage Metrics: ga	2022-07-30 12:31:43 +08:00
inosato	3b95d3b076	Remove ioutil in kubelet and its tests Signed-off-by: inosato <si17_21@yahoo.co.jp>	2022-07-30 12:35:26 +09:00
Kubernetes Prow Robot	25cdaccf0d	Merge pull request #111439 from claudiubelu/fix-plugin-watcher kubelet: Fixes plugin Watcher for Windows	2022-07-29 19:29:44 -07:00
Kubernetes Prow Robot	d838a8647b	Merge pull request #111418 from muyangren2/winstats_assert Fix test order pkg/kubelet/winstats/winstats_test.go	2022-07-29 19:29:29 -07:00
Kubernetes Prow Robot	cf2800b812	Merge pull request #111402 from verb/111030-ec-ga Promote EphemeralContainers feature to GA	2022-07-29 19:29:20 -07:00
Kubernetes Prow Robot	ca34eb1383	Merge pull request #111020 from claudiubelu/adds-unittests-5 unittests: Adds Windows unittests	2022-07-29 19:29:11 -07:00
Kubernetes Prow Robot	5d446b205e	Merge pull request #106244 from cncal/fix-state-checkpoint-testcase fix test for CheckpointStateRestore	2022-07-29 15:41:14 -07:00
Dmitry Verkhoturov	5126192548	clarify cpu.cfs_period_us default value cpu.cfs_period_us is 100μs by default despite having an "ms" unit for some unfortunate reason. Documentation: https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management The desired effect of that change is more clarity on the default value so users would be aware that the 10ms custom value would be not 0.1x of the default, but 100x of it.	2022-07-29 23:02:35 +02:00
Kubernetes Prow Robot	6a71632f65	Merge pull request #111239 from HecarimV/fix-22071914 fix: add pod info to the error log	2022-07-29 13:17:50 -07:00
Kubernetes Prow Robot	0b57f4ed4b	Merge pull request #110071 from gjkim42/deflake-TestStaticPodExclusion Deflake TestStaticPodExclusion	2022-07-29 13:17:43 -07:00
Kubernetes Prow Robot	126c07604d	Merge pull request #104484 from jackfrancis/prober-duration-metrics add container probe duration metrics	2022-07-29 13:17:11 -07:00
Claudiu Belu	430ada006d	Fixes kubelet log compression on Windows Currently, when kubelet will try to compress the logs to a .gz file, it will attempt to rename the archive before closing its file handles, which results in an error on Windows. This addresses the issue mentioned above.	2022-07-29 20:53:21 +03:00
Paul S. Schweigert	caa2fce0a1	add unit tests for grace period in killContainer func Signed-off-by: Paul S. Schweigert <paulschw@us.ibm.com>	2022-07-29 11:40:27 -04:00
Kubernetes Prow Robot	73b3be3082	Merge pull request #111009 from marosset/runasnonroot-windows-fix Windows: ensure runAsNonRoot does case-insensitive comparison on username	2022-07-28 17:55:22 -07:00
Kubernetes Prow Robot	cc69f8f65d	Merge pull request #107490 from pacoxu/add-volume-stats-slow-log add warning log if volume calculation took too long than 1 second	2022-07-28 17:55:10 -07:00
Dan Winship	3fdece285b	Add IPTablesOwnershipCleanup feature to disable kubelet iptables setup	2022-07-27 13:33:09 -04:00
Dan Winship	02c8210317	Clean up kubelet iptables error messages Their syntax seems to have gotten mangled in the structured logging migration...	2022-07-27 13:29:39 -04:00
Dan Winship	b7e977d497	Clean up kubelet iptables setup a bit Remove some unnecessary code that distinguishes "IPv4-primary" vs "IPv6-primary" despite it not having any effect.	2022-07-27 13:29:39 -04:00
Kubernetes Prow Robot	9ad4c5c0a0	Merge pull request #110670 from gnufied/fix-pod-deletion-terminating Fix pod stuck in termination state when mount fails or gets skipped after kubelet restart	2022-07-27 06:31:29 -07:00
Kubernetes Prow Robot	3ffdfbe286	Merge pull request #111254 from dims/update-to-golang-1.19-rc2 [golang] Update to 1.19rc2 (from 1.18.3)	2022-07-26 14:25:09 -07:00
Kubernetes Prow Robot	631a5a849a	Merge pull request #109778 from mythi/grpc-go-update grpc: move to use grpc.WithTransportCredentials()	2022-07-26 12:45:09 -07:00
Davanum Srinivas	a9593d634c	Generate and format files - Run hack/update-codegen.sh - Run hack/update-generated-device-plugin.sh - Run hack/update-generated-protobuf.sh - Run hack/update-generated-runtime.sh - Run hack/update-generated-swagger-docs.sh - Run hack/update-openapi-spec.sh - Run hack/update-gofmt.sh Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2022-07-26 13:14:05 -04:00
Claudiu Belu	f567b85cc4	kubelet: Fixes plugin Watcher for Windows Currently, the plugin Watcher checks if a file is a socket or not by running mode&os.ModeSocket != 0, which can't be True on Windows. util.IsUnixDomainSocket should be used instead.	2022-07-26 18:45:10 +03:00
muyangren2	42765566e1	Fix test order pkg/kubelet/winstats/winstats_test.go	2022-07-26 10:05:59 +08:00
Lee Verberne	d238e67ba6	Remove EphemeralContainers feature-gate checks	2022-07-26 02:55:30 +02:00
sivchari	5494b30ce5	feat: improve naming I fixed naming to recommended by Golang.	2022-07-24 19:04:08 +09:00
Hemant Kumar	835e8ccc76	Use CheckAndMarkAsUncertainViaReconstruction for uncertain volumes Also only remove volumes from skippedDuringReconstruction only if volume was marked as attached.	2022-07-22 20:11:37 -04:00
Hemant Kumar	6d43345c06	Remove volume from found during reconstruction if mounted Add unit tests for removing reconstructed volumes from ASOW	2022-07-22 20:04:51 -04:00
Hemant Kumar	b455270f6e	Add unit test for verifying if processReconstructedVolumes works as expected	2022-07-22 20:04:51 -04:00
Hemant Kumar	b8257e8c01	Address review comments	2022-07-22 20:04:51 -04:00
Hemant Kumar	eb071c2755	Fix code to process volumes which were skipped during reconstruction	2022-07-22 20:04:51 -04:00
Hemant Kumar	c8b85fb470	Keep track of each pod that uses a volume during reconstruction Add tests for volume cleaning up	2022-07-22 20:04:51 -04:00
HaoJie Liu	b058565f65	fix: add pod info to the error log Signed-off-by: HaoJie Liu <liuhaojie@beyondcent.com>	2022-07-19 14:17:33 +08:00
Kubernetes Prow Robot	1c1efde70d	Merge pull request #109639 from Abirdcfly/fixduplicateimport cleanup: remove all duplicate import	2022-07-18 16:55:23 -07:00
Mark Rossetti	588ff515bc	Windows: ensure runAsNonRoot does case-insensitive comparison on user name Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2022-07-18 15:23:13 -07:00
Giuseppe Scrivano	705242e2af	kubelet: remove superfluous function the ensureDirectory seems to be just a wrapper around MkdirAll. Since MkdirAll doesn't treat an existing directory as an error, there is no need of the extra stat() syscall that was previously performed. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2022-07-18 17:08:27 +02:00
Kubernetes Prow Robot	63822660f0	Merge pull request #110880 from yangjunmyfm192085/fixnegativevalue When metrics are counted, discard the wrong container StartTime metrics	2022-07-15 01:04:36 -07:00
Kubernetes Prow Robot	b3057e7ccc	Merge pull request #106834 from mengjiao-liu/sysctl-allow-slashes Add support for slash as sysctl separator to Pod securityContext field and to PodSecurityPolicy	2022-07-15 01:04:24 -07:00
JunYang	c71e3a7802	When metrics are counted, discard the wrong container startup time metrics	2022-07-15 08:56:12 +08:00
Kubernetes Prow Robot	ce583e0338	Merge pull request #110950 from yangjunmyfm192085/fixmetricsjudgement filter out terminated containers in cadvisor_stats_provider	2022-07-14 14:36:52 -07:00
Kubernetes Prow Robot	21149f1b68	Merge pull request #109794 from shiftstack/always_annotate_nodeip Make kubelet set alpha.kubernetes.io/provided-node-ip unconditionally	2022-07-14 14:36:40 -07:00
Kubernetes Prow Robot	f2395bbd4a	Merge pull request #111141 from yangjunmyfm192085/fixpanic Fix kubelet panic when accessing metrics/resource endpoint	2022-07-14 13:19:27 -07:00
sivchari	3db9e1c64c	fix: refer to url of Node Allocatable	2022-07-15 00:54:33 +09:00
Adrian Reber	fc37a7a990	kubelet: wire checkpoint container support through This adds the last pieces to wire through the container checkpoint support in the kubelet. Signed-off-by: Adrian Reber <areber@redhat.com>	2022-07-14 10:27:41 +00:00
Adrian Reber	8c24857ba3	kubelet: add CheckpointContainer() to the runtime Signed-off-by: Adrian Reber <areber@redhat.com>	2022-07-14 10:27:41 +00:00
Adrian Reber	3e6f50683f	kubelet: add CheckpointContainer() on the service level Signed-off-by: Adrian Reber <areber@redhat.com>	2022-07-14 10:27:40 +00:00
Adrian Reber	1ac7d78296	kubelet: add CheckpointContainer in remote runtime This is the first step to implement checkpointing and restoring of container and containers starting from the lowest layer in the kubelet. Signed-off-by: Adrian Reber <areber@redhat.com>	2022-07-14 10:27:40 +00:00
Adrian Reber	564f0e9a25	kubelet: add checkpoint/restore infrastructure This adds the first infrastructure code parts to the kubelet to support checkpoint/restore. Signed-off-by: Adrian Reber <areber@redhat.com>	2022-07-14 10:27:40 +00:00
JunYang	f33652ce61	Fix kubelet panic when accessing metrics/resource endpoint	2022-07-14 16:38:48 +08:00
Abirdcfly	00b9ead02c	cleanup: remove duplicate import Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-07-14 11:25:19 +08:00
Claudiu Belu	7e6e31577e	unittests: Adds Windows unittests Adds unit tests for a few functions that are not covered.	2022-07-08 17:24:15 +03:00
Kubernetes Prow Robot	80b2848725	Merge pull request #110860 from claudiubelu/utils-cleanup cleanup: Removes duplicate utils code	2022-07-07 20:36:12 -07:00
Kubernetes Prow Robot	0dc32b10fe	Merge pull request #110774 from kinvolk/rata/kubelet-short-tests pkg/kubelet: skip long test on short mode	2022-07-07 20:36:05 -07:00
Kubernetes Prow Robot	c05d185901	Merge pull request #110683 from zhoumingcheng/master-v2 add unit test coverage for pkg/kubelet/util/util_unix_test.go	2022-07-07 20:35:57 -07:00
Kubernetes Prow Robot	b3be343bc8	Merge pull request #110811 from Abirdcfly/clock Update golangci-lint to 1.46.2 and fix errors	2022-07-06 16:03:32 -07:00
JunYang	cafc5d1c82	filter out terminated containers in cadvisor_stats_provider	2022-07-06 19:21:27 +08:00
PingWang	c6b4725e55	Add failure handling of the desiredStateOfWorldPopulator start Signed-off-by: PingWang <wang.ping5@zte.com.cn>	2022-07-01 13:56:33 +08:00
Harsha Narayana	c3cbc443ef	structured-logging: replace KObjs with KObjSlice for logging	2022-07-01 09:52:07 +05:30
Abirdcfly	2bca77a3d9	Update golangci-lint to 1.46.2 and fix errors Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-06-29 17:42:46 +08:00
Claudiu Belu	93701ce0c1	cleanup: Removes duplicate utils code The utils found in pkg/kubelet/cri/remote/utils are the same as the ones in pkg/kubelet/utils, with the difference that the latter have had a few improvements recently. This commit removes the duplicated code.	2022-06-28 22:58:14 -07:00
sunzhaochang	e833c64ef0	Fix missing of Lock in SeenAllSources	2022-06-29 11:54:22 +08:00
Kubernetes Prow Robot	50b982edab	Merge pull request #109227 from Monokaix/refactor-pleg/getContainersFromPods refactor: pleg/getContainersFromPods	2022-06-28 10:17:58 -07:00
Kubernetes Prow Robot	10bea49c12	Merge pull request #110140 from marosset/hpc-sandbox-config-fixes Fixing issue in generatePodSandboxWindowsConfig for hostProcess containers	2022-06-27 20:21:57 -07:00
Kubernetes Prow Robot	7c8721ae29	Merge pull request #110711 from 249043822/br-evictionlog fix evictionManager debugLog wrong	2022-06-27 19:16:25 -07:00
Kubernetes Prow Robot	b19d50d68e	Merge pull request #110075 from luckerby/104584-retry-dial-on-socket-windows-base Retry Unix domain sockets on Windows nodes for the plugin registration mechanism	2022-06-27 19:16:16 -07:00
Kubernetes Prow Robot	0f3bf88a91	Merge pull request #108682 from chymy/nilpointer Method call 'err.Error()' might lead to a nil pointer dereference for pkg/kubelet/cm/cpumanager/cpu_assignment_test.go	2022-06-27 19:15:56 -07:00
Kubernetes Prow Robot	123713b496	Merge pull request #110504 from pohly/kubelet-shutdown-test kubelet: convert node shutdown manager to contextual logging	2022-06-27 18:10:15 -07:00
Kubernetes Prow Robot	92945a1a32	Merge pull request #109691 from zhangxyjlu/kubelet_testgetter Add test case for getPodVolumeSubpathsDir	2022-06-27 18:09:57 -07:00
21kyu	df168d5b5c	Change reflect.Ptr to reflect.Pointer	2022-06-26 01:23:43 +09:00
Rodrigo Campos	466c4d24a9	pkg/kubelet: skip long test on short mode When adding functionality to the kubelet package and a test file, is kind of painful to run unit tests today locally. We usually can't run specifying the test file, as if xx_test.go and xx.go use the same package, we need to specify all the dependencies. As soon as xx.go uses the Kuebelet type (we need to do that to fake a kubelet in the unit tests), this is completely impossible to do in practice. So the other option is to run the unit tests for the whole package or run only a specific funtion. Running a single function can work in some cases, but it is painful when we want to test all the functions we wrote. On the other hand, running the test for the whole package is very slow. Today some unit tests try to connect to the API server (with retries) create and list lot of pods/volumes, etc. This makes running the unit test for the kubelet package slow. This patch tries to make running the unit test for the whole package more palatable. This patch adds a skip if the short version was requested (go test -short ...), so we don't try to connect to the API server or skip other slow tests. Before this patch running the unit tests took in my computer (I've run it several times so the compilation is already done): $ time go test -v real 0m21.303s user 0m9.033s sys 0m2.052s With this patch it takes ~1/3 of the time: $ time go test -short -v real 0m7.825s user 0m9.588s sys 0m1.723s Around 8 seconds is something I can wait to run the tests :) Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>	2022-06-24 18:00:21 +02:00
Patrick Ohly	7f55a0bae0	kubelet: avoid manipulating global logger during unit test The code as it stands now works, but it is still complicated and previous versions had race conditions (https://github.com/kubernetes/kubernetes/issues/108040). Now the test works without modifying global state. The individual test cases could run in parallel, this just isn't done because they complete quickly already (2 seconds).	2022-06-24 11:27:40 +02:00
Patrick Ohly	65385fec20	kubelet: convert node shutdown manager to contextual logging This will make output checking easier (done in a separate commit). kubelet itself still uses the global logger.	2022-06-24 11:20:34 +02:00
zhoumingcheng	ca8f3dff9a	add unit test coverage for pkg/kubelet/util/queue Signed-off-by: zhoumingcheng <zhoumingcheng@beyondcent.com>	2022-06-23 17:57:17 +08:00
ZhangKe10140699	08235a5835	fix evictionManager debugLog wrong	2022-06-22 16:08:43 +08:00
Mengjiao Liu	20bb84b3f1	Pod SecurityContext and PodSecurityPolicy supports slash as sysctl separator	2022-06-22 10:24:35 +08:00
Matthew Booth	0f41aaf138	Make kubelet set alpha.kubernetes.io/provided-node-ip unconditionally	2022-06-21 10:37:34 +01:00
zhoumingcheng	b722056698	add unit test Signed-off-by: zhoumingcheng <zhoumingcheng@beyondcent.com>	2022-06-21 17:03:43 +08:00
Abirdcfly	984ed7ab94	typo in comments pkg/kubelet/volumemanager/volume_manager.go Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-06-20 09:59:01 +08:00
Patrick Ohly	4c6338ac0f	logs: replace config methods with functions API types are only supposed to have methods related to serialization.	2022-06-17 20:22:13 +02:00
Patrick Ohly	ea3f25f49b	logs: add alpha+beta feature gates It is useful to have the ability to control whether alpha or beta features are enabled. We can group features under LoggingAlphaOptions and LoggingBetaOptions because the configuration is designed so that each feature individually must be enabled via its own option. Currently, the JSON format itself is beta (graduated in 1.23) but additional options for it were only added in 1.23 and thus are still alpha: $ go run ./staging/src/k8s.io/component-base/logs/example/cmd/logger.go --logging-format=json --log-json-split-stream --log-json-info-buffer-size 1M --feature-gates LoggingBetaOptions=false [format: Forbidden: Log format json is BETA and disabled, see LoggingBetaOptions feature, options.json.splitStream: Forbidden: Feature LoggingAlphaOptions is disabled, options.json.infoBufferSize: Forbidden: Feature LoggingAlphaOptions is disabled] $ go run ./staging/src/k8s.io/component-base/logs/example/cmd/logger.go --logging-format=json --log-json-split-stream --log-json-info-buffer-size 1M [options.json.splitStream: Forbidden: Feature LoggingAlphaOptions is disabled, options.json.infoBufferSize: Forbidden: Feature LoggingAlphaOptions is disabled] This is the same approach that was taken for CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions. In order to test this without modifying the global feature gate in a test file, ValidateKubeletConfiguration must take a feature gate as argument.	2022-06-17 20:22:13 +02:00
Patrick Ohly	1aceac797d	logs: make LoggingConfiguration an unversioned API Making the LoggingConfiguration part of the versioned component-base/config API had the theoretic advantage that components could have offered different configuration APIs with experimental features limited to alpha versions (for example, sanitization offered only in a v1alpha1.KubeletConfiguration). Some components could have decided to only use stable logging options. In practice, this wasn't done. Furthermore, we don't want different components to make different choices regarding which logging features they offer to users. It should always be the same everywhere, for the sake of consistency. This can be achieved with a saner Go API by dropping the distinction between internal and external LoggingConfiguration types. Different stability levels of indidividual fields have to be covered by documentation (done) and potentially feature gates (not currently done). Advantages: - everything related to logging is under component-base/logs; previously this was scattered across different packages and different files under "logs" (why some code was in logs/config.go vs. logs/options.go vs. logs/logs.go always confused me again and again when coming back to the code): - long-term config and command line API are clearly separated into the "api" package underneath that - logs/logs.go itself only deals with legacy global flags and logging configuration - removal of separate Go APIs like logs.BindLoggingFlags and logs.Options - LogRegistry becomes an implementation detail, with less code and less exported functionality (only registration needs to be exported, querying is internal)	2022-06-17 20:22:13 +02:00
Zihong Zheng	9e8d8286ca	Revert "filter out terminated containers in cadvisor_stats_provider"	2022-06-15 16:09:37 -07:00
Kubernetes Prow Robot	48efb361f3	Merge pull request #110323 from Thearas/docs-ephemeral-storage docs: add `ephemeral-storage` to `SystemReserved`/`KubeReserved` comment	2022-06-10 13:19:44 -07:00
Mihai Albert	c666656259	Add retry logic for Unix Domain sockets on Windows	2022-06-10 00:59:55 +03:00
Davanum Srinivas	ab690750df	Switch to v3 of github.com/emicklei/go-restful Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2022-06-09 14:11:41 -04:00
Kubernetes Prow Robot	226323178e	Merge pull request #110256 from bobbypage/terminal-ready-condition kubelet: Mark ready condition as false explicitly for terminal pods	2022-06-08 20:07:42 -07:00
David Porter	7811d84fef	kubelet: Mark ready condition as false explicitly for terminal pods Terminal pods may continue to report a ready condition of true because there is a delay in reconciling the ready condition of the containers from the runtime with the pod status. It should be invalid for kubelet to report a terminal phase with a true ready condition. To fix the issue, explicitly override the ready condition to false for terminal pods during status updates. Signed-off-by: David Porter <david@porter.me>	2022-06-08 16:19:16 -07:00
Ryan Phillips	230124f3d4	kubelet: add e2e test to verify probe readiness	2022-06-06 17:00:55 -05:00
Ryan Phillips	f25ca15e1c	kubelet: only shutdown probes for pods that are terminated This fixes a bug where terminating pods would not run their readiness probes. Terminating pods are found within the possiblyRunningPods map.	2022-06-06 17:00:54 -05:00
Clayton Coleman	1d518adb76	kubelet: Pod probes should be handled by pod worker The pod worker is the owner of when a container is running or not, and the start and stop of the probes for a given pod should be handled during the pod sync loop. This ensures that probes do not continue running even after eviction. Because the pod semantics allow lifecycle probes to shorten grace period, the probe is removed after the containers in a pod are terminated successfully. As an optimization, if the pod will have a very short grace period (0 or 1 seconds) we stop the probes immediately to reduce resource usage during eviction slightly. After this change, the probe manager is only called by the pod worker or by the reconcile loop.	2022-06-06 17:00:54 -05:00
major1201	c87a559ed7	fix defer in loop and optimize test case with explicit field name	2022-06-05 23:36:19 +08:00
Kubernetes Prow Robot	1f90b7980b	Merge pull request #108997 from dobsonj/issue79980 Fix volume reconstruction for CSI ephemeral volumes	2022-06-03 18:08:20 -07:00
Kubernetes Prow Robot	60902b7caf	Merge pull request #109692 from yxxhero/remove_ioutil_in_kubelet remove ioutil in kubelet	2022-06-03 09:30:51 -07:00
Mark Rossetti	0c6088861b	Fixing issue in generatePodSandboxWindowsConfig for hostProcess containers by where pod sandbox won't have HostProcess bit set if pod does not have a security context but containers specify HostProcess. Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2022-06-02 12:10:10 -07:00
Jonathan Dobson	daa181d92e	kubelet: fix volume reconstruction for CSI ephemeral volumes This resolves a couple of issues for CSI volume reconstruction. 1. IsLikelyNotMountPoint is known not to work for bind mounts and was causing problems for subpaths and hostpath volumes. 2. Inline volumes were failing reconstruction due to calling GetVolumeName, which only works when there is a PV spec.	2022-06-01 14:22:57 -06:00
Kubernetes Prow Robot	737f706b1c	Merge pull request #108803 from SergeyKanzhelev/httpProbeMinorCleanup remove TODOs from http package and prober	2022-06-01 12:03:28 -07:00
Thearas	2457fbc643	docs: add `ephemeral-storage` to `SystemReserved`/`KubeReserved` comment	2022-06-01 16:19:26 +08:00
Davanum Srinivas	50bea1dad8	Move from k8s.gcr.io to registry.k8s.io Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2022-05-31 10:16:53 -04:00
Mikko Ylinen	2c8bfad910	grpc: move to use grpc.WithTransportCredentials() v1.43.0 marked grpc.WithInsecure() deprecated so this commit moves to use what is the recommended replacement: grpc.WithTransportCredentials(insecure.NewCredentials()) Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>	2022-05-30 21:41:47 +03:00
Antonio Ojea	d16d23e0c7	add pod util to verify pod is terminal pods on phase succeeded or failed are guaranteed to have all containers stopped and to not ever regress	2022-05-27 06:42:39 +02:00
Kubernetes Prow Robot	e9f1c9cc7c	Merge pull request #110138 from wojtek-t/fix_leaking_goroutines_in_kubelet_test Fix leaking goroutines in kubelet integration test	2022-05-23 04:06:01 -07:00
Wojciech Tyczyński	0d41d2921e	Fix leaking goroutines in kubelet integration test	2022-05-23 11:50:29 +02:00
Kubernetes Prow Robot	6dc592e347	Merge pull request #108787 from 249043822/cadvisor_stat_provider_filter_0 filter out terminated containers in cadvisor_stats_provider	2022-05-20 16:50:00 -07:00
Gunju Kim	563c99599f	Deflake TestStaticPodExclusion	2022-05-16 23:30:57 +09:00
Kubernetes Prow Robot	3441850891	Merge pull request #109987 from gnufied/fix-ephemeral-volume-expansion Fix resizing of ephemeral volumes	2022-05-13 14:24:06 -07:00
Kubernetes Prow Robot	f7857f0846	Merge pull request #109830 from AllenZMC/fix_test fix defer in loop, maybe resource leak	2022-05-13 08:51:49 -07:00
Kubernetes Prow Robot	1be1ec4aa3	Merge pull request #109970 from stevekuznetsov/skuznets/isolate-versioner storage: move the APIObjectVersioner definition to storage	2022-05-12 12:32:44 -07:00
Kubernetes Prow Robot	3688442c75	Merge pull request #108115 from haircommander/cadvisor-pod-stats kubelet/stats: update cadvisor stats provider with new log location	2022-05-12 08:09:13 -07:00
Hemant Kumar	4bf500eb92	Add test for checking ephemeral volume expansion	2022-05-11 16:18:10 -04:00
Hemant Kumar	a5c961f4a8	Fix resizing of ephemeral volumes	2022-05-11 15:06:42 -04:00
Steve Kuznetsov	3939f3003e	storage: move the APIObjectVersioner definition to storage The means by which we extract and parse the version of an API object is not specific to etcd3. In order to allow for a generic suite of tests against any storage.Interface imlpementation, we need this logic to live outside of the etcd3 package, or import cycles will exist. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>	2022-05-11 08:19:32 -07:00
Kubernetes Prow Robot	d9780798ba	Merge pull request #109849 from pacoxu/fix-data-race fix data race in device manager plugin hander	2022-05-09 06:33:20 -07:00
Paco Xu	0ec7e38ef0	fix data race in device manager plugin hander	2022-05-07 11:18:23 +08:00
Kubernetes Prow Robot	4bec6b34ef	Merge pull request #107122 from songlh/fixing-5 fixing the panic in TestVersion	2022-05-05 16:28:41 -07:00
AllenZMC	bedd0839a1	Optimize test cases for kubelet	2022-05-05 23:07:09 +08:00
Jordan Liggitt	410ac59c0d	Remove PodSecurityPolicy admission plugin	2022-05-04 16:00:56 -04:00
Kubernetes Prow Robot	d9fa563550	Merge pull request #109441 from Miciah/kubelet-parseResolvConf-handle-search-dot kubelet: parseResolvConf: Handle "search ."	2022-05-04 01:27:42 -07:00
Kubernetes Prow Robot	dbf2f1d833	Merge pull request #109103 from Dingshujie/fix_memory_leak cpu/memory manager containerMap memory leak	2022-05-03 18:24:43 -07:00
Kubernetes Prow Robot	1b2de5cf01	Merge pull request #109042 from bjorand/network_panic_kubelet kubelet: fix panic triggered when playing with a wip CRI	2022-05-03 18:24:20 -07:00
Kubernetes Prow Robot	05e3919b45	Merge pull request #109016 from klueska/refactor-devicemanager Refactor all device-plugin logic into separate 'plugin' package under the devicemanager	2022-05-03 18:24:12 -07:00
Kubernetes Prow Robot	be9ef536cd	Merge pull request #105995 from NoicFank/feature-add-error-handle Add error handling for Write() function	2022-05-03 17:18:07 -07:00
Kubernetes Prow Robot	9a160ac5fb	Merge pull request #101882 from jackfrancis/kubelet-initialnode-getcapacity kubelet: more resilient node allocatable ephemeral-storage data getter	2022-05-03 17:17:24 -07:00
Kubernetes Prow Robot	ea7c57b2ee	Merge pull request #99685 from yangjunmyfm192085/run-test24 Fix misspelling of success.	2022-05-03 17:16:47 -07:00
STRRL	297b6c2995	test: another test case for TestIsCgroupPod contains reserved word pod Signed-off-by: STRRL <im@strrl.dev>	2022-05-03 00:01:09 +08:00
STRRL	28b9216b42	test: testcases for pod container manager GetPodContainerName Signed-off-by: STRRL <im@strrl.dev>	2022-05-02 23:51:27 +08:00
Kevin Klues	57f8b31b42	Update tests to accommodate devicemanager refactoring Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-04-29 10:52:37 +00:00
Kevin Klues	f6eaa25b71	Move DevicePluginStub implementation into new plugin package Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-04-29 10:52:37 +00:00
Kevin Klues	db88676c20	Refactor all device plugin logic into separate 'plugin' package This is the first step towards being able to support a new plugin API version in parallel with the existing one. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-04-29 10:52:37 +00:00
Sergey Kanzhelev	1df526b3f7	remove TODOs from http package and prober	2022-04-28 16:51:11 +00:00
yxxhero	4fac7486d4	remove ioutil in kubelet Signed-off-by: yxxhero <aiopsclub@163.com>	2022-04-27 21:08:42 +08:00
zhangxiaoyang	0b1fb2b394	Add test case for getPodVolumeSubpathsDir	2022-04-27 16:33:28 +08:00
Miciah Masters	5832b84200	kubelet: parseResolvConf: Handle "search ." When parsing a resolv.conf file that has "search .", parseResolvConf should accept the "." entry verbatim. Before this commit, parseResolvConf unconditionally trimmed the "." suffix, which in the case of "." resulted in a "" entry (that is, the empty string). This empty entry could lead parseResolvConf to produce a resolv.conf file with "search ". Resolvers could fail to parse such a resolv.conf file from parseResolvConf, thus breaking DNS resolution in pods. After this commit, parseResolvConf accepts a resolv.conf file with "search ." and passes the "." entry through verbatim to produce a valid resolv.conf file. The "." suffix is still trimmed for any entry that does not solely comprise ".". Follow-up to commit `a215a88d91`. * pkg/kubelet/network/dns/dns.go (parseResolvConf): Handle a "." entry in the search path by copying it verbatim. * pkg/kubelet/network/dns/dns_test.go (TestParseResolvConf): Add a test case for "search .".	2022-04-12 15:39:31 -04:00
cncal	ab945d21ad	reorder the import packages	2022-04-09 11:30:26 +08:00
cncal	fa1d1edbef	use require to simplify testcases	2022-04-09 11:30:26 +08:00
cncal	a64b9cee21	fix test for CheckpointStateRestore	2022-04-09 11:30:26 +08:00
XuzhengChang	6266554b34	refactor: pleg/getContainersFromPods	2022-04-06 14:12:52 +08:00
Hemant Kumar	5da524d973	Fix error for inline migrated volumes Inline migrated volumes report a PV, even though they are not backed by PV.	2022-04-04 13:14:29 -04:00
Kubernetes Prow Robot	e04a4e1c5b	Merge pull request #105599 from jonyhy96/fix-pod-workers-test fix: pod workers test	2022-04-02 06:50:09 -07:00
David Ashpole	120da3bb9b	fix copylock vet errors in component-base metrics	2022-03-31 15:07:11 +00:00
Maciej Wyrzuc	1108bed763	Revert "Field `status.hostIPs` added for Pod (#101566 )" This reverts commit `61b3c028ba`.	2022-03-31 12:39:45 +00:00
DingShujie	fb3636da40	cpu manager policy set to none, no one remove container id from container map, lead memory leak	2022-03-30 23:25:05 +08:00
Jack Francis	ab14cba2cf	kubelet: more resilient node allocatable ephemeral-storage data getter	2022-03-29 18:13:57 -07:00
Kubernetes Prow Robot	b0254c8a0b	Merge pull request #108758 from fengzixu/improvement-volume-health re-push "add volume kubelet_volume_stats_health_abnormal to kubelet #105585"	2022-03-29 17:35:34 -07:00
Kubernetes Prow Robot	1266744002	Merge pull request #108693 from gnufied/enable-rwx-call-all-nodes Enable node-expansion to be called on all nodes for RWX volumes	2022-03-29 17:35:05 -07:00
Shiming Zhang	61b3c028ba	Field `status.hostIPs` added for Pod (#101566 ) * Add FeatureGate PodHostIPs * Add HostIPs field and update PodIPs field * Types conversion * Add dropDisabledStatusFields * Add HostIPs for kubelet * Add fuzzer for PodStatus * Add status.hostIPs in ConvertDownwardAPIFieldLabel * Add status.hostIPs in validEnvDownwardAPIFieldPathExpressions * Downward API support for status.hostIPs * Add DownwardAPI validation for status.hostIPs * Add e2e to check that hostIPs works * Add e2e to check that Downward API works * Regenerate	2022-03-29 11:46:07 -07:00
Kubernetes Prow Robot	6c96ac04ff	Merge pull request #101218 from gjkim42/add-taint-toleration-check kubelet: check taint/toleration before accepting pods	2022-03-29 09:16:56 -07:00
Kir Kolyshkin	37761a329e	pkg/kubelet: changes to update runc to 1.1.0 The changes (mostly in pkg/kubelet/cm) are there to adopt changed runc 1.1 API, and simplify things a bit. In particular: 1. simplify cgroup manager instantiation, using a new, easier way of libcontainers/cgroups/manager.New; 2. replace libcontainerAdapter with a boolean variable (all it did was passing on whether systemd manager should be used); 3. trivial change due to removed cgroupfs.HugePageSizes and added cgroups.HugePageSizes(); 4. do not calculate cgroup paths in update / destroy, since libcontainer cgroup managers now calculate the paths upon creation (previously, they were doing that only in Apply, so using e.g. Set or Destroy right after creation was impossible without specifying paths). We currently still calculate cgroup paths in Exists -- this is to be addressed separately. Co-Authored-By: Elana Hashman <ehashman@redhat.com>	2022-03-28 16:23:20 -07:00
Kubernetes Prow Robot	4fdca04f35	Merge pull request #109059 from danwinship/kube-iptables-hint Create a KUBE-IPTABLES-HINT chain	2022-03-28 15:24:04 -07:00
Hemant Kumar	dee48d3c36	Add more tests for volume recovery cases	2022-03-28 11:59:43 -04:00
Hemant Kumar	a99466ca86	check existing size before querying new size from api-server	2022-03-28 11:32:49 -04:00
Hemant Kumar	1809094389	address review comments for rwx volume types	2022-03-28 11:32:49 -04:00
Hemant Kumar	ed217f4140	rename SetVolumeSize to InitializeVolumeSize	2022-03-28 11:32:49 -04:00
Hemant Kumar	4d52dbb9f8	Remove legacyCallNodeExpandOnPlugin when RecoverVolumeExpansionFailure	2022-03-28 11:32:49 -04:00
Hemant Kumar	7a43406138	Do not update PVC if it already has updated size	2022-03-28 11:32:49 -04:00
Hemant Kumar	c0fbd83cde	Fix code for desired state of the world populator	2022-03-28 11:32:49 -04:00
Hemant Kumar	e4f62d6c41	Modify code to use new interface functions	2022-03-28 11:32:49 -04:00
Hemant Kumar	2e54686f1b	Add a function to record volume size in dsow	2022-03-28 11:32:49 -04:00
Hemant Kumar	10f91a9951	Refactor volume attach code	2022-03-28 11:32:49 -04:00
Hemant Kumar	6eea80ec97	Record size of volume in desired and actual state of the world	2022-03-28 11:32:49 -04:00
Kubernetes Prow Robot	dbd37cb8a8	Merge pull request #108831 from waynepeking348/skip_re_allocate_logic_if_pod_id_already_removed skip re-allocate logic if pod is already removed to avoid panic	2022-03-27 11:37:21 -07:00
Dan Winship	edbce228cb	Create a KUBE-IPTABLES-HINT chain for other components Components that run in a container but modify the host network namespace iptables rules need to know whether the system is using iptables-legacy or iptables-nft. Given that kubelet will run before any container-based components, it is well-positioned to help them figure this out. So create a chain with a well-known name that they can look for.	2022-03-27 14:12:36 -04:00
Kubernetes Prow Robot	d796dd7d0f	Merge pull request #108193 from utkarsh348/myfeature Fixed race condition in test manager shutdown	2022-03-27 05:55:21 -07:00
waynepeking348	6157d3cc4a	skip deleted activePods and return nil	2022-03-27 20:35:09 +08:00
fengzixu	38d8aae408	fix: add nil check	2022-03-27 08:38:20 +00:00
Dan Winship	749df8e022	Move iptables consts to kubelet_network_linux.go.	2022-03-26 11:22:51 -04:00
Kubernetes Prow Robot	c239b406f0	Merge pull request #108929 from gnufied/move-expansion-feature-gate-ga Move all volume expansion feature gates to GA	2022-03-25 18:08:16 -07:00
Benjamin Jorand	3c65728ede	kubelet: fix panic triggered when playing with a wip CRI	2022-03-26 00:23:35 +01:00
Kubernetes Prow Robot	ea006f5246	Merge pull request #108531 from tallclair/redirects Don't follow redirects with spdy	2022-03-25 15:34:23 -07:00
Kubernetes Prow Robot	e7845861a5	Merge pull request #108986 from gnufied/use-temp-dir-shutdown-tests Use tempdir for shutdown tests	2022-03-25 05:17:51 -07:00
Kubernetes Prow Robot	68cf2a60c6	Merge pull request #108847 from adisky/update-credential-api Move kubelet credential provider feature flag to beta and update the api's	2022-03-24 20:05:53 -07:00
Aditi Sharma	ed16ef2206	Move feature flag credential provider to beta Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>	2022-03-24 22:43:38 +05:30
Hemant Kumar	13b34d9c77	Use tempdir for shutdown tests	2022-03-24 11:58:49 -04:00
Hemant Kumar	cdfb841a52	remove ExpandInUsePersistentVolume feature gate	2022-03-24 11:19:42 -04:00
Hemant Kumar	966e1b6dd0	Fix code to not use the feature gate	2022-03-24 10:37:49 -04:00
Patrick Ohly	edffc700a4	enhance and fix log calls Some of these changes are cosmetic (repeatedly calling klog.V instead of reusing the result), others address real issues: - Logging a message only above a certain verbosity threshold without recording that verbosity level (if klog.V().Enabled() { klog.Info... }): this matters when using a logging backend which records the verbosity level. - Passing a format string with parameters to a logging function that doesn't do string formatting. All of these locations where found by the enhanced logcheck tool from https://github.com/kubernetes/klog/pull/297. In some cases it reports false positives, but those can be suppressed with source code comments.	2022-03-24 11:13:50 +01:00
Kubernetes Prow Robot	190f974dd8	Merge pull request #108902 from kolyshkin/bump-golangci-lint Fix verify:* after go 1.18 upgrade	2022-03-24 02:59:06 -07:00
Kubernetes Prow Robot	22db936de3	Merge pull request #107750 from shiftstack/issues/cloud-provider/56 Prefer user-provided node IP	2022-03-24 02:58:42 -07:00
Kubernetes Prow Robot	68a0fccfb9	Merge pull request #108363 from houjun41544/20220226-kubeletvolume Fix error logging statement to make it easier to understand	2022-03-23 22:30:52 -07:00
Kubernetes Prow Robot	3a2509b60e	Merge pull request #108841 from tengqm/fix-kubeletcfg-docstring Fix doc strings for kubelet config APIs	2022-03-23 13:22:27 -07:00
Kubernetes Prow Robot	2f7d53bbf1	Merge pull request #108442 from NikhilSharmaWe/volMan Managing nil pointer in VolumeManager	2022-03-23 13:21:55 -07:00
Kubernetes Prow Robot	75b19b242c	Merge pull request #108597 from kolyshkin/prepare-for-runc-1.1 kubelet/cm: refactor, prepare for runc 1.1 bump	2022-03-23 11:20:30 -07:00
Kir Kolyshkin	4513de06a8	Regen mocks using go 1.18 Generated by ./hack/update-mocks.sh using go 1.18 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-23 10:19:38 -07:00
Kubernetes Prow Robot	a6e65a246c	Merge pull request #107986 from wzshiming/promote/shutdown-based-on-pod-priority Promote graceful shutdown based on pod priority to beta	2022-03-23 08:06:09 -07:00
Kubernetes Prow Robot	df98f75e93	Merge pull request #107845 from smarterclayton/wait_on_create kubelet: If the container status is created, we are waiting	2022-03-22 12:21:59 -07:00
Kubernetes Prow Robot	41501c4fcf	Merge pull request #108704 from MartinForReal/feat/add_bootid_for_windows Add bootid support for windows node.	2022-03-22 10:36:11 -07:00
Matthew Booth	928a5db93b	cloud-provider handles kubelet's --node-ip When using a legacy cloud provider, if kubelet is passed a node address in --node-ip it will use this address in preference out the the addresses by the cloud provider. When using an external cloud provider, kubelet will annotate the Node with the first --node-ip for use by the cloud provider. The cloud provider validates this annotation but does not otherwise use it, meaning that --node-ip has no effect. This change moves the node address filtering code from kubelet to component-helpers and updates both kubelet and cloud-provider to use it. There is no functional change to kubelet, but cloud-provider now honours kubelet's --node-ip.	2022-03-22 16:58:37 +00:00
Nikhil Sharma	4224b524d5	Managing nil pointer in VolumeManager	2022-03-22 22:04:24 +05:30
ZhangKe10140699	f69fb544fa	filter out terminated containers in cadvisor_stats_provider	2022-03-22 16:36:46 +08:00
waynepeking348	51883193c5	keep original cpu limit logic to be in line with the comments	2022-03-22 10:09:10 +08:00
waynepeking348	0b6d27002f	fix test cases for cpu shares	2022-03-21 21:54:32 +08:00
Qiming Teng	4567032b5f	Fix doc strings for kubelet config APIs	2022-03-21 16:35:21 +08:00
waynepeking348	4c87589300	fix bugs of container cpu shares when cpu request set to zero	2022-03-20 21:53:22 +08:00
waynepeking348	35a456b0c6	skip reallocate logic if pod is already removed	2022-03-20 21:09:47 +08:00
MartinForReal	d529b7e10b	add bootid support for windows node. Signed-off-by: MartinForReal <fanshangxiang@gmail.com>	2022-03-18 02:17:52 +00:00
Kubernetes Prow Robot	56062f7f4f	Merge pull request #108010 from endocrimes/dani/eviction-flake eviction: Deflake TestStart	2022-03-17 12:22:54 -07:00
Kubernetes Prow Robot	9e50a332d8	Merge pull request #108366 from smarterclayton/terminating_not_terminated Delay writing a terminal phase until the pod is terminated	2022-03-17 08:29:21 -07:00
Kubernetes Prow Robot	a504daa048	Merge pull request #108441 from pacoxu/pod-overload-ga mark PodOverhead to GA in v1.24; remove in v1.26	2022-03-17 06:33:22 -07:00
Kubernetes Prow Robot	ba1c42892f	Merge pull request #100424 from yangjunmyfm192085/run-test30 Add test cases of kubelet_pods_test.go.	2022-03-17 00:41:19 -07:00
Kubernetes Prow Robot	5cb6fab8f6	Merge pull request #105585 from fengzixu/improvement-volume-health add volume kubelet_volume_stats_health_abnormal to kubelet	2022-03-17 01:32:38 +00:00
fengzixu	7d675381f8	fix: fix panic bug when volumeHealthStatus is nil	2022-03-17 01:32:24 +00:00
Paco Xu	acd696266e	mark PodOverhead to GA in v1.24; remove in v1.26	2022-03-17 09:30:14 +08:00
David Porter	c70f1955c4	test: Add E2E for job completions with cpu reservation Create an E2E test that creates a job that spawns a pod that should succeed. The job reserves a fixed amount of CPU and has a large number of completions and parallelism. Use to repro github.com/kubernetes/kubernetes/issues/106884 Signed-off-by: David Porter <david@porter.me>	2022-03-16 13:15:03 -04:00
Clayton Coleman	69a3820214	kubelet: Delay writing a terminal phase until the pod is terminated Other components must know when the Kubelet has released critical resources for terminal pods. Do not set the phase in the apiserver to terminal until all containers are stopped and cannot restart. As a consequence of this change, the Kubelet must explicitly transition a terminal pod to the terminating state in the pod worker which is handled by returning a new isTerminal boolean from syncPod. Finally, if a pod with init containers hasn't been initialized yet, don't default container statuses or not yet attempted init containers to the unknown failure state.	2022-03-16 13:15:00 -04:00
Maciej Borsz	aa95513982	Revert "add volume kubelet_volume_stats_health_abnormal to kubelet"	2022-03-16 13:44:09 +01:00
Shiming Zhang	ced991cb00	Emit Metrics in the shutdown process	2022-03-16 10:14:55 +08:00
Kubernetes Prow Robot	096cd9df63	Merge pull request #108699 from xing-yang/update_owners Update sig-storage owners files	2022-03-15 14:28:00 -07:00
Kubernetes Prow Robot	1a5abe5d1f	Merge pull request #105585 from fengzixu/improvement-volume-health add volume kubelet_volume_stats_health_abnormal to kubelet	2022-03-15 05:58:11 -07:00
Kubernetes Prow Robot	7858fc93e5	Merge pull request #108004 from equinix-ms/kubelet-include-oommetrics kubelet: expose OOM metrics	2022-03-14 23:14:13 -07:00
xing-yang	aae1f2c476	Update sig-storage owners file	2022-03-14 18:57:52 +00:00
chymy	5374f6fad8	Fix comment typo Signed-off-by: chymy <chang.min1@zte.com.cn>	2022-03-14 16:53:29 +08:00
chymy	7ed6fa7b2e	Method call 'err.Error()' might lead to a nil pointer dereference for pkg/kubelet/cm/cpumanager/cpu_assignment_test.go Signed-off-by: chymy <chang.min1@zte.com.cn>	2022-03-14 16:35:11 +08:00
Shiming Zhang	a1fadab4b0	Atomic write status file	2022-03-11 17:50:33 +08:00
Shiming Zhang	4aed18935e	Add test for storage	2022-03-11 17:31:10 +08:00
Shiming Zhang	5eb3e88f6b	Support metrics for node shutdown	2022-03-11 17:31:10 +08:00
Kubernetes Prow Robot	c227403973	Merge pull request #108568 from stevekuznetsov/skuznets/verbose-error kubelet: cgroups: be verbose about validation	2022-03-10 11:59:07 -08:00
Steve Kuznetsov	8f2bc39f72	kubelet: cgroups: be verbose about validation Previously, callers of `Exists()` would not know why the cGroup was or was not existing. In one call-site in particular, the `kubelet` would entirely fail to start if the cGroup validation did not succeed. In these cases we MUST explain what went wrong and pass that information clearly to the caller. Previously, some but not all of the reasons for invalidation were logged at a low log-level instead. This led to poor UX. The original method was retained on the interface so as to make this diff small. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>	2022-03-10 07:25:33 -08:00
Kubernetes Prow Robot	98ada45442	Merge pull request #108402 from Shoothzj/fix-typo-in-watch_based_manager_test Fix typo in watch_based_manager_test	2022-03-08 20:04:21 -08:00
Kir Kolyshkin	de5a69d847	pkg/kubelet/cm: fix potential nil dereference in enforceExistingCgroup Move the rl == nil check to before we dereference it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-08 17:05:46 -08:00
Kir Kolyshkin	9652d0cedc	pkg/kubelet/cm: move common code to libctCgroupConfig Instead of doing (almost) the same thing from the three different methods (Create, Update, Destroy), move the functionality to libctCgroupConfig, replacing updateSystemdCgroupInfo. The needResources bool is needed because we do not need resources during Destroy, so we skip the unneeded resource conversion. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-08 17:05:46 -08:00
Kir Kolyshkin	11b0d57c93	pkg/kubelet/cm/cgroup_manager: simplify setting hugetlb Commit `79be8be10e` made hugetlb settings optional if cgroup v2 is used and hugetlb is not available, fixing issue 92933. Note at that time this was only needed for v2, because for v1 the resources were set one-by-one, and only for supported resources. Commit `d312ef7eb6` switched the code to using Set from runc/libcontainer cgroups manager, and expanded the check to cgroup v1 as well. Move this check earlier, to inside m.toResources, so instead of converting all hugetlb resources from ResourceConfig to libcontainers's Resources.HugetlbLimit, and then setting it to nil, we can skip the conversion entirely if hugetlb is not supported, thus not doing the work that is not needed. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-08 17:05:46 -08:00
Kir Kolyshkin	59148e22d0	pkg/kubelet/cm: rm dup code Commit `ecd6361f` added setting PidsLimit to Create and Update. Commit `bce9d5f2` added setting PidsLimit to m.toResources. Now, PidsLimit is assigned twice. Remove the duplicate. Fixes: `bce9d5f2` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-08 17:05:46 -08:00
Kir Kolyshkin	a673b64864	kubelet/cm: speed up cgroup creation There's no need to call m.Update (which will create another instance of libcontainer cgroup manager, convert all the resources and then set them). All this is already done here, except for Set(). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-08 17:05:46 -08:00
Kubernetes Prow Robot	29ed12e76b	Merge pull request #108527 from ddebroy/instrumentedgc1 Pass instrumented runtime service to containerGC	2022-03-08 10:24:49 -08:00
Deep Debroy	023d6fb8f4	Pass instrumented runtime service to containergc Signed-off-by: Deep Debroy <ddebroy@gmail.com>	2022-03-08 14:33:37 +00:00
Tim Allclair	e1069c6495	Don't follow redirects with spdy	2022-03-04 16:08:58 -08:00
Kubernetes Prow Robot	5d6ef39406	Merge pull request #96004 from serathius/datapolicy-kubelet-pkg Add datapolicy tags to pkg/kubelet/	2022-03-04 15:34:51 -08:00
Kubernetes Prow Robot	422001df8b	Merge pull request #108154 from klueska/fix-topology-manager Update TopologyManager algorithm for selecting "best" non-preferred hint	2022-03-02 04:13:13 -08:00
Kubernetes Prow Robot	604ab4fc6c	Merge pull request #108340 from ArangoGutierrez/misspelled/1 Fix typo in pkg/kubelet/pluginmanager/cache/actual_state_of_world	2022-03-01 15:45:55 -08:00
Kubernetes Prow Robot	5d6a793221	Merge pull request #96828 from panjf2000/opt-epoll-eventfd kubelet/eviction: eliminate redundant allocations when handling eventfd	2022-03-01 13:59:54 -08:00
Kubernetes Prow Robot	0e8e307567	Merge pull request #106570 from odinuge/fix-cpu-shares-on-big-systems Fix cpu share issues on systems with large amounts of cpu	2022-03-01 10:15:55 -08:00
Kevin Klues	e370b7335c	Add extensive unit testing for TopologyManager hint generation algorithm Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-03-01 17:30:24 +00:00
Kevin Klues	99c57828ce	Update TopologyManager algorithm for selecting "best" non-preferred hint For the 'single-numa' and 'restricted' TopologyManager policies, pods are only admitted if all of their containers have perfect alignment across the set of resources they are requesting. The best-effort policy, on the other hand, will prefer allocations that have perfect alignment, but fall back to a non-preferred alignment if perfect alignment can't be achieved. The existing algorithm of how to choose the best hint from the set of "non-preferred" hints is fairly naive and often results in choosing a sub-optimal hint. It works fine in cases where all resources would end up coming from a single NUMA node (even if its not the same NUMA nodes), but breaks down as soon as multiple NUMA nodes are required for the "best" alignment. We will never be able to achieve perfect alignment with these non-preferred hints, but we should try and do something more intelligent than simply choosing the hint with the narrowest mask. In an ideal world, we would have the TopologyManager return a set of "resources-relative" hints (as opposed to a common hint for all resources as is done today). Each resource-relative hint would indicate how many other resources could be aligned to it on a given NUMA node, and a hint provider would use this information to allocate its resources in the most aligned way possible. There are likely some edge cases to consider here, but such an algorithm would allow us to do partial-perfect-alignment of "some" resources, even if all resources could not be perfectly aligned. Unfortunately, supporting something like this would require a major redesign to how the TopologyManager interacts with its hint providers (as well as how those hint providers make decisions based on the hints they get back). That said, we can still do better than the naive algorithm we have today, and this patch provides a mechanism to do so. We start by looking at the set of hints passed into the TopologyManager for each resource and generate a list of the minimum number of NUMA nodes required to satisfy an allocation for a given resource. Each entry in this list then contains the 'minNUMAAffinity.Count()' for a given resources. Once we have this list, we find the maximum 'minNUMAAffinity.Count()' from the list and mark that as the 'bestNonPreferredAffinityCount' that we would like to have associated with whatever "bestHint" we ultimately generate. The intuition being that we would like to (at the very least) get alignment for those resources that require multiple NUMA nodes to satisfy their allocation. If we can't quite get there, then we should try to come as close to it as possible. Once we have this 'bestNonPreferredAffinityCount', the algorithm proceeds as follows: If the mergedHint and bestHint are both non-preferred, then try and find a hint whose affinity count is as close to (but not higher than) the bestNonPreferredAffinityCount as possible. To do this we need to consider the following cases and react accordingly: 1. bestHint.NUMANodeAffinity.Count() > bestNonPreferredAffinityCount 2. bestHint.NUMANodeAffinity.Count() == bestNonPreferredAffinityCount 3. bestHint.NUMANodeAffinity.Count() < bestNonPreferredAffinityCount For case (1), the current bestHint is larger than the bestNonPreferredAffinityCount, so updating to any narrower mergeHint is preferred over staying where we are. For case (2), the current bestHint is equal to the bestNonPreferredAffinityCount, so we would like to stick with what we have unless the current mergedHint is also equal to bestNonPreferredAffinityCount and it is narrower. For case (3), the current bestHint is less than bestNonPreferredAffinityCount, so we would like to creep back up to bestNonPreferredAffinityCount as close as we can. There are three cases to consider here: 3a. mergedHint.NUMANodeAffinity.Count() > bestNonPreferredAffinityCount 3b. mergedHint.NUMANodeAffinity.Count() == bestNonPreferredAffinityCount 3c. mergedHint.NUMANodeAffinity.Count() < bestNonPreferredAffinityCount For case (3a), we just want to stick with the current bestHint because choosing a new hint that is greater than bestNonPreferredAffinityCount would be counter-productive. For case (3b), we want to immediately update bestHint to the current mergedHint, making it now equal to bestNonPreferredAffinityCount. For case (3c), we know that both the current bestHint and the current mergedHint are less than bestNonPreferredAffinityCount, so we want to choose one that brings us back up as close to bestNonPreferredAffinityCount as possible. There are three cases to consider here: 3ca. mergedHint.NUMANodeAffinity.Count() > bestHint.NUMANodeAffinity.Count() 3cb. mergedHint.NUMANodeAffinity.Count() < bestHint.NUMANodeAffinity.Count() 3cc. mergedHint.NUMANodeAffinity.Count() == bestHint.NUMANodeAffinity.Count() For case (3ca), we want to immediately update bestHint to mergedHint because that will bring us closer to the (higher) value of bestNonPreferredAffinityCount. For case (3cb), we want to stick with the current bestHint because choosing the current mergedHint would strictly move us further away from the bestNonPreferredAffinityCount. Finally, for case (3cc), we know that the current bestHint and the current mergedHint are equal, so we simply choose the narrower of the 2. This patch implements this algorithm for the case where we must choose from a set of non-preferred hints and provides a set of unit-tests to verify its correctness. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-03-01 14:38:26 +00:00
ZhangJian He	d09947a5b5	Fix typo in watch_based_manager_test	2022-03-01 10:21:17 +08:00
Kubernetes Prow Robot	bef9d807a0	Merge pull request #108325 from pacoxu/donotReturnErrWhenPauseLose do not return err when PodSandbox not exist	2022-02-28 18:15:46 -08:00
Kubernetes Prow Robot	e9ba9dc4e4	Merge pull request #107201 from pacoxu/add-metrics-volume-stats-cal add VolumeStatCalDuration metrics for fsquato monitoring benchmark	2022-02-28 16:07:46 -08:00
Kevin Klues	f8601cb5a3	Refactor TopologyManager to be more explicit about bestHint calculation Signed-off-by: Kevin Klues <kklues@nvidia.com>	2022-02-28 20:30:01 +00:00
houjun	df099ed923	Fix error logging statement to make it easier to understand	2022-02-26 15:25:56 +08:00

... 5 6 7 8 9 ...

10539 Commits