Commit Graph

10700 Commits

Author SHA1 Message Date
vinay kulkarni
b0dce923f1 Add Get interfaces for container's checkpointed ResourcesAllocated and Resize values, remove error logging for valid standalone kubelet scenario 2023-03-06 09:50:12 +00:00
huyinhou
88274d96fc update code style
Signed-off-by: huyinhou <huyinhou@bytedance.com>
2023-03-06 14:23:14 +08:00
vinay kulkarni
12435b26fc Fix nil pointer access panic in kubelet from uninitialized pod allocation checkpoint manager in standalone kubelet scenario 2023-03-04 08:07:40 +00:00
Sergey Kanzhelev
04189b1fc4 rename ExperimentalPodPidsLimit to PodPidsLimit 2023-03-04 01:48:16 +00:00
Paco Xu
81c5a122c3 add pageSize to memory.high formula 2023-03-03 11:24:50 +08:00
Paco Xu
7dab6253e1 default memoryThrottlingFactor to 0.9 and optimize the memory.high calculation formulas 2023-03-03 11:24:40 +08:00
Sergey Kanzhelev
e360de48b2 GRPCContainerProbe is GA 2023-03-02 22:07:59 +00:00
TommyStarK
951decd1e6 kubelet: fix recording when pulling image did finish
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-03-02 20:21:35 +01:00
Kubernetes Prow Robot
57fd02ca29 Merge pull request #116218 from pohly/test-lease-controller-leak
update lease controller
2023-03-02 10:30:56 -08:00
Kubernetes Prow Robot
efe20f6c9b Merge pull request #114114 from ffromani/full-pcpus-stricter-precheck-issue113537
node: cpumgr: stricter pre-check for  the policy option full-pcpus-only
2023-03-02 09:04:56 -08:00
Francesco Romani
0e9b92090c node: cpumgr: stricter precheck for full-pcpus-only
In order to implement the `full-pcpus-only` cpumanager policy option,
we leverage the implementation of the algorithm which picks CPUs.
By design, CPUs are taken from the biggest chunk available (socket
or NUMA zone) to physical cores, down to single cores.

Leveraging this, if the requested CPU count is a multiple of the SMT
level (commonly 2), we're guaranteed that only full physical cores
will be taken.

The hidden assumption here is this holds true by construction iff
the user reserved CPUs (if any) considering full physical CPUs.
IOW, if the user did intentionally or mistakely reserve single threads
which are no core siblings[1], then the simple check we implemented
is not sufficient.

A easy example can probably outline this better. With this setup:

cores: [(0, 4), (1, 5), (2, 6), (3, 8)] (in parens: thread siblings).
SMT level: 2 (each tuple is 2 elements)
Reserved CPUs: 0,1 (explicit pick using `--reserved-cpus`)

A container then requests 6 cpus. full-pcpus-only check: 6 % 2 == 0. Passed.
The CPU allocator will take first full cores, (2,6) and (3,8), and will
then pick the remaining single CPUs. The allocation will succeed, but
it's incorrect.

We can fix this case with a stricter precheck.
We need to additionally consider all the core siblings of the reserved
CPUs as unavailable when computing the free cpus, before to start the
actual allocation. Doing so, we fall back in the intended behavior, and
by construction all possible CPUs allocation whose number is multiple
of the SMT level are now correct again.

+++

[1] or thread siblings in the linux parlance, in any case:
hyperthread siblings of the same physical core

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-02 16:00:58 +01:00
Patrick Ohly
dad95e1be6 update lease controller
Passing in a context instead of a stop channel has several advantages:
- ensures that client-go calls return as soon as the controller is asked to stop
- contextual logging can be used

By passing that context down to its own functions and checking it while
waiting, the lease controller also doesn't get stuck in backoffEnsureLease
anymore (https://github.com/kubernetes/kubernetes/issues/116196).
2023-03-02 15:06:00 +01:00
Paco Xu
bea956568f add ip_local_reserved_ports to safe sysctl allow list only if kernel version >= 3.16 2023-03-02 12:40:42 +08:00
ruiwen-zhao
572e6e0ffb Add MaxParallelImagePulls support
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-03-02 03:57:59 +00:00
Kubernetes Prow Robot
53f3583c7f Merge pull request #114785 from TommyStarK/kubelet/replace-deprecated-pointer-function
kubelet: Replace deprecated pointer function
2023-03-01 18:04:55 -08:00
Patrick Ohly
961819a4d0 dependencies: update klog v2.90.1
This improves performance of the text formatting and ktesting.

Because ktesting no longer buffers messages by default, one unit
test needs to ask for that explicitly.
2023-03-01 19:03:50 +01:00
Harshal Patil
412b4b3329 Add connection related metrics to EventedPLEG
Signed-off-by: Harshal Patil <harpatil@redhat.com>
2023-03-01 11:35:27 -05:00
SataQiu
91089ce65b kubelet: remove the deprecated --master-service-namespace flag 2023-03-01 18:44:59 +08:00
Kubernetes Prow Robot
6a25c528bb Merge pull request #115891 from bart0sh/PR103-CRI-add-CDI-devices
DRA: Pass CDI devices with a new CRI field
2023-02-28 14:53:28 -08:00
Kubernetes Prow Robot
18eea58ac2 Merge pull request #115359 from iancoolidge/devel-cpuset
More code-review changes from k/utlils cpuset review
2023-02-28 10:55:16 -08:00
Ed Bartosh
5a86895070 DRA: pass CDI devices through CRI CDIDevice field 2023-02-28 19:21:20 +02:00
Paco Xu
ca4022c4da add net.ipv4.ip_local_reserved_ports to safe sysctls
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2023-02-27 19:02:20 +08:00
SataQiu
ed2caf17e0 kubelet: remove unused DockerID type 2023-02-27 16:02:59 +08:00
Chen Wang
7db339dba2 This commit contains the following:
1. Scheduler bug-fix + scheduler-focussed E2E tests
2. Add cgroup v2 support for in-place pod resize
3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes.

Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>
2023-02-24 18:21:21 +00:00
Vinay Kulkarni
f2bd94a0de In-place Pod Vertical Scaling - core implementation
1. Core Kubelet changes to implement In-place Pod Vertical Scaling.
2. E2E tests for In-place Pod Vertical Scaling.
3. Refactor kubelet code and add missing tests (Derek's kubelet review)
4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature.
5. Fix corner-case where resize A->B->A gets ignored
6. Add cgroup v2 support to pod resize E2E test.
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources

Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>
2023-02-24 18:21:21 +00:00
Jan Safranek
7bf9991389 Add metric for failed orphan pod cleanup 2023-02-22 18:43:38 +01:00
Jan Safranek
bd73aee9db Add volume reconstruction metrics
Count nr. of volumes that kubelet tried to reconstruct + reconstruction
errors.
2023-02-22 13:01:26 +01:00
HirazawaUi
692e7cd3be delete kubelet unused function 2023-02-21 16:08:02 +08:00
Ian K. Coolidge
d4a1bf83c1 cpuset: Convert Fatalf to Errrof in tests
Use of Fatalf is not apppropriate in any of these cases:
None of these failures are prerequisites.
2023-02-21 05:41:16 +00:00
Ian K. Coolidge
b536851fc7 cpuset: Add a few more test cases
Feedback from https://github.com/kubernetes/utils/pull/267 and related
reviews.

* Equality when insertion order is different
* UnsortedList contents
* Not-Subset cases
* Clone coverage
2023-02-21 05:40:54 +00:00
Ian K. Coolidge
22d3f67850 cpuset: Fix Parse() error message for n-k s.t. k<n
This case is tested extensively in cpuset_test.go, but the error message
needs a small adjustmnet.
2023-02-21 04:51:14 +00:00
Sascha Grunert
58923c9f1a Default to sandbox Seccomp field instead of SeccompProfilePath
The seccomp field is the new default since a couple of releases, means
we can stop using `SeccompProfilePath`.

Follow-up on https://github.com/kubernetes/kubernetes/pull/96281

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-02-20 12:16:32 +01:00
huyinhou
32495ae3f1 add lock in generate topology hints function 2023-02-20 10:56:53 +08:00
Kubernetes Prow Robot
ffe410bbb4 Merge pull request #115604 from pacoxu/fix-design-proposals-links
old design proposals are now moved to Design Proposals Archive repo
2023-02-16 09:55:38 -08:00
Paco Xu
3d536bd14b API docs: point to current docs instead of archived designs 2023-02-16 15:32:08 +08:00
Kubernetes Prow Robot
e18fa74551 Merge pull request #115590 from swatisehgal/topology-mgr-duration-metrics
node: topology-mgr: Add metric to measure topology manager admission latency
2023-02-15 07:12:25 -08:00
Swati Sehgal
8442b450e5 node: topology-mgr: code optimization
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-02-15 14:04:10 +00:00
Swati Sehgal
bc941633c1 node: topology-mgr: add metric to measure topology mgr admission latency
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-02-15 13:59:47 +00:00
Kubernetes Prow Robot
8f55d34507 Merge pull request #115384 from sourcelliu/allowlist
Add test for pkg/kubelet/sysctl/allowlist_test.go
2023-02-14 12:45:51 -08:00
Kubernetes Prow Robot
5071c4f57e Merge pull request #111982 from cvvz/kubelet-del-unnecessary-code
cleanup: delete useless code from kubelet volumemanager
2023-02-14 10:31:31 -08:00
cyclinder
1bdcd18bf6 close grpc server in test file to avoid goroutine leak
Signed-off-by: cyclinder <kuocyclinder@gmail.com>
2023-02-10 09:51:26 +08:00
Claudiu Belu
5cce4bccad tests: Port kubelet tests to Windows
Ports kubelet/util unit tests to Windows.
2023-02-09 13:50:51 +00:00
Paco Xu
019d2615af archived design proposals are now moved to Design Proposals Archive Repo. 2023-02-08 11:12:22 +08:00
Kubernetes Prow Robot
5437d493da Merge pull request #114364 from bart0sh/PR102-prepare-DRA-resources-before-CNI-setup
kubelet: prepare DRA resources before CNI setup
2023-02-07 08:09:04 -08:00
Kubernetes Prow Robot
22b88dea36 Merge pull request #115315 from enj/enj/i/kas_kubelet_conn_close
kubelet/client: collapse transport wiring onto standard approach
2023-02-07 07:01:14 -08:00
silenceshell
99154b1661 fix concurrent-map-write of FakeOS.Create 2023-02-07 21:37:31 +08:00
Madhav Jivrajani
5e1f440d0a *: Fix linter warnings
Adapt to newly improved linters in golangci-lint v1.51.1

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2023-02-07 13:01:41 +05:30
Monis Khan
754cb3d601 kubelet/client: collapse transport wiring onto standard approach
Signed-off-by: Monis Khan <mok@microsoft.com>
2023-02-06 20:34:49 -05:00
Ed Bartosh
4f88332ab4 kubelet: prepare DRA resources before CNI setup 2023-02-06 20:40:11 +02:00
Kubernetes Prow Robot
d3a62dcb76 Merge pull request #114351 from ruiwen-zhao/event_ignore_nil
[Evented PLEG] Ignore container events with nil PodSandboxStatus
2023-02-02 12:52:42 -08:00