Commit Graph

515 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
28fa3cbbf1
Merge pull request #115847 from moshe010/pod-resource-api-dra-upstream
Extend the PodResources API to include resources allocated by DRA
2023-03-14 14:12:26 -07:00
Moshe Levi
67a71c0bd7 kubelet podresources: add unit tests for DyanmicResource and Get method
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
2a568bcfc8 kubelet podresources: extend List to support Dynamic Resources and implement Get API
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Moshe Levi
9c57613912 Add ClassName to chekpoint state and in-memory cache
Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-03-14 19:33:04 +02:00
Francesco Romani
5e03998991 kubelet: podresources: pack parameters in a struct
To enable rate limiting, needed for GA graduation,
we need to pass more parameters to the already crowded
`ListenAndServePodresources` function.

To tidy up a bit, pack the parameters in a helper struct,
with no intended changes in behavior.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-14 19:33:01 +02:00
Aravindh Puthiyaparambil
26279a5282
kubelet: Add validation for EnableNodeLogQuery 2023-03-14 08:45:20 -07:00
Aravindh Puthiyaparambil
aadad09410
api: Add EnableNodeLogQuery to KubeletConfiguration
Added EnableNodeLogQuery field to kubelet/apis/config/types.go and
staging/src/k8s.io/kubelet/config/v1beta1/types.go, then executed.
 `hack/update-codegen.sh`.

This new field will default to off and will need to be explicitly
enabled in addition to the NodeLogQuery gate to use the feature.
2023-03-14 08:45:19 -07:00
Francesco Romani
b837a0c1ff kubelet: podresources: DOS prevention with builtin ratelimit
Implement DOS prevention wiring a global rate limit for podresources
API. The goal here is not to introduce a general ratelimiting solution
for the kubelet (we need more research and discussion to get there),
but rather to prevent misuse of the API.

Known limitations:
- the rate limits value (QPS, BurstTokens) are hardcoded to
  "high enough" values.
  Enabling user-configuration would require more discussion
  and sweeping changes to the other kubelet endpoints, so it
  is postponed for now.
- the rate limiting is global. Malicious clients can starve other
  clients consuming the QPS quota.

Add e2e test to exercise the flow, because the wiring itself
is mostly boilerplate and API adaptation.
2023-03-11 08:00:54 +01:00
Kubernetes Prow Robot
625b8be09e
Merge pull request #115371 from pacoxu/cgroup-v2-memory-tuning
default memoryThrottlingFactor to 0.9 and optimize the memory.high formulas
2023-03-08 18:46:00 -08:00
Kubernetes Prow Robot
8d5c96fed2
Merge pull request #116093 from swatisehgal/topologymanager-ga-graduation
node: topologymgr: Graduate Kubelet Topology Manager to GA
2023-03-08 16:56:06 -08:00
Paco Xu
f368413d65 sync default qps of kubelet change 2023-03-08 14:04:51 +08:00
Swati Sehgal
ae964a493f node: topologymgr: remove comments with feature gate references
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-07 09:42:54 +00:00
Swati Sehgal
d536a342b4 node: topologymgr: GA graduation implies Feature Gate is ON by default
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:51:05 +00:00
Wojciech Tyczyński
280651abcc Autogenerated 2023-03-06 12:08:34 +01:00
Wojciech Tyczyński
760acbbbe3 Bump QPS limits for Kubelet 2023-03-06 12:07:52 +01:00
Paco Xu
7dab6253e1 default memoryThrottlingFactor to 0.9 and optimize the memory.high calculation formulas 2023-03-03 11:24:40 +08:00
ruiwen-zhao
572e6e0ffb Add MaxParallelImagePulls support
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-03-02 03:57:59 +00:00
Kubernetes Prow Robot
53f3583c7f
Merge pull request #114785 from TommyStarK/kubelet/replace-deprecated-pointer-function
kubelet: Replace deprecated pointer function
2023-03-01 18:04:55 -08:00
Paco Xu
3d536bd14b API docs: point to current docs instead of archived designs 2023-02-16 15:32:08 +08:00
Paco Xu
019d2615af archived design proposals are now moved to Design Proposals Archive Repo. 2023-02-08 11:12:22 +08:00
songxiao-wang87
3e6b954290 Making a run test.
Signed-off-by: songxiao-wang87 <wang.xiaosong23@zte.com.cn>
2023-01-31 09:38:48 +00:00
TommyStarK
1fcc8fbf59 kubelet: Replace deprecated pointer function
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-01-08 13:44:09 +01:00
Moshe Levi
ce46ba7be8 kubelet podresource: fix GetAllocatableResources metrics
The GetAllocatableResources increase twice the PodResourcesEndpointRequestsTotalCount
This PR fix this.

Signed-off-by: Moshe Levi <moshele@nvidia.com>
2023-01-04 10:58:55 +02:00
Paco Xu
f28f40e521 remove a flag check that was introduced in #112542; address several comments
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-12-13 14:00:29 +08:00
Aditi Sharma
214a0ee7b8 Migrate container runtime endpoint flag to config
Signed-off-by: Aditi Sharma <adi.sky17@gmail.com>
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2022-12-13 14:00:29 +08:00
PiotrProkop
daee219210 Improved multi-numa alignment in Topology Manager: add topology-manager-policy-options flag in Kubelet
This patch adds new Kubelet option topologyManagerPolicyOptions.
To introduce new TopologyManager options, first we need to introduce new
flag called `topology-manager-policy-options` to allow users to modify
behaviour of best-effort and restricted policies.

Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-03 09:45:33 +01:00
Kubernetes Prow Robot
244c035b87
Merge pull request #110263 from claudiubelu/unittests
unittests: Fixes unit tests for Windows
2022-10-25 14:50:34 -07:00
Claudiu Belu
6f2eeed2e8 unittests: Fixes unit tests for Windows
Currently, there are some unit tests that are failing on Windows due to
various reasons:

- config options not supported on Windows.
- files not closed, which means that they cannot be removed / renamed.
- paths not properly joined (filepath.Join should be used).
- time.Now() is not as precise on Windows, which means that 2
  consecutive calls may return the same timestamp.
- different error messages on Windows.
- files have \r\n line endings on Windows.
- /tmp directory being used, which might not exist on Windows. Instead,
  the OS-specific Temp directory should be used.
- the default value for Kubelet's EvictionHard field was containing
  OS-specific fields. This is now moved, the field is now set during
  Kubelet's initialization, after the config file is read.
2022-10-25 23:46:56 +03:00
Kubernetes Prow Robot
6f579d3ceb
Merge pull request #111616 from ndixita/credential-api-ga
Move the Kubelet Credential Provider feature to GA and Update the Credential Provider API to GA
2022-10-15 07:53:09 -07:00
Monis Khan
b738be9b46
Use https links for k8s KEPs, issues, PRs, etc
Signed-off-by: Monis Khan <mok@microsoft.com>
2022-09-23 23:36:24 +00:00
Dixita Narang
9c3cb6e66d Fixing boilerplate header 2022-09-16 21:20:30 +00:00
Dixita Narang
4cc741955c Adding default values for v1 credential provider config 2022-09-09 06:11:15 +00:00
Dixita Narang
977a8ebb3a Renaming usage of v1beta1 to v1, and adding API violation exceptions and
vendor module for v1
2022-09-09 06:11:06 +00:00
Dmitry Verkhoturov
d0f9e6dc36 clarify CPUCFSQuotaPeriod values, set the minimum to 1ms
cpu.cfs_period_us is measured in microseconds in the kernel but
provided in time.Duration by the user, that change clarifies the code
to make this evident to the reader.

Also, the minimum value for that feature is 1ms and not 1μs, and this
change alters the validation to reject values smaller than 1ms.
2022-09-08 23:29:13 +02:00
Antonio Ojea
d434c588d7 Revert "change CPUCFSQuotaPeriod default value to 100us to match Linux default"
This reverts commit f2d591fae6.
2022-08-26 23:51:04 +02:00
Kubernetes Prow Robot
2b5475b3fa
Merge pull request #111554 from paskal/paskal/clarify_default_cfs_period
Clarify cpu.cfs_period_us default value
2022-08-25 07:28:07 -07:00
Kubernetes Prow Robot
70254065ea
Merge pull request #109966 from zhangxyjlu/config_validation_test
Add validation test for features.GracefulNodeShutdownBasedOnPodPriority
2022-08-24 00:02:24 -07:00
Dmitry Verkhoturov
f2d591fae6 change CPUCFSQuotaPeriod default value to 100us to match Linux default
cpu.cfs_period_us is 100μs by default despite having an "ms" unit
for some unfortunate reason. Documentation:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management

The desired effect of that change is to match
k8s default `CPUCFSQuotaPeriod` value (100ms before that change)
with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled
and Linux CFS (100us, 1000x smaller than 100ms).
2022-08-10 03:25:05 +02:00
jinxu
0064010cdd Promote Local storage capacity isolation feature to GA
This change is to promote local storage capacity isolation feature to GA

At the same time, to allow rootless system disable this feature due to
unable to get root fs, this change introduced a new kubelet config
"localStorageCapacityIsolation". By default it is set to true. For
rootless systems, they can set this configuration to false to disable
the feature. Once it is set, user cannot set ephemeral-storage
request/limit because capacity and allocatable will not be set.

Change-Id: I48a52e737c6a09e9131454db6ad31247b56c000a
2022-08-02 23:45:48 -07:00
zhangxiaoyang
7375ba4e27 add validation test for features.GracefulNodeShutdownBasedOnPodPriority 2022-08-03 14:43:00 +08:00
Kubernetes Prow Robot
d40bc18461
Merge pull request #105126 from sallyom/tracing-kubelet
kubelet tracing instrumentation
2022-08-02 11:38:06 -07:00
Dmitry Verkhoturov
32df800ba7 change CPUCFSQuotaPeriod default value to 100us to match Linux default
cpu.cfs_period_us is 100μs by default despite having an "ms" unit
for some unfortunate reason. Documentation:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management

The desired effect of that change is to match
k8s default `CPUCFSQuotaPeriod` value (100ms before that change)
with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled
and Linux CFS (100us, 1000x smaller than 100ms).
2022-08-02 09:55:50 +02:00
Kubernetes Prow Robot
ea21947641
Merge pull request #111426 from ping035627/k8s-220726
Update design-proposals URL
2022-08-01 23:50:30 -07:00
PingWang
473be65a3c Update design-proposals URL
Signed-off-by: PingWang <wang.ping5@zte.com.cn>

update url

Signed-off-by: PingWang <wang.ping5@zte.com.cn>
2022-08-02 09:13:38 +08:00
Sally O'Malley
5b4456ceea
kubelet tracing: generated files
Signed-off-by: Sally O'Malley <somalley@redhat.com>
2022-08-01 12:55:14 -04:00
Sally O'Malley
47e7d8034f
kubelet tracing
Signed-off-by: Sally O'Malley <somalley@redhat.com>
Co-authored-by: David Ashpole <dashpole@google.com>
2022-08-01 12:55:02 -04:00
Kubernetes Prow Robot
bebea5f950
Merge pull request #111152 from sivchari/fix-refer-url
fix: refer to url of Node Allocatable
2022-07-31 20:32:39 -07:00
Dmitry Verkhoturov
5126192548 clarify cpu.cfs_period_us default value
cpu.cfs_period_us is 100μs by default despite having an "ms" unit
for some unfortunate reason. Documentation:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management

The desired effect of that change is more clarity on the default value
so users would be aware that the 10ms custom value would be
not 0.1x of the default, but 100x of it.
2022-07-29 23:02:35 +02:00
Kubernetes Prow Robot
631a5a849a
Merge pull request #109778 from mythi/grpc-go-update
grpc: move to use grpc.WithTransportCredentials()
2022-07-26 12:45:09 -07:00
sivchari
3db9e1c64c fix: refer to url of Node Allocatable 2022-07-15 00:54:33 +09:00