Commit Graph

1176 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
68f808e6db Merge pull request #111371 from sivchari/improve-naming
feat: improve naming
2022-12-14 02:23:37 -08:00
Kubernetes Prow Robot
7754f007d6 Merge pull request #114169 from jpbetz/improve-kubelet-flag-errors
Improve error messages of flags that parse quantities and percentages
2022-12-10 06:05:11 -08:00
Kubernetes Prow Robot
a668924cb6 Merge pull request #113255 from claudiubelu/path-filepath-update-kubelet
Replaces path.Operation with filepath.Operation (kubelet)
2022-12-09 22:27:41 -08:00
Joe Betz
ab3c353227 Improve error messages for parse errors of --kube-reserved, --system-reserved and --qos-reserved 2022-11-28 16:35:26 -05:00
Ed Bartosh
abcb56defb kubelet: do not enter termination status if pod might need to unprepare resources 2022-11-11 21:58:03 +01:00
Ed Bartosh
ae0f38437c kubelet: add support for dynamic resource allocation
Dependencies need to be updated to use
github.com/container-orchestrated-devices/container-device-interface.

It's not decided yet whether we will implement Topology support
for DRA or not. Not having any toppology-related code
will help to avoid wrong impression that DRA is used as a hint
provider for the Topology Manager.
2022-11-11 21:58:03 +01:00
PiotrProkop
540b5bd308 [topologymanager] rely on Cadvisor to calculate NUMA distance
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-09 17:52:14 +01:00
PiotrProkop
315f0dc6f1 Fix discovering numa distance when node ids are not starting from 0 or their ids are not sequential
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-09 17:52:08 +01:00
Claudiu Belu
b9bf3e5c49 Replaces path.Operation with filepath.Operation (kubelet)
The path module has a few different functions:
Clean, Split, Join, Ext, Dir, Base, IsAbs. These functions do not
take into account the OS-specific path separator, meaning that they
won't behave as intended on Windows.

For example, Dir is supposed to return all but the last element of the
path. For the path "C:\some\dir\somewhere", it is supposed to return
"C:\some\dir\", however, it returns ".".

Instead of these functions, the ones in filepath should be used instead.
2022-11-08 16:05:48 +00:00
Kubernetes Prow Robot
243ba086e7 Merge pull request #112914 from PiotrProkop/topology-manager-policies-flag
node: topologymanager:  Improved multi-numa alignment in Topology Manager
2022-11-07 16:00:51 -08:00
David Ashpole
64af1adace Second attempt: Plumb context to Kubelet CRI calls (#113591)
* plumb context from CRI calls through kubelet

* clean up extra timeouts

* try fixing incorrectly cancelled context
2022-11-05 06:02:13 -07:00
PiotrProkop
75bb437a6b Improved multi-numa alignment in Topology Manager: implement closest numa policy
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-03 10:45:25 +01:00
PiotrProkop
d5dd42dfac Improved multi-numa alignment in Topology Manager: introduce TopologyManagerOptions
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-03 10:45:21 +01:00
PiotrProkop
58ef3f202a Improved multi-numa alignment in Topology Manager: add NUMAInfo
Signed-off-by: PiotrProkop <pprokop@nvidia.com>
2022-11-03 10:45:09 +01:00
Kubernetes Prow Robot
6754265580 Merge pull request #109757 from STRRL/enriching-unit-test-for-container-manager
Add testcases for pkg/kubelet/cm/pod_container_manager_linux.go
2022-11-02 23:45:35 -07:00
Kubernetes Prow Robot
433787d25b Merge pull request #113018 from fromanirh/cpumanager-ga-features
node: kubelet: cpumgr: CPU Manager to GA
2022-11-02 14:41:01 -07:00
Kubernetes Prow Robot
25dc4c4f32 Merge pull request #112980 from swatisehgal/devicemanager-ga-graduation
node: devicemgr: Graduate Kubelet DeviceManager to GA
2022-11-02 13:17:01 -07:00
Francesco Romani
a6b928d90c kubelet: cpumgr: internal variable trivial rename
CPUManager is going GA, thus it makes little sense
to keep the names of the internal configuration
variables `Experimental*`.

Trivial rename only.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-11-02 18:41:42 +01:00
Francesco Romani
5e12338a22 node: cpumgr: address golint complains
Add docstrings and trivial fixes.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-11-02 18:41:42 +01:00
Francesco Romani
ff44dc1932 cpumanager: the FG is locked to default (ON)
hence we can remove the if() guards, the feature
is always available.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-11-02 18:41:41 +01:00
Antonio Ojea
9c2b333925 Revert "plumb context from CRI calls through kubelet"
This reverts commit f43b4f1b95.
2022-11-02 13:37:23 +00:00
Swati Sehgal
40741681a2 node: devicemgr: Address warnings from golint
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-11-02 11:05:20 +00:00
Swati Sehgal
8b29eded52 node: devicemgr: Remove devicePluginEnabled field from container mgr
With graduation of device plugins to GA in 1.26, the feature gate is
enabled by default so `devicePluginEnabled` field no longer needs to
be passed at the time of Container Manager creation.

In addition to that, we remove the `ManagerStub` as it is no longer
needed.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-11-02 11:05:20 +00:00
Swati Sehgal
752fa093e0 node: devicemgr: GA graduation implies Feature Gate is ON by default
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-11-02 11:05:20 +00:00
Kubernetes Prow Robot
7b84436168 Merge pull request #113408 from dashpole/kubelet_context
Plumb context to Kubelet CRI calls
2022-11-01 19:59:08 -07:00
Kubernetes Prow Robot
1a41cb8985 Merge pull request #113021 from rphillips/fixes/112936
kubelet: fix nil crash in allocateRemainingFrom
2022-11-01 10:46:45 -07:00
Kubernetes Prow Robot
4c657e5014 Merge pull request #110403 from claudiubelu/unittests-3
unittests: Fixes unit tests for Windows (part 3)
2022-10-31 15:52:44 -07:00
Kubernetes Prow Robot
d0e86111ef Merge pull request #112855 from fromanirh/cpumanager-metrics
node: metrics: cpumanager: add metrics about pinning
2022-10-31 03:12:56 -07:00
Kubernetes Prow Robot
9702161caa Merge pull request #112597 from mythi/grpc-authority
grpc: set localhost Authority to unix client calls
2022-10-31 03:12:45 -07:00
David Ashpole
f43b4f1b95 plumb context from CRI calls through kubelet 2022-10-28 02:55:28 +00:00
Kubernetes Prow Robot
ab4907d2f4 Merge pull request #112913 from Garrybest/pr_cpumanager
fix GetAllocatableCPUs in cpumanager
2022-10-27 07:20:33 -07:00
Francesco Romani
47d3299781 node: metrics: cpumanager: add pinning metrics
In order to improve the observability of the cpumanager,
add and populate metrics to track if the combination of
the kubelet configuration and podspec would trigger
exclusive core allocation and pinning.

We should avoid leaking any node/machine specific information
(e.g. core ids, even though this is admittedly an extreme example);
tracking these metrics seems to be a good first step, because
it allows us to get feedback without exposing details.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-10-27 14:40:40 +02:00
Garrybest
95eb5670cf add GetAllocatableCPUs test in cpumanager
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-10-27 19:57:12 +08:00
Garrybest
d446f5f90e fix GetAllocatableCPUs in cpumanager
Signed-off-by: Garrybest <garrybest@foxmail.com>
2022-10-27 19:57:06 +08:00
Kubernetes Prow Robot
244c035b87 Merge pull request #110263 from claudiubelu/unittests
unittests: Fixes unit tests for Windows
2022-10-25 14:50:34 -07:00
Claudiu Belu
6f2eeed2e8 unittests: Fixes unit tests for Windows
Currently, there are some unit tests that are failing on Windows due to
various reasons:

- config options not supported on Windows.
- files not closed, which means that they cannot be removed / renamed.
- paths not properly joined (filepath.Join should be used).
- time.Now() is not as precise on Windows, which means that 2
  consecutive calls may return the same timestamp.
- different error messages on Windows.
- files have \r\n line endings on Windows.
- /tmp directory being used, which might not exist on Windows. Instead,
  the OS-specific Temp directory should be used.
- the default value for Kubelet's EvictionHard field was containing
  OS-specific fields. This is now moved, the field is now set during
  Kubelet's initialization, after the config file is read.
2022-10-25 23:46:56 +03:00
Claudiu Belu
9f95b7b18c unittests: Fixes unit tests for Windows (part 3)
Currently, there are some unit tests that are failing on Windows due to
various reasons:

- paths not properly joined (filepath.Join should be used).
- Proxy Mode IPVS not supported on Windows.
- DeadlineExceeded can occur when trying to read data from an UDP
  socket. This can be used to detect whether the port was closed or not.
- In Windows, with long file name support enabled, file names can have
  up to 32,767 characters. In this case, the error
  windows.ERROR_FILENAME_EXCED_RANGE will be encountered instead.
- files not closed, which means that they cannot be removed / renamed.
- time.Now() is not as precise on Windows, which means that 2
  consecutive calls may return the same timestamp.
- path.Base() will return the same path. filepath.Base() should be used
  instead.
- path.Join() will always join the paths with a / instead of the OS
  specific separator. filepath.Join() should be used instead.
2022-10-21 19:25:48 +03:00
Kubernetes Prow Robot
bf14677914 Merge pull request #112546 from oscr/the-the
grammar: replace all occurrences of "the the" with "the"
2022-10-19 10:03:02 -07:00
chaunceyjiang
d2b372e029 Remove redundant type conversion
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2022-10-18 14:37:40 +08:00
Oscar Utbult
e4f776f230 grammar: replace all occurrences of "the the" with "the" 2022-10-14 09:03:14 +02:00
Ryan Phillips
2514486d80 kubelet: fix nil crash in allocateRemainingFrom 2022-10-12 12:51:17 -05:00
jesse.tang
759e043136 Optimize: file /cpuset slice make cap (#112270) 2022-09-30 16:56:25 -07:00
Mikko Ylinen
fbcdf48bb8 grpc: set localhost Authority to unix client calls
Several reports exist (both with device plugins and CSI) that
kubelet w/ grpc-go sends invalid Authority header and some non
grpc-go servers reject these unix domain socket client connections.

grpc-go sets the Authority header correct when the dial address
is in a format where the its address scheme can be determined.

Instead of making changes to get the all server addresses to unix://
prefixed format, set grpc.WithAuthority("localhost") client connection
override to get the same result.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-09-20 13:15:36 +03:00
Kubernetes Prow Robot
127f33f63d Merge pull request #111221 from inosato/remove-ioutil-from-kubelet
Remove ioutil in kubelet/kubeadm and its tests
2022-09-17 21:56:28 -07:00
Kubernetes Prow Robot
c45ca46cdb Merge pull request #112387 from mythi/kubelet-devicemanager-topologyinfo
devicemanager: do not leak empty TopologyInfo to TopologyManager
2022-09-14 07:17:00 -07:00
Mikko Ylinen
68bb0935bd devicemanager: do not leak empty TopologyInfo to TopologyManager
Device Plugins that wish to leverage the Topology Manager can send back a populated
TopologyInfo struct as part of the device registration, along with the device IDs
and the health of the device. TopologyInfo is converted to TopologyHints and
used by TopologyManager to find the optimal/desired resource allocation for a Pod.

If a plugin sends an empty but non-nil instance of TopologyInfo for a resource,
devicemanager passes it on as an empty instance of TopologyHint which is
currently interpreted as "Hint Provider has no possible NUMA affinities
for resource" which further means that pods requesting that resource will fail.

To not block device resources that pass TopologyInfo{Nodes:[]*NUMANode{}} from being
used, interprete that as nil set of hints and not a []TopologyHint{}.

Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
2022-09-14 16:13:31 +03:00
Dmitry Verkhoturov
d0f9e6dc36 clarify CPUCFSQuotaPeriod values, set the minimum to 1ms
cpu.cfs_period_us is measured in microseconds in the kernel but
provided in time.Duration by the user, that change clarifies the code
to make this evident to the reader.

Also, the minimum value for that feature is 1ms and not 1μs, and this
change alters the validation to reject values smaller than 1ms.
2022-09-08 23:29:13 +02:00
sivchari
c62a7cdb32 fix: test 2022-08-26 01:25:44 +09:00
sivchari
12d49b6bfb fix: rename 2022-08-26 00:44:31 +09:00
Kubernetes Prow Robot
bc9f48b841 Merge pull request #112024 from cndoit18/remove-redundant-judgment
style: remove redundant judgment
2022-08-25 07:28:18 -07:00