CPUManager is going GA, thus it makes little sense
to keep the names of the internal configuration
variables `Experimental*`.
Trivial rename only.
Signed-off-by: Francesco Romani <fromani@redhat.com>
With graduation of device plugins to GA in 1.26, the feature gate is
enabled by default so `devicePluginEnabled` field no longer needs to
be passed at the time of Container Manager creation.
In addition to that, we remove the `ManagerStub` as it is no longer
needed.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
In order to improve the observability of the cpumanager,
add and populate metrics to track if the combination of
the kubelet configuration and podspec would trigger
exclusive core allocation and pinning.
We should avoid leaking any node/machine specific information
(e.g. core ids, even though this is admittedly an extreme example);
tracking these metrics seems to be a good first step, because
it allows us to get feedback without exposing details.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Currently, there are some unit tests that are failing on Windows due to
various reasons:
- config options not supported on Windows.
- files not closed, which means that they cannot be removed / renamed.
- paths not properly joined (filepath.Join should be used).
- time.Now() is not as precise on Windows, which means that 2
consecutive calls may return the same timestamp.
- different error messages on Windows.
- files have \r\n line endings on Windows.
- /tmp directory being used, which might not exist on Windows. Instead,
the OS-specific Temp directory should be used.
- the default value for Kubelet's EvictionHard field was containing
OS-specific fields. This is now moved, the field is now set during
Kubelet's initialization, after the config file is read.
Currently, there are some unit tests that are failing on Windows due to
various reasons:
- paths not properly joined (filepath.Join should be used).
- Proxy Mode IPVS not supported on Windows.
- DeadlineExceeded can occur when trying to read data from an UDP
socket. This can be used to detect whether the port was closed or not.
- In Windows, with long file name support enabled, file names can have
up to 32,767 characters. In this case, the error
windows.ERROR_FILENAME_EXCED_RANGE will be encountered instead.
- files not closed, which means that they cannot be removed / renamed.
- time.Now() is not as precise on Windows, which means that 2
consecutive calls may return the same timestamp.
- path.Base() will return the same path. filepath.Base() should be used
instead.
- path.Join() will always join the paths with a / instead of the OS
specific separator. filepath.Join() should be used instead.
Several reports exist (both with device plugins and CSI) that
kubelet w/ grpc-go sends invalid Authority header and some non
grpc-go servers reject these unix domain socket client connections.
grpc-go sets the Authority header correct when the dial address
is in a format where the its address scheme can be determined.
Instead of making changes to get the all server addresses to unix://
prefixed format, set grpc.WithAuthority("localhost") client connection
override to get the same result.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Device Plugins that wish to leverage the Topology Manager can send back a populated
TopologyInfo struct as part of the device registration, along with the device IDs
and the health of the device. TopologyInfo is converted to TopologyHints and
used by TopologyManager to find the optimal/desired resource allocation for a Pod.
If a plugin sends an empty but non-nil instance of TopologyInfo for a resource,
devicemanager passes it on as an empty instance of TopologyHint which is
currently interpreted as "Hint Provider has no possible NUMA affinities
for resource" which further means that pods requesting that resource will fail.
To not block device resources that pass TopologyInfo{Nodes:[]*NUMANode{}} from being
used, interprete that as nil set of hints and not a []TopologyHint{}.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
cpu.cfs_period_us is measured in microseconds in the kernel but
provided in time.Duration by the user, that change clarifies the code
to make this evident to the reader.
Also, the minimum value for that feature is 1ms and not 1μs, and this
change alters the validation to reject values smaller than 1ms.
This change is to promote local storage capacity isolation feature to GA
At the same time, to allow rootless system disable this feature due to
unable to get root fs, this change introduced a new kubelet config
"localStorageCapacityIsolation". By default it is set to true. For
rootless systems, they can set this configuration to false to disable
the feature. Once it is set, user cannot set ephemeral-storage
request/limit because capacity and allocatable will not be set.
Change-Id: I48a52e737c6a09e9131454db6ad31247b56c000a
cpu.cfs_period_us is 100μs by default despite having an "ms" unit
for some unfortunate reason. Documentation:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management
The desired effect of that change is more clarity on the default value
so users would be aware that the 10ms custom value would be
not 0.1x of the default, but 100x of it.
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>