Device Plugins that wish to leverage the Topology Manager can send back a populated
TopologyInfo struct as part of the device registration, along with the device IDs
and the health of the device. TopologyInfo is converted to TopologyHints and
used by TopologyManager to find the optimal/desired resource allocation for a Pod.
If a plugin sends an empty but non-nil instance of TopologyInfo for a resource,
devicemanager passes it on as an empty instance of TopologyHint which is
currently interpreted as "Hint Provider has no possible NUMA affinities
for resource" which further means that pods requesting that resource will fail.
To not block device resources that pass TopologyInfo{Nodes:[]*NUMANode{}} from being
used, interprete that as nil set of hints and not a []TopologyHint{}.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
cpu.cfs_period_us is measured in microseconds in the kernel but
provided in time.Duration by the user, that change clarifies the code
to make this evident to the reader.
Also, the minimum value for that feature is 1ms and not 1μs, and this
change alters the validation to reject values smaller than 1ms.
This change is to promote local storage capacity isolation feature to GA
At the same time, to allow rootless system disable this feature due to
unable to get root fs, this change introduced a new kubelet config
"localStorageCapacityIsolation". By default it is set to true. For
rootless systems, they can set this configuration to false to disable
the feature. Once it is set, user cannot set ephemeral-storage
request/limit because capacity and allocatable will not be set.
Change-Id: I48a52e737c6a09e9131454db6ad31247b56c000a
cpu.cfs_period_us is 100μs by default despite having an "ms" unit
for some unfortunate reason. Documentation:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management
The desired effect of that change is more clarity on the default value
so users would be aware that the 10ms custom value would be
not 0.1x of the default, but 100x of it.
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
v1.43.0 marked grpc.WithInsecure() deprecated so this commit moves to use
what is the recommended replacement:
grpc.WithTransportCredentials(insecure.NewCredentials())
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
This is the first step towards being able to support a new plugin API version
in parallel with the existing one.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
The changes (mostly in pkg/kubelet/cm) are there to adopt changed
runc 1.1 API, and simplify things a bit. In particular:
1. simplify cgroup manager instantiation, using a new, easier way of
libcontainers/cgroups/manager.New;
2. replace libcontainerAdapter with a boolean variable (all it did
was passing on whether systemd manager should be used);
3. trivial change due to removed cgroupfs.HugePageSizes and added
cgroups.HugePageSizes();
4. do not calculate cgroup paths in update / destroy, since libcontainer
cgroup managers now calculate the paths upon creation (previously,
they were doing that only in Apply, so using e.g. Set or Destroy right
after creation was impossible without specifying paths).
We currently still calculate cgroup paths in Exists -- this is to be
addressed separately.
Co-Authored-By: Elana Hashman <ehashman@redhat.com>
Previously, callers of `Exists()` would not know why the cGroup was or
was not existing. In one call-site in particular, the `kubelet` would
entirely fail to start if the cGroup validation did not succeed. In
these cases we MUST explain what went wrong and pass that information
clearly to the caller. Previously, some but not all of the reasons for
invalidation were logged at a low log-level instead. This led to poor
UX.
The original method was retained on the interface so as to make this
diff small.
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Instead of doing (almost) the same thing from the three different
methods (Create, Update, Destroy), move the functionality to
libctCgroupConfig, replacing updateSystemdCgroupInfo.
The needResources bool is needed because we do not need resources
during Destroy, so we skip the unneeded resource conversion.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 79be8be10e made hugetlb settings optional if cgroup v2 is used and
hugetlb is not available, fixing issue 92933. Note at that time this was only
needed for v2, because for v1 the resources were set one-by-one, and only for
supported resources.
Commit d312ef7eb6 switched the code to using Set from runc/libcontainer
cgroups manager, and expanded the check to cgroup v1 as well.
Move this check earlier, to inside m.toResources, so instead of
converting all hugetlb resources from ResourceConfig to libcontainers's
Resources.HugetlbLimit, and then setting it to nil, we can skip the
conversion entirely if hugetlb is not supported, thus not doing the work
that is not needed.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit ecd6361f added setting PidsLimit to Create and Update.
Commit bce9d5f2 added setting PidsLimit to m.toResources.
Now, PidsLimit is assigned twice.
Remove the duplicate.
Fixes: bce9d5f2
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
There's no need to call m.Update (which will create another instance of
libcontainer cgroup manager, convert all the resources and then set
them). All this is already done here, except for Set().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>