Include number of requested and available CPUs in the error message
when the assignment of CPUs fails because there are less available
CPUs than requested.
Kubelet, if using cloud provider external, initializes temporary
the node addresses using the non-cloud provider logic, until the
cloud provider overrides it.
This behavior has undesired consequences if the cloud-provider addresses
are different than the original ones, specially for hostNetwork pods,
that inherit these addresses from the Node.
Since some cloud-providers depend on this behavior, in order to keep
backward compatibility, assume that the specifying addresses via
the node-ip flags means that the intent is to keep the existing
behavior to temporary initialize the addresses.
If the node-ips are the unspecified addresses or are not set, then
wait for the external cloud provider to set the node addresses.
Change-Id: I3a3895f9b830769f9658e6a03f058c914c438a09
Signed-off-by: Antonio Ojea <aojea@google.com>
Add file doc.go with some rudimentary information to package
kubelet/cm. This will make it easier for people approaching the
kubelet codebase for the first time to quickly understand what's
in the package, since its name is abbreviated and hostile to
newcomers.
This removes deprecated sets.String and sets.Int
- replace sets.String with sets.Set[string]
- replace sets.Int with sets.Set[int]
- replace sets.NewString with sets.New[string]
- replace sets.NewInt with sets.New[int]
- replace sets.(OLD).List with sets.List(NEW)
This change bypasses all logic to set swap in the linux container
resources if a swap controller is not available on node. Failing
to do so may cause errors in runc when starting a container with
a swap configuration -- even if this is set to 0.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change switches to using isCgroup2UnifiedMode locally to ensure
that any mocked function is also used when checking the swap controller
availability.
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Add and use more facilities to the *internal* podresources client.
Checking e2e test runs, we have quite some
```
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/lib/kubelet/pod-resources/kubelet.sock: connect: connection refused": rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/lib/kubelet/pod-resources/kubelet.sock: connect: connection refused"
```
This is likely caused by kubelet restarts, which we do plenty in e2e tests,
combined with the fact gRPC does lazy connection AND we don't really
check the errors in client code - we just bubble them up.
While it's arguably bad we don't check properly error codes, it's also
true that in the main case, e2e tests, the functions should just never
fail besides few well known cases, we're connecting over a
super-reliable unix domain socket after all.
So, we centralize the fix adding a function (alongside with minor
cleanups) which wants to trigger and ensure the connection happens,
localizing the changes just here. The main advantage is this approach
is opt-in, composable, and doesn't leak gRPC details into the client
code.
Signed-off-by: Francesco Romani <fromani@redhat.com>
As part of this change, the code responsible for managing the sandbox
image within the kubelet has been removed. Previously, the kubelet used
to prevent sandbox image from the garbage collection process. However,
with this update, the responsibility of managing the sandbox containers
has been shifted to the CRI implementation itself. By allowing sandbox
image pinning from CRI, we improve efficiency and simplify the kubelet's
interaction with the container runtime. As a result, the kubelet can now
rely on the container runtime's built-in mechanisms for sandbox container
lifecycle management.
Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>