Instead of erroring during the preflight check 'CreateJob'
from "upgrade" commands when there are no schedulable nodes,
show a warning.
This can happen in single node clusters.
Also increase the Job TTL after completion to 20 seconds
to make sure it's more than the timeout what waits
for the Job to complete.
Follow the same process of adding the Timeouts struct
to UpgradeConfiguration similarly to how it was done for
other API Kinds.
In the Timeouts struct include one new timeout:
- UpgradeManifests
WaitForAllControlPlaneComponents is a new feature gate
that can be used to tell kubeadm to wait for all control plane
components and not only kube-apiserver.
- Add the Waiter function WaitForControlPlaneComponents
that waits for all CP components in parallel. Uses the regular
healthz endpoint for checks of status 200.
- Add a new experimental phase to kubeadm join called "wait-control-plane".
A similar phase exists for kubeadm init.
The function wait.go#WaitForKubeletAndFunc() has been used in
a number of places in kubeadm. It starts a go routine to wait for
the kubelet /healthz and in parallel starts another go routine
to wait for an custom function.
This logic is problematic. If kubeadm is waiting for the kubelet
in parallel with something that requires the kubelet, the right
solution would be to first wait for the kubelet in serial and only
then proceed with the other action. The parallelism here particularly
during "init" required a unwanted "initial timeout" of 40s, before
the kubelet waiting even starts. In most cases, this makes the kubelet
waiter to not even start, while the main point of waiting becomes
the "other action".
- Remove the function WaitForKubeletAndFunc() from the Waiter interface.
- Rename the function WaitForHealthyKubelet() to just WaitForKubelet()
to be consistent with the naming WaitForAPI().
- Update WaitForKubelet() to not use TryRunCommand() and instead
use PollUntilContextTimeout().
- Remove the "initial timeout" of 40s in WaitForKubelet().
- Make both WaitForKubelet() and WaitForAPI() use similar error
handling and output.
- Update all usage of WaitForKubelet() to be a serial call before
any other action, such as another wait* call.
- Make the default wait timeout for the kubelet
/healthz to be 1 minute (kubeadmconstants.DefaultKubeletTimeout).
- Apply updates to all implementations of the Waiter interface.
- Register the new file in /certs/renewal, so that the
file is renewed if present. If not present the common message "MISSING"
is shown. Same for other certs/kubeconfig files.
- In /kubeconfig, update the spec for admin.conf to use
the "kubeadm:cluster-admins" Group. A new spec is added for
the "super-admin.conf" file that uses the "system:masters" Group.
- Add a new function EnsureAdminClusterRoleBinding() that includes
logic to ensure that admin.conf contains a User that is properly
bound on the "cluster-admin" built-in ClusterRole. This requires
bootstrapping using the "system:masters" containing "super-admin.conf".
Add detailed unit tests for this new logic.
- In /upgrade#PerformPostUpgradeTasks() add logic to create the
"admin.conf" and "super-admin.conf" with the new, updated specs.
Add detailed unit tests for this new logic.
- In /upgrade#StaticPodControlPlane() ensure that renewal of
"super-admin.conf" is performed if the file exists.
Update unit tests.
Back up kubelet config file for `kubeadm upgrade apply`, some code
refactoring is done to de-dup some redundant code logic.
Signed-off-by: Dave Chen <dave.chen@arm.com>