v1beta4 added the Timeouts struct and a EtcdAPICall timeout
field, but it was never used in the etcd client calls.
This is a bug, so it should be fixed, we also reduced
the timeout from 200 seconds exponentional backoff to 2 minute
linear default timeout.
With https://github.com/kubernetes/kubernetes/pull/122079,
kubeadm now relies on `ttlSecondsAfterFinished` to clean
up `upgrade-health-check` once its pod reaches a terminal state.
However, there is a case where the pod won't reach a terminal state and
the job will not register a terminal state, hence no garbage collection.
For example, if the pause image is not present, `ErrImagePull` will make
the pod keep retrying to pull the image and the pod will never reach a
terminal state on its own. And the job will continue to wait for the pod
to reach a terminal state which will not happen.
So we need to set `activeDeadlineSeconds` to prevent the job from
waiting forever for the pod to reach a terminal state.
Without this, users invoking `kubeadm upgrade plan` need to cleanup the
job outside of kubeadm even if they ignore the preflight result because
the job still runs when the result is configured to be ignored via
`--ignore-prelight-errors=CreateJob` flag.
Since the timeout for the polling in the `CreateJob` step in kubeadm
is 15 seconds, we should set the `activeDeadlineSeconds` to the same
timeout.
During kubeadm join in 1.30 kubeadm started respecting
the kubeletconfiguration healthz address/port. Previously
it hardcoded the health check to localhost:defaultport.
A corner case was not handled where the user applies --patches
on join to modify the local kubeletconfiguration. This results
in kubeletconfiguration patch target patches not being applied to
the KubeletConfiguration in memory and the health check
running on the address:port which are present in the kubelet-config
configmap.
Fix that by explicitly calling a new function to patch the
KubeletConfiguration in memory. This is scoped to only handle
the healthz checks *after* the kubelet config.yaml was already
patched and written to disk.
Currently, there are some unit tests that are failing on Windows due
to various reasons:
- IPVS proxy mode is not supported on Windows.
- pkg/kubelet/cri/remote was moved to cri-client.
v1beta3.ClusterConfiguration.APIServer.TimeoutForControlPlane
must be migrated to {Init|Join}Configuration.Timeouts.
.ControlPlaneComponentHealthCheck.
To achieve this sort of cross-Kind migration do the following:
- Use a temporary, thread-safe variable in timeoututils.go
- Make the order of GVKs in documentMapToInitConfiguration
deterministic.
Flags for kubeadm init such as --apiserver-extra-args prior
to v1beta4 used a map[string]string for pflag.Value storage. This no
longer works since v1beta4 extra args are a slice of Arg.
Add a new flag type argSlice and implement a solution for
parsing these flags.
At the same time deprecate these flags and show a warning
that users should use config.
Instead of defaulting ExtraEnvs for CP components to an empty
slice when converting from/to v1beta3 keep it nil.
This allows for expecting a nil value in the internal
config, similarly to ExtraArgs.