- Mark the "node-role.kubernetes.io/master" key for labels
and taints as deprecated.
- During "kubeadm init/join" apply the label
"node-role.kubernetes.io/control-plane" to new control-plane nodes,
next to the existing "node-role.kubernetes.io/master" label.
- During "kubeadm upgrade apply", find all Nodes with the "master"
label and also apply the "control-plane" label to them
(if they don't have it).
- During upgrade health-checks collect Nodes labeled both "master"
and "control-plane".
- Rename the constants.ControlPlane{Taint|Toleraton} to
constants.OldControlPlane{Taint|Toleraton} to manage the transition.
- Mark constants.OldControlPlane{{Taint|Toleraton} as deprecated.
- Use constants.OldControlPlane{{Taint|Toleraton} instead of
constants.ControlPlane{Taint|Toleraton} everywhere.
- Introduce constants.ControlPlane{Taint|Toleraton}.
- Add constants.ControlPlaneToleraton to the kube-dns / CoreDNS
Deployments to make them anticipate the introduction
of the "node-role.kubernetes.io/control-plane:NoSchedule"
taint (constants.ControlPlaneTaint) on kubeadm control-plane Nodes.
Client side period validation of certificates should not be
fatal, as local clock skews are not so uncommon. The validation
should be left to the running servers.
- Remove this validation from TryLoadCertFromDisk().
- Add a new function ValidateCertPeriod(), that can be used for this
purpose on demand.
- In phases/certs add a new function CheckCertificatePeriodValidity()
that will print warnings if a certificate does not pass period
validation, and caches certificates that were already checked.
- Use the function in a number of places where certificates
are loaded from disk.
- Ensure the directory is created with 0700 via a new function
called CreateDataDirectory().
- Call this function in the init phases instead of the manual call
to MkdirAll.
- Call this function when joining control-plane nodes with local etcd.
If the directory creation is left to the kubelet via the
static Pod hostPath mounts, it will end up with 0755
which is not desired.
Pinning the kube-controller-manager and kube-scheduler kubeconfig files
to point to the control-plane-endpoint can be problematic during
immutable upgrades if one of these components ends up contacting an N-1
kube-apiserver:
https://kubernetes.io/docs/setup/release/version-skew-policy/#kube-controller-manager-kube-scheduler-and-cloud-controller-manager
For example, the components can send a request for a non-existing API
version.
Instead of using the CPE for these components, use the LocalAPIEndpoint.
This guarantees that the components would talk to the local
kube-apiserver, which should be the same version, unless the user
explicitly patched manifests.
Back in the v1alpha2 days the fuzzer test needed to be disabled. To ensure that
there were no config breaks and everything worked correctly extensive replacement
tests were put in place that functioned as unit tests for the kubeadm config utils
as well.
The fuzzer test has been reenabled for a long time now and there's no need for
these replacements. Hence, over time most of these were disabled, deleted and
refactored. The last remnants are part of the LoadJoinConfigurationFromFile test.
The test data for those old tests remains largely unused today, but it still receives
updates as it contains kubelet's and kube-proxy's component configs. Updates to these
configs are usually done because the maintainers of those need to add a new field.
Hence, to cleanup old code and reduce maintenance burden, the last test that depends
on this test data is finally refactored and cleaned up to represent a simple unit test
of `LoadJoinConfigurationFromFile`.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Add PatchStaticPod() in staticpod/utils.go
Apply patches to static Pods in:
- phases/controlplane/CreateStaticPodFiles()
- phases/etcd/CreateLocalEtcdStaticPodManifestFile() and
CreateStackedEtcdStaticPodManifestFile()
Add unit tests and update Bazel.
Component configs are used by kubeadm upgrade plan at the moment. However, they
can prevent kubeadm upgrade plan from functioning if loading of an unsupported
version of a component config is attempted. For that matter it's best to just
stop loading component configs as part of the kubeadm config load process.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
In slower setups it can take more time for the existing cluster
to be in a healthy state, so the existing backoff of ~50 seconds
is apparently not sufficient.
The client dial can also fail for similar reasons.
Improve kubeadm's join toleration of adding new etcd members.
Wrap both the client dial and member add in a longer backoff
(up to ~200 seconds).
This particular change should be backported to the support skew.
In a future change for master, all etcd client operations should be
make consistent so that the etcd logic is in a sane state.