Originally raised as an issue with invalid versions to plan, but it has
been determined with air gapped environments and development versions it
is not possible to fully address that issue.
But one thing that was identified was that we can do a better job in how
we output the upgrade plan information. Kubeadm outputs the requested
version as "Latest stable version", though that may not actually be the
case. For this instance, we want to change this to "Target version" to
be a little more accurate.
Then in the component upgrade table that is emitted, the last column of
AVAILABLE isn't quite right either. Also changing this to TARGET to
reflect that this is the version we are targetting to upgrade to,
regardless of its availability.
There could be some improvements in checking available versions,
particularly in air gapped environments, to make sure we actually have
access to the requested version. But this at least clarifies some of the
output a bit.
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
Add DefaultedStaticInitConfiguration() which can be
used instead of DefaultedInitConfiguration() during unit tests.
The later can be slow since it performs dynamic defaulting.
Apply the label:
"node.kubernetes.io/exclude-from-external-load-balancers"
To control plane nodes to preserve backwards compatibility
with the legacy mode where "master" nodes were excluded from
LBs.
During upgrade the coredns migration library seems to require
that the input version doesn't have the "v" prefix".
Fixes a bug where the user cannot run commands such as
"kubeadm upgrade plan" if they have `v1.8.0` installed.
Assuming this is caused by the fact that previously the image didn't
have a "v" prefix.
Fixes an issue where some kubeadm phases fail if a certificate file
contains a certificate chain with one or more intermediate CA
certificates. The validation algorithm has been changed from requiring
that a certificate was signed directly by the root CA to requiring that
there is a valid certificate chain back to the root CA.
In kubeadm etcd join there is a a bug that exists where,
if a peer already exists in etcd, it attempts to mitigate
by continuing and generating the etcd manifest file. However,
this existing "member name" may actually be unset, causing
subsequent etcd consistency checks to fail.
This change checks if the member name is empty - if it is,
it sets the member name to the node name, and resumes.
- Mark the "node-role.kubernetes.io/master" key for labels
and taints as deprecated.
- During "kubeadm init/join" apply the label
"node-role.kubernetes.io/control-plane" to new control-plane nodes,
next to the existing "node-role.kubernetes.io/master" label.
- During "kubeadm upgrade apply", find all Nodes with the "master"
label and also apply the "control-plane" label to them
(if they don't have it).
- During upgrade health-checks collect Nodes labeled both "master"
and "control-plane".
- Rename the constants.ControlPlane{Taint|Toleraton} to
constants.OldControlPlane{Taint|Toleraton} to manage the transition.
- Mark constants.OldControlPlane{{Taint|Toleraton} as deprecated.
- Use constants.OldControlPlane{{Taint|Toleraton} instead of
constants.ControlPlane{Taint|Toleraton} everywhere.
- Introduce constants.ControlPlane{Taint|Toleraton}.
- Add constants.ControlPlaneToleraton to the kube-dns / CoreDNS
Deployments to make them anticipate the introduction
of the "node-role.kubernetes.io/control-plane:NoSchedule"
taint (constants.ControlPlaneTaint) on kubeadm control-plane Nodes.
the controller manager should validate the podSubnet against the node-mask
because if they are incorrect can cause the controller-manager to fail.
We don't need to calculate the node-cidr-masks, because those should
be provided by the user, if they are wrong we fail in validation.
Currently the "generate-csr" command does not have any output.
Pass an io.Writer (bound to os.Stdout from /cmd) to the functions
responsible for generating the kubeconfig / certs keys and CSRs.
If nil is passed these functions don't output anything.
The kubeconfig phase of "kubeadm init" detects external CA mode
and skips the generation of kubeconfig files. The kubeconfig
handling during control-plane join executes
CreateJoinControlPlaneKubeConfigFiles() which requires the presence
of ca.key when preparing the spec of a kubeconfig file and prevents
usage of external CA mode.
Modify CreateJoinControlPlaneKubeConfigFiles() to skip generating
the kubeconfig files if external CA mode is detected.
- Modify validateCACertAndKey() to print warnings for missing keys
instead of erroring out.
- Update unit tests.
This allows doing a CP node join in a case where the user has:
- copied shared certificates to the new CP node, but not copied
ca.key files, treating the cluster CAs as external
- signed other required certificates in advance
For external CA users that have prepared the kubeconfig files
for components, they might wish to provide a custom API server URL.
When performing validation on these kubeconfig files, instead of
erroring out on such custom URLs, show a klog Warning.
This allows flexibility around topology setup, where users
wish to make the kubeconfigs point to the ControlPlaneEndpoint instead
of the LocalAPIEndpoint.
Fix validation in ValidateKubeconfigsForExternalCA expecting
all kubeconfig files to use the CPE. The kube-scheduler and
kube-controller-manager now use LAE.
Client side period validation of certificates should not be
fatal, as local clock skews are not so uncommon. The validation
should be left to the running servers.
- Remove this validation from TryLoadCertFromDisk().
- Add a new function ValidateCertPeriod(), that can be used for this
purpose on demand.
- In phases/certs add a new function CheckCertificatePeriodValidity()
that will print warnings if a certificate does not pass period
validation, and caches certificates that were already checked.
- Use the function in a number of places where certificates
are loaded from disk.
The isCoreDNSVersionSupported() check assumes that
there is a running kubelet, that manages the CoreDNS containers.
If the containers are being created it is not possible to fetch
their image digest. To workaround that, a poll can be used in
isCoreDNSVersionSupported() and wait for the CoreDNS Pods
are expected to be running. Depending on timing and CNI
yet to be installed this can cause problems related to
addon idempotency of "kubeadm init", because if the CoreDNS
Pods are waiting for another step they will never get running.
Remove the function isCoreDNSVersionSupported() and assume that
the version is always supported. Rely on the Corefile migration
library to error out if it must.
Pinning the kube-controller-manager and kube-scheduler kubeconfig files
to point to the control-plane-endpoint can be problematic during
immutable upgrades if one of these components ends up contacting an N-1
kube-apiserver:
https://kubernetes.io/docs/setup/release/version-skew-policy/#kube-controller-manager-kube-scheduler-and-cloud-controller-manager
For example, the components can send a request for a non-existing API
version.
Instead of using the CPE for these components, use the LocalAPIEndpoint.
This guarantees that the components would talk to the local
kube-apiserver, which should be the same version, unless the user
explicitly patched manifests.
A check that verifies that kubeadm does not "upgrade" to an older release was
overly optimized by skipping upgrade if the new version is the same as the old
one. This somewhat makes sense, but that way changes in any of the etcd fields
in the ClusterConfiguration won't be applied if the etcd version is not
changed.
Hence, this simple change ensures that the upgrade is done even when no version
change takes place.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>