- lock the FG to true by default
- cleanup wrappers and logic related to versioned vs unversioned
naming of API objects (CMs and RBAC)
- update unit tests
The OldControlPlaneTaint taint (master) can be replaced
with the new ControlPlaneTaint (control-plane) taint.
Adapt unit tests in markcontrolplane_test.go
and cluster_test.go.
The CRI socket that kubeadm writes as an annotation
on a particular Node object can include an endpoint that
does not have an URL scheme. This is undesired as long term
the kubelet can stop allowing endpoints without URL scheme.
For control plane nodes "kubeadm upgrade apply" takes
the locally defaulted / populated NodeRegistration and refreshes
the CRI socket in PerformPostUpgradeTasks. But for secondary
nodes "kubeadm upgrade node" does not.
Adapt "upgrade node" to fetch the NodeRegistration for this node
and fix the CRI socket missing URL scheme if needed in the Node
annotation.
Add the UnversionedKubeletConfigMap feature gate that can
be used to control legacy vs new behavior for naming the
default configmap used to store the KubeletConfiguration.
Update related unit tests.
During operations such as "upgrade", kubeadm fetches the
ClusterConfiguration object from the kubeadm ConfigMap.
However, due to requiring node specifics it wraps it in an
InitConfiguration object. The function responsible for that is:
app/util/config#FetchInitConfigurationFromCluster().
A problem with this function (and sub-calls) is that it ignores
the static defaults applied from versioned types
(e.g. v1beta3/defaults.go) and only applies dynamic defaults for:
- API endpoints
- node registration
- etc...
The introduction of Init|JoinConfiguration.ImagePullPolicy now
has static defaulting of the NodeRegistration object with a default
policy of "PullIfNotPresent". Respect this defaulting by constructing
a defaulted internal InitConfiguration from
FetchInitConfigurationFromCluster() and only then apply the dynamic
defaults over it.
This fixes a bug where "kubeadm upgrade ..." fails when pulling images
due to an empty ("") ImagePullPolicy. We could assume that empty
string means default policy on runtime in:
cmd/kubeadm/app/preflight/checks.go#ImagePullCheck()
but that might actually not be the user intent during "init" and "join",
due to e.g. a typo. Similarly, we don't allow empty tokens
on runtime and error out.
- Remove the object form v1beta3 and internal type
- Deprecate a couple of phases that were specifically designed / named to
modify the ClusterStatus object
- Adapt logic around annotation vs ClusterStatus retrieval
- Update unit tests
- Run generators
- Mark the "node-role.kubernetes.io/master" key for labels
and taints as deprecated.
- During "kubeadm init/join" apply the label
"node-role.kubernetes.io/control-plane" to new control-plane nodes,
next to the existing "node-role.kubernetes.io/master" label.
- During "kubeadm upgrade apply", find all Nodes with the "master"
label and also apply the "control-plane" label to them
(if they don't have it).
- During upgrade health-checks collect Nodes labeled both "master"
and "control-plane".
- Rename the constants.ControlPlane{Taint|Toleraton} to
constants.OldControlPlane{Taint|Toleraton} to manage the transition.
- Mark constants.OldControlPlane{{Taint|Toleraton} as deprecated.
- Use constants.OldControlPlane{{Taint|Toleraton} instead of
constants.ControlPlane{Taint|Toleraton} everywhere.
- Introduce constants.ControlPlane{Taint|Toleraton}.
- Add constants.ControlPlaneToleraton to the kube-dns / CoreDNS
Deployments to make them anticipate the introduction
of the "node-role.kubernetes.io/control-plane:NoSchedule"
taint (constants.ControlPlaneTaint) on kubeadm control-plane Nodes.
Component configs are used by kubeadm upgrade plan at the moment. However, they
can prevent kubeadm upgrade plan from functioning if loading of an unsupported
version of a component config is attempted. For that matter it's best to just
stop loading component configs as part of the kubeadm config load process.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
While `ClusterStatus` will be maintained and uploaded, it won't be
used by the internal `kubeadm` logic in order to determine the etcd
endpoints anymore.
The only exception is during the first upgrade cycle (`kubeadm upgrade
apply`, `kubeadm upgrade node`), in which we will fallback to the
ClusterStatus to let the upgrade path add the required annotations to
the newly created static pods.
If the user has modified the kubelet.conf post TLS bootstrap
to become invalid, the function getNodeNameFromKubeletConfig() can
panic. This was observed to trigger in "kubeadm reset" use cases.
Add basic validation and unit tests around parsing the kubelet.conf
with the aforementioned function.
kubeadm's current implementation of component config support is "kind" centric.
This has its downsides. Namely:
- Kind names and numbers can change between config versions.
Newer kinds can be ignored. Therefore, detection of a version change is
considerably harder.
- A component config can have only one kind that is managed by kubeadm.
Thus a more appropriate way to identify component configs is required.
Probably the best solution identified so far is a config group.
A group name is unlikely to change between versions, while the kind names and
structure can.
Tracking component configs by group name allows us to:
- Spot more easily config version changes and manage alternate versions.
- Support more than one kind in a config group/version.
- Abstract component configs by hiding their exact structure.
Hence, this change rips off the old kind based support for component configs
and replaces it with a group name based one. This also has the following
extra benefits:
- More tests were added.
- kubeadm now errors out if an unsupported version of a known component group
is used.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Whenever kubeadm needs to fetch its configuration from the cluster, it gets
the component configuration of all supported components (currently only kubelet
and kube-proxy). However, kube-proxy is deemed an optional component and its
installation may be skipped (by skipping the addon/kube-proxy phase on init).
When kube-proxy's installation is skipped, its config map is not created and
all kubeadm operations, that fetch the config from the cluster, are bound to
fail with "not found" or "forbidden" (because of missing RBAC rules) errors.
To fix this issue, we have to ignore the 403 and 404 errors, returned on an
attempt to fetch kube-proxy's component config from the cluster.
The `GetFromKubeProxyConfigMap` function now supports returning nil for both
error and object to indicate just such a case.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Currently ConfigFileAndDefaultsToInternalConfig and
FetchConfigFromFileOrCluster are used to default and load InitConfiguration
from file or cluster. These two APIs do a couple of completely separate things
depending on how they were invoked. In the case of
ConfigFileAndDefaultsToInternalConfig, an InitConfiguration could be either
defaulted with external override parameters, or loaded from file.
With FetchConfigFromFileOrCluster an InitConfiguration is either loaded from
file or from the config map in the cluster.
The two share both some functionality, but not enough code. They are also quite
difficult to use and sometimes even error prone.
To solve the issues, the following steps were taken:
- Introduce DefaultedInitConfiguration which returns defaulted version agnostic
InitConfiguration. The function takes InitConfiguration for overriding the
defaults.
- Introduce LoadInitConfigurationFromFile, which loads, converts, validates and
defaults an InitConfiguration from file.
- Introduce FetchInitConfigurationFromCluster that fetches InitConfiguration
from the config map.
- Reduce, when possible, the usage of ConfigFileAndDefaultsToInternalConfig by
replacing it with DefaultedInitConfiguration or LoadInitConfigurationFromFile
invocations.
- Replace all usages of FetchConfigFromFileOrCluster with calls to
LoadInitConfigurationFromFile or FetchInitConfigurationFromCluster.
- Delete FetchConfigFromFileOrCluster as it's no longer used.
- Rename ConfigFileAndDefaultsToInternalConfig to
LoadOrDefaultInitConfiguration in order to better describe what the function
is actually doing.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Replaced hardcoded "v0.12.0" strings with MinimumControlPlaneVersion and
MinimumKubeletVersion global variables.
This should help with a regular release version bumps.