- Modify VerifyUnmarshalStrict to use serializer/json instead
of sigs.k8s.io/yaml. In strict mode, the serializers
in serializer/json use the new sigs.k8s.io/json library
that also catches case sensitive errors for field names -
e.g. foo vs Foo. Include test case for that in strict/testdata.
- Move the hardcoded schemes to check to the side of the
caller - i.e. accept a slice of runtime.Scheme.
- Move the klog warnings outside of VerifyUnmarshalStrict
and make them the responsibility of the caller.
- Call VerifyUnmarshalStrict when downloading the configuration
from kubeadm-config or the kube-proxy or kubelet-config CMs.
This validation is useful if the user has manually patched the CMs.
- Apply "control-plane" taint during init/join by adding the
taint in SetNodeRegistrationDynamicDefaults(). The old
taint "master" is still applied.
- Clarify API docs (v1beta2 and v1beta3) for nodeRegistration.Taint
to not mention "master" taint and be more generic. Remove
example for taints that includes the word "master".
- Update unit tests.
Change the default container runtime CRI socket endpoint to the
one of containerd. Previously it was the one for Docker
- Rename constants.DefaultDockerCRISocket to DefaultCRISocket
- Make the constants files include the endpoints for all supported
container runtimes for Unix/Windows.
- Update unit tests related to docker runtime testing.
- In kubelet/flags.go hardcode the legacy docker socket as a check
to allow kubeadm 1.24 to run against kubelet 1.23 if the user
explicitly sets the criSocket field to "npipe:////./pipe/dockershim"
on Windows or "unix:///var/run/dockershim.sock" on Linux.
The CRI socket that kubeadm writes as an annotation
on a particular Node object can include an endpoint that
does not have an URL scheme. This is undesired as long term
the kubelet can stop allowing endpoints without URL scheme.
For control plane nodes "kubeadm upgrade apply" takes
the locally defaulted / populated NodeRegistration and refreshes
the CRI socket in PerformPostUpgradeTasks. But for secondary
nodes "kubeadm upgrade node" does not.
Adapt "upgrade node" to fetch the NodeRegistration for this node
and fix the CRI socket missing URL scheme if needed in the Node
annotation.
- Update defaults for v1beta2 and 3 to have URL scheme
- Raname DefaultUrlScheme to DefaultContainerRuntimeURLScheme
- Prepend a missing URL scheme to user sockets and warn them
that this might not be supported in the future
- Update socket validation to exclude IsAbs() testing
(This is broken on Windows). Assume the path is not empty and has
URL scheme at this point (validation happens after defaulting).
- Use net.Dial to open Unix sockets
- Update all related unit tests
Signed-off-by: pacoxu <paco.xu@daocloud.io>
Signed-off-by: Lubomir I. Ivanov <lubomirivanov@vmware.com>
Add the UnversionedKubeletConfigMap feature gate that can
be used to control legacy vs new behavior for naming the
default configmap used to store the KubeletConfiguration.
Update related unit tests.
Kubeadm requires manual version updates of its current supported k8s
control plane version and minimally supported k8s control plane and
kubelet versions every release cycle.
To avoid that, in constants.go:
- Add the helper function getSkewedKubernetesVersion() that can be
used to retrieve a MAJOR.(MINOR+n).0 version of k8s. It currently
uses the kubeadm version populated in "component-base/version" during
the kubeadm build process.
- Use the function to set existing version constants (variables).
Update util/config/common.go#NormalizeKubernetesVersion() to
tolerate the case where a k8s version in the ClusterConfiguration
is too old for the kubeadm binary to use during code freeze.
Include unit tests for the new utilities.
During operations such as "upgrade", kubeadm fetches the
ClusterConfiguration object from the kubeadm ConfigMap.
However, due to requiring node specifics it wraps it in an
InitConfiguration object. The function responsible for that is:
app/util/config#FetchInitConfigurationFromCluster().
A problem with this function (and sub-calls) is that it ignores
the static defaults applied from versioned types
(e.g. v1beta3/defaults.go) and only applies dynamic defaults for:
- API endpoints
- node registration
- etc...
The introduction of Init|JoinConfiguration.ImagePullPolicy now
has static defaulting of the NodeRegistration object with a default
policy of "PullIfNotPresent". Respect this defaulting by constructing
a defaulted internal InitConfiguration from
FetchInitConfigurationFromCluster() and only then apply the dynamic
defaults over it.
This fixes a bug where "kubeadm upgrade ..." fails when pulling images
due to an empty ("") ImagePullPolicy. We could assume that empty
string means default policy on runtime in:
cmd/kubeadm/app/preflight/checks.go#ImagePullCheck()
but that might actually not be the user intent during "init" and "join",
due to e.g. a typo. Similarly, we don't allow empty tokens
on runtime and error out.
- Make v1beta3 use bootstraptoken/v1 instead of local copies
- Make the internal API use bootstraptoken/v1
- Update validation, /cmd, /util and other packages
- Update v1beta2 conversion
- Remove the object form v1beta3 and internal type
- Deprecate a couple of phases that were specifically designed / named to
modify the ClusterStatus object
- Adapt logic around annotation vs ClusterStatus retrieval
- Update unit tests
- Run generators
- Pin the ClusterConfiguration when fuzzing
the internal InitConfiguration that embeds it. Kubeadm includes
separate constructs for this embedding in the internal type
and this round trip is not viable.
- Remove the artificial calls to SetDefaults_ClusterConfiguration()
in v1beta{2|3}'s converters from public to internal InitConfiguration.
- Make sure the internal InitConfiguration.ClusterConfiguration is
defaulted in initconfiguration.go instead.
- scheme: switch to:
utilruntime.Must(scheme.SetVersionPriority(v1beta3.SchemeGroupVersion))
- change all imports in the code base from v1beta2 to v1beta3
- rename all import aliases for kubeadmapiv1beta2 to "kubeadmapiv".
this allows smaller diffs when changing the default public API.
Add DefaultedStaticInitConfiguration() which can be
used instead of DefaultedInitConfiguration() during unit tests.
The later can be slow since it performs dynamic defaulting.
- Mark the "node-role.kubernetes.io/master" key for labels
and taints as deprecated.
- During "kubeadm init/join" apply the label
"node-role.kubernetes.io/control-plane" to new control-plane nodes,
next to the existing "node-role.kubernetes.io/master" label.
- During "kubeadm upgrade apply", find all Nodes with the "master"
label and also apply the "control-plane" label to them
(if they don't have it).
- During upgrade health-checks collect Nodes labeled both "master"
and "control-plane".
- Rename the constants.ControlPlane{Taint|Toleraton} to
constants.OldControlPlane{Taint|Toleraton} to manage the transition.
- Mark constants.OldControlPlane{{Taint|Toleraton} as deprecated.
- Use constants.OldControlPlane{{Taint|Toleraton} instead of
constants.ControlPlane{Taint|Toleraton} everywhere.
- Introduce constants.ControlPlane{Taint|Toleraton}.
- Add constants.ControlPlaneToleraton to the kube-dns / CoreDNS
Deployments to make them anticipate the introduction
of the "node-role.kubernetes.io/control-plane:NoSchedule"
taint (constants.ControlPlaneTaint) on kubeadm control-plane Nodes.
Back in the v1alpha2 days the fuzzer test needed to be disabled. To ensure that
there were no config breaks and everything worked correctly extensive replacement
tests were put in place that functioned as unit tests for the kubeadm config utils
as well.
The fuzzer test has been reenabled for a long time now and there's no need for
these replacements. Hence, over time most of these were disabled, deleted and
refactored. The last remnants are part of the LoadJoinConfigurationFromFile test.
The test data for those old tests remains largely unused today, but it still receives
updates as it contains kubelet's and kube-proxy's component configs. Updates to these
configs are usually done because the maintainers of those need to add a new field.
Hence, to cleanup old code and reduce maintenance burden, the last test that depends
on this test data is finally refactored and cleaned up to represent a simple unit test
of `LoadJoinConfigurationFromFile`.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Component configs are used by kubeadm upgrade plan at the moment. However, they
can prevent kubeadm upgrade plan from functioning if loading of an unsupported
version of a component config is attempted. For that matter it's best to just
stop loading component configs as part of the kubeadm config load process.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
When doing the very first upgrade from a cluster that contains the
source of truth in the ClusterStatus struct, the new kubeadm logic
will try to retrieve this information from annotations.
This changeset adds to both etcd and apiserver endpoint retrieval the
special case in which they won't retry if we are in such cases. The
logic will retry if we find any unknown error, but will not retry in
the following cases:
- etcd annotations do not contain etcd endpoints, but the overall list
of etcd pods is greater than 0. This means that we listed at least
one etcd pod, but they are missing the annotation.
- API server annotation is not found on the api server pod for a given
node name, but no errors aside from that one were found. This means
that the API server pod is present, but is missing the annotation.
In both cases there is no point in retrying, and so, this speeds up the
upgrade path when coming from a previous existing cluster.
While `ClusterStatus` will be maintained and uploaded, it won't be
used by the internal `kubeadm` logic in order to determine the etcd
endpoints anymore.
The only exception is during the first upgrade cycle (`kubeadm upgrade
apply`, `kubeadm upgrade node`), in which we will fallback to the
ClusterStatus to let the upgrade path add the required annotations to
the newly created static pods.
The warning message
```
[config] WARNING: Ignored YAML document with GroupVersionKind ...
```
is printed for all GVKs that are not part of the kubeadm core types.
This is wrong as the component config types are supported and successfully
parsed and used despite the fact that the warning is printed for them too.
Hence this simple fix first checks if the group of the GVK is a supported
component config group and the warning is printed only if it's not.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
If the user has modified the kubelet.conf post TLS bootstrap
to become invalid, the function getNodeNameFromKubeletConfig() can
panic. This was observed to trigger in "kubeadm reset" use cases.
Add basic validation and unit tests around parsing the kubelet.conf
with the aforementioned function.
kubeadm's current implementation of component config support is "kind" centric.
This has its downsides. Namely:
- Kind names and numbers can change between config versions.
Newer kinds can be ignored. Therefore, detection of a version change is
considerably harder.
- A component config can have only one kind that is managed by kubeadm.
Thus a more appropriate way to identify component configs is required.
Probably the best solution identified so far is a config group.
A group name is unlikely to change between versions, while the kind names and
structure can.
Tracking component configs by group name allows us to:
- Spot more easily config version changes and manage alternate versions.
- Support more than one kind in a config group/version.
- Abstract component configs by hiding their exact structure.
Hence, this change rips off the old kind based support for component configs
and replaces it with a group name based one. This also has the following
extra benefits:
- More tests were added.
- kubeadm now errors out if an unsupported version of a known component group
is used.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>