The idempotency.go (perhaps not so accurately named) contains
API calls that kubeadm does against an API server using client-go.
Some users seem to have unstable setups where for unknown reasons
the API server can be unavailable or refuse to respond as expected.
Use PollUntilContextTimeout in all exported functions to ensure
such API calls are all retry-able.
NOTE: The context passed to PollUntilContextTimeout is not propagated
in the polled function. Instead the poll function creates it's own
context 'ctx := context.Background()', this is to avoid
breaking expectations on the side of the callers, that expect
a certain type of error and not "context timeout" errors.
Additional changes:
- Make all context.TODO() -> context.Background()
- Update all unit tests and make sure during testing the retry
interval and timeout are short. Test coverage of idempotency.go
is at ~97%.
- Remove the TestMutateConfigMapWithConflict test. It does not
contribute much, because conflict handling is done at the API,
server side, not on the side of kubeadm. This simulating this is not
needed.
kubeadm upgrade checks the migration path for the existing CoreDNS
deployment pre-flight. Migration paths are defined for CoreDNS
versions, which are derived from the image tag used in the existing
deployment.
The kubeadm ClusterConfiguration.DNS.ImageMeta supports suffixing the
tag with a digest, but at upgrade time does not derive the version
correctly from an image with digest suffix, because DeployedDNSAddon
does not deal with digests correctly. This commit makes DeployedDNSAddon
digest-aware.
Signed-off-by: Markus Rudy <mr@edgeless.systems>
This commit syncs RBAC from coredns/deployment and removes a get nodes
RBAC.
Historically the federation CoreDNS plugin needed the nodes resource to
fetch zone and region labels.
However, the CoreDNS federation plugin was deprecated and cleaned up a
long time ago and removed the Nodes RBAC requirement here in
`coredns/deployment` coredns.yaml.sed:
https://github.com/coredns/deployment/pull/229
This change however, never made it to `kubernetes/kubernetes`.
Signed-off-by: Nico Berlee <nico.berlee@on2it.net>
kubeam:node-proxier -> kubeadm:node-proxier
This causes e2e test failures:
"[area-kubeadm] proxy addon kube-proxy ServiceAccount should
be bound to the system:node-proxier cluster role"
in:
- kubeadm-kinder-latest
- kubeadm-kinder-latest-on-...
- other tests
Kubeadm no longer supports kube-dns and CoreDNS is the only
supported DNS server. Remove ClusterConfiguration.DNS.Type
from v1beta3 that is used to set the DNS server type.
Add DefaultedStaticInitConfiguration() which can be
used instead of DefaultedInitConfiguration() during unit tests.
The later can be slow since it performs dynamic defaulting.
During upgrade the coredns migration library seems to require
that the input version doesn't have the "v" prefix".
Fixes a bug where the user cannot run commands such as
"kubeadm upgrade plan" if they have `v1.8.0` installed.
Assuming this is caused by the fact that previously the image didn't
have a "v" prefix.
- Mark the "node-role.kubernetes.io/master" key for labels
and taints as deprecated.
- During "kubeadm init/join" apply the label
"node-role.kubernetes.io/control-plane" to new control-plane nodes,
next to the existing "node-role.kubernetes.io/master" label.
- During "kubeadm upgrade apply", find all Nodes with the "master"
label and also apply the "control-plane" label to them
(if they don't have it).
- During upgrade health-checks collect Nodes labeled both "master"
and "control-plane".
- Rename the constants.ControlPlane{Taint|Toleraton} to
constants.OldControlPlane{Taint|Toleraton} to manage the transition.
- Mark constants.OldControlPlane{{Taint|Toleraton} as deprecated.
- Use constants.OldControlPlane{{Taint|Toleraton} instead of
constants.ControlPlane{Taint|Toleraton} everywhere.
- Introduce constants.ControlPlane{Taint|Toleraton}.
- Add constants.ControlPlaneToleraton to the kube-dns / CoreDNS
Deployments to make them anticipate the introduction
of the "node-role.kubernetes.io/control-plane:NoSchedule"
taint (constants.ControlPlaneTaint) on kubeadm control-plane Nodes.
the controller manager should validate the podSubnet against the node-mask
because if they are incorrect can cause the controller-manager to fail.
We don't need to calculate the node-cidr-masks, because those should
be provided by the user, if they are wrong we fail in validation.
The isCoreDNSVersionSupported() check assumes that
there is a running kubelet, that manages the CoreDNS containers.
If the containers are being created it is not possible to fetch
their image digest. To workaround that, a poll can be used in
isCoreDNSVersionSupported() and wait for the CoreDNS Pods
are expected to be running. Depending on timing and CNI
yet to be installed this can cause problems related to
addon idempotency of "kubeadm init", because if the CoreDNS
Pods are waiting for another step they will never get running.
Remove the function isCoreDNSVersionSupported() and assume that
the version is always supported. Rely on the Corefile migration
library to error out if it must.
Until now, users were always asked to manually convert a component config to a
version supported by kubeadm, if kubeadm is not supporting its version.
This is true even for configs generated with older kubeadm versions, hence
getting users to make manual conversions on kubeadm generated configs.
This is not appropriate and user friendly, although, it tends to be the most
common case. Hence, we sign kubeadm generated component configs stored in
config maps with a SHA256 checksum. If a configs is loaded by kubeadm from a
config map and has a valid signature it's considered "kubeadm generated" and if
a version migration is required, this config is automatically discarded and a
new one is generated.
If there is no checksum or the checksum is not matching, the config is
considered as "user supplied" and, if a version migration is required, kubeadm
will bail out with an error, requiring manual config migration (as it's today).
The behavior when supplying component configs on the kubeadm command line
does not change. Kubeadm would still bail out with an error requiring migration
if it can recognize their groups but not versions.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
A narrow assumption of what is contained in the `imageID` fields for the
CoreDNS pods causes a panic upon upgrade.
Fix this by using a proper regex to match a trailing SHA256 image digest
in `imageID` or return an error if it cannot find it.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>