Explictly show the help msg that `kubeadm upgrade plan` can only run
on the node where "admin.conf" exists, normally, this is the control
plane node.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The notes printed to the user from common.go when
loadConfig fails are outdated and incorrect.
If the config cannot be loaded the user should not be instructed
to re-upload the config with kubeadm commands. Instead they
should do it manually with kubectl.
On loadConfig() error just wrap the error in a simple message
and show it to the user.
The current setup stomps missing IsNotFound errors for Node objects.
The underlying fetching of init configuration uses
the node object to construct an initconfiguration for this
upgrade process, so if the Node is missing the kube-config CM
will be reported as missing, which is incorrect.
The component connection between kube-apiserver and kubelet does not
require the "O" field on the Subject to be set to the
"system:masters" privileged group. It can be a less
privileged group like "kubeadm:cluster-admins".
Change the group in the apiserve-kubelet-client
certificate specification. This cert is passed to
--kubelet-client-certificate.
The addition of the "super-admin.conf" functionality required
init.go's Client() to create RBAC rules on its first creation.
However this created a problem with the "wait-control-plane" phase
of "kubeadm init" where a client is needed to connect to the
API server Discovery API's "/healthz" endpoint. The logic that ensures
the RBAC became the step where the API server wait was polled for.
To avoid this, introduce a new InitData function ClientWithoutBootstrap.
In "wait-control-plane" use this client, which has no permissions
(anonymous), but is sufficient to connect to the "/healthz".
Pending changes here would be:
- Stop using the "/healthz", instead a regular REST client from
the kubelet cert/key can be constructed.
- Make the wait for kubelet / API server linear (not in go routines).
In EnsureAdminClusterRoleBindingImpl() there are a couple of
polls around CRB create calls. When testing the function
a short retry and a timeout are used. These introduce around
2x20 fake client "connections" / poll iterations under a couple
of test cases with 2 seconds overall test increase.
Given the polls in EnsureAdminClusterRoleBindingImpl()
are of type PollUntilContextTimeout() with "immediate" set to "true",
the short retry / time out can be removed when testing,
because one poll iteration is guaranteed and the tested function
is at 100% coverage with reactors and test cases.
Poll CRB create calls for kubeadm:cluster-admins when using the
super-admin.conf credential. The prior create call that uses the
credential admin.conf was already polled. Polling this subsequent
call seems advisable to ensure that momentary errors in between
cannot trip EnsureAdminClusterRoleBindingImpl().
- Update unit tests in certs_test.go related to the "renew" CLI command.
- In /init, (d *initData) Client(), make sure that the new logic
for bootstrapping an "admin.conf" user is performed, by calling
EnsureAdminClusterRoleBinding() from the phases backend. Add a
"adminKubeConfigBootstrapped" flag that helps call this logic only
once per "kubeadm init" binary execution.
- In /phases/init include a new subphase for generating
the "super-admin.conf" file.
- In /phases/reset make sure the file "super-admin.conf" is
cleaned if present. Update unit tests.
- Register the new file in /certs/renewal, so that the
file is renewed if present. If not present the common message "MISSING"
is shown. Same for other certs/kubeconfig files.
- In /kubeconfig, update the spec for admin.conf to use
the "kubeadm:cluster-admins" Group. A new spec is added for
the "super-admin.conf" file that uses the "system:masters" Group.
- Add a new function EnsureAdminClusterRoleBinding() that includes
logic to ensure that admin.conf contains a User that is properly
bound on the "cluster-admin" built-in ClusterRole. This requires
bootstrapping using the "system:masters" containing "super-admin.conf".
Add detailed unit tests for this new logic.
- In /upgrade#PerformPostUpgradeTasks() add logic to create the
"admin.conf" and "super-admin.conf" with the new, updated specs.
Add detailed unit tests for this new logic.
- In /upgrade#StaticPodControlPlane() ensure that renewal of
"super-admin.conf" is performed if the file exists.
Update unit tests.
- Add the new file name: super-admin.conf and a function
to return its default path GetSuperAdminKubeConfigPath()
- Add the ClusterAdminsGroupAndClusterRoleBinding object name.
These were found with a modified klog that enables "go vet" to check klog call
parameters:
cmd/kubeadm/app/features/features.go:149:4: printf: k8s.io/klog/v2.Warningf format %t has arg v of wrong type string (govet)
klog.Warningf("Setting deprecated feature gate %s=%t. It will be removed in a future release.", k, v)
test/images/sample-device-plugin/sampledeviceplugin.go:147:5: printf: k8s.io/klog/v2.Errorf does not support error-wrapping directive %w (govet)
klog.Errorf("error: %w", err)
test/images/sample-device-plugin/sampledeviceplugin.go:155:3: printf: k8s.io/klog/v2.Errorf does not support error-wrapping directive %w (govet)
klog.Errorf("Failed to add watch to %q: %w", triggerPath, err)
staging/src/k8s.io/code-generator/cmd/prerelease-lifecycle-gen/prerelease-lifecycle-generators/status.go:207:5: printf: k8s.io/klog/v2.Fatalf does not support error-wrapping directive %w (govet)
klog.Fatalf("Package %v: unsupported %s value: %q :%w", i, tagEnabledName, ptag.value, err)
staging/src/k8s.io/legacy-cloud-providers/vsphere/nodemanager.go:286:3: printf: (k8s.io/klog/v2.Verbose).Infof format %s reads arg #1, but call has 0 args (govet)
klog.V(4).Infof("Node %s missing in vSphere cloud provider cache, trying node informer")
staging/src/k8s.io/legacy-cloud-providers/vsphere/nodemanager.go:302:3: printf: (k8s.io/klog/v2.Verbose).Infof format %s reads arg #1, but call has 0 args (govet)
klog.V(4).Infof("Node %s missing in vSphere cloud provider caches, trying the API server")