- replace all stray calls of os.Exit() to util.CheckError() instead
- CheckError() now checks if the klog verbosity level is >=5
and shows a stack trace of the error
- don't call klog.Fatal in version.go
It seems undesirable that Kubernetes as a system should be
blocking a node if it's Linux kernel is way too new.
If such a problem even occurs we should exclude versions from
the list of supported versions instead of blocking users
from trying e.g. the latest 7.0.0-beta kernel because our
validators are not aware of this new version.
Etcd v3.3.0 added the --listen-metrics-urls flag which allows specifying
addition URLs to the already present /health and /metrics endpoints.
While /health and /metrics are enabled for URLS defined with
--listen-client-urls (v3+ ?) they do require HTTPS.
Replace the present etcdctl based liveness probe with a standard HTTP
GET v1.Probe that connects to http://127.0.0.1:2381/health.
These endpoints are not reachable from the outside and only available
for localhost connections.
For file discovery, in case the user feeds a file for the CA
from the kubeconfig, make sure it's preloaded and embedded using
the new function EnsureCertificateAuthorityIsEmbedded().
This commit also applies cleanup:
- unroll validateKubeConfig() into ValidateConfigInfo() as this way
the default cluster can be re-used.
- in ValidateConfigInfo() reuse the variable config instead of creating
a new variable kubeconfig.
- make the Ensure* functions return descriptive errors instead of
wrapping the errors on the side of the callers.
When adding a new etcd member the etcd cluster can enter a state
of vote, where any new members added at the exact same time will
fail with an error right away.
Implement exponential backoff retry around the MemberAdd call.
This solves a kubeadm problem when concurrently joining
control-plane nodes with stacked etcd members.
From experiment, a few retries with milliseconds apart are
sufficient to achieve the concurrent join of a 3xCP cluster.
Apply the same backoff to MemberRemove in case the concurrent
removal of members fails for similar reasons.
Instead of creating a Docker client and fetching an Info object
from the docker enpoint, call the "docker info" command
and populate a local dockerInfo struct from JSON output.
Also
- add unit tests.
- update import boss and bazel.
This change affects "test/e2e_node/e2e_node_suite_test.go"
as it consumes this Docker validator by calling
"system.ValidateSpec()".
MarshalClusterConfigurationToBytes has capabilities to output the component
configs, as separate YAML documents, besides the kubeadm ClusterConfiguration
kind. This is no longer necessary for the following reasons:
- All current use cases of this function require only the ClusterConfiguration.
- It will output component configs only if they are not the default ones. This
can produce undeterministic output and, thus, cause potential problems.
- There are only hacky ways to dump the ClusterConfiguration only (without the
component configs).
Hence, we simplify things by replacing the function with direct calls to the
underlaying MarshalToYamlForCodecs. Thus marshalling only ClusterConfiguration,
when needed.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
one test also proved it did not call the internet
but this was not fool proof as it did not return a string
and thus could be called with something expecting to fail.
updated the network calls to be package local so tests could pass their
own implementation. A public interface was not provided as it would not
be likely this would ever be needed or wanted.
During the control plane joins, sometimes the control plane returns an
expected error when trying to download the `kubeadm-config` ConfigMap.
This is a workaround for this issue until the root cause is completely
identified and fixed.
Ideally, this commit should be reverted in the near future.
Ever since v1alpha3, InitConfiguration is containing ClusterConfiguration
embedded in it. This was done to mimic the internal InitConfiguration, which in
turn is used throughout the kubeadm code base as if it is the old
MasterConfiguration of v1alpha2.
This, however, is confusing to users who vendor in kubeadm as the embedded
ClusterConfiguration inside InitConfiguration is not marshalled to YAML.
For this to happen, special care must be taken for the ClusterConfiguration
field to marshalled separately.
Thus, to make things smooth for users and to reduce third party exposure to
technical debt, this change removes ClusterConfiguration embedding from
InitConfiguration.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
* fix duplicated imports of api/core/v1
* fix duplicated imports of client-go/kubernetes
* fix duplicated imports of rest code
* change import name to more reasonable
There are a couple of problems with regards to the `omitempty` in v1beta1:
- It is not applied to certain fields. This makes emitting YAML configuration
files in v1beta1 config format verbose by both kubeadm and third party Go
lang tools. Certain fields, that were never given an explicit value would
show up in the marshalled YAML document. This can cause confusion and even
misconfiguration.
- It can be used in inappropriate places. In this case it's used for fields,
that need to be always serialized. The only one such field at the moment is
`NodeRegistrationOptions.Taints`. If the `Taints` field is nil, then it's
defaulted to a slice containing a single control plane node taint. If it's
an empty slice, no taints are applied, thus, the cluster behaves differently.
With that in mind, a Go program, that uses v1beta1 with `omitempty` on the
`Taints` field has no way to specify an explicit empty slice of taints, as
this would get lost after marshalling to YAML.
To fix these issues the following is done in this change:
- A whole bunch of additional omitemptys are placed at many fields in v1beta2.
- `omitempty` is removed from `NodeRegistrationOptions.Taints`
- A test, that verifies the ability to specify empty slice value for `Taints`
is included.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
This change introduces config fields to the v1beta2 format, that allow
certificate key to be specified in the config file. This certificate key is a
hex encoded AES key, that is used to encrypt certificates and keys, needed for
secondary control plane nodes to join. The same key is used for the decryption
during control plane join.
It is important to note, that this key is never uploaded to the cluster. It can
only be specified on either command line or the config file.
The new fields can be used like so:
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
certificateKey: "yourSecretHere"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
controlPlane:
certificateKey: "yourSecretHere"
---
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
These are based on recommendation from
[staticcheck](http://staticcheck.io/).
- Remove unused struct fields
- Remove unused function
- Remove unused variables
- Remove unused constants.
- Miscellaneous cleanups
This unit test contains some hacks are causing the bazel-test
e2e job to flake very often. Instead of maintaining this
unit test remove it completely. It has little benefits
WRT testing app/util/chroot*.go.
kubeadm still generates RSA keys when deploying a node, but also
accepts ECDSA keys if they already exist pregenerated in the
directory specified in --cert-dir.
Add the functionality to support `CreateOrMutateConfigMap` and `MutateConfigMap`.
* `CreateOrMutateConfigMap` will try to create a given ConfigMap object; if this ConfigMap
already exists, a new version of the resource will be retrieved from the server and a
mutator callback will be called on it. Then, an `Update` of the mutated object will be
performed. If there's a conflict during this `Update` operation, retry until no conflict
happens. On every retry the object is refreshed from the server to the latest version.
* `MutateConfigMap` will try to get the latest version of the ConfigMap from the server,
call the mutator callback and then try to `Update` the mutated object. If there's a
conflict during this `Update` operation, retry until no conflict happens. On every retry
the object is refreshed from the server to the latest version.
Add unit tests for `MutateConfigMap`
* One test checks that in case of no conflicts, the update of the
given ConfigMap happens without any issues.
* Another test mimics 5 consecutive CONFLICT responses when updating
the given ConfigMap, whereas the sixth try it will work.
The function certs.NewCACertAndKey() is just a wrapper around
pkiutil.NewCertificateAuthority() which doesn't add any
additional functionality.
Instead use pkiutil.NewCertificateAuthority() directly.
Currently kubeadm produces an error upon parsing multiple
certificates stored in the cluster-info configmap. Yet it
should check all available certificates in a scenario like
CA key rotation.
Check all available CA certs against pinned certificate hashes.
Fixes https://github.com/kubernetes/kubeadm/issues/1399
In the case where newControlPlane is true we don't go through
getNodeRegistration() and initcfg.NodeRegistration.CRISocket is empty.
This forces DetectCRISocket() to be called later on, and if there is more than
one CRI installed on the system, it will error out, while asking for the user
to provide an override for the CRI socket. Even if the user provides an
override, the call to DetectCRISocket() can happen too early and thus ignore it
(while still erroring out).
However, if newControlPlane == true, initcfg.NodeRegistration is not used at
all and it's overwritten later on.
Thus it's necessary to supply some default value, that will avoid the call to
DetectCRISocket() and as initcfg.NodeRegistration is discarded, setting
whatever value here is harmless.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
We have an existing helper function for this: runtime.SerializerInfoForMediaType()
This is common prep-work for encoding runtime.Objects into JSON/YAML for transmission over the wire or writing to ComponentConfigs.
Add ResetClusterStatusForNode() that clears a certain
control-plane node's APIEndpoint from the ClusterStatus
key in the kubeadm ConfigMap on "kubeadm reset".
- move most unrelated to phases output to klog.V(1)
- rename some prefixes for consistency - e.g.
[kubelet] -> [kubelet-start]
- control-plane-prepare: print details for each generated CP
component manifest.
- uppercase the info text for all "[reset].." lines
- modify the text for one line in reset
This package contains public/private key utilities copied directly from
client-go/util/cert. All imports were updated.
Future PRs will actually refactor the libraries.
Updates #71004
Currently kubeadm supports a couple of configuration versions - v1alpha3 and
v1beta1. The former is deprecated, but still supported.
To discourage users from using it and to speedup conversion to newer versions,
we disable the loading of deprecated configurations by all kubeadm
sub-commands, but "kubeadm config migrate".
v1alpha3 is still present and supported at source level, but cannot be used
directly with kubeadm and some of its internal APIs.
The added benefit to this is, that users won't need to lookup for an old
kubeadm binary after upgrade, just because they were stuck with a deprecated
config version for too long.
To achieve this, the following was done:
- ValidateSupportedVersion now has an allowDeprecated boolean parameter, that
controls if the function should return an error upon detecting deprecated
config version. Currently the only deprecated version is v1alpha3.
- ValidateSupportedVersion is made package private, because it's not used
outside of the package anyway.
- BytesToInitConfiguration and LoadJoinConfigurationFromFile are modified to
disallow loading of deprecated kubeadm config versions. An error message,
that points users to kubeadm config migrate is returned.
- MigrateOldConfig is still allowed to load deprecated kubeadm config versions.
- A bunch of tests were fixed to not expect success if v1alpha3 config is
supplied.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
MigrateOldConfigFromFile is a function, whose purpose is to migrate one config
into another. It is working OK for now, but it has some issues:
- It is incredibly inefficient. It can reload and re-parse a single config file
for up to 3 times.
- Because of the reloads, it has to take a file containing the configuration
(not a byte slice as most of the rest config functions). However, it returns
the migrated config in a byte slice (rather asymmetric from the input
method).
- Due to the above points it's difficult to implement a proper interface for
deprecated kubeadm config versions.
To fix the issues of MigrateOldConfigFromFile, the following is done:
- Re-implement the function by removing the calls to file loading package
public APIs and replacing them with newly extracted package private APIs that
do the job with pre-provided input data in the form of
map[GroupVersionKind][]byte.
- Take a byte slice of the input configuration as an argument. This makes the
function input symmetric to its output. Also, it's now renamed to
MigrateOldConfig to represent the change from config file path as an input
to byte slice.
- As a bonus (actually forgotten from a previous change) BytesToInternalConfig
is renamed to the more descriptive BytesToInitConfiguration.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Remove the deprecated `scheduler.alpha.kubernetes.io/critical-pod` pod annotation and use
the `priorityClassName` first class attribute instead, setting all master components to
`system-cluster-critical`.
Add test files that exclude the field in question
under KubeletConfiguration -> evictionHard for non-Linux.
Add runtime abstraction for the test files in initconfiguration_tests.go
Currently ConfigFileAndDefaultsToInternalConfig and
FetchConfigFromFileOrCluster are used to default and load InitConfiguration
from file or cluster. These two APIs do a couple of completely separate things
depending on how they were invoked. In the case of
ConfigFileAndDefaultsToInternalConfig, an InitConfiguration could be either
defaulted with external override parameters, or loaded from file.
With FetchConfigFromFileOrCluster an InitConfiguration is either loaded from
file or from the config map in the cluster.
The two share both some functionality, but not enough code. They are also quite
difficult to use and sometimes even error prone.
To solve the issues, the following steps were taken:
- Introduce DefaultedInitConfiguration which returns defaulted version agnostic
InitConfiguration. The function takes InitConfiguration for overriding the
defaults.
- Introduce LoadInitConfigurationFromFile, which loads, converts, validates and
defaults an InitConfiguration from file.
- Introduce FetchInitConfigurationFromCluster that fetches InitConfiguration
from the config map.
- Reduce, when possible, the usage of ConfigFileAndDefaultsToInternalConfig by
replacing it with DefaultedInitConfiguration or LoadInitConfigurationFromFile
invocations.
- Replace all usages of FetchConfigFromFileOrCluster with calls to
LoadInitConfigurationFromFile or FetchInitConfigurationFromCluster.
- Delete FetchConfigFromFileOrCluster as it's no longer used.
- Rename ConfigFileAndDefaultsToInternalConfig to
LoadOrDefaultInitConfiguration in order to better describe what the function
is actually doing.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
systemd is the recommended driver as per the setup of running
the kubelet using systemd as the init system. Add a preflight
check that throws a warning if this isn't the case.
Currently JoinConfigFileAndDefaultsToInternalConfig is doing a couple of
different things depending on its parameters. It:
- loads a versioned JoinConfiguration from an YAML file.
- returns defaulted JoinConfiguration allowing for some overrides.
In order to make code more manageable, the following steps are taken:
- Introduce LoadJoinConfigurationFromFile, which loads a versioned
JoinConfiguration from an YAML file, defaults it (both dynamically and
statically), converts it to internal JoinConfiguration and validates it.
- Introduce DefaultedJoinConfiguration, which returns defaulted (both
dynamically and statically) and verified internal JoinConfiguration.
The possibility of overwriting defaults via versioned JoinConfiguration is
retained.
- Re-implement JoinConfigFileAndDefaultsToInternalConfig to use
LoadJoinConfigurationFromFile and DefaultedJoinConfiguration.
- Replace some calls to JoinConfigFileAndDefaultsToInternalConfig with calls to
either LoadJoinConfigurationFromFile or DefaultedJoinConfiguration where
appropriate.
- Rename JoinConfigFileAndDefaultsToInternalConfig to the more appropriate name
LoadOrDefaultJoinConfiguration.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
DetectUnsupportedVersion is somewhat uncomfortable, complex and inefficient
function to use. It takes an entire YAML document as bytes, splits it up to
byte slices of the different YAML sub-documents and group-version-kinds and
searches through those to detect an unsupported kubeadm config. If such config
is detected, the function returns an error, if it is not (i.e. the normal
function operation) everything done so far is discarded.
This could have been acceptable, if not the fact, that in all cases that this
function is called, the YAML document bytes are split up and an iteration on
GVK map is performed yet again. Hence, we don't need DetectUnsupportedVersion
in its current form as it's inefficient, complex and takes only YAML document
bytes.
This change replaces DetectUnsupportedVersion with ValidateSupportedVersion,
which takes a GroupVersion argument and checks if it is on the list of
unsupported config versions. In that case an error is returned.
ValidateSupportedVersion relies on the caller to read and split the YAML
document and then iterate on its GVK map checking if the particular
GroupVersion is supported or not.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
For historical reasons InitConfiguration is used almost everywhere in kubeadm
as a carrier of various configuration components such as ClusterConfiguration,
local API server endpoint, node registration settings, etc.
Since v1alpha2, InitConfiguration is meant to be used solely as a way to supply
the kubeadm init configuration from a config file. Its usage outside of this
context is caused by technical dept, it's clunky and requires hacks to fetch a
working InitConfiguration from the cluster (as it's not stored in the config
map in its entirety).
This change is a small step towards removing all unnecessary usages of
InitConfiguration. It reduces its usage by replacing it in some places with
some of the following:
- ClusterConfiguration only.
- APIEndpoint (as local API server endpoint).
- NodeRegistrationOptions only.
- Some combinations of the above types, or if single fields from them are used,
only those field.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
In order to allow for a smoother UX with CRIs different than Docker, we have to
make the --cri-socket command line flag optional when just one CRI is
installed.
This change does that by doing the following:
- Introduce a new runtime function (DetectCRISocket) that will attempt to
detect a CRI socket, or return an appropriate error.
- Default to using the above function if --cri-socket is not specified and
CRISocket in NodeRegistrationOptions is empty.
- Stop static defaulting to DefaultCRISocket. And rename it to
DefaultDockerCRISocket. Its use is now narrowed to "Docker or not"
distinguishment and tests.
- Introduce AddCRISocketFlag function that adds --cri-socket flag to a flagSet.
Use that in all commands, that support --cri-socket.
- Remove the deprecated --cri-socket-path flag from kubeadm config images pull
and deprecate --cri-socket in kubeadm upgrade apply.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
This avoids ending in a wrong cluster state by assuming that the
present certificates will work. It is specially important when we
are growing etcd from 1 member to 2, in which case in case of failure
upon joining etcd will be unavailable.
When the etcd cluster grows we need to explicitly wait for it to be
available. This ensures that we are not implicitly doing this in
following steps when they try to access the apiserver.
It may happen that both the git version and the remote version
are broken/inaccessible. In this case the broken remote version
would be used.
To overcome this situation fall back to the constant CurrentKubernetesVersion.
The alternative could be os.Exit(1).
Also this change fixes Bazel-based unit tests in air-gapped environment.
This is a minor cleanup which helps to make the code of kubeadm a bit
less error-prone by reducing the scope of local variables and
unexporting functions that are not meant to be used outside of their
respective modules.