The warning message
```
[config] WARNING: Ignored YAML document with GroupVersionKind ...
```
is printed for all GVKs that are not part of the kubeadm core types.
This is wrong as the component config types are supported and successfully
parsed and used despite the fact that the warning is printed for them too.
Hence this simple fix first checks if the group of the GVK is a supported
component config group and the warning is printed only if it's not.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
kubeadm deploys the apiserver, controller-manager and the scheduler
using liveness probes.
The bind-address option is used to configure the probe address, in
case this is configured with an unspecified address, the probe
will fail. When using an unspecified address the probe host field is
left empty, otherwise the bind-address is used.
The function validateKubeConfig() can end up comparing
a user generated kubeconfig to a kubeconfig generated by kubeadm.
If a user kubeconfig has a CA that is base64 encoded with whitespace,
if said kubeconfig is loaded using clientcmd.LoadFromFile()
the CertificateAuthorityData bytes will be decoded from base64
and placed in the v1.Config raw. On the other hand a kubeconfig
generated by kubeadm will have the ca.crt parsed to a Certificate
object with whitespace ignored in the PEM input.
Make sure that validateKubeConfig() tolerates whitespace differences
when comparing CertificateAuthorityData.
kubeadm removed the deprecated "--address" flag for controller-manager
and scheduler in favor of "--bind-address"
We should use bind-address to configure the manifest probe addresses.
If the user has modified the kubelet.conf post TLS bootstrap
to become invalid, the function getNodeNameFromKubeletConfig() can
panic. This was observed to trigger in "kubeadm reset" use cases.
Add basic validation and unit tests around parsing the kubelet.conf
with the aforementioned function.
Currently if the controlplane fails to init, we print out a message
with some example commands that only show docker CLI.
This tries to improve that by printing the example commands for
docker, cri-o and containerd by checking the socket looking for
the default docker socket.
CreateOrMutateConfigMap was not resilient when it was trying to Create
the ConfigMap. If this operation returned an unknown error the whole
operation would fail, because it was strict in what error it was
expecting right afterwards: if the error returned by the Create call
was a IsAlreadyExists error, it would work fine. However, if an
unexpected error (such as an EOF) happened, this call would fail.
We are seeing this error specially when running control plane node
joins in an automated fashion, where things happen at a relatively
high speed pace.
It was specially easy to reproduce with kind, with several control
plane instances. E.g.:
```
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
I1130 11:43:42.788952 887 round_trippers.go:443] POST https://172.17.0.2:6443/api/v1/namespaces/kube-system/configmaps?timeout=10s in 1013 milliseconds
Post https://172.17.0.2:6443/api/v1/namespaces/kube-system/configmaps?timeout=10s: unexpected EOF
unable to create ConfigMap
k8s.io/kubernetes/cmd/kubeadm/app/util/apiclient.CreateOrMutateConfigMap
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/util/apiclient/idempotency.go:65
```
This change makes this logic more resilient to unknown errors. It will
retry on the light of unknown errors until some of the expected error
happens: either `IsAlreadyExists`, in which case we will mutate the
ConfigMap, or no error, in which case the ConfigMap has been created.
kubeadm always use the IPv4 localhost address by defaultA for etcd
The probe hostname is obtained before the generation of the etcd
parameters, so it can't detect the right IP familiy for the
host of the probe.
This causes that with IPv6 clusters doesn't work because the probe
uses the IPv4 localhost address.
This patchs configures the right localhost address based on the used
AdvertiseAddress IP family.
- Use `schema.TypeMeta` instead of custom `struct` for VK
- More strict check on GVK after `Interpret` in `SplitYAMLDocuments`
- Adjust `Interpret` comment to include JSON
- Add retrieveValidatedConfigInfo to be able to better unit
test the function.
- Break some of the logic in RetrieveValidatedConfigInfo into
helper functions.
- Pass JoinConfiguration.Discovery to RetrieveValidatedConfigInfo
instead of JoinConfiguration.
- Use the discovery timeout per API call to fetch cluster-info
(optionally the user value can be slit in 2).
- Add detailed unit tests for retrieveValidatedConfigInfo.
kubeadm's current implementation of component config support is "kind" centric.
This has its downsides. Namely:
- Kind names and numbers can change between config versions.
Newer kinds can be ignored. Therefore, detection of a version change is
considerably harder.
- A component config can have only one kind that is managed by kubeadm.
Thus a more appropriate way to identify component configs is required.
Probably the best solution identified so far is a config group.
A group name is unlikely to change between versions, while the kind names and
structure can.
Tracking component configs by group name allows us to:
- Spot more easily config version changes and manage alternate versions.
- Support more than one kind in a config group/version.
- Abstract component configs by hiding their exact structure.
Hence, this change rips off the old kind based support for component configs
and replaces it with a group name based one. This also has the following
extra benefits:
- More tests were added.
- kubeadm now errors out if an unsupported version of a known component group
is used.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
- Add a new preflight check for upgrade that runs the pause container
with -v in a Job.
- Wait for the Job to complete and return an error after N seconds.
- Manually clean the Job because we don't have the TTL controller
enabled in kubeadm yet (it's still alpha).
Currently this uses the combined kubelet output (stdout + stderr), but this
causes parsing issues if the kubelet logs something on stderr.
Thus we ignore the entire stderr and use stdout only.
We do disable a couple of tests here. That is because the fakeexecer only
supports combined output and return a "not supported" error if `.Output()`
gets invoked thus permanently failing those.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
- Don't always print to stdout that the kubelet is starting.
instead delegate this to the callers of TryStartKubelet.
- Add a new root kubeadm init phase called "kubelet-finalize"
- Add a sub-phase to "kubelet-finalize"
called "experimental-cert-rotation"
- "cert-rotation" performs the following actions:
- tries to guess if kubelet client cert rotation is enabled
- update the kubelet.conf to use the rotatable cert/key
The PR introducing 5bb8069 got merged accidentally (the CI robot not
respecting a hold). Hence, the feedback to that PR is merged separately.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
As the hyperkube image is itself deprecated and moved out of tree, its use with
kubeadm gets deprecated too. Hence, deprecation messages will be printed when
it is used.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
ToClientSet() in kubeconfig.go creates a clientset from
the passed Config object (kubeconfig). For IP addresses
that are not reachable e.g. Get() calls for ConfigMaps
can block for a few minutes with the default timeout.
Modify the timeout to a shorter value by passing an override.
There are two writes yet only one read on a non-buffered channel that is
created locally and not passed anywhere else.
Therefore, it could leak one of its two spawned Goroutines if either:
* The provided `f` takes longer than an erroneous result from
`waiter.WaitForHealthyKubelet`, or;
* The provided `f` completes before an erroneous result from
`waiter.WaitForHealthyKubelet`.
The fix is to add a one-element buffer so that the channel write happens
for the second Goroutine in these cases, allowing it to finish and freeing
references to the now-buffered channel, letting it to be GC'd.
3993c42431 introduced the propagation of *_PROXY
host env. variables to the kube-proxy container.
To allow The NODE_NAME variable to be properly updated by the downward
API make, sure we preserve the existing variables when adding *_PROXY.
This change removes dependencies on the internal types of the kubelet and
kube-proxy component configs. Along with that defaulting and validation is
removed as well. kubeadm will display a warning, that it did not verify the
component config upon load.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Checking if the path exists before creating the volume is
problematic because the path will be created regardless
after the initial call to "kubeadm init" and once the CM Pod
is running.
Then on subsequent calls to "kubeadm init" or the "control-plane"
phase the manifest for the CM will be different.
Always mount this path, but also consider the user provided
flag override from ClusterConfiguration.
Used strings instead of bytes in the TestTokenOutput test cases as
expected output is a plain text.
This should also simplify the data representation and the test code
a bit.
The flag has been problematic and abused by users.
While perhaps its original purpose was to be able to feed
a new version of the control-plane it also made it possible
to apply modifications to the ClusterConfiguration object
in the cluster. The lack of a feature in kubeadm for reconfiguration
of running clusters resulted in users using this flag for
the same purpose.
While it works for certain scenarios like updating
a static Pod for this control-plane only, it can result in
unexpected behavior if the user has for example fed a node name
different than the host name, when originally they created this node.
kubeadm 1.16 introduced the "kustomize" feature that
is a potential replacement for this user demand.
Add warning that this flag should not be used.
This field is not used in the kubeadm code. It was brought from
cli-runtime where it's used to support complex relationship between
command line parameters, which is not present in kubeadm.
This integration test allows us to detect if a given feature gate will
panic kubeadm. This builds on the assumption that a golang panic makes
the process exit with the code 2.
These tests are not trying to check if the init process succeeds or
not, their only purpose is to ensure that the exit code of the
`kubeadm init` invocation is not 2, thus, reflecting a golang panic.
Some refactors had to be made to the test code, so we return the exit
code along with stdout and stderr.
of service subnets.
Update DNS, Cert, dry-run logic to support list of Service CIDRs.
Added unit tests for GetKubernetesServiceCIDR and updated
GetDNSIP() unit test to inclue dual-sack cases.
The firewalld monitoring code was not well tested (and not easily
testable), would never be triggered on most platforms, and was only
being taken advantage of from one place (kube-proxy), which didn't
need it anyway since it already has its own resync loop.
Since the firewalld monitoring was the only consumer of pkg/util/dbus,
we can also now delete that.
Whenever kubeadm needs to fetch its configuration from the cluster, it gets
the component configuration of all supported components (currently only kubelet
and kube-proxy). However, kube-proxy is deemed an optional component and its
installation may be skipped (by skipping the addon/kube-proxy phase on init).
When kube-proxy's installation is skipped, its config map is not created and
all kubeadm operations, that fetch the config from the cluster, are bound to
fail with "not found" or "forbidden" (because of missing RBAC rules) errors.
To fix this issue, we have to ignore the 403 and 404 errors, returned on an
attempt to fetch kube-proxy's component config from the cluster.
The `GetFromKubeProxyConfigMap` function now supports returning nil for both
error and object to indicate just such a case.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
The new etcd balancer (>3.3.14, 3.4.0) uses an asynchronous resolver for
endpoints. Without "WithBlock", the client may return before the
connection is up.
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
A recent commit added warnings for KubeletConfiguration and
KubeProxyConfiguration fields that kubeadm cares about and
does not recommend the user modifying them. Kubelet's
"rotateCertificates" cannot be handled using this function
as there is not way to figure out if the user has set it explicitly to
"false". Hardcode the value to "true" and add a comment about that.
Also apply the following changes to warnDefaultComponentConfigValue()
calls:
- use a local "kind" variable that defines the Kind we are warning about.
- fix wrong paths to fields.
- replace all stray calls of os.Exit() to util.CheckError() instead
- CheckError() now checks if the klog verbosity level is >=5
and shows a stack trace of the error
- don't call klog.Fatal in version.go
It seems undesirable that Kubernetes as a system should be
blocking a node if it's Linux kernel is way too new.
If such a problem even occurs we should exclude versions from
the list of supported versions instead of blocking users
from trying e.g. the latest 7.0.0-beta kernel because our
validators are not aware of this new version.
Usage of github.com/blang/semver is not needed and
k8s.io/apimachinery/pkg/util/version should be used instead
for semantic version parsing and version comparison.
Etcd v3.3.0 added the --listen-metrics-urls flag which allows specifying
addition URLs to the already present /health and /metrics endpoints.
While /health and /metrics are enabled for URLS defined with
--listen-client-urls (v3+ ?) they do require HTTPS.
Replace the present etcdctl based liveness probe with a standard HTTP
GET v1.Probe that connects to http://127.0.0.1:2381/health.
These endpoints are not reachable from the outside and only available
for localhost connections.