- During "upgrade apply" call a new function AddNewControlPlaneTaint()
that finds all nodes with the new "control-plane" node-role label
and adds the new "control-plane" taint to them.
- The function is called in "apply" and is separate from
the step to remove the old "master" label for better debugging
if errors occur.
- Apply "control-plane" taint during init/join by adding the
taint in SetNodeRegistrationDynamicDefaults(). The old
taint "master" is still applied.
- Clarify API docs (v1beta2 and v1beta3) for nodeRegistration.Taint
to not mention "master" taint and be more generic. Remove
example for taints that includes the word "master".
- Update unit tests.
- Update the markcontrolplane phase used by init and join to
only label the nodes with the new control plane label.
- Cleanup TODOs about the old label.
- Remove outdated comment about selfhosting in staticpod/utils.go.
Selfhosting has not been supported in kubeadm for a while
and the comment also mentions the "master" label.
- Update unit tests.
- Rename the function in postupgrade.go to better reflect
what is being done.
- During "upgrade apply" find all nodes with the old label
and remove it by calling PatchNode.
- Update health check for CP nodes to not track "master"
labeled nodes. At this point all CP nodes should have
"control-plane" and we can use that selector only.
- Throw an error if there is more than one known socket on the host.
- Remove the special handling for docker+containerd.
- Remove the local instances of constants for endpoints for
Windows / Unix and use the defaultKnownCRISockets variable
which is populated from OS specific constants.
- Update error message in detectCRISocketImpl to have more
details.
- Make detectCRISocketImpl accept a list of "known" sockets
- Update unit tests for detectCRISocketImpl and make them
use generic paths such as "unix:///foo/bar.sock".
Change the default container runtime CRI socket endpoint to the
one of containerd. Previously it was the one for Docker
- Rename constants.DefaultDockerCRISocket to DefaultCRISocket
- Make the constants files include the endpoints for all supported
container runtimes for Unix/Windows.
- Update unit tests related to docker runtime testing.
- In kubelet/flags.go hardcode the legacy docker socket as a check
to allow kubeadm 1.24 to run against kubelet 1.23 if the user
explicitly sets the criSocket field to "npipe:////./pipe/dockershim"
on Windows or "unix:///var/run/dockershim.sock" on Linux.
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:
if klog.V(5).Enabled() {
klog.Info("hello world")
}
Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.
Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
The API was deprecated in 1.23 when output/v1alpha2 was
added. v1alpha1 is problematic since it embeds kubeadm/v1beta2
BootstrapToken related types directly. v1alpha2 imports
a new group dedicated to bootstrap tokens apis/bootstraptoken.
cli.Run was an attempt to elliminate error handling in Kubernetes
commands. However, it had to rely on heuristics that are not necessarily right
for all commands.
kubectl is one example which has its own error printing code that should be
used in all cases after a command failure. It now gets used also for
`--warnings-as-errors`. Previously, that caused the following message to be
logged at the end:
E0110 16:56:01.987555 202060 run.go:120] "command failed" err="1 warning received"
Now it ends with:
error: 1 warning received
crictl already works with the current state of dockershim.
Using the docker CLI is not required and the DockerRuntime
can be removed from kubeadm. This means that crictl
can connect at the dockershim (or cri-dockerd) socket and
be used to list containers, pull images, remove containers, and
all actions that the kubelet can otherwise perform with the socket.
Ensure that crictl is now required for all supported container runtimes
in checks.go. In the help text in waitcontrolplane.go show only
the crictl example.
Remove the check for the docker service from checks.go.
Remove the DockerValidor check from checks.go.
These two checks were special casing Docker as CR and compensating
for the lack of the same checks in dockershim. With the
extraction of dockershim to cri-dockerd, ideally cri-dockerd
should perform the required checks whether it can support
a given Docker config / version running on a host.
During "upgrade node" and "upgrade apply" read the
kubelet env file from /var/lib/kubelet/kubeadm-flags.env
patch the --container-runtime-endpoint flag value to
have the appropriate URL scheme prefix (e.g. unix:// on Linux)
and write the file back to disk.
This is a temporary workaround that should be kept only for 1 release
cycle - i.e. remove this in 1.25.
The CRI socket that kubeadm writes as an annotation
on a particular Node object can include an endpoint that
does not have an URL scheme. This is undesired as long term
the kubelet can stop allowing endpoints without URL scheme.
For control plane nodes "kubeadm upgrade apply" takes
the locally defaulted / populated NodeRegistration and refreshes
the CRI socket in PerformPostUpgradeTasks. But for secondary
nodes "kubeadm upgrade node" does not.
Adapt "upgrade node" to fetch the NodeRegistration for this node
and fix the CRI socket missing URL scheme if needed in the Node
annotation.
- Update defaults for v1beta2 and 3 to have URL scheme
- Raname DefaultUrlScheme to DefaultContainerRuntimeURLScheme
- Prepend a missing URL scheme to user sockets and warn them
that this might not be supported in the future
- Update socket validation to exclude IsAbs() testing
(This is broken on Windows). Assume the path is not empty and has
URL scheme at this point (validation happens after defaulting).
- Use net.Dial to open Unix sockets
- Update all related unit tests
Signed-off-by: pacoxu <paco.xu@daocloud.io>
Signed-off-by: Lubomir I. Ivanov <lubomirivanov@vmware.com>
Currently when the dockershim socket is used, kubeadm only passes
the --network-plugin=cni to the kubelet and assumes the built-in
dockershim. This is valid for versions <1.24, but with dockershim
and related flags removed the kubelet will fail.
Use preflight.GetKubeletVersion() to find the version of the host
kubelet and if the version is <1.24 assume that it has built-in
dockershim. Newer versions should will be treated as "remote" even
if the socket is for dockershim, for example, provided by cri-dockerd.
Update related unit tests.
Since we removed dockershim we now rely on both flags, which therefore
should not marked experimental any more.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Apply a small fix to ensure the kubeconfig files
that kubeadm manages have a CA when printed in the table
of the "check expiration" command. "CAName" is the field used for that.
In practice kubeconfig files can contain multiple credentials
from different CAs, but this is not supported by kubeadm and there
is a single cluster CA that signs the single client cert/key
in kubeadm managed kubeconfigs.
In case stacked etcd is used, the code that does expiration checks
does not validate if the etcd CA is "external" (missing key)
and if the etcd CA signed certificates are valid.
Add a new function UsingExternalEtcdCA() similar to existing functions
for the cluster CA and front-proxy CA, that performs the checks for
missing etcd CA key and certificate validity.
This function only runs for stacked etcd, since if etcd is external
kubeadm does not track any certs signed by that etcd CA.
This fixes a bug where the etcd CA will be reported as local even
if the etcd/ca.key is missing during "certs check-expiration".
When the "kubeadm certs check-expiration" command is used and
if the ca.key is not present, regular on disk certificate reads
pass fine, but fail for kubeconfig files. The reason for the
failure is that reading of kubeconfig files currently
requires reading both the CA key and cert from disk. Reading the CA
is done to ensure that the CA cert in the kubeconfig is not out of date
during renewal.
Instead of requiring both a CA key and cert to be read, only read
the CA cert from disk, as only the cert is needed for kubeconfig files.
This fixes printing the cert expiration table even if the ca.key
is missing on a host (i.e. the CA is considered external).
If done too soon, the klog.V() calls are ignored because the log verbosity
isn't set. In Kubernetes 1.22, the verbosity was set, but not the logging
format.
Don't use a custom dialer for the kubelet if is not rotating
certificates, so we can reuse TCP connections because we don't need
a customer dialer.
Kubelet needs to be able to recover from stale http connections.
HTTP2 has a mechanism to detect broken connections by sending periodical pings.
HTTP1 only can have one persistent connection, and it will close all Idle connections
once the Kubelet heartbet fails. However, since there are many edge cases that we can't
control, users can still opt-in to the previous behavior for closing the connections by
setting the environment variable DISABLE_HTTP2.
kubeam:node-proxier -> kubeadm:node-proxier
This causes e2e test failures:
"[area-kubeadm] proxy addon kube-proxy ServiceAccount should
be bound to the system:node-proxier cluster role"
in:
- kubeadm-kinder-latest
- kubeadm-kinder-latest-on-...
- other tests
Before this commit when setting bindAddress to 1.2.3.4 the warning was:
The recommended value for "bindAddress" in "KubeProxyConfiguration" is: 1.2.3.4; the provided value is: 0.0.0.0
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
Signed-off-by: wangyysde <net_use@bzhy.com>
Generation swagger.json.
Use v2 path for hpa_cpu_field.
run update-codegen.sh
Signed-off-by: wangyysde <net_use@bzhy.com>
Add the UnversionedKubeletConfigMap feature gate that can
be used to control legacy vs new behavior for naming the
default configmap used to store the KubeletConfiguration.
Update related unit tests.
This PR removes Serve function and uses all required places
ServeWithListenerStopped which takes place new Serve function.
This function returns ListenerStopped channel can be used to drain
requests before shutting down the server.
Instead of the individual error and return, it's better to aggregate all
the errors so that we can fix them all at once.
Take the chance to fix some comments, since kubeadm are not checking that
the certs are equal across controlplane.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The addition of output/v1alpha2 made the converter-gen require
an explicit converter for:
kubeadm/v1beta2.BootstrapToken -> bootstraptoken/v1.BootstrapToken.
Add this converter under kubeadm/v1beta.
Use the converter in output/v1alpha1.
In various places log messages where emitted as part of validation or even
before it (for example, cli.PrintFlags). Those log messages did not use the
final logging configuration, for example text output instead of JSON or not the
final verbosity. The last point became more obvious after moving the setup of
verbosity into logs.Options.Apply because PrintFlags never printed anything
anymore.
In order to force applications to deal with logging as soon as possible, the
Options.Validate and Options.Apply methods are now private. Applications should
use the new Options.ValidateAndApply directly after parsing.
These three options are the ones from logs.AddFlags which are not deprecated.
Therefore it makes sense to make them available also via the configuration file
support in the one command which currently supports that (kubelet).
Long-term, all commands should use LoggingConfiguration, either with a
configuration file (as in kubelet) or via flags (kube-scheduler,
kube-apiserver, kube-controller-manager).
Short-term, both approaches have to be supported. As the majority of the
commands only use logs.AddFlags, that function by default continues to register
the flags and only leaves that to Options.AddFlags when explicitly requested.
A drive-by bug fix is done for log flushing: the periodic flushing called
klog.Flush and therefore missed explicit flushing of the newer logr
backend. This bug was never present in any release Kubernetes and therefore the
fix is not submitted in a separate PR.
This feature has graduated to GA in v1.11 and will always be
enabled. So no longe need to check if enabled.
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
* De-share the Handler struct in core API
An upcoming PR adds a handler that only applies on one of these paths.
Having fields that don't work seems bad.
This never should have been shared. Lifecycle hooks are like a "write"
while probes are more like a "read". HTTPGet and TCPSocket don't really
make sense as lifecycle hooks (but I can't take that back). When we add
gRPC, it is EXPLICITLY a health check (defined by gRPC) not an arbitrary
RPC - so a probe makes sense but a hook does not.
In the future I can also see adding lifecycle hooks that don't make
sense as probes. E.g. 'sleep' is a common lifecycle request. The only
option is `exec`, which requires having a sleep binary in your image.
* Run update scripts
* Updated non idle logging time
* Update cmd/kubelet/app/options/options.go
Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
The new handler is meant to be executed at the end of the delegation chain.
It simply checks if the request have been made before the server has installed all known HTTP paths.
In that case it returns a 503 response otherwise it returns a 404.
We don't want to add additional checks to the readyz path as it might prevent fixing bricked clusters.
This specific handler is meant to "protect" requests that arrive before the paths and handlers are fully initialized.
The feature gate gets locked to "true", with the goal to remove it in two
releases.
All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.
Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
The recommendation from #sig-cli was to print usage, then the error. Extra care
is taken to only print the usage instruction when the error really was about
flag parsing.
Taking kube-scheduler as example:
$ _output/bin/kube-scheduler
I0929 09:42:42.289039 149029 serving.go:348] Generated self-signed cert in-memory
...
W0929 09:42:42.489255 149029 client_config.go:620] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined
E0929 09:42:42.489366 149029 run.go:98] "command failed" err="invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable"
$ _output/bin/kube-scheduler --xxx
Usage:
kube-scheduler [flags]
...
--vmodule moduleSpec
comma-separated list of pattern=N settings for file-filtered logging
Error: unknown flag: --xxx
The kubectl behavior doesn't change:
$ _output/bin/kubectl get nodes
Unable to connect to the server: dial tcp: lookup xxxx: No address associated with hostname
$ _output/bin/kubectl --xxx
Error: unknown flag: --xxx
See 'kubectl --help' for usage.
All Kubernetes commands should show flags with hyphens in their help text even
when the flag originally was defined with underscore. Converting a command to
this style is not breaking its command line API because the old-style parameter
with underscore is accepted as alias.
The easiest solution to achieve this is to set normalization shortly before
running the command in the new central cli.Run or the few places where that
function isn't used yet.
There may be some texts which depends on normalization at flag definition time,
like the --logging-format usage warning. Those get generated assuming that
hyphens will be used.
It wasn't documented that InitLogs already uses the log flush frequency, so
some commands have called it before parsing (for example, kubectl in the
original code for logs.go). The flag never had an effect in such commands.
Fixing this turned into a major refactoring of how commands set up flags and
run their Cobra command:
- component-base/logs: implicitely registering flags during package init is an
anti-pattern that makes it impossible to use the package in commands which
want full control over their command line. Logging flags must be added
explicitly now, something that the new cli.Run does automatically.
- component-base/logs: AddFlags would have crashed in kubectl-convert if it
had been called because it relied on the global pflag.CommandLine. This
has been fixed and kubectl-convert now has the same --log-flush-frequency
flag as other commands.
- component-base/logs/testinit: an exception are tests where flag.CommandLine has
to be used. This new package can be imported to add flags to that
once per test program.
- Normalization of the klog command line flags was inconsistent. Some commands
unintentionally didn't normalize to the recommended format with hyphens. This
gets fixed for sample programs, but not for production programs because
it would be a breaking change.
This refactoring has the following user-visible effects:
- The validation error for `go run ./cmd/kube-apiserver --logging-format=json
--add-dir-header` now references `add-dir-header` instead of `add_dir_header`.
- `staging/src/k8s.io/cloud-provider/sample` uses flags with hyphen instead of
underscore.
- `--log-flush-frequency` is not listed anymore in the --logging-format flag's
`non-default formats don't honor these flags` usage text because it will also
work for non-default formats once it is needed.
- `cmd/kubelet`: the description of `--logging-format` uses hyphens instead of
underscores for the flags, which now matches what the command is using.
- `staging/src/k8s.io/component-base/logs/example/cmd`: added logging flags.
- `apiextensions-apiserver` no longer prints a useless stack trace for `main`
when command line parsing raises an error.
The API is a copy of output/v1alpha1 with a minor difference
where output/v1alpha2.BootstrapToken embeds
bootstraptoken/v1.BootstrapToken instead of
kubeadm/v1beta2.BootstrapToken.
Embedding the later is an undesired binding between the "kubeadm"
and "output" groups, preventing the eventual deprecation and removal
of the kubeadm.v1beta2 API.
This new output API version, unlike v1alpha1, does not include
defaulting which is not needed.
Because the proxy.Provider interface included
proxyconfig.EndpointsHandler, all the backends needed to
implement its methods. But iptables, ipvs, and winkernel implemented
them as no-ops, and metaproxier had an implementation that wouldn't
actually work (because it couldn't handle Services with no active
Endpoints).
Since Endpoints processing in kube-proxy is deprecated (and can't be
re-enabled unless you're using a backend that doesn't support
EndpointSlice), remove proxyconfig.EndpointsHandler from the
definition of proxy.Provider and drop all the useless implementations.
Due to a cut-and-paste error in the original implementation in Kubernetes 1.19,
support for generic ephemeral inline volumes in the PVC protection controller
was incorrectly tied to the "storage object in use" feature gate.
The configuration is deprecated and targets removal for v1.23. Tests
cases have been changed as well.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
With this commit kube-proxy accepts current system values (retrieved by sysctl) which are higher than the internally known and expected values.
The code change was mistakenly created as PR in the k3s project (see https://github.com/k3s-io/k3s/pull/3505).
A real life use case is described in Rancher issue https://github.com/rancher/rancher/issues/33360.
When Kubernetes runs on a Node which itself is a container (e.g. LXC), and the value is changed on the (LXC) host, kube-proxy then fails at the next start as it does not recognize the current value and attempts to overwrite the current value with the previously known one. This result in:
```
I0624 07:38:23.053960 54 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_max' to 524288
F0624 07:38:23.053999 54 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied
```
However a sysctl overwrite only makes sense if the current value is lower than the previously known and expected value. If the value was increased on the host, that shouldn't really bother kube-proxy and just go on with it.
Signed-off-by: Claudio Kuenzler ck@claudiokuenzler.com
* This allows a controller to use cloud provider managed RBAC
when --use-service-account-credentials is set.
* Create ControllerInitFuncConstructor to pass to init funcs to avoid
future function signature growth.
* Add comments for context around legacy naming of node controllers.
* Add example for setting client names from cloud controller manager.
Panicing if not running in a test and if the component-base/version
variables are empty is not ideal. At some point sections
of kubeadm could be exposed as a library and if these sections
import the constants package, they would panic on the library
users unless they set the version information in component-base
with ldflags.
Instead:
- If the component-base version is empty, return a placeholder version
that should indicate to users that build kubeadm that something is not
right (e.g. they did not use 'make'). During library usage or unit
tests this version should not be relevant.
- Update unit tests to use hardcoded versions instead of the versions
from the constants package. Using the constants package for testing
is good but during unit tests these versions are already placeholders
since unit tests do not populate the actual component-base versions
(e.g. 1.23).
Tests under /app and /test would fail if the current/minimum k8s version
is dynamically populated from the version in the kubeadm binary.
Adapt the tests to support that.
Kubeadm requires manual version updates of its current supported k8s
control plane version and minimally supported k8s control plane and
kubelet versions every release cycle.
To avoid that, in constants.go:
- Add the helper function getSkewedKubernetesVersion() that can be
used to retrieve a MAJOR.(MINOR+n).0 version of k8s. It currently
uses the kubeadm version populated in "component-base/version" during
the kubeadm build process.
- Use the function to set existing version constants (variables).
Update util/config/common.go#NormalizeKubernetesVersion() to
tolerate the case where a k8s version in the ClusterConfiguration
is too old for the kubeadm binary to use during code freeze.
Include unit tests for the new utilities.
This change optimizes the kubeadm/etcd `AddMember` client-side function
by stopping early in the backoff loop when a peer conflict is found
(indicating the member has already been added to the etcd cluster). In
this situation, the function will stop early and relay a call to
`ListMembers` to fetch the current list of members to return. With this
optimization, front-loading a `ListMembers` call is no longer necessary,
as this functionally returns the equivalent response.
This helps reduce the amount of time taken in situational cases where an
initial client request to add a member is accepted by the server, but
fails client-side.
This situation is possible situationally, such as if network latency
causes the request to timeout after it was sent and accepted by the
cluster. In this situation, the following loop would occur and fail with
an `ErrPeerURLExist` response, and would be stuck until the backoff
timeout was met (roughly ~2min30sec currently).
Testing Done:
* Manual testing with an etcd cluster. Initial "AddMember` call was
successful, and the etcd manifest file was identical to prior version
of these files. Subsequent calls to add the same member succeeded
immediately (retaining idempotency), and the resulting manifest file
remains identical to previous version as well. The difference, this
time, is the call finished ~2min25sec faster in an identical test in
the environment tested with.
The purell package at github.com/PuerkitoBio/purell is no longer maintained and in k/k repo under kubeadm package its been used for normalizing the URL. This commit removes the dependency on this package and creates a local function for normalizing the URL within the preflight package under cmd/kubeadm.
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
chore: add new line at end of the file
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
fix: remove unused mod from vendor modules file
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
The CPUManagerPolicyOptions received from the kubelet config/command line args
is propogated to the Container Manager.
We defer the consumption of the options to a later patch(set).
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
In this patch we enhance the kubelet configuration to support
cpuManagerPolicyOptions.
In order to introduce SMT-awareness in CPU Manager, we introduce a
new flag in Kubelet to allow the user to specify an additional flag
called `cpumanager-policy-options` to allow the user to modify the
behaviour of static policy to strictly guarantee allocation of whole
core.
Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
During operations such as "upgrade", kubeadm fetches the
ClusterConfiguration object from the kubeadm ConfigMap.
However, due to requiring node specifics it wraps it in an
InitConfiguration object. The function responsible for that is:
app/util/config#FetchInitConfigurationFromCluster().
A problem with this function (and sub-calls) is that it ignores
the static defaults applied from versioned types
(e.g. v1beta3/defaults.go) and only applies dynamic defaults for:
- API endpoints
- node registration
- etc...
The introduction of Init|JoinConfiguration.ImagePullPolicy now
has static defaulting of the NodeRegistration object with a default
policy of "PullIfNotPresent". Respect this defaulting by constructing
a defaulted internal InitConfiguration from
FetchInitConfigurationFromCluster() and only then apply the dynamic
defaults over it.
This fixes a bug where "kubeadm upgrade ..." fails when pulling images
due to an empty ("") ImagePullPolicy. We could assume that empty
string means default policy on runtime in:
cmd/kubeadm/app/preflight/checks.go#ImagePullCheck()
but that might actually not be the user intent during "init" and "join",
due to e.g. a typo. Similarly, we don't allow empty tokens
on runtime and error out.
Instead of dynamically defaulting NodeRegistration.ImagePullPolicy,
which is common when doing defaulting depending on host state - e.g.
hostname, statically default it in v1beta3/defaults.go.
- Remove defaulting in checks.go
- Add one more unit test in checks_test.go
- Adapt v1beta2 conversion and fuzzer / round tripping tests
This also results in the default being visible when calling:
"kubeadm config print ...".