Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:
- Logging a message only above a certain verbosity threshold without
recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
this matters when using a logging backend which records the verbosity
level.
- Passing a format string with parameters to a logging function that
doesn't do string formatting.
All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.
In some cases it reports false positives, but those can be suppressed with
source code comments.
We now re-use the crictl tool path within the `ContainerRuntime` when
exec'ing into it. This allows introducing a convenience function to
create the crictl command and re-use it where necessary.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
This change adds 2 options for windows:
--forward-healthcheck-vip: If true forward service VIP for health check
port
--root-hnsendpoint-name: The name of the hns endpoint name for root
namespace attached to l2bridge, default is cbr0
When --forward-healthcheck-vip is set as true and winkernel is used,
kube-proxy will add an hns load balancer to forward health check request
that was sent to lb_vip:healthcheck_port to the node_ip:healthcheck_port.
Without this forwarding, the health check from google load balancer will
fail, and it will stop forwarding traffic to the windows node.
This change fixes the following 2 cases for service:
- `externalTrafficPolicy: Cluster` (default option): healthcheck_port is
10256 for all services. Without this fix, all traffic won't be directly
forwarded to windows node. It will always go through a linux node and
get forwarded to windows from there.
- `externalTrafficPolicy: Local`: different healthcheck_port for each
service that is configured as local. Without this fix, this feature
won't work on windows node at all. This feature preserves client ip
that tries to connect to their application running in windows pod.
Change-Id: If4513e72900101ef70d86b91155e56a1f8c79719
* kube-proxy cluder-cidr arg accepts comma-separated list
It is possible in dual-stack clusters to provide kube-proxy with
a comma-separated list with an IPv4 and IPv6 CIDR for pods.
update: signoff
update2: update email profile
Signed-off-by: Tyler Lloyd <Tyler.Lloyd@microsoft.com>
Signed-off-by: Tyler Lloyd <tylerlloyd928@gmail.com>
* Updating cluster-cidr comment description
Signed-off-by: Tyler Lloyd <tyler.lloyd@microsoft.com>
For the YAML examples, make the indentation consistent
by starting with a space and following with a TAB.
Also adjust the indentation of some fields to place them under
the right YAML field parent - e.g. ignorePreflightErrors
is under nodeRegistration.
All the controllers should use context for signalling termination of communication with API server. Once kcm cancels context all the cert controllers which are started via kcm should cancel the APIServer request in flight instead of hanging around.
This commit includes all the changes needed for APIServer. Instead of modifying the existing signatures for the methods which either generate or return stopChannel, we generate a context from the channel and use the generated context to be passed to the controllers which are started in APIServer. This ensures we don't have to touch APIServer dependencies.
- Modify VerifyUnmarshalStrict to use serializer/json instead
of sigs.k8s.io/yaml. In strict mode, the serializers
in serializer/json use the new sigs.k8s.io/json library
that also catches case sensitive errors for field names -
e.g. foo vs Foo. Include test case for that in strict/testdata.
- Move the hardcoded schemes to check to the side of the
caller - i.e. accept a slice of runtime.Scheme.
- Move the klog warnings outside of VerifyUnmarshalStrict
and make them the responsibility of the caller.
- Call VerifyUnmarshalStrict when downloading the configuration
from kubeadm-config or the kube-proxy or kubelet-config CMs.
This validation is useful if the user has manually patched the CMs.
The apiserver owns and manages the kubernetes.default service.
It has 3 different options to reconcile the endpoints that belong to
that service:
- None: endpoints are handled by an external party.
- MasterCount: legacy, it reconciles based on the endpoints generated
and a flag specifying the number of master on the cluster.
- Lease: default since 1.11, each apiserver writes a lease in etcd
and renews periodically, the endpoints are generated based on the
existing leases.
It seems that when the default was set for the lease reconciler, the
controlplane code wasn't updated and kept using the master count
reconciler.
This also starts the deprecation of the master count reconciler in
favor of the lease reconciler.
The legacy naming "kubelet-config-x.yy" is no longer the
default behavior. Rename instances in documentation and comments
of "kubelet-config-x.yy" to "kubelet-config".
- Graduate the feature gate to Beta and enable it by default.
- Pre-set the default value for UnversionedKubeletConfigMap
to "true" in test/e2e_kubeadm.
- Fix a couple of typos in "tolerate" introduced in the PR that
added the FG in 1.23.
Compare with two pointers will always show that they are different value,
so it will always print the warning message.
Signed-off-by: Dave Chen <dave.chen@arm.com>
- During "upgrade apply" call a new function AddNewControlPlaneTaint()
that finds all nodes with the new "control-plane" node-role label
and adds the new "control-plane" taint to them.
- The function is called in "apply" and is separate from
the step to remove the old "master" label for better debugging
if errors occur.
- Apply "control-plane" taint during init/join by adding the
taint in SetNodeRegistrationDynamicDefaults(). The old
taint "master" is still applied.
- Clarify API docs (v1beta2 and v1beta3) for nodeRegistration.Taint
to not mention "master" taint and be more generic. Remove
example for taints that includes the word "master".
- Update unit tests.
- Update the markcontrolplane phase used by init and join to
only label the nodes with the new control plane label.
- Cleanup TODOs about the old label.
- Remove outdated comment about selfhosting in staticpod/utils.go.
Selfhosting has not been supported in kubeadm for a while
and the comment also mentions the "master" label.
- Update unit tests.
- Rename the function in postupgrade.go to better reflect
what is being done.
- During "upgrade apply" find all nodes with the old label
and remove it by calling PatchNode.
- Update health check for CP nodes to not track "master"
labeled nodes. At this point all CP nodes should have
"control-plane" and we can use that selector only.
- Throw an error if there is more than one known socket on the host.
- Remove the special handling for docker+containerd.
- Remove the local instances of constants for endpoints for
Windows / Unix and use the defaultKnownCRISockets variable
which is populated from OS specific constants.
- Update error message in detectCRISocketImpl to have more
details.
- Make detectCRISocketImpl accept a list of "known" sockets
- Update unit tests for detectCRISocketImpl and make them
use generic paths such as "unix:///foo/bar.sock".
Change the default container runtime CRI socket endpoint to the
one of containerd. Previously it was the one for Docker
- Rename constants.DefaultDockerCRISocket to DefaultCRISocket
- Make the constants files include the endpoints for all supported
container runtimes for Unix/Windows.
- Update unit tests related to docker runtime testing.
- In kubelet/flags.go hardcode the legacy docker socket as a check
to allow kubeadm 1.24 to run against kubelet 1.23 if the user
explicitly sets the criSocket field to "npipe:////./pipe/dockershim"
on Windows or "unix:///var/run/dockershim.sock" on Linux.
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:
if klog.V(5).Enabled() {
klog.Info("hello world")
}
Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.
Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.
The API was deprecated in 1.23 when output/v1alpha2 was
added. v1alpha1 is problematic since it embeds kubeadm/v1beta2
BootstrapToken related types directly. v1alpha2 imports
a new group dedicated to bootstrap tokens apis/bootstraptoken.
cli.Run was an attempt to elliminate error handling in Kubernetes
commands. However, it had to rely on heuristics that are not necessarily right
for all commands.
kubectl is one example which has its own error printing code that should be
used in all cases after a command failure. It now gets used also for
`--warnings-as-errors`. Previously, that caused the following message to be
logged at the end:
E0110 16:56:01.987555 202060 run.go:120] "command failed" err="1 warning received"
Now it ends with:
error: 1 warning received
crictl already works with the current state of dockershim.
Using the docker CLI is not required and the DockerRuntime
can be removed from kubeadm. This means that crictl
can connect at the dockershim (or cri-dockerd) socket and
be used to list containers, pull images, remove containers, and
all actions that the kubelet can otherwise perform with the socket.
Ensure that crictl is now required for all supported container runtimes
in checks.go. In the help text in waitcontrolplane.go show only
the crictl example.
Remove the check for the docker service from checks.go.
Remove the DockerValidor check from checks.go.
These two checks were special casing Docker as CR and compensating
for the lack of the same checks in dockershim. With the
extraction of dockershim to cri-dockerd, ideally cri-dockerd
should perform the required checks whether it can support
a given Docker config / version running on a host.
During "upgrade node" and "upgrade apply" read the
kubelet env file from /var/lib/kubelet/kubeadm-flags.env
patch the --container-runtime-endpoint flag value to
have the appropriate URL scheme prefix (e.g. unix:// on Linux)
and write the file back to disk.
This is a temporary workaround that should be kept only for 1 release
cycle - i.e. remove this in 1.25.
The CRI socket that kubeadm writes as an annotation
on a particular Node object can include an endpoint that
does not have an URL scheme. This is undesired as long term
the kubelet can stop allowing endpoints without URL scheme.
For control plane nodes "kubeadm upgrade apply" takes
the locally defaulted / populated NodeRegistration and refreshes
the CRI socket in PerformPostUpgradeTasks. But for secondary
nodes "kubeadm upgrade node" does not.
Adapt "upgrade node" to fetch the NodeRegistration for this node
and fix the CRI socket missing URL scheme if needed in the Node
annotation.
- Update defaults for v1beta2 and 3 to have URL scheme
- Raname DefaultUrlScheme to DefaultContainerRuntimeURLScheme
- Prepend a missing URL scheme to user sockets and warn them
that this might not be supported in the future
- Update socket validation to exclude IsAbs() testing
(This is broken on Windows). Assume the path is not empty and has
URL scheme at this point (validation happens after defaulting).
- Use net.Dial to open Unix sockets
- Update all related unit tests
Signed-off-by: pacoxu <paco.xu@daocloud.io>
Signed-off-by: Lubomir I. Ivanov <lubomirivanov@vmware.com>
Currently when the dockershim socket is used, kubeadm only passes
the --network-plugin=cni to the kubelet and assumes the built-in
dockershim. This is valid for versions <1.24, but with dockershim
and related flags removed the kubelet will fail.
Use preflight.GetKubeletVersion() to find the version of the host
kubelet and if the version is <1.24 assume that it has built-in
dockershim. Newer versions should will be treated as "remote" even
if the socket is for dockershim, for example, provided by cri-dockerd.
Update related unit tests.
Since we removed dockershim we now rely on both flags, which therefore
should not marked experimental any more.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Apply a small fix to ensure the kubeconfig files
that kubeadm manages have a CA when printed in the table
of the "check expiration" command. "CAName" is the field used for that.
In practice kubeconfig files can contain multiple credentials
from different CAs, but this is not supported by kubeadm and there
is a single cluster CA that signs the single client cert/key
in kubeadm managed kubeconfigs.
In case stacked etcd is used, the code that does expiration checks
does not validate if the etcd CA is "external" (missing key)
and if the etcd CA signed certificates are valid.
Add a new function UsingExternalEtcdCA() similar to existing functions
for the cluster CA and front-proxy CA, that performs the checks for
missing etcd CA key and certificate validity.
This function only runs for stacked etcd, since if etcd is external
kubeadm does not track any certs signed by that etcd CA.
This fixes a bug where the etcd CA will be reported as local even
if the etcd/ca.key is missing during "certs check-expiration".
When the "kubeadm certs check-expiration" command is used and
if the ca.key is not present, regular on disk certificate reads
pass fine, but fail for kubeconfig files. The reason for the
failure is that reading of kubeconfig files currently
requires reading both the CA key and cert from disk. Reading the CA
is done to ensure that the CA cert in the kubeconfig is not out of date
during renewal.
Instead of requiring both a CA key and cert to be read, only read
the CA cert from disk, as only the cert is needed for kubeconfig files.
This fixes printing the cert expiration table even if the ca.key
is missing on a host (i.e. the CA is considered external).
If done too soon, the klog.V() calls are ignored because the log verbosity
isn't set. In Kubernetes 1.22, the verbosity was set, but not the logging
format.
Don't use a custom dialer for the kubelet if is not rotating
certificates, so we can reuse TCP connections because we don't need
a customer dialer.
Kubelet needs to be able to recover from stale http connections.
HTTP2 has a mechanism to detect broken connections by sending periodical pings.
HTTP1 only can have one persistent connection, and it will close all Idle connections
once the Kubelet heartbet fails. However, since there are many edge cases that we can't
control, users can still opt-in to the previous behavior for closing the connections by
setting the environment variable DISABLE_HTTP2.
kubeam:node-proxier -> kubeadm:node-proxier
This causes e2e test failures:
"[area-kubeadm] proxy addon kube-proxy ServiceAccount should
be bound to the system:node-proxier cluster role"
in:
- kubeadm-kinder-latest
- kubeadm-kinder-latest-on-...
- other tests
Before this commit when setting bindAddress to 1.2.3.4 the warning was:
The recommended value for "bindAddress" in "KubeProxyConfiguration" is: 1.2.3.4; the provided value is: 0.0.0.0
Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
Signed-off-by: wangyysde <net_use@bzhy.com>
Generation swagger.json.
Use v2 path for hpa_cpu_field.
run update-codegen.sh
Signed-off-by: wangyysde <net_use@bzhy.com>
Add the UnversionedKubeletConfigMap feature gate that can
be used to control legacy vs new behavior for naming the
default configmap used to store the KubeletConfiguration.
Update related unit tests.
This PR removes Serve function and uses all required places
ServeWithListenerStopped which takes place new Serve function.
This function returns ListenerStopped channel can be used to drain
requests before shutting down the server.
Instead of the individual error and return, it's better to aggregate all
the errors so that we can fix them all at once.
Take the chance to fix some comments, since kubeadm are not checking that
the certs are equal across controlplane.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The addition of output/v1alpha2 made the converter-gen require
an explicit converter for:
kubeadm/v1beta2.BootstrapToken -> bootstraptoken/v1.BootstrapToken.
Add this converter under kubeadm/v1beta.
Use the converter in output/v1alpha1.
In various places log messages where emitted as part of validation or even
before it (for example, cli.PrintFlags). Those log messages did not use the
final logging configuration, for example text output instead of JSON or not the
final verbosity. The last point became more obvious after moving the setup of
verbosity into logs.Options.Apply because PrintFlags never printed anything
anymore.
In order to force applications to deal with logging as soon as possible, the
Options.Validate and Options.Apply methods are now private. Applications should
use the new Options.ValidateAndApply directly after parsing.
These three options are the ones from logs.AddFlags which are not deprecated.
Therefore it makes sense to make them available also via the configuration file
support in the one command which currently supports that (kubelet).
Long-term, all commands should use LoggingConfiguration, either with a
configuration file (as in kubelet) or via flags (kube-scheduler,
kube-apiserver, kube-controller-manager).
Short-term, both approaches have to be supported. As the majority of the
commands only use logs.AddFlags, that function by default continues to register
the flags and only leaves that to Options.AddFlags when explicitly requested.
A drive-by bug fix is done for log flushing: the periodic flushing called
klog.Flush and therefore missed explicit flushing of the newer logr
backend. This bug was never present in any release Kubernetes and therefore the
fix is not submitted in a separate PR.
This feature has graduated to GA in v1.11 and will always be
enabled. So no longe need to check if enabled.
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
* De-share the Handler struct in core API
An upcoming PR adds a handler that only applies on one of these paths.
Having fields that don't work seems bad.
This never should have been shared. Lifecycle hooks are like a "write"
while probes are more like a "read". HTTPGet and TCPSocket don't really
make sense as lifecycle hooks (but I can't take that back). When we add
gRPC, it is EXPLICITLY a health check (defined by gRPC) not an arbitrary
RPC - so a probe makes sense but a hook does not.
In the future I can also see adding lifecycle hooks that don't make
sense as probes. E.g. 'sleep' is a common lifecycle request. The only
option is `exec`, which requires having a sleep binary in your image.
* Run update scripts
* Updated non idle logging time
* Update cmd/kubelet/app/options/options.go
Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
The new handler is meant to be executed at the end of the delegation chain.
It simply checks if the request have been made before the server has installed all known HTTP paths.
In that case it returns a 503 response otherwise it returns a 404.
We don't want to add additional checks to the readyz path as it might prevent fixing bricked clusters.
This specific handler is meant to "protect" requests that arrive before the paths and handlers are fully initialized.
The feature gate gets locked to "true", with the goal to remove it in two
releases.
All code now can assume that the feature is enabled. Tests for "feature
disabled" are no longer needed and get removed.
Some code wasn't using the new helper functions yet. That gets changed while
touching those lines.
The recommendation from #sig-cli was to print usage, then the error. Extra care
is taken to only print the usage instruction when the error really was about
flag parsing.
Taking kube-scheduler as example:
$ _output/bin/kube-scheduler
I0929 09:42:42.289039 149029 serving.go:348] Generated self-signed cert in-memory
...
W0929 09:42:42.489255 149029 client_config.go:620] error creating inClusterConfig, falling back to default config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined
E0929 09:42:42.489366 149029 run.go:98] "command failed" err="invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable"
$ _output/bin/kube-scheduler --xxx
Usage:
kube-scheduler [flags]
...
--vmodule moduleSpec
comma-separated list of pattern=N settings for file-filtered logging
Error: unknown flag: --xxx
The kubectl behavior doesn't change:
$ _output/bin/kubectl get nodes
Unable to connect to the server: dial tcp: lookup xxxx: No address associated with hostname
$ _output/bin/kubectl --xxx
Error: unknown flag: --xxx
See 'kubectl --help' for usage.
All Kubernetes commands should show flags with hyphens in their help text even
when the flag originally was defined with underscore. Converting a command to
this style is not breaking its command line API because the old-style parameter
with underscore is accepted as alias.
The easiest solution to achieve this is to set normalization shortly before
running the command in the new central cli.Run or the few places where that
function isn't used yet.
There may be some texts which depends on normalization at flag definition time,
like the --logging-format usage warning. Those get generated assuming that
hyphens will be used.
It wasn't documented that InitLogs already uses the log flush frequency, so
some commands have called it before parsing (for example, kubectl in the
original code for logs.go). The flag never had an effect in such commands.
Fixing this turned into a major refactoring of how commands set up flags and
run their Cobra command:
- component-base/logs: implicitely registering flags during package init is an
anti-pattern that makes it impossible to use the package in commands which
want full control over their command line. Logging flags must be added
explicitly now, something that the new cli.Run does automatically.
- component-base/logs: AddFlags would have crashed in kubectl-convert if it
had been called because it relied on the global pflag.CommandLine. This
has been fixed and kubectl-convert now has the same --log-flush-frequency
flag as other commands.
- component-base/logs/testinit: an exception are tests where flag.CommandLine has
to be used. This new package can be imported to add flags to that
once per test program.
- Normalization of the klog command line flags was inconsistent. Some commands
unintentionally didn't normalize to the recommended format with hyphens. This
gets fixed for sample programs, but not for production programs because
it would be a breaking change.
This refactoring has the following user-visible effects:
- The validation error for `go run ./cmd/kube-apiserver --logging-format=json
--add-dir-header` now references `add-dir-header` instead of `add_dir_header`.
- `staging/src/k8s.io/cloud-provider/sample` uses flags with hyphen instead of
underscore.
- `--log-flush-frequency` is not listed anymore in the --logging-format flag's
`non-default formats don't honor these flags` usage text because it will also
work for non-default formats once it is needed.
- `cmd/kubelet`: the description of `--logging-format` uses hyphens instead of
underscores for the flags, which now matches what the command is using.
- `staging/src/k8s.io/component-base/logs/example/cmd`: added logging flags.
- `apiextensions-apiserver` no longer prints a useless stack trace for `main`
when command line parsing raises an error.
The API is a copy of output/v1alpha1 with a minor difference
where output/v1alpha2.BootstrapToken embeds
bootstraptoken/v1.BootstrapToken instead of
kubeadm/v1beta2.BootstrapToken.
Embedding the later is an undesired binding between the "kubeadm"
and "output" groups, preventing the eventual deprecation and removal
of the kubeadm.v1beta2 API.
This new output API version, unlike v1alpha1, does not include
defaulting which is not needed.
Because the proxy.Provider interface included
proxyconfig.EndpointsHandler, all the backends needed to
implement its methods. But iptables, ipvs, and winkernel implemented
them as no-ops, and metaproxier had an implementation that wouldn't
actually work (because it couldn't handle Services with no active
Endpoints).
Since Endpoints processing in kube-proxy is deprecated (and can't be
re-enabled unless you're using a backend that doesn't support
EndpointSlice), remove proxyconfig.EndpointsHandler from the
definition of proxy.Provider and drop all the useless implementations.
Due to a cut-and-paste error in the original implementation in Kubernetes 1.19,
support for generic ephemeral inline volumes in the PVC protection controller
was incorrectly tied to the "storage object in use" feature gate.
The configuration is deprecated and targets removal for v1.23. Tests
cases have been changed as well.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
With this commit kube-proxy accepts current system values (retrieved by sysctl) which are higher than the internally known and expected values.
The code change was mistakenly created as PR in the k3s project (see https://github.com/k3s-io/k3s/pull/3505).
A real life use case is described in Rancher issue https://github.com/rancher/rancher/issues/33360.
When Kubernetes runs on a Node which itself is a container (e.g. LXC), and the value is changed on the (LXC) host, kube-proxy then fails at the next start as it does not recognize the current value and attempts to overwrite the current value with the previously known one. This result in:
```
I0624 07:38:23.053960 54 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_max' to 524288
F0624 07:38:23.053999 54 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied
```
However a sysctl overwrite only makes sense if the current value is lower than the previously known and expected value. If the value was increased on the host, that shouldn't really bother kube-proxy and just go on with it.
Signed-off-by: Claudio Kuenzler ck@claudiokuenzler.com
* This allows a controller to use cloud provider managed RBAC
when --use-service-account-credentials is set.
* Create ControllerInitFuncConstructor to pass to init funcs to avoid
future function signature growth.
* Add comments for context around legacy naming of node controllers.
* Add example for setting client names from cloud controller manager.
Panicing if not running in a test and if the component-base/version
variables are empty is not ideal. At some point sections
of kubeadm could be exposed as a library and if these sections
import the constants package, they would panic on the library
users unless they set the version information in component-base
with ldflags.
Instead:
- If the component-base version is empty, return a placeholder version
that should indicate to users that build kubeadm that something is not
right (e.g. they did not use 'make'). During library usage or unit
tests this version should not be relevant.
- Update unit tests to use hardcoded versions instead of the versions
from the constants package. Using the constants package for testing
is good but during unit tests these versions are already placeholders
since unit tests do not populate the actual component-base versions
(e.g. 1.23).
Tests under /app and /test would fail if the current/minimum k8s version
is dynamically populated from the version in the kubeadm binary.
Adapt the tests to support that.
Kubeadm requires manual version updates of its current supported k8s
control plane version and minimally supported k8s control plane and
kubelet versions every release cycle.
To avoid that, in constants.go:
- Add the helper function getSkewedKubernetesVersion() that can be
used to retrieve a MAJOR.(MINOR+n).0 version of k8s. It currently
uses the kubeadm version populated in "component-base/version" during
the kubeadm build process.
- Use the function to set existing version constants (variables).
Update util/config/common.go#NormalizeKubernetesVersion() to
tolerate the case where a k8s version in the ClusterConfiguration
is too old for the kubeadm binary to use during code freeze.
Include unit tests for the new utilities.
This change optimizes the kubeadm/etcd `AddMember` client-side function
by stopping early in the backoff loop when a peer conflict is found
(indicating the member has already been added to the etcd cluster). In
this situation, the function will stop early and relay a call to
`ListMembers` to fetch the current list of members to return. With this
optimization, front-loading a `ListMembers` call is no longer necessary,
as this functionally returns the equivalent response.
This helps reduce the amount of time taken in situational cases where an
initial client request to add a member is accepted by the server, but
fails client-side.
This situation is possible situationally, such as if network latency
causes the request to timeout after it was sent and accepted by the
cluster. In this situation, the following loop would occur and fail with
an `ErrPeerURLExist` response, and would be stuck until the backoff
timeout was met (roughly ~2min30sec currently).
Testing Done:
* Manual testing with an etcd cluster. Initial "AddMember` call was
successful, and the etcd manifest file was identical to prior version
of these files. Subsequent calls to add the same member succeeded
immediately (retaining idempotency), and the resulting manifest file
remains identical to previous version as well. The difference, this
time, is the call finished ~2min25sec faster in an identical test in
the environment tested with.
The purell package at github.com/PuerkitoBio/purell is no longer maintained and in k/k repo under kubeadm package its been used for normalizing the URL. This commit removes the dependency on this package and creates a local function for normalizing the URL within the preflight package under cmd/kubeadm.
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
chore: add new line at end of the file
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
fix: remove unused mod from vendor modules file
Signed-off-by: gkarthiks <github.gkarthiks@gmail.com>
The CPUManagerPolicyOptions received from the kubelet config/command line args
is propogated to the Container Manager.
We defer the consumption of the options to a later patch(set).
Co-authored-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
In this patch we enhance the kubelet configuration to support
cpuManagerPolicyOptions.
In order to introduce SMT-awareness in CPU Manager, we introduce a
new flag in Kubelet to allow the user to specify an additional flag
called `cpumanager-policy-options` to allow the user to modify the
behaviour of static policy to strictly guarantee allocation of whole
core.
Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
During operations such as "upgrade", kubeadm fetches the
ClusterConfiguration object from the kubeadm ConfigMap.
However, due to requiring node specifics it wraps it in an
InitConfiguration object. The function responsible for that is:
app/util/config#FetchInitConfigurationFromCluster().
A problem with this function (and sub-calls) is that it ignores
the static defaults applied from versioned types
(e.g. v1beta3/defaults.go) and only applies dynamic defaults for:
- API endpoints
- node registration
- etc...
The introduction of Init|JoinConfiguration.ImagePullPolicy now
has static defaulting of the NodeRegistration object with a default
policy of "PullIfNotPresent". Respect this defaulting by constructing
a defaulted internal InitConfiguration from
FetchInitConfigurationFromCluster() and only then apply the dynamic
defaults over it.
This fixes a bug where "kubeadm upgrade ..." fails when pulling images
due to an empty ("") ImagePullPolicy. We could assume that empty
string means default policy on runtime in:
cmd/kubeadm/app/preflight/checks.go#ImagePullCheck()
but that might actually not be the user intent during "init" and "join",
due to e.g. a typo. Similarly, we don't allow empty tokens
on runtime and error out.
Instead of dynamically defaulting NodeRegistration.ImagePullPolicy,
which is common when doing defaulting depending on host state - e.g.
hostname, statically default it in v1beta3/defaults.go.
- Remove defaulting in checks.go
- Add one more unit test in checks_test.go
- Adapt v1beta2 conversion and fuzzer / round tripping tests
This also results in the default being visible when calling:
"kubeadm config print ...".
This change updates the CSR API to add a new, optional field called
expirationSeconds. This field is a request to the signer for the
maximum duration the client wishes the cert to have. The signer is
free to ignore this request based on its own internal policy. The
signers built-in to KCM will honor this field if it is not set to a
value greater than --cluster-signing-duration. The minimum allowed
value for this field is 600 seconds (ten minutes).
This change will help enforce safer durations for certificates in
the Kube ecosystem and will help related projects such as
cert-manager with their migration to the Kube CSR API.
Future enhancements may update the Kubelet to take advantage of this
field when it is configured in a way that can tolerate shorter
certificate lifespans with regular rotation.
Signed-off-by: Monis Khan <mok@vmware.com>
Given bootstraptoken/v1 is now a separate GV, there is no need
to duplicate the API and utilities inside v1beta3 and the internal
version.
v1beta2 must continue to use its internal copy due, since output/v1alpha1
embeds the v1beta2.BootstrapToken object. See issue 2427 in k/kubeadm.
- Make v1beta3 use bootstraptoken/v1 instead of local copies
- Make the internal API use bootstraptoken/v1
- Update validation, /cmd, /util and other packages
- Update v1beta2 conversion
Package bootstraptoken contains an API and utilities wrapping the
"bootstrap.kubernetes.io/token" Secret type to ease its usage in kubeadm.
The API is released as v1, since these utilities have been part of a
GA workflow for 10+ releases.
The "bootstrap.kubernetes.io/token" Secret type is also GA.
During "join" of new control plane machines, kubeadm would
download shared certificates and keys from the cluster stored
in a Secret. Based on the contents of an entry in the Secret,
it would use helper functions from client-go to either write
it as public key, cert (mode 644) or as a private key (mode 600).
The existing logic is always writing both keys and certs with mode 600.
Allow detecting public readable data properly and writing some files
with mode 644.
First check the data with ParsePrivateKeyPEM(); if this passes
there must be at least one private key and the file should be written
with mode 600 as private. If that fails, validate if the data contains
public keys with ParsePublicKeysPEM() and write the file as public
(mode 644).
As a result of this new logic, and given the current set of managed
kubeadm files, .key files will end up with 600, while .crt and .pub
files will end up with 644.
This change updates the backdating logic to only be applied to the
NotBefore date and not the NotAfter date when the certificate is
short lived. Thus when such a certificate is issued, it will not be
immediately expired. Long lived certificates continue to have the
same lifetime as before.
Consolidated all certificate lifetime logic into the
PermissiveSigningPolicy.policy method.
Signed-off-by: Monis Khan <mok@vmware.com>
Add {Init|Join}Configuration.Patches, which is a structure that
contains patch related options. Currently it only has the "Directory"
field which is the same option as the existing --experimental-patches
flag.
The flags --[experimental-]patches value override this value
if both a flag and config is passed during "init" or "join".
The feature of "patches" in kubeadm has been in Alpha for a few
releases. It has not received major bug reports from users.
Deprecate the --experimental-patches flag and add --patches.
Both flags are allowed to be mixed with --config.
When API Priority and Fairness is enabled, the inflight limits must
add up to something positive.
This rejects the configuration that prompted
https://github.com/kubernetes/kubernetes/issues/102885
Update help for max inflight flags
This adds the gate `SeccompDefault` as new alpha feature. Seccomp path
and field fallbacks are now passed to the helper functions, whereas unit
tests covering those code paths have been added as well.
Beside enabling the feature gate, the feature has to be enabled by the
`SeccompDefault` kubelet configuration or its corresponding
`--seccomp-default` CLI flag.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Apply suggestions from code review
Co-authored-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
If the user has not specified a pull policy we must assume a default of
v1.PullIfNotPresent.
Add some extra verbose output to help users monitor what policy is
used and what images are skipped / pulled.
Use "fallthrough" and case handle "v1.PullAlways".
Update unit test.
Update CreateInitStaticPodManifestFiles, CreateStaticPodFiles and CreateLocalEtcdStaticPodManifestFile to take into account if the command was run as dry-run.
In the Alpha stage of the feature in kubeadm to support
a rootless control plane, the allocation and assignment of
UID/GIDs to containers in the static pods will be automated.
This automation will require management of users and groups
in /etc/passwd and /etc/group.
The tools on Linux for user/group management are inconsistent
and non-standardized. It also requires us to include a number of
more dependencies in the DEB/RPMs, while complicating the UX for
non-package manager users.
The format of /etc/passwd and /etc/group is standardized.
Add code for managing (adding and deleting) a set of managed
users and groups in these files.
During Runner data initialization, if the value for the flag
"--skip-phases" was empty set the {init|join}Runner.Options.SkipPhases
to the {Init|Join}Configuration.SkipPhases value.
- Add the field SkipPhases in the public v1beta3 as a []string (omitempty)
- Add the field in the internal type
- Run generators
- Adapt v1beta2 converter for JoinConfiguration
Ideally this should be part of dockershim/CRI and not on the
side of kubeadm.
Remove the detection during:
- During preflight
- During kubelet config defaulting
Update dependencies and the test images to use pause 3.5. We also
provide a changelog entry for the new container image version.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
FeatureGate acts as a secondary switch to disable cloud-controller loops
in KCM, Kubelet and KAPI.
Provide comprehensive logging information to users, so they will be
guided in adoption of out-of-tree cloud provider implementation.
- Remove the deprecated --csr* flags "init phase certs"
- Deprecate the same flags for "certs renew".
For both cases users should be using "certs generate-csr".
The command "kubeadm config view" was deprecated in 1.19.
Remove it as scheduled in 1.22.
The replacement is to use kubectl:
kubectl get cm -n kube-system kubeadm-config -o=jsonpath="{.data.ClusterConfiguration}"
- Remove the object form v1beta3 and internal type
- Deprecate a couple of phases that were specifically designed / named to
modify the ClusterStatus object
- Adapt logic around annotation vs ClusterStatus retrieval
- Update unit tests
- Run generators
Running "go test ./cmd/kubeadm/app/..." results in these 3 files
being generated, since we have more callers to the functions
for generating unique private keys during pkiutil tests.
Add the files to ensure they are not generated locally all the time.
Kubeadm no longer supports kube-dns and CoreDNS is the only
supported DNS server. Remove ClusterConfiguration.DNS.Type
from v1beta3 that is used to set the DNS server type.
Now the following flags have no effect and would be removed in v1.24:
* `--port`
* `--address`
The insecure port flags `--port` may only be set to 0 now.
Signed-off-by: Jian Zeng <zengjian.zj@bytedance.com>
- Pin the ClusterConfiguration when fuzzing
the internal InitConfiguration that embeds it. Kubeadm includes
separate constructs for this embedding in the internal type
and this round trip is not viable.
- Remove the artificial calls to SetDefaults_ClusterConfiguration()
in v1beta{2|3}'s converters from public to internal InitConfiguration.
- Make sure the internal InitConfiguration.ClusterConfiguration is
defaulted in initconfiguration.go instead.
- scheme: switch to:
utilruntime.Must(scheme.SetVersionPriority(v1beta3.SchemeGroupVersion))
- change all imports in the code base from v1beta2 to v1beta3
- rename all import aliases for kubeadmapiv1beta2 to "kubeadmapiv".
this allows smaller diffs when changing the default public API.
it turns out that setting a timeout on HTTP client affect watch requests made by the delegated authentication component.
with a 10 second timeout watch requests are being re-established exactly after 10 seconds even though the default request timeout for them is ~5 minutes.
this is because if multiple timeouts were set, the stdlib picks the smaller timeout to be applied, leaving other useless.
for more details see a937729c2c/src/net/http/client.go (L364)
instead of setting a timeout on the HTTP client we should use context for cancellation.
The v1beta1/2 API doc.go files include an example
flag for the kubelet binary "cgroup-driver" under
"kubeletExtraArgs".
This flag is deprecated and should not be in the examples.
Add "v" instead which is one of the flags we know will
not be deprecated soon.
This is part of the "master" -> "control-plane" rename
that we missed. It's not critical for 1.21 as the
"control-plane" taint is still not added to CP nodes,
but it would be best to add the toleration preemptively
like the KEP planned.
* Removes discovery v1alpha1 API
* Replaces per Endpoint Topology with a read only DeprecatedTopology
in GA API
* Adds per Endpoint Zone field in GA API
The kubeadm documentation instructs users to set the container
runtime driver to "systemd", since kubeadm manages a kubelet via
the systemd init system. The kubelet default however is "cgroupfs".
For new clusters set the driver to "systemd" unless the user
is explicit about it. The same defaulting would not happen
during "upgrade".
Errors from staticcheck:
cmd/preferredimports/preferredimports.go:38:2:
package golang.org/x/crypto/ssh/terminal is deprecated:
this package moved to golang.org/x/term. (SA1019)
vendor/k8s.io/client-go/plugin/pkg/client/auth/exec/exec.go:36:2:
package golang.org/x/crypto/ssh/terminal is deprecated:
this package moved to golang.org/x/term. (SA1019)
vendor/k8s.io/client-go/tools/clientcmd/auth_loaders.go:26:2:
package golang.org/x/crypto/ssh/terminal is deprecated:
this package moved to golang.org/x/term. (SA1019)
Please review the above warnings. You can test via:
hack/verify-staticcheck.sh <failing package>
If the above warnings do not make sense, you can exempt the line or
file. See:
https://staticcheck.io/docs/#ignoring-problems
generated:
- hack/update-internal-modules.sh
- hack/lint-dependencies.sh
- hack/update-vendor.sh
Signed-off-by: Stephen Augustus <foo@auggie.dev>
Pass the flag --pod-infra-container-image to the kubelet not only
for Docker but for all CRs.
This flag tells the kubelet to special case the image and not garbage
collect it.
This change updates the number of workers that the CSR signing
controllers use. If a large number of certificates (especially
short lived ones) are approved at the same time, it can take the
signing controllers a long time to process them serially. The
NewCSRSigningController logic is already go routine safe.
Signed-off-by: Monis Khan <mok@vmware.com>
Looks like there is a bit of an issue in the Bluderbuss (Prow plugin)
where it prefers to pick reviewers from a parent OWNERS files,
instead of using an approver from a current OWNERS file as
an additional reviewer.
The dependencycheck binary name was vendorcheck, which was the original
name of the tool. This updates it to dependencycheck.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
As a part of cleaning up inactive members (those with no activity within
the past 18 months) from OWNERS files, this commit moves gmarek from an
approver to an emeritus_approver.
The new flag will parse the `--reserved-memory` flag straight forward
to the []kubeletconfig.MemoryReservation variable instead of parsing
it to the middle map representation.
It gives us possibility to get rid of a lot of unneeded code and use the single
presentation for the reserved-memory.
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
Updates kubeadm version resolution to use kubernetes community infra
bucket to fetch appropriate k8s ci versions. The images are already
being pulled from the kubernetes community infra bucket meaning that a
mismatch can occur when the ci version is fetched from the google infra
bucket and the image is not yet present on k8s infra.
Follow-up to kubernetes/kubernetes#97087
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Originally raised as an issue with invalid versions to plan, but it has
been determined with air gapped environments and development versions it
is not possible to fully address that issue.
But one thing that was identified was that we can do a better job in how
we output the upgrade plan information. Kubeadm outputs the requested
version as "Latest stable version", though that may not actually be the
case. For this instance, we want to change this to "Target version" to
be a little more accurate.
Then in the component upgrade table that is emitted, the last column of
AVAILABLE isn't quite right either. Also changing this to TARGET to
reflect that this is the version we are targetting to upgrade to,
regardless of its availability.
There could be some improvements in checking available versions,
particularly in air gapped environments, to make sure we actually have
access to the requested version. But this at least clarifies some of the
output a bit.
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
we disabled the /healthz check because our test blocks one post-start
hook from finishing. Instead we should check all the other /healthz/...
endpoints before running the tests
Add DefaultedStaticInitConfiguration() which can be
used instead of DefaultedInitConfiguration() during unit tests.
The later can be slow since it performs dynamic defaulting.
Apply the label:
"node.kubernetes.io/exclude-from-external-load-balancers"
To control plane nodes to preserve backwards compatibility
with the legacy mode where "master" nodes were excluded from
LBs.
This commit replaces the CSIMigrationXXXComplete flag
with InTreePluginXXUnregister flag. This new flag will
be a superset of the CSIMigrationXXXComplete. But this
decouple the plugin unregister from CSI migration. So
if a K8s distribution want to go directly with CSI and
do not support in-tree, they can use this flag directly.
Testing:
1. Enable the InTreePluginXXUnregister and not CSIMigrationXXX,
verify that the PVC using old plugin name will have error
saying cannot find the plugin
2. Enable both the InTreePluginXXUnregister and CSIMigrationXXX
verify that the PVC using old plugin name will start to use
the migrated CSI plugin
Migrate how resource lock and leader election config is generated to new way, hidding kubeClient. This also halfs kubeClient timeout, making it an useful value.
If timeout is equal to RenewDeadline and we hit client timeout on request, there will be no retry, as RenewDeadline part will cancel the context and lose leader election. So setting a timeout to value at least equal to RenewDeadline is pointless.
Setting it as half of RenewDeadline is a heuristic to resolve this missing retry problem without adding additional parameter.
Migrate how resource lock and leader election config is generated to new
way, hidding kubeClient. This also halfs kubeClient timeout, making it
an useful value.
During upgrade the coredns migration library seems to require
that the input version doesn't have the "v" prefix".
Fixes a bug where the user cannot run commands such as
"kubeadm upgrade plan" if they have `v1.8.0` installed.
Assuming this is caused by the fact that previously the image didn't
have a "v" prefix.
Fixes an issue where some kubeadm phases fail if a certificate file
contains a certificate chain with one or more intermediate CA
certificates. The validation algorithm has been changed from requiring
that a certificate was signed directly by the root CA to requiring that
there is a valid certificate chain back to the root CA.
In kubeadm etcd join there is a a bug that exists where,
if a peer already exists in etcd, it attempts to mitigate
by continuing and generating the etcd manifest file. However,
this existing "member name" may actually be unset, causing
subsequent etcd consistency checks to fail.
This change checks if the member name is empty - if it is,
it sets the member name to the node name, and resumes.
The error messages when the user feeds an invalid discovery token CA
hash are vague. Make sure to:
- Print the list of supported hash formats (currently only "sha256").
- Wrap the error from pubKeyPins.Allow() with a descriptive message.
Implement pod resource metrics as described in KEP 1916. The new
`/metrics/resources` endpoint is exposed on the active scheduler
and reports kube_pod_resources* metrics that present the effective
requests and limits for all resources on the pods as calculated by
the scheduler and kubelet. This allows administrators using the
system to quickly perform resource consumption, reservation, and
pending utilization calculations when those metrics are read.
Because metrics calculation is on-demand, there is no additional
resource consumption incurred by the scheduler unless the endpoint
is scraped.
- Mark the "node-role.kubernetes.io/master" key for labels
and taints as deprecated.
- During "kubeadm init/join" apply the label
"node-role.kubernetes.io/control-plane" to new control-plane nodes,
next to the existing "node-role.kubernetes.io/master" label.
- During "kubeadm upgrade apply", find all Nodes with the "master"
label and also apply the "control-plane" label to them
(if they don't have it).
- During upgrade health-checks collect Nodes labeled both "master"
and "control-plane".
- Rename the constants.ControlPlane{Taint|Toleraton} to
constants.OldControlPlane{Taint|Toleraton} to manage the transition.
- Mark constants.OldControlPlane{{Taint|Toleraton} as deprecated.
- Use constants.OldControlPlane{{Taint|Toleraton} instead of
constants.ControlPlane{Taint|Toleraton} everywhere.
- Introduce constants.ControlPlane{Taint|Toleraton}.
- Add constants.ControlPlaneToleraton to the kube-dns / CoreDNS
Deployments to make them anticipate the introduction
of the "node-role.kubernetes.io/control-plane:NoSchedule"
taint (constants.ControlPlaneTaint) on kubeadm control-plane Nodes.
Aborted requests are the ones that were disrupted with http.ErrAbortHandler.
For example, the timeout handler will panic with http.ErrAbortHandler when a response to the client has been already sent
and the timeout elapsed.
Additionally, a new metric requestAbortsTotal was defined to count aborted requests. The new metric allows for aggregation for each group, version, verb, resource, subresource and scope.
without APIServerIdentity enabled, stale apiserver leases won't be GC'ed
and the same for stale storage version entries. In that case the storage
migrator won't operate correctly without manual intervention.
To make sure that the storage version filter can block certain requests until
the storage version updates are completed, and that the apiserver works
properly after the storage version updates are done.
StorageVersions are updated during apiserver bootstrap.
Also add a poststarthook to the aggregator which updates the
StorageVersions via the storageversion.Manager
Previously no timeout was set. Requests without explicit timeout might potentially hang forever and lead to starvation of the application.
When no timeout was specified a default one will be applied.
* api: structure change
* api: defaulting, conversion, and validation
* [FIX] validation: auto remove second ip/family when service changes to SingleStack
* [FIX] api: defaulting, conversion, and validation
* api-server: clusterIPs alloc, printers, storage and strategy
* [FIX] clusterIPs default on read
* alloc: auto remove second ip/family when service changes to SingleStack
* api-server: repair loop handling for clusterIPs
* api-server: force kubernetes default service into single stack
* api-server: tie dualstack feature flag with endpoint feature flag
* controller-manager: feature flag, endpoint, and endpointSlice controllers handling multi family service
* [FIX] controller-manager: feature flag, endpoint, and endpointSlicecontrollers handling multi family service
* kube-proxy: feature-flag, utils, proxier, and meta proxier
* [FIX] kubeproxy: call both proxier at the same time
* kubenet: remove forced pod IP sorting
* kubectl: modify describe to include ClusterIPs, IPFamilies, and IPFamilyPolicy
* e2e: fix tests that depends on IPFamily field AND add dual stack tests
* e2e: fix expected error message for ClusterIP immutability
* add integration tests for dualstack
the third phase of dual stack is a very complex change in the API,
basically it introduces Dual Stack services. Main changes are:
- It pluralizes the Service IPFamily field to IPFamilies,
and removes the singular field.
- It introduces a new field IPFamilyPolicyType that can take
3 values to express the "dual-stack(mad)ness" of the cluster:
SingleStack, PreferDualStack and RequireDualStack
- It pluralizes ClusterIP to ClusterIPs.
The goal is to add coverage to the services API operations,
taking into account the 6 different modes a cluster can have:
- single stack: IP4 or IPv6 (as of today)
- dual stack: IPv4 only, IPv6 only, IPv4 - IPv6, IPv6 - IPv4
* [FIX] add integration tests for dualstack
* generated data
* generated files
Co-authored-by: Antonio Ojea <aojea@redhat.com>
the controller manager should validate the podSubnet against the node-mask
because if they are incorrect can cause the controller-manager to fail.
We don't need to calculate the node-cidr-masks, because those should
be provided by the user, if they are wrong we fail in validation.
This PR adds trailing unit tests to check the service cluster IP range and
improves the code coverage of k8s.io/kubernetes/cmd/kube-apiserver/app from
5.7% to 6.2%.
1) Dual stack IPv4/IPv6
2) Invalid IPv4, IPv6 mask
3) missing IPv4, IPv6 mask
4) invalid IP address format
The tests 2, 3, 4 are suggsted by Antonio Ojea.
Currently the "generate-csr" command does not have any output.
Pass an io.Writer (bound to os.Stdout from /cmd) to the functions
responsible for generating the kubeconfig / certs keys and CSRs.
If nil is passed these functions don't output anything.
Discussion is ongoing about how to best handle dual-stack with clouds
and autodetected IPs, but there is at least agreement that people on
bare metal ought to be able to specify two explicit IPs on dual-stack
hosts, so allow that.
Deprecate the experimental command "alpha self-hosting" and its
sub-command "pivot" that can be used to create a self-hosting
control-plane from static Pods.
The kubeconfig phase of "kubeadm init" detects external CA mode
and skips the generation of kubeconfig files. The kubeconfig
handling during control-plane join executes
CreateJoinControlPlaneKubeConfigFiles() which requires the presence
of ca.key when preparing the spec of a kubeconfig file and prevents
usage of external CA mode.
Modify CreateJoinControlPlaneKubeConfigFiles() to skip generating
the kubeconfig files if external CA mode is detected.
- Modify validateCACertAndKey() to print warnings for missing keys
instead of erroring out.
- Update unit tests.
This allows doing a CP node join in a case where the user has:
- copied shared certificates to the new CP node, but not copied
ca.key files, treating the cluster CAs as external
- signed other required certificates in advance
The provided DialContext wraps existing clients' DialContext in an attempt to
preserve any existing timeout configuration. In some cases, we may replace
infinite timeouts with golang defaults.
- scaleio: tcp connect/keepalive values changed from 0/15 to 30/30
- storageos: no change
The flag was deprecated as it is problematic since it allows
overrides of the kubelet configuration that is downloaded
from the cluster during upgrade.
Kubeadm node upgrades already download the KubeletConfiguration
and store it in the internal ClusterConfiguration type. It is then
only a matter of writing that KubeletConfiguration to disk.
For external CA users that have prepared the kubeconfig files
for components, they might wish to provide a custom API server URL.
When performing validation on these kubeconfig files, instead of
erroring out on such custom URLs, show a klog Warning.
This allows flexibility around topology setup, where users
wish to make the kubeconfigs point to the ControlPlaneEndpoint instead
of the LocalAPIEndpoint.
Fix validation in ValidateKubeconfigsForExternalCA expecting
all kubeconfig files to use the CPE. The kube-scheduler and
kube-controller-manager now use LAE.
This PR specifies minimum control plane version,
kubelet version and current K8s version for v1.20.
Signed-off-by: Kommireddy Akhilesh <akhileshkommireddy2412@gmail.com>
Client side period validation of certificates should not be
fatal, as local clock skews are not so uncommon. The validation
should be left to the running servers.
- Remove this validation from TryLoadCertFromDisk().
- Add a new function ValidateCertPeriod(), that can be used for this
purpose on demand.
- In phases/certs add a new function CheckCertificatePeriodValidity()
that will print warnings if a certificate does not pass period
validation, and caches certificates that were already checked.
- Use the function in a number of places where certificates
are loaded from disk.
CNI is no longer alpha and is widely used by almost every Kubernetes cluster, we should remove the alpha warnings that were originally added from the early days of CNI
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
The isCoreDNSVersionSupported() check assumes that
there is a running kubelet, that manages the CoreDNS containers.
If the containers are being created it is not possible to fetch
their image digest. To workaround that, a poll can be used in
isCoreDNSVersionSupported() and wait for the CoreDNS Pods
are expected to be running. Depending on timing and CNI
yet to be installed this can cause problems related to
addon idempotency of "kubeadm init", because if the CoreDNS
Pods are waiting for another step they will never get running.
Remove the function isCoreDNSVersionSupported() and assume that
the version is always supported. Rely on the Corefile migration
library to error out if it must.
- Ensure the directory is created with 0700 via a new function
called CreateDataDirectory().
- Call this function in the init phases instead of the manual call
to MkdirAll.
- Call this function when joining control-plane nodes with local etcd.
If the directory creation is left to the kubelet via the
static Pod hostPath mounts, it will end up with 0755
which is not desired.
A bug was discovered in the `enforceRequirements` func for `upgrade plan`.
If a command line argument that specifies the target Kubernetes version is
supplied, the returned `ClusterConfiguration` by `enforceRequirements` will
have its `KubernetesVersion` field set to the new version.
If no version was specified, the returned `KubernetesVersion` points to the
currently installed one.
This remained undetected for a couple of reasons
- It's only `upgrade plan` that allows for the version command line argument to
be optional (in `upgrade plan` it's mandatory)
- Prior to 1.19, the implementation of `upgrade plan` did not make use of the
`KubernetesVersion` returned by `enforceRequirements`.
`upgrade plan` supports this optional command line argument to enable
air-gapped setups (as not specifying a version on the command line will end up
looking for the latest version over the Interned).
Hence, the only option is to make `enforceRequirements` consistent in the
`upgrade plan` case and always return the currently installed version in the
`KubernetesVersion` field.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Pinning the kube-controller-manager and kube-scheduler kubeconfig files
to point to the control-plane-endpoint can be problematic during
immutable upgrades if one of these components ends up contacting an N-1
kube-apiserver:
https://kubernetes.io/docs/setup/release/version-skew-policy/#kube-controller-manager-kube-scheduler-and-cloud-controller-manager
For example, the components can send a request for a non-existing API
version.
Instead of using the CPE for these components, use the LocalAPIEndpoint.
This guarantees that the components would talk to the local
kube-apiserver, which should be the same version, unless the user
explicitly patched manifests.
A check that verifies that kubeadm does not "upgrade" to an older release was
overly optimized by skipping upgrade if the new version is the same as the old
one. This somewhat makes sense, but that way changes in any of the etcd fields
in the ClusterConfiguration won't be applied if the etcd version is not
changed.
Hence, this simple change ensures that the upgrade is done even when no version
change takes place.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
As a part of cleaning up inactive members (who haven't been active since
beginning of 2019) from OWNERS files, this commit removes euank and
CaoShuFeng from the list of reviewers.
dependencycheck verifies that no violating depdendencies exist in pkgs
passed via a JSON file generated from go list.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
The implementation consists of
- identifying all places where VolumeSource.PersistentVolumeClaim has
a special meaning and then ensuring that the same code path is taken
for an ephemeral volume, with the ownership check
- adding a controller that produces the PVCs for each embedded
VolumeSource.EphemeralVolume
- relaxing the PVC protection controller such that it removes
the finalizer already before the pod is deleted (only
if the GenericEphemeralVolume feature is enabled): this is
needed to break a cycle where foreground deletion of the pod
blocks on removing the PVC, which waits for deletion of the pod
The controller was derived from the endpointslices controller.
* Creates private keys and CSR files for all the control-plane certificates
* Helps with External CA mode of kubeadm
Signed-off-by: Richard Wall <richard.wall@jetstack.io>
Back in the v1alpha2 days the fuzzer test needed to be disabled. To ensure that
there were no config breaks and everything worked correctly extensive replacement
tests were put in place that functioned as unit tests for the kubeadm config utils
as well.
The fuzzer test has been reenabled for a long time now and there's no need for
these replacements. Hence, over time most of these were disabled, deleted and
refactored. The last remnants are part of the LoadJoinConfigurationFromFile test.
The test data for those old tests remains largely unused today, but it still receives
updates as it contains kubelet's and kube-proxy's component configs. Updates to these
configs are usually done because the maintainers of those need to add a new field.
Hence, to cleanup old code and reduce maintenance burden, the last test that depends
on this test data is finally refactored and cleaned up to represent a simple unit test
of `LoadJoinConfigurationFromFile`.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Over the course of recent development of the `componentconfigs` package,
it became evident that most of the tests in this package cannot be implemented without
using a component config. As all of the currently supported component configs are
external to the kubeadm project (kubelet and kube-proxy), practically all of the tests
in this package are now dependent on external code.
This is not desirable, because other component's configs may change frequently and
without much of a notice. In particular many configs add new fields without bumping their
versions. In addition to that, some components may be deprecated in the future and many
tests may use their configs as a place holder of a component config just to test some
common functionality.
To top that, there are many tests that test the same common functionality several times
(for each different component config).
Thus a kubeadm managed replacement and a fake test environment are introduced.
The new test environment uses kubeadm's very own `ClusterConfiguration`.
ClusterConfiguration is normally not managed by the `componentconfigs` package.
It's only used, because of the following:
- It's a versioned API that is under the control of kubeadm maintainers. This enables us to test
the componentconfigs package more thoroughly without having to have full and always up to date
knowledge about the config of another component.
- Other components often introduce new fields in their configs without bumping up the config version.
This, often times, requires that the PR that introduces such new fields to touch kubeadm test code.
Doing so, requires more work on the part of developers and reviewers. When kubeadm moves out of k/k
this would allow for more sporadic breaks in kubeadm tests as PRs that merge in k/k and introduce
new fields won't be able to fix the tests in kubeadm.
- If we implement tests for all common functionality using the config of another component and it gets
deprecated and/or we stop supporting it in production, we'll have to focus on a massive test refactoring
or just continue importing this config just for test use.
Thus, to reduce maintenance costs without sacrificing test coverage, we introduce this mini-framework
and set of tests here which replace the normal component configs with a single one (`ClusterConfiguration`)
and test the component config independent logic of this package.
As a result of this, many of the older test cases are refactored and greatly simplified to reflect
on the new change as well. The old tests that are strictly tied to specific component configs
(like the defaulting tests) are left unchanged.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Kubeadm setup of kube-controller-manager and kube-scheduler is
lacking the --port=0 option which caused the component to enable
the insecure port by default and serve insecurely on the default
node interface.
Add --port=0 by default to both components. Users are still allowed
the explicitly set the flag (via extraArgs), which allows them
to override this default kubeadm behavior and enable the insecure port.
NOTE: the flag is deprecated and should be removed from kubeadm manifests
once it's removed from core.
`kubeadm config upload` is a GA command that has been deprecated and scheduled
for removal since Kubernetes 1.15 (released 06/19/2019). This change will
finally removed it in Kubernetes 1.19 (planned for August 2020).
The original command has long since been replaced by a GA init phase:
`kubeadm init phase upload-config`
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Add PatchStaticPod() in staticpod/utils.go
Apply patches to static Pods in:
- phases/controlplane/CreateStaticPodFiles()
- phases/etcd/CreateLocalEtcdStaticPodManifestFile() and
CreateStackedEtcdStaticPodManifestFile()
Add unit tests and update Bazel.
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
This change enables kubeadm upgrade plan to print a state table with
information regarding known component config API groups. Most importantly this
information includes current and preferred version for each group and an
indication if a manual user upgrade is required.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
`kubeadm upgrade plan` is using the external (currently `v1alpha1`) types of
the kubeadm output API to collect upgrade plans. This is counter intuitive
since code structure gets bound to the whatever version the output API is at.
In addition to that, the versioned API is used only in the very last stages of
a machine readable output (which is currently not implemented).
Hence, to increase flexibility and keep up with the standard Kubernetes
ecosystem practice, `kubeadm upgrade plan` is migrated to use the internal
types of the output API.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
UploadConfiguration() now always retries the underling API calls,
which can make TestUploadConfiguration run for a long time.
Remove the negative test cases, where errors are expected.
Negative test cases should be tested in app/util/apiclient,
where a short timeout / retry count should be possible for unit tests.
Currently, kubeadm would refuse to perfom an upgrade (or even planing for one)
if it detects a user supplied unsupported component config version. Hence,
users are required to manually upgrade their component configs and store them
in the config maps prior to executing `kubeadm upgrade plan` or
`kubeadm upgrade apply`.
This change introduces the ability to use the `--config` option of the
`kubeadm upgrade plan` and `kubeadm upgrade apply` commands to supply a YAML
file containing component configs to be used in place of the existing ones in
the cluster upon upgrade.
The old behavior where `--config` is used to reconfigure a cluster is still
supported. kubeadm automatically detects which behavior to use based on the
presence (or absense) of kubeadm config types (API group
`kubeadm.kubernetes.io`).
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
We were detecting the IP family that kube-proxy should use
based on the bind address, however, this is not valid when
using an unspecified address, because on those cases
kube-proxy adopts the IP family of the address reported
in the Node API object.
The IP family will be determined by the nodeIP used by the proxier
The order of precedence is:
1. config.bindAddress if bindAddress is not 0.0.0.0 or ::
2. the primary IP from the Node object, if set
3. if no IP is found it defaults to 127.0.0.1 and IPv4
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
If an etcd member with the same address already exists, don't re-add it.
Instead, use the existing member list for creating the "initial cluster"
that is written for this etcd server instance static Pod.
Component configs are used by kubeadm upgrade plan at the moment. However, they
can prevent kubeadm upgrade plan from functioning if loading of an unsupported
version of a component config is attempted. For that matter it's best to just
stop loading component configs as part of the kubeadm config load process.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Under the feature gate DefaultPodTopologySpread, which will disable the legacy DefaultPodTopologySpread plugin from the default algorithm providers.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
`getK8sVersionFromUserInput` would attempt to load the config from a user
specified YAML file (via the `--config` option of `kubeadm upgrade plan` or
`kubeadm upgrade apply`). This is done in order to fetch the `KubernetesVersion`
field of the `ClusterConfiguration`. The complete config is then immediately
discarded. The actual config that is used during the upgrade process is fetched
from within `enforceRequirements`.
This, along with the fact that `getK8sVersionFromUserInput` is always called
immediately after `enforceRequirements` makes it possible to merge the two.
Merging them would help us simplify things and avoid future problems in
component config related patches.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
Until now, users were always asked to manually convert a component config to a
version supported by kubeadm, if kubeadm is not supporting its version.
This is true even for configs generated with older kubeadm versions, hence
getting users to make manual conversions on kubeadm generated configs.
This is not appropriate and user friendly, although, it tends to be the most
common case. Hence, we sign kubeadm generated component configs stored in
config maps with a SHA256 checksum. If a configs is loaded by kubeadm from a
config map and has a valid signature it's considered "kubeadm generated" and if
a version migration is required, this config is automatically discarded and a
new one is generated.
If there is no checksum or the checksum is not matching, the config is
considered as "user supplied" and, if a version migration is required, kubeadm
will bail out with an error, requiring manual config migration (as it's today).
The behavior when supplying component configs on the kubeadm command line
does not change. Kubeadm would still bail out with an error requiring migration
if it can recognize their groups but not versions.
Signed-off-by: Rostislav M. Georgiev <rostislavg@vmware.com>
In case a malformed flag is passed to k8s components
such as "–foo", where "–" is not an ASCII dash character,
the components currently silently ignore the flag
and treat it as a positional argument.
Make k8s components/commands exit with an error if a positional argument
that is not empty is found. Include a custom error message for all
components except kubeadm, as cobra.NoArgs is used in a lot of
places already (can be fixed in a followup).
The kubelet already handles this properly - e.g.:
'unknown command: "–foo"'
This change affects:
- cloud-controller-manager
- kube-apiserver
- kube-controller-manager
- kube-proxy
- kubeadm {alpha|config|token|version}
- kubemark
Signed-off-by: Monis Khan <mok@vmware.com>
Signed-off-by: Lubomir I. Ivanov <lubomirivanov@vmware.com>