Automatic merge from submit-queue (batch tested with PRs 63138, 63091, 63201, 63341). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Enable bypassing online checks in kubeadm upgrade plan
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
**What this PR does / why we need it**:
This PR makes `kubeadm upgrade plan` a little nicer to use in an air gapped environment. `kubeadm upgrade plan` now accepts a version and returns that instead of checking the internet.
**Which issue(s) this PR fixes**:
Fixeskubernetes/kubeadm#698
**Special notes for your reviewer**:
I also cleaned up the tests for this section of code by adding formal names for table tests and using `t.Run`.
**Release note**:
```release-note
`kubeadm upgrade plan` now accepts a version which improves the UX nicer in air-gapped environments.
```
`kubeadm upgrade plan <version>` is now supported. If no
version is supplied then the original behavior remains.
If a version is supplied there will be no pause when figuring out
versions. Kubeadm will assume the version you pass in is the latest
stable version.
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
Automatic merge from submit-queue (batch tested with PRs 59965, 59115, 63076, 63059). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prepull etcd before an upgrade
If kubeadm ever has to upgrade etcd it should prepull the image so
there is less downtime during the upgrade when etcd versions change.
Fixeskubernetes/kubeadm#669
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
**What this PR does / why we need it**:
This PR Prepulls the etcd image during a `kubeadm upgrade apply`.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixeskubernetes/kubeadm#669
**Special notes for your reviewer**:
constants.MasterComponents was not changed because it is used in many places where etcd does not need to be nor should it be a part of this slice.
**Release note**:
```release-note
NONE
```
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 62655, 61711, 59122, 62853, 62390). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Modify the kubeadm upgrade DAG for the TLS Upgrade
**What this PR does / why we need it**:
This adds the necessary utilities to detect Etcd TLS on static pods from the file system and query Etcd.
It modifies the upgrade logic to make it support the APIServer downtime.
Tests are included and should be passing.
```bash
bazel test //cmd/kubeadm/... \
&& bazel build //cmd/kubeadm --platforms=@io_bazel_rules_go//go/toolchain:linux_amd64 \
&& issue=TLSUpgrade ~/Repos/vagrant-kubeadm-testing/copy_kubeadm_bin.sh
```
These cases are working consistently for me
```bash
kubeadm-1.9.6 reset \
&& kubeadm-1.9.6 init --kubernetes-version 1.9.1 \
&& kubectl apply -f https://git.io/weave-kube-1.6
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.6 # non-TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.0 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.10.1 # TLS to TLS
/vagrant/bin/TLSUpgrade_kubeadm upgrade apply 1.9.1 # TLS to TLS /w major version downgrade
```
This branch is based on top of #61942, as resolving the hash race condition is necessary for consistent behavior.
It looks to fit in pretty well with @craigtracey's PR: #62141
The interfaces are pretty similar
/assign @detiber @timothysc
**Which issue(s) this PR fixes**
Helps with https://github.com/kubernetes/kubeadm/issues/740
**Special notes for your reviewer**:
278b322a1c
[kubeadm] Implement ReadStaticPodFromDisk
c74b56372d
Implement etcdutils with Cluster.HasTLS()
- Test HasTLS()
- Instrument throughout upgrade plan and apply
- Update plan_test and apply_test to use new fake Cluster interfaces
- Add descriptions to upgrade range test
- Support KubernetesDir and EtcdDataDir in upgrade tests
- Cover etcdUpgrade in upgrade tests
- Cover upcoming TLSUpgrade in upgrade tests
8d8e5fe33b
Update test-case, fix nil-pointer bug, and improve error message
97117fa873
Modify the kubeadm upgrade DAG for the TLS Upgrade
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of
KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch
on the Etcd TLS Upgrade if TLS Upgrade:
- Skip L7 Etcd check (could implement a waiter for this)
- Skip data rollback on etcd upgrade failure due to lack of L7 check
(APIServer is already down unable to serve new requests)
- On APIServer upgrade failure, also rollback the etcd manifest to
maintain protocol compatibility
- Add logging
**Release note**:
```release-note
kubeadm upgrade no longer races leading to unexpected upgrade behavior on pod restarts
kubeadm upgrade now successfully upgrades etcd and the controlplane to use TLS
kubeadm upgrade now supports external etcd setups
kubeadm upgrade can now rollback and restore etcd after an upgrade failure
```
Fix `rollbackEtcdData()` to return error=nil on success
`rollbackEtcdData()` used to always return an error making the rest of the
upgrade code completely unreachable.
Ignore errors from `rollbackOldManifests()` during the rollback since it
always returns an error.
Success of the rollback is gated with etcd L7 healthchecks.
Remove logic implying the etcd manifest should be rolled back when
`upgradeComponent()` fails
- Calculate `beforePodHashMap` before the etcd upgrade in anticipation of KubeAPIServer downtime
- Detect if pre-upgrade etcd static pod cluster `HasTLS()==false` to switch on the Etcd TLS Upgrade
if TLS Upgrade:
- Skip L7 Etcd check (could implement a waiter for this)
- Skip data rollback on etcd upgrade failure due to lack of L7 check (APIServer is already down unable to serve new requests)
- On APIServer upgrade failure, also rollback the etcd manifest to maintain protocol compatibility
- Add logging
- Test HasTLS()
- Instrument throughout upgrade plan and apply
- Update plan_test and apply_test to use new fake Cluster interfaces
- Add descriptions to upgrade range test
- Support KubernetesDir and EtcdDataDir in upgrade tests
- Cover etcdUpgrade in upgrade tests
- Cover upcoming TLSUpgrade in upgrade tests
If kubeadm ever has to upgrade etcd it should prepull the image so
there is less downtime during the upgrade when etcd versions change.
Fixeskubernetes/kubeadm#669
Signed-off-by: Chuck Ha <ha.chuck@gmail.com>
- Update kubeadm static pod upgrades to use the
kubetypes.ConfigHashAnnotationKey annotation on the mirror pod rather
than generating a hash from the full object info. Previously, a status
update for the pod would allow the upgrade to proceed before the
new static pod manifest was actually deployed.
Signed-off-by: Jason DeTiberus <detiber@gmail.com>
Automatic merge from submit-queue (batch tested with PRs 62568, 62220, 62743, 62751, 62753). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Kubeadm upgrade same version
What this PR does / why we need it:
When kubeadm 1.10 came out, it inadvertently introduced a backwards incompatible config change. Because the kubeadm MasterConfiguration is written by the old version of kubeadm and read by the new one, this incompatibility causes the upgrade to fail.
To mitigate this, I've written a simple transform that operates on a map-based version of the config. This map is mutated to make it compatible with the new structure, then serialised to JSON and deserialised by the usual APIMachinery.
Because of complications with the multiple versions, this PR enforces kubeadm only being used to upgrade to kubernetes of the same minor and major versions.
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes [kubeadm#744](https://github.com/kubernetes/kubeadm/issues/744#issuecomment-379045823L)
This PR is an alternate take on #62353. Instead of trying to gate migration on versions, this constrains kubeadm to only upgrade versions from the same major and minor versions.
Special notes for your reviewer:
```release-note
fixes configuration error when upgrading kubeadm from 1.9 to 1.10+
enforces kubeadm upgrading kubernetes from the same major and minor versions as the kubeadm binary.
```
Automatic merge from submit-queue (batch tested with PRs 61129, 60359). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Cleanup old upgrading code that is v1.8->v1.9-specific
**What this PR does / why we need it**:
Cleanup old upgrading code that is v1.8->v1.9-specific
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes https://github.com/kubernetes/kubeadm/issues/622
This will finish the task in the issue.
**Special notes for your reviewer**:
/cc @luxas @vbmade2000
**Release note**:
```release-note
NONE
```
- Place etcd server and peer certs & keys into pki subdir
- Move certs.altName functions to pkiutil + add appendSANstoAltNames()
Share the append logic for the getAltName functions as suggested by
@jamiehannaford.
Move functions/tests to certs/pkiutil as suggested by @luxas.
Update Bazel BUILD deps
- Warn when an APIServerCertSANs or EtcdCertSANs entry is unusable
- Add MasterConfiguration.EtcdPeerCertSANs
- Move EtcdServerCertSANs and EtcdPeerCertSANs under MasterConfiguration.Etcd
- Generate Server and Peer cert for etcd
- Generate Client cert for apiserver
- Add flags / hostMounts for etcd static pod
- Add flags / hostMounts for apiserver static pod
- Generate certs on upgrade of static-pods for etcd/kube-apiserver
- Modify logic for appending etcd flags to staticpod to be safer for external etcd
Automatic merge from submit-queue (batch tested with PRs 59344, 59595, 59598). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix kubeadm typo
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 57824, 58806, 59410, 59280). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
2nd try at using a vanity GCR name
The 2nd commit here is the changes relative to the reverted PR. Please focus review attention on that.
This is the 2nd attempt. The previous try (#57573) was reverted while we
figured out the regional mirrors (oops).
New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest. To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today). For now the staging is an alias to
gcr.io/google_containers (the legacy URL).
When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.
We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it. Nice and
visible, easy to keep track of.
xref https://github.com/kubernetes/release/issues/281
TL;DR:
* The new `staging-k8s.gcr.io` is where we push images. It is literally an alias to `gcr.io/google_containers` (the existing repo) and is hosted in the US.
* The contents of `staging-k8s.gcr.io` are automatically synced to `{asia,eu,us)-k8s.gcr.io`.
* The new `k8s.gcr.io` will be a read-only alias to whichever regional repo is closest to you.
* In the future, images will be promoted from `staging` to regional "prod" more explicitly and auditably.
```release-note
Use "k8s.gcr.io" for pulling container images rather than "gcr.io/google_containers". Images are already synced, so this should not impact anyone materially.
Documentation and tools should all convert to the new name. Users should take note of this in case they see this new name in the system.
```
This is the 2nd attempt. The previous was reverted while we figured out
the regional mirrors (oops).
New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest. To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today). For now the staging is an alias to
gcr.io/google_containers (the legacy URL).
When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.
We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it. Nice and
visible, easy to keep track of.
Automatic merge from submit-queue (batch tested with PRs 53895, 58013, 58466, 58531, 58535). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[kubeadm] Bump kube-dns to 1.14.8
**What this PR does / why we need it**:
Bump kube-dns to 1.14.8 for kubeadm. Ref https://github.com/kubernetes/kubernetes/pull/57918.
cc @rramkumar1
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Improve error messages and comments in KubeAdm.
**What this PR does / why we need it**:
Improve error messages and comments in KubeAdm.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 57139, 57358). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubeadm upgrade: fix unit test flake
The CA generated for each test case is global and the cases modify
the expiry. This can flake depending on what order the tests run.
Generate a new CA for each test case.
```release-note
NONE
```
Fixes https://github.com/kubernetes/kubernetes/issues/57357
/cc @kubernetes/sig-cluster-lifecycle-bugs
/cc @xiangpengzhao
/cc @luxas