Automatic merge from submit-queue
Fix detect-node-names to not error out if there are no nodes
Fixes#21564.
Teardown was not working correctly in rare cases because `detect-node-names` was failing before any of the actual cleanup was run. I'm pretty sure the issue was that there was an instance group, but no instances in the instance group, so we bailed out when we tried to expand the bash array.
This PR adds a guard so we don't bail if the array is empty.
cc @jlowdermilk @spxtr
Automatic merge from submit-queue
Add support for running clusters on GCI
Google Container-VM Image (GCI) is the next revision of Container-VM. See documentation at https://cloud.google.com/compute/docs/containers/vm-image/. This change adds support for starting a Kubernetes cluster using GCI.
With this change, users can start a kubernetes cluster using the latest kubelet and kubectl release binary built in the GCI image by running:
$ KUBE_OS_DISTRIBUTION="gci" cluster/kube-up.sh
Or run a testing cluster on GCI by running:
$ KUBE_OS_DISTRIBUTION="gci" go run hack/e2e.go -v --up
The commands above will choose the latest GCI image by default.
Automatic merge from submit-queue
Switch to ABAC authorization from AllowAll
Switch from AllowAll to ABAC. All existing identities (that are created by deployment scripts) are given full permissions through ABAC. Manually created identities will need policies added to the `policy.jsonl` file on the master.
Automatic merge from submit-queue
don't source the kube-env in addon-manager
This was added in 2feb658ed7 which became unused after #23603 but wasn't removed
Automatic merge from submit-queue
Trusty: Add debug supports for docker and kubelet
This PR adds debug support in two aspects: (1) For a test cluster, docker command will have "--debug" flag. Recently we noticed that this is very helpful in debug e2e test failures; (2) The kubelet command line will be put in /etc/default/kubelet. If a developer wants to test kubelet flags without recreating a cluster, she/he only needs to revise this file and then run "initctl restart kubelet". In addition, this PR fixes a couple of small things like comments and alignment.
Test result:
(1) Manually verified changing /etc/default/kubelet and run "initctl restart kubelet";
(2) Verified docker command line flag "--debug";
(3) e2e on pure trusty cluster and hybrid cluster all passed.
@roberthbailey @dchen1107 @zmerlynn please review it.
cc/ @yujuhong @fabioy @wonderfly FYI.
Automatic merge from submit-queue
Trusty: Add retry in curl commands
This fix is for improving robustness in fetch critical metadata files when the metadata server is temporarily unreachable.
@roberthbailey @zmerlynn @dchen1107 please review it.
cc/ @fabioy @wonderfly FYI.
Automatic merge from submit-queue
jenkins: Allow configuration of release bucket
This allows others to leverage the existing E2E code to test some
patched kube binary by simply overriding the bucket and reusing many of
the existing scripts
Automatic merge from submit-queue
Trusty: Handle the new var in kube-proxy manifest
This is to capture the kube-proxy manifest change in PR #24429.
@roberthbailey @fabioy @zmerlynn please review this change and mark it as cherry pick candidate. We need to catch up 1.2.3 release.
cc/ @dchen1107 @wonderfly @cjcullen FYI.
I have verified this fix. Without this fix, kube-proxy pod in Trusty nodes cannot be started correctly, i.e., the command line has an unhadled variable. And some other kube-system pods do not work correctly as kube-proxy is not working well. After applying this fix, kube-proxy can be started correctly, and all kube-system pods run successfully.
Automatic merge from submit-queue
Strip comments from configure-vm.sh for gce
We are getting very close to the 32KiB limit on GCE metadata entry length. We used to strip comments before putting the value in metadata, but I think we removed it in a refactor because it wasn't absolutely necessary, and leaving it out made the scripts slightly cleaner. It's close to being necessary again.
Removing comments reduces the size from 31,609B to 27,221B: https://www.diffchecker.com/0xmmecvw.
This allows others to leverage the existing E2E code to test some
patched kube binary by simply overriding the bucket and reusing many of
the existing scripts
Automatic merge from submit-queue
Trusty: Fixes for running GKE master
This PR includes two fixes for running GKE master on our image:
(1) The kubelet command line assembly had a missing part for cbr0. We did not catch it because the code path is not covered by OSS k8s tests;
(2) Remove the "" from the variables in the cert files. It causes a parsing issue in GKE. Again, this code path is not covered by k8s tests.
This PR also refactors the code for assembling kubelet flag. I move all logic into a single function assemble_kubelet_flags in configure-helper.sh for better readability and also simplify node.yaml and master.yaml.
@roberthbailey @dchen1107 please review it, and mark it as cherrypick-candidate. This PR is verified by @maisem. Together with his CL for GKE, we can run GKE cluster with master on our image and nodes on ContainerVM.
cc/ @maisem @fabioy @wonderfly FYI
This only applies to gce kube-up. 60 seconds of open connection should
be sufficient for anything that we should be downloading. The release
tar is currently 255M.
Automatic merge from submit-queue
Trusty: Regional release .tar.gz support
@zmerlynn and @roberthbailey please review it. This change is to support the feature added in PR #22234. The entire logic is pretty much the same as in #22234, with only few minor changes in implementation.
I had manually run e2e tests with "export RELEASE_REGION_FALLBACK=true" on two clusters: (1) Trusty on master nodes on ContainerVM; (2) Master and nodes all on trusty. All tests are green. I don't figure out a way to simulate regional fallback. But I did test the function download_or_bust() out-of-box.
cc/ @wonderfly @dchen1107 @fabioy FYI.