Automatic merge from submit-queue
cluster/images/hyperkube: re-add hyperkube busybox style symlinks
Originally symlinks were added with a `--make-symlinks` command discussed in https://github.com/kubernetes/kubernetes/issues/24510 and implemented in https://github.com/kubernetes/kubernetes/pull/24511.
It was backed out in https://github.com/kubernetes/kubernetes/pull/25693 because go binaries don't run in qemu and this breaks cross-building the Dockerfile for arm. In this case, due to running `hyperkube --make-symlinks`.
Lets just add the symlinks manually until the upstream bug is fixed (qemu).
fixes#28702
@mikedanese @thockin @yifan-gu @euank
Automatic merge from submit-queue
Updated fluentd configuration to spawn multiple threads for processing
(by default, fluentd uses a single thread).
@a-robinson @igorpeshansky
Automatic merge from submit-queue
Bump exechealthz image
With the new image at least if we observe an exec container taking more ram than it should (like the oom situation, which shouldn't happen today because of the increased limits), we can kubectl exec and check the pprof endpoints.
Note that I'm not bumping the rc version, because I just did so with: https://github.com/kubernetes/kubernetes/pull/29693.
Automatic merge from submit-queue
Give healthz more memory to mitigate #29688
This will recreate the rc but not the pods. At least on the clusters we patched, if the pods get recreated they'll ccome back up with the updated limits.
#29688
Automatic merge from submit-queue
export KUBE_USER to salt (support custom usernames) for vagrant, vsph…
GCE/GKE were handled in #29164, AWS was handled in #29428. This should cover the rest of the configurations that use ABAC.
Automatic merge from submit-queue
kube-up: increase download timeout for kubernetes.tar.gz
Particularly on smaller instances on AWS, we were hitting the 80 second
timeout now that our image is well over the 1GB mark.
Increase the timeout from 80 seconds to 300 seconds.
Fix#29418
Automatic merge from submit-queue
Include CNI for all architectures in the hyperkube image
Can some of you (@jfrazelle @mikedanese) quickly lgtm this?
I'd like it if we got it merged before v1.4.0-alpha.2
It's not a huge change, I'm just cross-compiling this CNI stuff while waiting for the v0.4.0 which likely will release binaries for all arches.
Automatic merge from submit-queue
AWS kube-up: fix MASTER_OS_DISTRIBUTION
On AWS we were defining KUBE_MASTER_OS_DISTRIBUTION, but the scripts
expect MASTER_OS_DISTRIBUTION.
Fixes#29422
Particularly on smaller instances on AWS, we were hitting the 80 second
timeout now that our image is well over the 1GB mark.
Increase the timeout from 80 seconds to 300 seconds.
Fix#29418
Automatic merge from submit-queue
Add load balancer in front of apiserver in HA master setup
The first commit is just https://github.com/kubernetes/kubernetes/pull/29201 and has been already LGTMed.
Second commit has some small fixes:
1. Precompute REGION variable in config
2. Add timeout for waiting for loadbalancer
3. Fix kube-down so that it doesn't delete some resources if there are still masters/nodes in other zones
Second commit also fixes bug in https://github.com/kubernetes/kubernetes/pull/29201 where variable `REGION` was unset in `kube-down` when deleting master IP. The bug caused leaking of IP addresses.
https://github.com/kubernetes/kubernetes/issues/21124
@davidopp @jszczepkowski @wojtek-t @mikedanese
Automatic merge from submit-queue
Bump the default etcd version in the Makefile to 3.0.3
Fixes: #29132
I haven't had time to manually validate the arm and arm64 version yet, but I think it should be fine.
cc @xiang90 @hongchaodeng @timothysc @lavalamp @wojtek-t @thockin @kubernetes/sig-scalability @Pensu @laboger
Automatic merge from submit-queue
Ubuntu: Enable ssh compression when downloading binaries during cluster creation
<!--
Checklist for submitting a Pull Request
Please remove this comment block before submitting.
1. Please read our [contributor guidelines](https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md).
2. See our [developer guide](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md).
3. If you want this PR to automatically close an issue when it is merged,
add `fixes #<issue number>` or `fixes #<issue number>, fixes #<issue number>`
to close multiple issues (see: https://github.com/blog/1506-closing-issues-via-pull-requests).
4. Follow the instructions for [labeling and writing a release note for this PR](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes) in the block below.
-->
resolves#20971 by using the options provided by ssh.
Native ssh compression has existed for years, and the server is free to disregard the setting, so this should be safe.
With things like the kube binaries I see about a 2x speed increase.
```
λ time scp kubes-bin.tar 9.30.182.251:/mnt/build/kubin
kubes-bin.tar 100% 344MB 10.7MB/s 00:32
real 0m32.284s
user 0m1.679s
sys 0m1.263s
λ time scp -C kubes-bin.tar 9.30.182.251:/mnt/build/kubin
kubes-bin.tar 100% 344MB 22.9MB/s 00:15
real 0m14.810s
user 0m12.858s
sys 0m0.994s
λ ls -lah kubes-bin.tar
-rw-r--r-- 1 mhb staff 344M Jun 2 15:29 kubes-bin.tar
λ tar -tf kubes-bin.tar
kubectl
master/
master/etcd
master/etcdctl
master/flanneld
master/kube-apiserver
master/kube-controller-manager
master/kube-scheduler
node/
node/flanneld
node/kube-proxy
node/kubelet
```
Automatic merge from submit-queue
fix logrotate config (again)
we need to add the dateformat option so that the logrotate
can create unique logfiles for each rotation. Without this,
logrotation is skipped with message like (generated in
verbose mode of logrotate):
rotating log /var/log/rotate-test.log, log->rotateCount is 5
dateext suffix '-20160718'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
destination /var/log/rotate-test2.log-20160718.gz already exists, skipping rotation
Tested as follows:
# config in '/etc/logrotate.d/rotate-test':
/var/log/rotate-test.log {
rotate 5
copytruncate
missingok
notifempty
compress
maxsize 100M
daily
dateext
dateformat -%Y%m%d-%s
create 0644 root root
}
# create 150Mb of /var/log/rotate-test.log
$ dd if=/dev/zero of=/var/log/rotate-test.log bs=1048576 count=150 conv=notrunc oflag=append
# run logrotate
$ /usr/sbin/logrotate -v /etc/logrotate.conf
...
rotating pattern: /var/log/rotate-test.log after 1 days (5 rotations)
empty log files are not rotated, log files >= 104857600 are rotated earlier, old logs are removed
considering log /var/log/rotate-test.log
log needs rotating
rotating log /var/log/rotate-test.log, log->rotateCount is 5
Converted ' -%Y%m%d-%s' -> '-%Y%m%d-%s'
dateext suffix '-20160718-1468875268'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
copying /var/log/rotate-test.log to /var/log/rotate-test.log-20160718-1468875268
truncating /var/log/rotate-test.log
compressing log with: /bin/gzip
Repeating 'dd' and 'logrotate' commands now generate logfiles correctly.
#27754
@bprashanth can you please review?
Automatic merge from submit-queue
hyperkube: fix build for 3rd party registry (again)
Fixes issue #28487
This is a minor fix for the issue reported in #28487
we need to add the dateformat option so that the logrotate
can create unique logfiles for each rotation. Without this,
we logrotation is skipped with message like (generated in
verbose mode of logrotate):
rotating log /var/log/rotate-test.log, log->rotateCount is 5
dateext suffix '-20160718'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
destination /var/log/rotate-test2.log-20160718.gz already exists, skipping rotation
Tested as follows:
# config in '/etc/logrotate.d/rotate-test':
/var/log/rotate-test.log {
rotate 5
copytruncate
missingok
notifempty
compress
maxsize 100M
daily
dateext
dateformat -%Y%m%d-%s
create 0644 root root
}
# create 150Mb of /var/log/rotate-test.log
$ dd if=/dev/zero of=/var/log/rotate-test.log bs=1048576 count=150 conv=notrunc oflag=append
# run logrotate
$ /usr/sbin/logrotate -v /etc/logrotate.conf
...
rotating pattern: /var/log/rotate-test.log after 1 days (5 rotations)
empty log files are not rotated, log files >= 104857600 are rotated earlier, old logs are removed
considering log /var/log/rotate-test.log
log needs rotating
rotating log /var/log/rotate-test.log, log->rotateCount is 5
Converted ' -%Y%m%d-%s' -> '-%Y%m%d-%s'
dateext suffix '-20160718-1468875268'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
copying /var/log/rotate-test.log to /var/log/rotate-test.log-20160718-1468875268
truncating /var/log/rotate-test.log
compressing log with: /bin/gzip
Repeating 'dd' and 'logrotate' commands now generate logfiles correctly.
Automatic merge from submit-queue
Unset KUBERNETES_PROVIDER when KUBERNETES_CONFORMANCE_TEST is set
fixes#26269
same as #26530 - i accidentally lost my fork and couldn't rebase there ;)
@mikedanese PTAL
Automatic merge from submit-queue
kube-up: install new Docker pre-requisite (libltdl7) when not in image
Docker now has a dependency on libltdl7; we have to specify it manually
if we are installing docker using dpkg (vs using apt-get or similar,
which would pull it in automatically)
Fixes#28644
This allows us to start building real dependencies into Makefile.
Leave old hack/* scripts in place but advise to use 'make'. There are a few
rules that call things like 'go run' or 'build/*' that I left as-is for now.
Docker now has a dependency on libltdl7; we have to specify it manually
if we are installing docker using dpkg (vs using apt-get or similar,
which would pull it in automatically)
Fixes#28644
Automatic merge from submit-queue
Fixes#28205, Check release tar location for Openstack-Heat provider
This does a basic check to see where the release tars are located.
Allows people to use openstack-heat outside of compiling k8s.
Automatic merge from submit-queue
Enable extensions/v1beta1/NetworkPolicy by default
Fixes https://github.com/kubernetes/kubernetes/issues/28401
For some reason this also triggered an update to the swagger spec (which apparently hadn't been done before but wasn't failing validation...)
Automatic merge from submit-queue
Implementing a proper master/worker split in the juju cluster code.
```
release-note-none
```
General updates to the cluster/juju Kubernetes provider, to bring it up to date.
Updating the skydns templates to version 11
Updating the etcd container definition to include arch.
Updating the master template to include arch and version for hyperkube container.
Adding dns_domain configuration options.
Adding storage layer options.
[]()
Automatic merge from submit-queue
Check existence of kubernetes dir for get-kube.sh
[]()
There are a lot of references to https://get.k8s.io/ over the internet.
Most of such references do not describe KUBERNETES_SKIP_DOWNLOAD env variable
and newbies can get into a situation described below:
- execute `wget -q -O - https://get.k8s.io | bash`
- receive a failure due too missed packages or some configs
- fix the issue
- try again `wget -q -O - https://get.k8s.io | bash`
In this case, get-kube.sh will not check that kubernetes directory already
exist and repeat download again.
Lets make get-kube.sh more user-friendly and check existence of kubernetes dir
Updating the skydns templates to version 11
Updating the etcd container definition to include arch.
Updating the master template to include arch and version for hyperkube container.
Adding dns_domain configuration options.
Adding storage layer options.
Fixing underscore problem and adding exceptions.
Fixing the underscore flag errors.
Automatic merge from submit-queue
Make GKE detect-instance-groups work on Mac.
Make the fix from #27803 also work on mac.
The GNU `expr` command supports both the `expr match STRING REGEXP` and `expr STRING : REGEXP` command syntax.
The BSD `expr` command only has the `expr STRING : REGEXP` syntax.
@fabioy @a-robinson
Automatic merge from submit-queue
Bump skydns godeps to latest
Update Godeps for github.com/skynetservices/skydns and miekg/dns.
Bump kubedns version to 1.6 with latest skynetservices/skydns code
Built kube-dns for all architectures and pushed containers to gcr.io.
Automatic merge from submit-queue
Enhance kubedns pod health checks to cover kubedns container
The existing health check hits port 53, the dnsmasq container, with the same domain name every time. Since dnsmasq looks up and caches results from the kubedns container, running on port 10053, the health check is not covering the kubedns container after the first query (and once every TTL expiration).
This PR enhances the health check to directly hit port 10053 (kubedns) in addition to port 53.
Automatic merge from submit-queue
Substitute federation_domain_map parameter with its value in node bootstrap scripts.
This PR also removes the substitution code we added to the build scripts.
**Release Note**
```release-note
If you use one of the kube-dns replication controller manifest in `cluster/saltbase/salt/kube-dns`, i.e. `cluster/saltbase/salt/kube-dns/{skydns-rc.yaml.base,skydns-rc.yaml.in}`, either substitute one of `__PILLAR__FEDERATIONS__DOMAIN__MAP__` or `{{ pillar['federations_domain_map'] }}` with the corresponding federation name to domain name value or remove them if you do not support cluster federation at this time. If you plan to substitute the parameter with its value, here is an example for `{{ pillar['federations_domain_map'] }`
pillar['federations_domain_map'] = "- --federations=myfederation=federation.test"
where `myfederation` is the name of the federation and `federation.test` is the domain name registered for the federation.
```
cc @erictune @kubernetes/sig-cluster-federation @MikeSpreitzer @luxas
[]()
Automatic merge from submit-queue
Remove duplicated nginx image. Use nginx-slim instead
This PR removes the image `gcr.io/google_containers/nginx:1.7.9` and uses `gcr.io/google_containers/nginx-slim:0.7`.
Besides removing the duplication `1.7.9` is 16 months old.
Automatic merge from submit-queue
Making DHCP_OPTION_SET_ID creation optional
Reason: We have a pre-configured VPC in AWS. `kube-up.sh` should not making changes to the VPC DHCP option if there's already DHCP options configured.
PR Changes: When `DHCP_OPTION_SET_ID` is given in environment variable, kube-up.sh will skip the `DHCP_OPTION_SET_ID` creation.
Automatic merge from submit-queue
Enable setting up Kubernetes cluster in Ubuntu on Azure
Implement basic cloud provider functionality to deploy Kubernetes on
Azure. SaltStack is used to deploy Kubernetes on top of Ubuntu
virtual machines. OpenVpn provides network connectivity. For
kubelet authentication, we use basic authentication (username and
password). The scripts use the legacy Azure Service Management APIs.
We have set up a nightly test job in our Jenkins server for federated
testing to run the e2e test suite on Azure. With the cloud provider
scripts in this commit, 14 e2e test cases pass in this environment.
We plan to implement additional Azure functionality to support more
test cases.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/21207)
<!-- Reviewable:end -->
Automatic merge from submit-queue
[fix] allow ALLOW_PRIVILEGED to be passed to kubelet and kube-api
This is something that we need for running docker in docker. Please let me know if you would consider this change. Thanks :)
Automatic merge from submit-queue
Add Calico as policy provider in GCE
Adds Calico as policy provider to GCE, enforcing the extensions/v1beta1 NetworkPolicy API.
Still to do:
- [x] Enable NetworkPolicy API when POLICY_PROVIDER is provided.
- [x] Fix CNI plugin, policy controller versions.
CC @thockin - does this general approach look good?
Automatic merge from submit-queue
Tracked addition of federation, sed support in kube DNS
[]()
The kube DNS app recently gained support for federation (whatever that
is), including a new Salt parameter. This broke the deployAddons.sh script for cluster ubuntu. The DNS app also gained alternate
templates, intended to be friendly to `sed`. Fortunately, those do
not demand a federation parameter.
This PR fixes up the ` cluster/ubuntu/deployAddons.sh` script to track those changes, by switching to the `sed`-friendly templates.
Automatic merge from submit-queue
mount instanceid file from config drive when using openstack cloud provider
fix https://github.com/kubernetes/kubernetes/issues/23191, the instanceid file is read however we do not mount it as a volume, and it would cause the cloud provider contacts the metadata server, in some cases, the metadata server is not able to serve, then the cloud provider would fail to initialize, we should avoid that.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/23733)
<!-- Reviewable:end -->
Automatic merge from submit-queue
cluster/aws: Add option for kubeconfig context
Added KUBE_CONFIG_CONTEXT environment variable to customize the kubeconfig context created at the end of the aws kube-up script.
Fixes#24877
This PR does barely anything and shouldn't require e2e tests. It's just a minor convenience.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24910)
<!-- Reviewable:end -->
The kube DNS app recently gained support for federation (whatever that
is), including a new Salt parameter. It also gained alternate
templates, intended to be friendly to `sed`. Fortunately, those do
not demand a federation parameter.
Automatic merge from submit-queue
Revert "Federation e2e supports aws"
Reverting https://github.com/kubernetes/kubernetes/pull/27791 to get our Jenkins tests green again.
cc @kubernetes/sig-cluster-federation
Automatic merge from submit-queue
federation: Updating KubeDNS to try finding a local service first for federation query
Ref https://github.com/kubernetes/kubernetes/issues/26762
Updating KubeDNS to try to find a local service first for federation query.
Without this change, KubeDNS always returns the DNS hostname, even if a local service exists.
Have updated the code to first remove federation name from path if it exists, so that the default search for local service happens. If we dont find a local service, then we try to find the DNS hostname.
Will appreciate a strong review since this is my first change to KubeDNS.
https://github.com/kubernetes/kubernetes/pull/25727 was the original PR that added federation support to KubeDNS.
cc @kubernetes/sig-cluster-federation @quinton-hoole @madhusudancs @bprashanth @mml
Automatic merge from submit-queue
Use new fluentd-gcp container with journal support
This makes use of the systemd-journal support added in PR #27981
and Fixes#27446.
cc/ @a-robinson @andyzheng0831
Automatic merge from submit-queue
Support journal logs in fluentd-gcp on GCI
This maintains a single common image for each rather than having to fork out separate images, relying on different commands in yaml manifests to differentiate in the behavior. This is treading on top of @adityakali's #27906, but I wasn't able to get in touch with him this afternoon until very recently. He's handling making sure that the new yaml manifests are used when running on GCI.
```release-note
```
Only run the systemd-journal plugin when on a platform that requests it.
The plugin crashes the fluentd process if the journal isn't present, so
it can't just be run blindly in all configurations.
Following from #27830, this copies the source onto the instance and
displays the location of it prominently (keeping the download link for
anyone that just wants to curl it).
Example output (this tag doesn't exist yet):
---
Welcome to Kubernetes v1.4.0!
You can find documentation for Kubernetes at:
http://docs.kubernetes.io/
The source for this release can be found at:
/usr/local/share/doc/kubernetes/kubernetes-src.tar.gz
Or you can download it at:
https://storage.googleapis.com/kubernetes-release/release/v1.4.0/kubernetes-src.tar.gz
It is based on the Kubernetes source at:
https://github.com/kubernetes/kubernetes/tree/v1.4.0
For Kubernetes copyright and licensing information, see:
/usr/local/share/doc/kubernetes/LICENSES
---
Automatic merge from submit-queue
Pushing a new KubeDNS image and updating the YAML files
Updating KubeDNS image to include https://github.com/kubernetes/kubernetes/pull/27845
@kubernetes/sig-cluster-federation @girishkalele @mml
Automatic merge from submit-queue
Revert kube-proxy as a DaemonSet in hyperkube for the v1.3 release
It was a bit sad, but I was a bit too fast with the kube-proxy DaemonSet thing, so we have to target v1.4 for that one. Reverting to a static-pod
This one is for v1.3
@mikedanese @cheld @zreigz
Automatic merge from submit-queue
increase addon check interval
Do static pods have a crash loop back off? If so, this test would be much faster if we restarted the kubelet to clear that.
Fixes#26770
Following from #27830, this copies the source onto the instance and
displays the location of it prominently (keeping the download link for
anyone that just wants to curl it).
Example output (this tag doesn't exist yet):
---
Welcome to Kubernetes v1.4.0!
You can find documentation for Kubernetes at:
http://docs.kubernetes.io/
The source for this release can be found at:
/usr/local/share/doc/kubernetes/kubernetes-src.tar.gz
Or you can download it at:
https://storage.googleapis.com/kubernetes-release/release/v1.4.0/kubernetes-src.tar.gz
It is based on the Kubernetes source at:
https://github.com/kubernetes/kubernetes/tree/v1.4.0
For Kubernetes copyright and licensing information, see:
/usr/local/share/doc/kubernetes/LICENSES
---
Automatic merge from submit-queue
AWS kube-up: Authorize route53 in the IAM policy
Federation needs this now (on the nodes), and I suspect ingress
controllers will shortly want this also. Given we're going to authorize
it on the nodes, we should authorize it on the master also (the master
is much more trusted).
Fix#27467
Automatic merge from submit-queue
Allow conformance tests to run on non-GCE providers
fixes https://github.com/kubernetes/kubernetes/issues/26869
Creates a skeleton provider which has all the required function stubs -- but will allow a previously set "skeleton" KUBERNETES_PROVIDER to not be overriden with "gce".
Federation needs this now (on the nodes), and I suspect ingress
controllers will shortly want this also. Given we're going to authorize
it on the nodes, we should authorize it on the master also (the master
is much more trusted).
Fix#27467
Automatic merge from submit-queue
AWS kube-up: move to Docker 1.11.2
This is to mirror GCE
Also we remove support for vivid as Docker no longer packages for it, and remove some of the unreachable distro code in aws kube-up.
Also bump the AMI to a 1.3 version (with preinstalled Docker 1.11.2)
Fixes https://github.com/kubernetes/kubernetes/issues/27654
Automatic merge from submit-queue
GCE provider: Limit Filter calls to regexps rather than insane blobs
Filters can't exceed 4k, and GET requests against the GCE API are also limited, so these break down in different ways at different cluster counts. Fix it by introducing an advisory `node-instance-prefix` configuration in the GCE provider that can hint the `EnsureLoadBalancer`/`UpdateLoadBalancer code` (and the firewall creation/update code). If it's not there, or wrong (a hostname that's registered violates it), just ignore it and grab the whole project.
Fixes#27731
[]()
Filters can't exceed 4k, and GET requests against the GCE API are also
limited, so these break down in different ways at different cluster
counts. Fix it by introducing an advisory node-instance-prefix
configuration in the GCE provider that can hint the
EnsureLoadBalancer/UpdateLoadBalancer code (and the firewall
creation/update code). If it's not there, or wrong (a hostname that's
registered violates it), just ignore it and grab the whole project.
Automatic merge from submit-queue
federation: Creating kubeconfig files to be used for creating secrets for clusters on aws and gke
Extension of https://github.com/kubernetes/kubernetes/pull/26914 which created the kubeconfig files for gce clusters.
This PR extends it to AWS, vagrant and GKE.
The change for AWS and vagrant is exactly same as GCE.
For GKE, since `gcloud create clusters` creates kubeconfig, we are just copying the generated kubeconfig to the desired location
cc @kubernetes/sig-cluster-federation @colhom
@roberthbailey for GKE
Automatic merge from submit-queue
rkt: Map kubelet's `--stage1-image` flag to rkt's `--stage1-name` flag.
This enables rkt to use cached stage1 image instead of unpacking the stage1 image every time for every pod.
After this change, users need to preload the stage1 images in order to enable rkt to find the stage1 image with the name specified by this flag.
Also, the cloud config is modified to pre-load the stage1 images.
cc @kubernetes/sig-rktnetes @kubernetes/sig-node
Automatic merge from submit-queue
add logrotate service and configuration for GCI
This change mirrors the configuration in cluster/saltbase/salt/logrotate for GCI.
On GCI we use systemd timers (https://www.freedesktop.org/software/systemd/man/systemd.timer.html) and install an hourly timer - kube-logrotate.timer. This will invoke kube-logrotate.service (which calls /usr/sbin/logrotate) once every hour to perform log rotation as per the rotation rules installed under /etc/logrotate.d/.
@kubernetes/goog-image @zmerlynn @dchen1107 @andyzheng0831
This enables rkt to use cached stage1 image instead of unpacking the
stage1 image every time for every pod.
After this change, users need to preload the stage1 images in order to
enable rkt to find the stage1 image with the name specified by this flag.
Automatic merge from submit-queue
make GCI image detection robust
This change makes sure that in case we roll back a released GCI image, the image detection logic picks a correct active image.
@kubernetes/goog-image @Amey-D @wonderfly @dchen1107
Automatic merge from submit-queue
Update to dnsmasq:1.3 and make hyperkube always use the latest addons
This bumps dnsmasq to a version that works on all architectures: https://github.com/kubernetes/contrib/pull/1192 (which have to be pushed first indeed)
Also I removed the manifests in hyperkube addons in favor for machine-generated ones, which will avoid mistakes.
This one is required for `v1.3`, so it has to be cherrypicked I think...
It makes docker and docker-multinode addons work again...
(Yes, we'll probably get rid of docker in favor for minikube, but we'll have to have it in this release at least)
@girishkalele @thockin @ArtfulCoder @david-mcmahon @bgrant0607 @mikedanese
This works around a linux kernel bug with overly aggressive caching of
ARP entries, which was causing problems when we reused IP addresses in
VPCs, for example with an ASG in a relatively small subnet.
See #23395 for more explanation.
Fixes#23395
Vivid is EOL, and Docker is no longer packaged for it.
Remove support for it in 1.3 (in 1.2 we had warned users it was EOL).
Also remove unused wheezy, trusty & coreos & do general cleanup.
Automatic merge from submit-queue
Prep for continuous Docker validation test
```release-note
Add a test config variable to specify desired Docker version to run on GCI.
```
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
Tested on my local Jenkins instance that was able to start a cluster with the latest Docker version (different from installed version) running on both master and nodes.
@dchen1107 Can you review?
cc/ @andyzheng0831 for changes in `cluster/gce/gci/helper.sh`, and @ixdy @spxtr for changes to the Jenkins e2e-runner
cc/ @kubernetes/goog-image
Implement basic cloud provider functionality to deploy Kubernetes on
Azure. SaltStack is used to deploy Kubernetes on top of Ubuntu
virtual machines. OpenVpn provides network connectivity. For
kubelet authentication, we use basic authentication (username and
password). The scripts use the legacy Azure Service Management APIs.
We have set up a nightly test job in our Jenkins server for federated
testing to run the e2e test suite on Azure. With the cloud provider
scripts in this commit, 14 e2e test cases pass in this environment.
We plan to implement additional Azure functionality to support more
test cases.
This first reverts commit 8e8437dad8.
Also resolves conflicts with docs on f334fc41
And resolves conflicts with https://github.com/kubernetes/kubernetes/pull/22231/commits
to make people switching between two different methods of setting up by
setting env variables.
Conflicts:
cluster/get-kube.sh
cluster/saltbase/salt/README.md
cluster/saltbase/salt/kube-proxy/default
cluster/saltbase/salt/top.sls
Automatic merge from submit-queue
Revert "Revert "GCI: add support for network plugin""
PR #27027 added the network plugin support in GCI config, but later a bug in the network plugin broke e2e tests (see issue #27118). The bug was fixed by #27141 and we have been repeatedly run the serial e2e tests more than 10 times to verify the fix. Now it should be safe to put the GCI network plugin support back.
We will first merge in the master branch and monitor the Jenkins serial tests for a while and then cherry-pick it into release-1.3 branch.
Automatic merge from submit-queue
kube-up multizone: don't print scary warning
The node-count check gets confused when there are more nodes that we
launched, which is normal with KUBE_USE_EXISTING_MASTER.
This fix just suppresses the error message in that case.
Fix#23390
The node-count check gets confused when there are more nodes that we
launched, which is normal with KUBE_USE_EXISTING_MASTER.
This fix just suppresses the error message in that case.
Fix#23390
Automatic merge from submit-queue
Fixes and improvements to Photon Controller backend for kube-up
- Improve reliability of network address detection by using MAC
address. VMware has a MAC OUI that reliably distinguishes the VM's
NICs from the other NICs (like the CBR). This doesn't rely on the
unreliable reporting of the portgroup.
- Persist route changes. We configure routes on the master and nodes,
but previously we didn't persist them so they didn't last across
reboots. This persists them in /etc/network/interfaces
- Fix regression that didn't configure auth for kube-apiserver with
Photon Controller.
- Reliably run apt-get update: Not doing this can cause apt to fail.
- Remove unused nginx config in salt
- Improve reliability of network address detection by using MAC
address. VMware has a MAC OUI that reliably distinguishes the VM's
NICs from the other NICs (like the CBR). This doesn't rely on the
unreliable reporting of the portgroup.
- Persist route changes. We configure routes on the master and nodes,
but previously we didn't persist them so they didn't last across
reboots. This persists them in /etc/network/interfaces
- Fix regression that didn't configure auth for kube-apiserver with
Photon Controller.
- Reliably run apt-get update: Not doing this can cause apt to fail.
- Remove unused nginx config in salt
Automatic merge from submit-queue
version bump for gci to milestone 53
Fixes#26455
GCI release 53 includes kubernetes v1.3.0-alpha.5 with docker-1.11.2.
@dchen1107 @kubernetes/goog-image @andyzheng0831
Automatic merge from submit-queue
support for mounting local-ssds on GCI
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
@vulpecula @roberthbailey @andyzheng0831 @kubernetes/goog-image
Automatic merge from submit-queue
Trusty: fix the 'ping' issue and fluentd-gcp issue #26379
This PR is mainly for being picking up the fix in #27016 and #27102 in trusty code, so that we can fix the issues in the release-1.2 branch for GCI. It contains two parts:
(1) Adding iptables rules to accept ICMP traffic, otherwise 'ping' from a pod does not work;
(2) Revising the code for cleaning up docker0 stuff including the bridge and iptables rules. I slightly refactor the code of starting kubelet and removing docker0 stuff before starting kubelet. The old code did it after starting kubelet but before restarting docker. I think doing it before starting kubelet is safter.
cc/ @roberthbailey @fabioy @dchen1107 @a-robinson @kubernetes/goog-image
Automatic merge from submit-queue
cluster/gce/coreos: Update heapster apiVersion
This fixes an inadvertant search-replace error in #26617.
The error was missed then because the search-replace issue wasn't
present in the standalone controllers, but was in all the others.
I verified that with this change heapster comes up under the default influxdb monitoring and without this change addon manager spits out validation failure errors for the heapster yaml.
cc @yifan-gu
This change adds support for mounting local ssds on GCI.
It updates the previous container-vm behavior as well to
match that for GCI nodes by mounting the local-ssds under
the same path (/mnt/disks/ssdN).
Automatic merge from submit-queue
GCI: add support for network plugin
I had run e2e against a cluster with both master and nodes on GCI a couple of times. The PR auto tests will cover the hybrid cluster with just master on GCI.
cc/ @roberthbailey @fabioy @kubernetes/goog-image
Automatic merge from submit-queue
Exit image puller subshell
Exit the subshell with 0 so even if the last docker pull fails the pod doesn't end up in the error state.
This fixes an inadvertant search-replace error in #26617.
The error was missed then because the search-replace issue wasn't
present in the standalone controllers, but was in all the others.
Automatic merge from submit-queue
GCI: fix the issue #26379
This PR deletes docker0 explicitly to fix the issue. In some cases, coexistence of docker0 and cbr0 make troubles in GCI-based cluster instances.
I verified it in GKE. With the fix, fluentd-gcp pod shows no error. "curl google.com" can work inside a pod. Mark it as P0 to match the issue priority.
@a-robinson @roberthbailey @freehan @kubernetes/goog-image
Automatic merge from submit-queue
Enable support for memory eviction configuration via salt
Added evictions based on memory by default whenever the available memory is < 100Mi.
Updated GCE and GCI.
Automatic merge from submit-queue
Bump cluster autoscaler version and enable scale down by default
Follow up of https://github.com/kubernetes/contrib/pull/1148.
cc: @piosz @fgrzadkowski @jszczepkowski
Automatic merge from submit-queue
Re-enable node problem detector by default
Re-enable node problem detector started in gce cluster by default.
For now, in the master node, the node problem detector will be started and do nothing (see https://github.com/kubernetes/node-problem-detector/pull/13).
But in fact, in my test cluster, the master has no extra cpu to run the node problem detector, so node problem detector is started on all nodes except master, which is what we want but not expected...
@dchen1107
/cc @kubernetes/sig-node
/cc @andyzheng0831 for the gci script change.
[]()
Automatic merge from submit-queue
Don't run fluentd-es on GCI masters
It isn't run on containervm masters. It can't do anything on the master because the master doesn't have kube-proxy running to enable fluentd to talk to the elasticsearch service.
@andyzheng0831
Automatic merge from submit-queue
Add collection of the new glbc and cluster-autoscaler logs
I've incremented the version numbers by 2 to avoid conflicting with #26652. I'll make sure the potential conflict between the images gets resolved reasonably.
cc @piosz @bprashanth @aledbf
We want to continuously validate Docker releases (#25215), on GCI. This change
adds a new test config variable, `KUBE_GCI_DOCKER_VERSION`, through which we can
specify which version of Docker we want to run on the master and nodes. This
change also patches the Jenkins e2e-runner with the ability to fetch the latest
Docker (pre)release, and sets the aforementioned variable accordingly.
Automatic merge from submit-queue
GCI/Trusty: support the Docker registry mirror
@roberthbailey @zmerlynn please review it.
cc/ @fabioy @dchen1107 @kubernetes/goog-image FYI.
cc/ @ojarjur it is very straightforward to add support for GCI, which is pretty much like the change to ContainerVM's configure-vm.sh in your original PR #25841.
Automatic merge from submit-queue
GCI: correct the fix in #26363
This PR is mainly for correcting the fix to 'find' command in #26363. I added "-maxdepth 1" in an earlier change, and #26363 tried to fix it by changing the search path. This is potentially incorrect, when yaml files are in more than one layer deep. The real fix should be removing the "-maxdepth 1" flag from 'find' command. This PR also updates two minor places in the file configure-helper.sh introduced by two previous PR #26413 and #26048.
@roberthbailey @wonderfly
cc/ @dchen1107 @fabioy @kubernetes/goog-image
Automatic merge from submit-queue
Increase failure threshold for glbc liveness probe
This pod fails a liveness probe on occasion, probably because the failure thresholds are too strict. Simple enough that either reviewer can review.
Automatic merge from submit-queue
pin GCI version to milestone 52
This is mainly for pinning the 1.2 branch to GCI milestone 52
which contains correct docker and kubelet built in.
Doing this allows us to upgrade docker to v1.11 (issue #26455)
in GCI 53 without breaking the 1.2 release branch.
@kubernetes/goog-image @dchen1107 @roberthbailey @andyzheng0831
This is mainly for pinning the 1.2 branch to GCI milestone 52
which contains correct docker and kubelet built in.
Doing this allows us to upgrade docker to v1.11 (issue #26455)
in GCI 53 without breaking the 1.2 release branch.
Automatic merge from submit-queue
Rebuild elasticsearch image to include changes since 1.2
Fixes#25360. I've pushed the image to GCR.
@jimmidyson @keontang @vishh
Automatic merge from submit-queue
Move the defaults setting of GCI to util.sh
fixes#26291
This change recovers some of the side effects of
https://github.com/kubernetes/kubernetes/pull/26197, i.e., keeps the defaults of
`NODE_IMAGE` and `NODE_IMAGE_PROJECT` to `MASTER_IMAGE` and
`MASTER_IMAGE_PROJECT`, for backward compatibility. Although it keeps
`OS_DISTRIBUTION` defaulting to `gci`, the default settings of these vars are
moved to `cluster/gce/util.sh` and conditioned on `OS_DISTRIBUTION==gci`.
@euank @roberthbailey Can you review?
Automatic merge from submit-queue
cluster/coreos: Update heapster addon to beta2
fixes#26616
As noted there, heapster was updated but not for gce/coreos which breaks anything that depends on heapster's new metrics API (i.e. autoscaling)
This change recovers some of the side effects of
https://github.com/kubernetes/kubernetes/pull/26197, i.e., keeps the defaults of
`NODE_IMAGE` and `NODE_IMAGE_PROJECT` to `MASTER_IMAGE` and
`MASTER_IMAGE_PROJECT`, for backward compatibility. Although it keeps
`OS_DISTRIBUTION` defaulting to `gci`, the default settings of these vars are
moved to `cluster/gce/util.sh` and conditioned on `OS_DISTRIBUTION==gci`.
Automatic merge from submit-queue
GCI: cherry-pick the fix in PR #25670
This PR simply copies the fix in #25670 into the GCI support.
cc/ @kubernetes/goog-image @dchen1107 @roberthbailey
Automatic merge from submit-queue
Switch DNS addons from skydns to kubedns
Change GCI and trusty cluster-helper scripts to use kubedns instead of skydns.
Unified skydns templates using a simple underscore based template and
added transform sed scripts to transform into salt and sed yaml
templates
Moved all content out of cluster/addons/dns into build/kube-dns and
saltbase/salt/kube-dns
Automatic merge from submit-queue
Support for cluster autoscaler in GCE Trusty and GCI images
Fixes: #26346
Ref: #26197
cc: @fgrzadkowski @vulpecula @piosz @jszczepkowski
Automatic merge from submit-queue
Prepull images in e2e
Quick and dirty image puller because the SQ stalled multiple times just *today* on image pull flake (https://github.com/kubernetes/kubernetes/issues/25277).
@kubernetes/sig-node @kubernetes/sig-testing wdyt?
Automatic merge from submit-queue
Make node-instance-group base names unique to prevent collisions
We create multiple IGMs for >1000 Node clusters. When we have a conflict on base name IGMs will fight over ownership of the VM that happen to have the name belonging to multiple IGMs.
This change will increase reliability of starting big clusters.
cc @wojtek-t @alex-mohr @roberthbailey @mikedanese
Automatic merge from submit-queue
Add node problem detector as an addon pod.
```release-note
Introduce a new add-on pod NodeProblemDetector.
NodeProblemDetector is a DaemonSet running on each node, monitoring node health and reporting
node problems as NodeCondition and Event. Currently it already supports kernel log monitoring, and
will support more problem detection in the future. It is enabled by default on gce now.
```
This PR enables NodeProblemDetector as an add-on pod.
/cc @mikedanese @kubernetes/sig-node
[]()
Automatic merge from submit-queue
Configuration for GCP webhook authentication and authorization
This PR adds configuration for GCP webhook authentication and authorization in ContainerVM and GCI. The change of configure-vm.sh and kube-apiserver.manifest is directly copied from @cjcullen's PR #25380 and #25296. The change in GCI script configure-helper.sh includes the support for webhook authentication and authorization, and also some code refactor to improve readability.
@cjcullen @roberthbailey @zmerlynn please review it. The original PRs are P1, please mark this as P1.
cc/ @fabioy @kubernetes/goog-image FYI.
I verified it by running e2e tests on GCI cluster. Without the GCI side change, cluster creation fails as being capture by GKE Jenkins tests. I don't test when the two env GCP_AUTHN_URL and GCP_AUTHZ_URL are set, because they are only set in GKE. After this PR is merged, @cjcullen will test in GKE.
Automatic merge from submit-queue
cluster/gce/coreos: Set service-cluster-ip-range
Broken by #19242
See also #26002
This is necessary to kube-up for me, but depending on how #26002 plays out, this PR might not be necessary. Happy to close this or merge or whatever depending on what's best.
cc @yifan-gu @sjpotter @mikedanese
Automatic merge from submit-queue
Updating CentOS image, adding heat back to the required cli tools.
[]()
Updated the CentOS cloudimage to the latest available, and also added heat to the required list of cli tools. This is an interim step to replacing all the commands with openstackclient.
Automatic merge from submit-queue
add CIDR allocator for NodeController
This PR:
* use pkg/controller/framework to watch nodes and reduce lists when allocate CIDR for node
* decouple the cidr allocation logic from monitoring status logic
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/19242)
<!-- Reviewable:end -->
Automatic merge from submit-queue
GCI: Fix the condition for using the default image
This PR revises the condition for using the default GCI image. The old logic is not convenient for manually run e2e tests in some cases (mainly for GCI team to test custom images). The new logic by this PR is very similar to the logic in using ContainerVM. When setting distro to "gci", if master or node image is unset, we use gci-dev for it. If either is set, we respect it.
@roberthbailey @zmerlynn @dchen1107 please review it, and we should cherry pick it in release-1.2 branch. Thanks!
cc/ @kubernetes/goog-image @adityakali FYI
Automatic merge from submit-queue
Fix hyperkube's layer caching, and remove --make-symlinks at build time
@david-mcmahon This is required before you release. Explanation in the code.
Automatic merge from submit-queue
AWS: More support for ap-northeast-2 region
Issue #24446
The new AWS region for Seoul, Korea (ap-northeast-2)
was launched in January 2016
https://aws.amazon.com/blogs/aws/now-open-aws-asia-pacific-seoul-region/
But it requires a few changes.
To test:
```
export KUBERNETES_PROVIDER=aws
export KUBE_AWS_ZONE=ap-northeast-2a
export MASTER_SIZE=t2.medium
export NODE_SIZE=t2.medium
export NUM_NODES=4
cluster/kube-up.sh
```
I assigned the AMIs by checking the specific version used from `ap-northeast-1`,
and finding the same image with the same datestamp.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/24464)
<!-- Reviewable:end -->
Automatic merge from submit-queue
GCI/Trusty: Fix an issue in using 'find' commands
This PR makes the logic of 'find' command consistent with the 'cp' command afterwards, i.e., only check one layer of a given dir. Without this fix, we have seen a recent breakage after PR #25309 added the file cluster/addons/fluentd-elasticsearch/es-image/template-k8s-logstash.json. The 'find' command discovers this json file, but the 'cp' command fails.
@roberthbailey @dchen1107 @zmerlynn please review this fix, and mark it as a cherry pick candidate. I already verified this fix can resolve the breakage.
cc/ @wonderfly @fabioy @kubernetes/goog-image FYI
Automatic merge from submit-queue
cluster/juju: Updated the url for the getting started doc
Minor change to update the URL pointing at the "Running Kubernetes locally via Docker" document
Automatic merge from submit-queue
GCI: Enable the log of upstart jobs
This PR enables the log of upstart jobs in master.yaml and node.yaml. By default, log of upstart jobs are enabled in Trusty and placed in /var/log/upstart, but not enabled in GCI. This change explicitly directs the log to the system logger. For trusty, they are in /var/log/syslog file. In GCI, we can check it using "journalctl". This change will be useful for debugging if cluster initialization fails.
@roberthbailey @maisem @dchen1107 please review it. This will be useful for issues like #23634. We should also cherry pick it in release-1.2
cc/ @fabioy @zmerlynn @wonderfly FYI.
Automatic merge from submit-queue
Automatically download the latest stable release version of Kubernetes.
The ubuntu version of download-release.sh included in the binary release downloads the released .tar.gz file again. Right now the version of the downloaded file is manually encoded within the script. This change fetches the released version automatically, similar to the shell script available on the main Kubernetes site below:
https://get.k8s.io/
Ideally the installation on bare metal ubuntu should work with the files available in the already downloaded package.
@mikedanese
Automatic merge from submit-queue
add index template for es aggregations
This index template helps us to do es aggregations of namespace_name, pod_name and container_name. Then after doing eggs, we will get the whole name not all the spilt pieces.
fix#25127
Automatic merge from submit-queue
Salt configuration for the new Cluster Autoscaler for GCE
Adds support for cloud autoscaler from contrib/cloud-autoscaler in kube-up.sh GCE script.
cc: @fgrzadkowski @piosz
Automatic merge from submit-queue
Use --format='value(name)' with gcloud instead of grep/awk/cut
Fixing our fragile parsing of `gcloud` is getting old (#24746, #25159, maybe others?).
Instead, let's just get the proper output out of `gcloud` in the first place.
Automatic merge from submit-queue
PSP admission
```release-note
Update PodSecurityPolicy types and add admission controller that could enforce them
```
Still working on removing the non-relevant parts of the tests but I wanted to get this open to start soliciting feedback.
- [x] bring PSP up to date with any new features we've added to SCC for discussion
- [x] create admission controller that is a pared down version of SCC (no ns based strategies, no user/groups/service account permissioning)
- [x] fix tests
@liggitt @pmorie - this is the simple implementation requested that assumes all PSPs should be checked for each requests. It is a slimmed down version of our SCC admission controller
@erictune @smarterclayton
Automatic merge from submit-queue
Remove myself from a bunch of OWNERS files
For the time being I am too overloaded to do non scheduler/admission related reviews that aren't explicitly assigned to me.
cc/ @brendandburns
Automatic merge from submit-queue
Change default clusterCIDRs from /16 to /14 in GCE configs allowing 1000 Node clusters by default.
cc @thockin @roberthbailey @wojtek-t @zmerlynn @davidopp
There are a lot of references to https://get.k8s.io/ over the internet.
Most of such references do not describe KUBERNETES_SKIP_DOWNLOAD env variable
and newbies can get into a situation described below:
- execute `wget -q -O - https://get.k8s.io | bash`
- receive a failure due too missed packages or some configs
- fix the issue
- try again `wget -q -O - https://get.k8s.io | bash`
In this case, get-kube.sh will not check that kubernetes directory already
exist and repeat download again.
Lets make get-kube.sh more user-friednly and check existence if kubernetes dir
Automatic merge from submit-queue
Openstack provider
Our pull request delivers solution to create Kubernetes cluster on the top of OpenStack. Heat OpenStack Orchestration engine describes the infrastructure for Kubernetes cluster. CentoOS images are used for Kubernetes host machines.
We tested our solution with DevStack and Citycloud provider.
We believe that our solution will fill the gap that which is on the market.
<!-- Reviewable:start -->
---
This change is [<img src="http://reviewable.k8s.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](http://reviewable.k8s.io/reviews/kubernetes/kubernetes/21737)
<!-- Reviewable:end -->
Apparently our cluster start time increased, to the point where users
are reporting spurious timeouts (#23623) and users are reporting that
increasing the timeout fixes the issue (thanks @paralin for the
suggestion and @jlfields for confirming).
Fix#23623
Automatic merge from submit-queue
GCI/Trusty: Fix the running of kube-addon-manager
This PR fixes the issue that kube-addon-master (added in #23600) is not started. Without this fix, no kube-system pods can be running correctly. As a result, the GCI-based Jenkins testing k8s head has been down for a couple of days. The root cause is that we stopped to use namespace.yaml, but configure-helper.sh still tries to copy it. This PR also gets rid of /var/cache/kubernetes-install/kube_env.yaml, as it is not needed anymore after #24108.
@mikedanese @roberthbailey @dchen1107 please review it. If possible please mark it as P1, as it blocks GCI-based Jenkins tests.
cc/ @kubernetes/goog-image @fabioy FYI
Automatic merge from submit-queue
Added vsphere support for vagrant
Since the native vsphere support (using govc library) requires admin permissions on ESX/vCenter, not everyone can have such permissions. So I'm adding a vsphere support using vagrant using vagrant-vsphere plugin
Automatic merge from submit-queue
Move godeps to vendor/
This is a first-step towards glide support, maybe we don't want or need to take this, but it was easy to try.
This fails to compile, not sure why:
```
# k8s.io/kubernetes/pkg/apis/extensions/v1beta1
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2703: undefined: extensions.ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2703: undefined: ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2719: undefined: extensions.ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2719: undefined: ClusterAutoscaler
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: undefined: extensions.ClusterAutoscalerList
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: undefined: ClusterAutoscalerList
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:3468: Convert_extensions_JobSpec_To_v1beta1_JobSpec redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:328
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:3845: Convert_extensions_ScaleStatus_To_v1beta1_ScaleStatus redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:98
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:4737: Convert_v1beta1_JobSpec_To_extensions_JobSpec redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:380
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:5186: Convert_v1beta1_ScaleStatus_To_extensions_ScaleStatus redeclared in this block
previous declaration at _output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion.go:120
_output/local/go/src/k8s.io/kubernetes/pkg/apis/extensions/v1beta1/conversion_generated.go:2723: too many errors
!!! Error in /home/thockin/tmp/godep-vendor/src/k8s.io/kubernetes/hack/lib/golang.sh:417
```
Automatic merge from submit-queue
cluster/images/hyperkube: create symlink for each server
Add a kubelet symlink so that the hyperkube image can appear as a kubelet image. https://github.com/kubernetes/kubernetes/issues/24510
Automatic merge from submit-queue
GCI: Add two GCI specific metadata pairs
This PR adds two GCI specific metadata pairs when using GCI image.
(1) "gci-update-strategy": by default the GCI in-place updater is enabled. It means that when a new image is released, the instance on the old image will be upgraded to the new image. In this change, we turn it off;
(2) "gci-ensure-gke-docker": GCI is built with two versions of docker. When this metadata is set to "true", the version satisfying kubernetes qualification will be used. Setting this metadata prevents from using incorrect docker version.
Automatic merge from submit-queue
Use v1.6.2-1 tag for build.
Is there any reason these don't use the VERSION file like everything else? cc @luxas @ixdy
Automatic merge from submit-queue
Add an entry to the salt config to allow Debian jessie on GCE.
```release-note
Add an entry to the salt config to allow Debian jessie on GCE.
As with the existing Wheezy image on GCE, docker is expected
to already be installed in the image.
```
[]()
Automatic merge from submit-queue
Convert internal types to use exact precision integers
This makes conversion more suitable for future optimizations, and we need to stop pretending for some of our internal types that the width of the int doesn't matter.
@wojtek-t
Automatic merge from submit-queue
Fix detect-node-names to not error out if there are no nodes
Fixes#21564.
Teardown was not working correctly in rare cases because `detect-node-names` was failing before any of the actual cleanup was run. I'm pretty sure the issue was that there was an instance group, but no instances in the instance group, so we bailed out when we tried to expand the bash array.
This PR adds a guard so we don't bail if the array is empty.
cc @jlowdermilk @spxtr
Automatic merge from submit-queue
Promote Pod Hostname & Subdomain to fields (were annotations)
Deprecating the podHostName, subdomain and PodHostnames annotations and created corresponding new fields for them on PodSpec and Endpoints types.
Annotation doc: #22564
Annotation code: #20688
CentOS 7 Core nodes running on OpenStack with an SSL-enabled API
endpoint results in the following error without this patch:
F0425 19:00:58.124520 5 server.go:100] Cloud provider could not be initialized: could not init cloud provider "openstack": Post https://my.openstack.cloud:5000/v2.0/tokens: x509: failed to load system roots and no roots provided
The root cause is that the ca-bundle.crt file is actually a symlink
which points to a directory which wasn't previously exposed.
[root@kubernetesstack-master ~]# ls -l /etc/ssl/certs/ca-bundle.crt
lrwxrwxrwx. 1 root root 49 18 nov 11:02 /etc/ssl/certs/ca-bundle.crt -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
[root@kubernetesstack-master ~]#
Fixed the order of fields for basic_auth.
This provider still needs to leverage common.sh for generating proper credentials though.
Also documented a pattern for how to get the SWIFT_SERVER_URL automatically
Hard-coded the enabling of the common addons:
- logging
- kube-dashboard
- monitoring
Will make it configureable in a subsequent PR.
Also need to enable configuration of basic_auth.csv
This makes it so that we download the OS image automatically.
Also contains other usability improvements:
- kubectl context created with heat stack name
- Bumped default minions to 3
Making the assumption that the person running kube-up has their
Openstack environment setup, those same variables are being passed
into heat, and then into openstack.conf.
The salt codebase was modified to add openstack as well.
If someone has an openrc as part of their profile, this will make kube-up work automatically.
The only things that have to be modified are in config-default.sh, either by editing the file or setting environment variables.
This assumes you have your environement variables set correctly.
When ENABLE_PROXY is set to true, it takes the current proxy
settings and applies them to the heat configuration.
Also modified the defaults system in config-default.sh
Automatic merge from submit-queue
Update Docker version after cockpit installation
Fixes https://github.com/kubernetes/kubernetes/issues/24530
The vagrant setup didn't worked for me because `cockpit cockpit-kubernetes` brings their own Docker version (1.7) which doesn't work and the master components doesn't come up. More information about this bug are in my [issue](https://github.com/kubernetes/kubernetes/issues/24530).
My test system:
```bash
$ uname -a
Darwin MyMacBook.local 15.4.0 Darwin Kernel Version 15.4.0: Fri Feb 26 22:08:05 PST 2016; root:xnu-3248.40.184~3/RELEASE_X86_64 x86_64
$ vagrant --version
Vagrant 1.8.1
$ VBoxManage --version
5.0.16r105871
```
Automatic merge from submit-queue
Add support for running clusters on GCI
Google Container-VM Image (GCI) is the next revision of Container-VM. See documentation at https://cloud.google.com/compute/docs/containers/vm-image/. This change adds support for starting a Kubernetes cluster using GCI.
With this change, users can start a kubernetes cluster using the latest kubelet and kubectl release binary built in the GCI image by running:
$ KUBE_OS_DISTRIBUTION="gci" cluster/kube-up.sh
Or run a testing cluster on GCI by running:
$ KUBE_OS_DISTRIBUTION="gci" go run hack/e2e.go -v --up
The commands above will choose the latest GCI image by default.
Added KUBE_CONFIG_CONTEXT environment variable to customize the
kubeconfig context created at the end of the aws kube-up script.
Signed-off-by: Christian Stewart <christian@paral.in>
Automatic merge from submit-queue
Switch to ABAC authorization from AllowAll
Switch from AllowAll to ABAC. All existing identities (that are created by deployment scripts) are given full permissions through ABAC. Manually created identities will need policies added to the `policy.jsonl` file on the master.
Automatic merge from submit-queue
don't source the kube-env in addon-manager
This was added in 2feb658ed7 which became unused after #23603 but wasn't removed
The UI didn't work with vSphere kube-up implementation. This fixes
that by making the following changes:
* Configure the apiserver with admission controls, especially
ServiceAccount. This will provide the token to the dashboard pod
that it needs to talk to the apiserver. This will also improve other
pods that require service accounts.
* Add routes to the master so it can communicate with the pods, so
hitting the https://MASTER/ui URL will allow it to contact the
pods.
* Add an extra subject for the cluster IP to the apiserver, so when
the dashboard communicates with the apiserver, the certificate
matches the IP address it's using.
Automatic merge from submit-queue
Trusty: Add debug supports for docker and kubelet
This PR adds debug support in two aspects: (1) For a test cluster, docker command will have "--debug" flag. Recently we noticed that this is very helpful in debug e2e test failures; (2) The kubelet command line will be put in /etc/default/kubelet. If a developer wants to test kubelet flags without recreating a cluster, she/he only needs to revise this file and then run "initctl restart kubelet". In addition, this PR fixes a couple of small things like comments and alignment.
Test result:
(1) Manually verified changing /etc/default/kubelet and run "initctl restart kubelet";
(2) Verified docker command line flag "--debug";
(3) e2e on pure trusty cluster and hybrid cluster all passed.
@roberthbailey @dchen1107 @zmerlynn please review it.
cc/ @yujuhong @fabioy @wonderfly FYI.
Automatic merge from submit-queue
Initial kube-up support for VMware's Photon Controller
This is for: https://github.com/kubernetes/kubernetes/issues/24121
Photon Controller is an open-source cloud management platform. More
information is available at:
http://vmware.github.io/photon-controller/
This commit provides initial support for Photon Controller. The
following features are tested and working:
- kube-up and kube-down
- Basic pod and service management
- Networking within the Kubernetes cluster
- UI and DNS addons
It has been tested with a Kubernetes cluster of up to 10
nodes. Further work on scaling is planned for the near future.
Internally we have implemented continuous integration testing and will
run it multiple times per day against the Kubernetes master branch
once this is integrated so we can quickly react to problems.
A few things have not yet been implemented, but are planned:
- Support for kube-push
- Support for test-build-release, test-setup, test-teardown
Assuming this is accepted for inclusion, we will write documentation
for the kubernetes.io site.
We have included a script to help users configure Photon Controller
for use with Kubernetes. While not required, it will help some
users get started more quickly. It will be documented.
We are aware of the kube-deploy efforts and will track them and
support them as appropriate.
Automatic merge from submit-queue
Trusty: Add retry in curl commands
This fix is for improving robustness in fetch critical metadata files when the metadata server is temporarily unreachable.
@roberthbailey @zmerlynn @dchen1107 please review it.
cc/ @fabioy @wonderfly FYI.
Automatic merge from submit-queue
Use kube-system namespace
Fixes#23153.
Sadly, kube-system isn't automatically created, so people need to make
sure to create it in their turnup scripts. Also after creating
kube-system it can take 10+ seconds for master and proxy to show up.
I tested the equivalent of these changes locally, but not these changes
themselves as I don't have a dev/build env up, so please read carefully
and maybe try them out!
This is for: https://github.com/kubernetes/kubernetes/issues/24121
Photon Controller is an open-source cloud management platform. More
information is available at:
http://vmware.github.io/photon-controller/
This commit provides initial support for Photon Controller. The
following features are tested and working:
- kube-up and kube-down
- Basic pod and service management
- Networking within the Kubernetes cluster
- UI and DNS addons
It has been tested with a Kubernetes cluster of up to 10
nodes. Further work on scaling is planned for the near future.
Internally we have implemented continuous integration testing and will
run it multiple times per day against the Kubernetes master branch
once this is integrated so we can quickly react to problems.
A few things have not yet been implemented, but are planned:
- Support for kube-push
- Support for test-build-release, test-setup, test-teardown
Assuming this is accepted for inclusion, we will write documentation
for the kubernetes.io site.
We have included a script to help users configure Photon Controller
for use with Kubernetes. While not required, it will help some
users get started more quickly. It will be documented.
We are aware of the kube-deploy efforts and will track them and
support them as appropriate.
E2e shows occasional kubectl failures here, so add some retries. We may want
to make this more general, but I think we should try it out in small scope
first.
Also clean up the retry loop so it doesn't process errors as successful runs
(discovered in testing).
Also simplify a bit of go template syntax.
Testing: I made kubectl randomly fail 50% of the time ($RANDOM%2 ==0) and
iterated until this gave me more helpful results. Still not perfect, but
better.
Automatic merge from submit-queue
jenkins: Allow configuration of release bucket
This allows others to leverage the existing E2E code to test some
patched kube binary by simply overriding the bucket and reusing many of
the existing scripts
Automatic merge from submit-queue
Trusty: Handle the new var in kube-proxy manifest
This is to capture the kube-proxy manifest change in PR #24429.
@roberthbailey @fabioy @zmerlynn please review this change and mark it as cherry pick candidate. We need to catch up 1.2.3 release.
cc/ @dchen1107 @wonderfly @cjcullen FYI.
I have verified this fix. Without this fix, kube-proxy pod in Trusty nodes cannot be started correctly, i.e., the command line has an unhadled variable. And some other kube-system pods do not work correctly as kube-proxy is not working well. After applying this fix, kube-proxy can be started correctly, and all kube-system pods run successfully.
Automatic merge from submit-queue
Fix GLBC cluster addon README link
Fix the link to L7 load balancer controller in GLBC cluster addon README.
Fixed#24462.
Automatic merge from submit-queue
Strip comments from configure-vm.sh for gce
We are getting very close to the 32KiB limit on GCE metadata entry length. We used to strip comments before putting the value in metadata, but I think we removed it in a refactor because it wasn't absolutely necessary, and leaving it out made the scripts slightly cleaner. It's close to being necessary again.
Removing comments reduces the size from 31,609B to 27,221B: https://www.diffchecker.com/0xmmecvw.
Automatic merge from submit-queue
add HOME env variable for kube-addons service
Fix https://github.com/kubernetes/kubernetes/issues/23973.
Briefly, systemd service does not know the `HOME` environment variable which causes the kubectl write schema file into `/.kube` while it is expected to be `/root/.kube`.
Automatic merge from submit-queue
Add easy-rsa to hyperkube container
Otherwise gets downloaded a runtime, which kind of breaks the container model.
See [comment](https://github.com/kubernetes/kubernetes/issues/20514#issuecomment-195835786) in #20514 - this causes dockerized install of k8s to fail if you're behind a proxy. make-ca-cert.sh already looks for a local copy of easy-rsa.tar.gz before downloading it, so this drops the tarball in the expected place in the container.
Automatic merge from submit-queue
Fix spacing in usage_from_stdin and info_from_stdin (issue #24186).
If "a" is a bash array, then the syntax to append the contents of $line as a
new element to the array is a+=("$line"), not messages+=$line
Using the former syntax just seems to append to the first element, creating a
long string and thus losing newline information.
Fixing this allows us to drop some empty lines from invocations of
usage_from_stdin.
Automatic merge from submit-queue
add labels to kube component static pods
```
$ k --namespace=kube-system get po -l 'tier in (control-plane)'
NAME READY STATUS RESTARTS AGE
kube-apiserver-k-7-master 1/1 Running 2 1m
kube-controller-manager-k-7-master 1/1 Running 1 1m
kube-scheduler-k-7-master 1/1 Running 0 54s
$ k --namespace=kube-system get po -l 'tier in (node)'
NAME READY STATUS RESTARTS AGE
kube-proxy-k-7-minion-eheu 1/1 Running 0 1m
kube-proxy-k-7-minion-mwo9 1/1 Running 0 1m
kube-proxy-k-7-minion-xw6m 1/1 Running 0 1m
```
cc @bgrant0607 @thockin @gmarek
Fixes#21267
Automatic merge from submit-queue
add config-test.sh to cluster/centos so we can run e2e test on centos/fedora/rhel
so I can run e2e test on centos locally using the following command
```console
KUBERNETES_PROVIDER=centos KUBERNETES_CONFORMANCE_TEST=y ./cluster/test-e2e.sh
```
Automatic merge from submit-queue
don't ship kube-registry-proxy and pause images in tars.
pause is built into containervm. if it's not on the machine we should just pull
it. nobody that I'm aware of uses kube-registry-proxy and it makes build/deployment
more complicated and slower.
This allows others to leverage the existing E2E code to test some
patched kube binary by simply overriding the bucket and reusing many of
the existing scripts
If "a" is a bash array, then the syntax to append the contents of $line as a
new element to the array is a+=("$line"), not messages+=$line
Using the former syntax just seems to append to the first element, creating a
long string and thus losing newline information.
Fixing this allows us to drop some empty lines from invocations of
usage_from_stdin.
Automatic merge from submit-queue
Fix GKE kube-up to correctly find an IGM from a multi-zone cluster
I've confirmed that this successfully brings up a cluster, fixing the immediate issue with the new e2e test. Sorry about not properly vetting it in the original PR (#24075).
This does cause a warning message to be printed based on the handling of the NUM_NODES variable though, which I could fix if you guys think it's worth it:
```
Detected 6 ready nodes, found 6 nodes out of expected 3. Found more nodes than expected, your cluster may not behave correctly.
```
@quinton-hoole
Automatic merge from submit-queue
Fix log dump for new gcloud
`gcloud compute instance-groups managed list-instances` at CI has self-link for instance instead of just name. Fixes#24120
Automatic merge from submit-queue
Trusty: Fixes for running GKE master
This PR includes two fixes for running GKE master on our image:
(1) The kubelet command line assembly had a missing part for cbr0. We did not catch it because the code path is not covered by OSS k8s tests;
(2) Remove the "" from the variables in the cert files. It causes a parsing issue in GKE. Again, this code path is not covered by k8s tests.
This PR also refactors the code for assembling kubelet flag. I move all logic into a single function assemble_kubelet_flags in configure-helper.sh for better readability and also simplify node.yaml and master.yaml.
@roberthbailey @dchen1107 please review it, and mark it as cherrypick-candidate. This PR is verified by @maisem. Together with his CL for GKE, we can run GKE cluster with master on our image and nodes on ContainerVM.
cc/ @maisem @fabioy @wonderfly FYI
This only applies to gce kube-up. 60 seconds of open connection should
be sufficient for anything that we should be downloading. The release
tar is currently 255M.
Automatic merge from submit-queue
Make kube2sky and skydns docker images cross-platform
ARM tracking issue: #17981
Continues on: #19216
Make it possible to create `kube2sky` and `skydns` docker images for ARM and other architectures too
Build in a container, so `golang` isn't a dependency
I've preserved the original default behaviour:
- `skydns`: It just compiles with go on host
- `kube2sky`: Build an image
@brendandburns @dchen1107 @ArtfulCoder @thockin @fgrzadkowski
Automatic merge from submit-queue
Up to golang 1.6
A second attempt to upgrade go version above `go1.4`
Merge ASAP after you've cut the `release-1.2` branch and feel ready.
`go1.6` should perform slightly better than `go1.5`, so this time it might work
@gmarek @wojtek-t @zmerlynn @mikedanese @brendandburns @ixdy @thockin
pause is built into containervm. if it's not on the machine we should just pull
it. nobody that I'm aware of uses kube-registry-proxy and it makes build/deployment
more complicated and slower.
Automatic merge from submit-queue
Cross-build hyperkube and debian-iptables for ARM. Also add a flannel image
We have to be able to build complex docker images too on `amd64` hosts.
Right now we can't build Dockerfiles with `RUN` commands when building for other architectures e.g. ARM.
Resin has a tutorial about this here: https://resin.io/blog/building-arm-containers-on-any-x86-machine-even-dockerhub/
But it's a bit clumsy syntax.
The other alternative would be running this command in a Makefile:
```
# This registers in the kernel that ARM binaries should be run by /usr/bin/qemu-{ARCH}-static
docker run --rm --privileged multiarch/qemu-user-static:register --reset
```
and
```
ADD https://github.com/multiarch/qemu-user-static/releases/download/v2.5.0/x86_64_qemu-arm-static.tar.xz /usr/bin
```
Then the kernel will be able to differ ARM binaries from amd64. When it finds a ARM binary, it will invoke `/usr/bin/qemu-arm-static` first and lets `qemu` translate the ARM syscalls to amd64 ones.
Some code here: https://github.com/multiarch
WDYT is the best approach? If registering `binfmt_misc` in the kernels of the machines is OK, then I think we should go with that.
Otherwise, we'll have to wait for resin's patch to be merged into mainline qemu before we may use the code I have here now.
@fgrzadkowski @david-mcmahon @brendandburns @zmerlynn @ixdy @ihmccreery @thockin
Automatic merge from submit-queue
support NETWORK_PROVIDER=cni for KUBERNETES_PROVIDER=vagrant
While trying to develop CNI plugins for K8's, I found the docs referenced the support of --network-plugin=cni for kubelet, but this wasn't surfaced up via salt to support env NETWORK_PROVIDER=cni before a kube-up deployment.
This PR is my attempt at adding CNI support to the kube-up happy path, following a lot of similar work for NETWORK_PROVIDER=kubenet which already exists.
Also, I've added the ability to consume CNI plugin's (binaries) and configuration files from the local cluster/network-plugins directory into the necessary locations as referenced here for CNI:
http://kubernetes.io/docs/admin/network-plugins
This allows a local developer to easily work on CNI plugin development while following the existing kube-up.sh docs and process.
In general, i've struggled to find any authoritative information or answers to my questions in slack regarding CNI progress / correct integration, so comments encouraged here!
Files are taken from cluster/network-plugins/{bin,conf} to be consumed within a vagrant kube-up.sh environment.
Paths used for configuration files and the 'cni' name of the network provider are all from the kubernetes documentation, but the actual implementation in the salt automation doesn't seem to exist.
Use of NETWORK_PROVIDER=cni is documented as useable (as well as it's affects on the runtime args of kubelet),
however the actual implimentation in the salt automation doesnt seem to exist.
this change attempts to fix that for the vagrant usecase.
Automatic merge from submit-queue
use apply instead of create to setup namespaces and tokens in addon manager
when the addon manager restarts, it takes ~15 minutes (1000 seconds) to start the sync loop because it retries creation of namespace and tokens 100 times. Create fails if the tokens already exist. Just use apply.
Automatic merge from submit-queue
Juju kube up
I found some problems with the kube-up script that this pull request addresses. We didn't have the kubectl binary in the correct location.
Just changing where we download the package from the master, and fixing the kube-down.sh script to remove those files.
We rename it to EPHEMERAL_BLOCK_DEVICE_MAPPINGS, and we also change the value
so that it starts with a `,`, instead of always inserting a comma before it.
In this way the value can be empty.
Also, if the user sets the (currently experimental) KUBE_AWS_STORAGE
environment variable to be "ebs", then we will not mount any instance storage
which will cause the machines to use EBS storage instead.
format-disks used to run with non-strict bash semantics, but this changed in
1.2 as we now merge it into the GCE script, so pipefail and errexit are both
set.
However, the way we list the ephemeral disks, by piping to grep, would cause an
exit code of 2 if there were no ephemeral disks.
Tolerate failure here by add `|| true`. The metadata service call is unlikely
to fail, so we continue to ignore that possibility.
Automatic merge from submit-queue
Fix so setup-files don't recreate/invalidate certificates that already exist
Fixes: #23197 and a lot of other DNS and dashboard issues
This is quite critical for `docker`-based users and should be considered as a **cherrypick-candidate** as it makes a lot of people wonder why Dashboard and/or DNS doesn't work. Example: https://github.com/kubernetes/dashboard/issues/374
Earlier when you shut your `docker.md` cluster down and started it again, all ServiceAccounts became invalidated by `setup-files` that happily ran once again and replaced all files. That made `apiserver` and `controller-manager` pick up the new certs (or there was a race condition, they _could_ have picked up the old certs too, but that's unlikely) and the old certs were put into `/var/run/secrets` because the ServiceAccount's Secrets were stored in etcd, which `setup-files` didn't touch.
@fgrzadkowski @huggsboson @thockin @mikedanese @vishh @pwittrock @eparis @bgrant0607
Automatic merge from submit-queue
Trusty: Regional release .tar.gz support
@zmerlynn and @roberthbailey please review it. This change is to support the feature added in PR #22234. The entire logic is pretty much the same as in #22234, with only few minor changes in implementation.
I had manually run e2e tests with "export RELEASE_REGION_FALLBACK=true" on two clusters: (1) Trusty on master nodes on ContainerVM; (2) Master and nodes all on trusty. All tests are green. I don't figure out a way to simulate regional fallback. But I did test the function download_or_bust() out-of-box.
cc/ @wonderfly @dchen1107 @fabioy FYI.
Sadly, kube-system isn't automatically created, so people need to make
sure to create it in their turnup scripts. Also after creating
kube-system it can take 10+ seconds for master and proxy to show up.
I tested the equivalent of these changes locally, but not these changes
themselves as I don't have a dev/build env up, so please read carefully
and maybe try them out!
Use kubectl create ns
Automatic merge from submit-queue
Create a new Deployment in kube-system for every version.
It appears that version numbers have already been properly added to these files. Small change to delete an old deployment entirely, so we can make a new one per version (like replication controllers).
We'll want to change this back once the kube-addons support deployments in a later version.
Mostly doc updates and cruft removal
- describe conformance test policy and howto in e2e-tests.md
- rm e2e test info from testing.md in the name of DRY
- rm cluster/test-conformance.sh; unusable in release tar, not e2e.go
- update e2e test link in write-a-getting-started-guide.md
There are actually two `roles` setting in ubuntu installation scripts.
One is roles as string, which can be set as env and then used in scripts.
The other is roles as array, which is used by internal handling to
locate specific role by offset.
This patch tries to distinguish roles meaning by declearing the second
as roles_array, thus eliminating its ambiguity.
When using this flag, this error is shown:
Flag --api-version has been deprecated, flag is no longer respected and will be deleted in the next release
Stop using the flag in the validate-cluster.sh script and avoid the warning.
This commit imports the latest development focus from the Charmer team
working to deliver Kubernetes charms with Juju.
Notable Changes:
- The charm is now assembled from layers in $JUJU_ROOT/layers
- Prior, the juju provider would compile and fat-pack the charms, this
new approach delivers the entirety of Kubernetes via hyperkube.
- Adds Kubedns as part of `cluster/kube-up.sh` and verification
- Removes the hard-coded port 8080 for the Kubernetes Master
- Includes TLS validation
- Validates kubernetes config from leader charm
- Targets Juju 2.0 commands
This should allow allow the non_masquerade_cidr option to get configured
in /etc/salt/minion.d/grains.conf, allowing the flag to used by kubelet
in /etc/sysconfig/kubelet. Default configuration is set in pillar
If we deleted an ELB, we often fail to delete the security group,
because deleting the ELB is invisibly asynchronous.
Add a retry loop around delete-security-group to work around this.
Fix#21147
The only tested-working distros are vivid, wily & jessie.
vivid should not really be used because it is no longer supported, so
recommend wily or jessie instead.
For other distros, recommend jessie instead.
Fix#21218
The previous jessie image had a broken cloud-init, which would use an
Ubuntu-specific 'nobootwait' argument when mounting disks. We now
override that in the image.
Fix#22549
also:
- adds a mechanism to build and upload hyperkube for non-official
releases
- adds a mechanism for proxying azkube's traffic
- --no-cloud-provider for now
- support specifying the resource group for CI scenarios
Allow the gcr.io/google_containers registry to be overridden
regionally by just blasting a new KUBE_ADDON_REGISTRY out. Instead of
adding every addon to Salt and asking all of the other consumers
(Trusty, Juju, Mesos, etc) to change, just script the sed ourselves.
This is probably the 9th grossest thing I've ever done, but it works
well, and it works quickly. I kind of wish it didn't.
It includes some performance improvements for parsing JSON (which is
very important for us, since all Docker logs are JSON) as well as a
couple new settings, like forcing of a flush of multiline logs after a
time period rather than having to wait until a new log is seen before
feeling confident flushing the previous one.
-Remove CPU limits to enable CPU bursting once 1.2 begins enforcing CPU limits.
-Add a memory limit for fluentd-es to match fluentd-gcp.
-Explicitly set requests to match limits.
This change revises the way to provide kube-system manifests for clusters on Trusty. Originally, we maintained copies of some manifests under cluster/gce/trusty/kube-manifests, which is not scalable and hard to maintain. With this change, clusters on Trusty will use the same source of manifests as ContainerVM. This change also fixes some minor problems such as shell variables and comments to meet the style guidance better.
Starting docker through Salt has always been problematic. Kubelet or
the babysitter process should start it. We've kept it around primarily
so we have a `service: docker` node for the Salt DAG.
Instead, we enable (but do not start) the Docker service in Salt. This
lets us keep the DAG node, but won't start it.
There's another bug in Salt, where watches will start the service even
on `service.enabled`. So we remove the watches, and move them to our
existing Salt bug-fix script.
The kubelet flag "nosystem" was removed recently, which breaks kubelet in Trusty. This changes remove the flag usage accordingly. It also revises several aspects of Trusty support to make it in the same page as running on ContainerVM, such as new flags in kubelet and new logic in api-server and etcd pods.
The Docker 1.9.1 package on Debian is broken, and the service fails to
install when run unattended. This is treated as an installation failure
and causes everything to fail.
However, the service can be started by Salt once we're not installing
the package, and indeed we restart docker anyway.
So, on Debian, use a helper script to install the docker package. The
script sets up a policy-rc.d file to prevent the service starting, and
then cleanly removes it afterwards (this would be difficult to do in
Salt, I believe).
PR #22022 added a new variable "cpurequest" in kube-proxy.manifest. This makes kubelet in Trusty fail to start the kube-proxy pod as this variable value is not set.
* In kube-up.sh, create a staging bucket with a location nearest the
zone being created. If new variable RELEASE_REGION_FALLBACK is set
(default false), create multiple buckets and stage to fallback
URLs. (In open source, this path is primarily for testing.)
* In configure-vm.sh, split the URL env variables by comma (if any
extra are present) and retry on the fallback URLs. Also factor the
hash checking into this path rather than outside, since a corrupt
release in a particular geo can be retried in a different geo.
* Remove the local already-staged .tar.gz checks. They've caused
several issues along the way, and with this code path become virtually
unmaintainable. (I could add a sentinel for each bucket it's possibly
staged to, but ew.)
Configurations in config-default.sh should take default values if they
are set outside of the script. `roles` option is an exception. This
patch fix it to maintain consistency behavior with other options.
I didn't expect glog to split single log statements onto multiple lines,
but apparently it does if they're long enough. This groups them back
together appropriately.
Default distro is jessie, due to the support situation with Ubuntu
distros. Default ubuntu distro is wily.
Update the docs to reflect the recommended distros with kube-up, and to
encourage contributions for other distros.
Spot instances take a lot longer to run; wait up to 15 minutes for the
nodes to launch when we're using spot instances. (Previously we were
waiting 5 minutes).
for vsphere provider docker currently only supports 1.9.1 release.
The older versions of docker are failing on jessie due to issue https://github.com/docker/docker/issues/18793
and newer version 1.10.x is not properly tested.
Based on the official debian image, with the following changes:
* Switched extlinux -> grub, because we need to change kernel options
to enable the memory cgroup controller, and extlinux is harder and has
reboot problems
* Added packages that would otherwise be installed as part of the boot
(just an optimization)
* Also add the cloud-initramfs-growroot package; with it the root
volume will resize.
* We add panic=10 & oops=panic to kernel options
* We install the packages as per the base image, except we install
awscli from pip, because the repo version is really old.
Once we've built the master, we can build kubeconfig. By doing so, if
we time out waiting for the nodes, the system is still configured
correctly.
In particular, spot instances can be slow to launch.
Related to issue #21200
I think we should probably leave this undocumented for now, until we
have a better way to launch multiple sets of nodes, but it's great for
cost savings while testing!
Fix#21200
We run unattened-upgrades manually, and then reboot automatically if we
find /var/run/reboot-required; then we check if any services need
restarting and restart them automatically using the needrestart tool.
This should mean we don't _have_ to build new images on every security
update, though we can do so to avoid a reboot.
Issue #21382
m3.large for > 150 nodes.
t2.micro often runs out of memory. The t2 class has very
difficult-to-understand behaviour when it runs out of CPU. The
m3.medium is reasonably affordable, and avoids these problems.
Fix#21151
Issue #18975
Otherwise we risk services coming up on the master before the backing
volume is ready.
If we then see the master-pd is already mounted, don't try to remount
it.
Issue #21155
In commit 07d7cfd3, people add ${DEBUG} == "true" in file
cluster/saltbase/salt/generate-cert/make-ca-cert.sh
But the default value for DEBUG is not set. In that commit, it set the value
of DEBUG in cluster/ubuntu/util.sh where it call this script. When using this
script in saltstack to bring up cluster in other cloud platforms, it will fail
to generate the cert since we set set -o nounset in make-ca-cert.sh and var DEBUG
does not set. Set a default value for DEBUG here will fix this problem.
If the ephemeral volume is present and mounted, don't try to reinitialize
them.
Don't block the boot if the ephemeral volume is corrupt / missing -
this enables us to cope with a stop/start & presumably also corruption.
In this case, we'll reformat the ephemeral storage.
Fix#21157
This is so we have the same behaviour as on GCE.
This also lets us change the bootstrap script or the config, which is
nice. Instance data is immutable on AWS once it is booted.
Fix#21150
This is good because it removes an obstacle to using the
cluster/ubuntu scripting to install Kubernetes into a restricted
environment where the machines can not open connections to arbitrary
external locations.
Also add debuggability to make-ca-cert.sh
Resolves#21037Resolves#21092
We were assuming the PROJECT env var was set, which the e2e tests do.
But PROJECT is normally not set on AWS (it is set on GCE); this broke as
part of the harmonization.
Revert to the pre-existing behaviour here, where we use "aws_" as the
prefix.
Fix#21141
This change corrects how we determine the log level. Moreover, it explicitly redirects kubelet log to /var/log/kubelet.log, as we noticed it may miss sometimes.
This change moves the code of running and monitoring addon pods in a daemon type upstart job, so that addon manifest monitoring can be restarted automatically upon failure. Second, it updates the usage of "kube-ui" to "dashboard" to match the change in PR #20330.
Package manager "dnf" does not work correctly with Salt
(cf https://github.com/saltstack/salt/issues/31001)
It causes Salt to consider that some packages (python, git, curl, etc.) are not
installed, which breaks the Vagrant Kubernetes setup.
Updating dnf and dnf-plugins-core to their latest version solves the issue.
Additionally, I've added the "fastestmirror" to dnf, which is useful if a
RPM mirror is broken or very slow. (In my case, dnf used a broken mirror which
froze the Kubernetes setup).
Fix script for case when neeed to setup cluster
in an existen VPC and subnet with ip mask example: 10.0.0.0/8.
Fixed bug to detect ip of master if provided MASTER_RESERVED_IP.
For some reason detecting master ip was moved to volumes and only when MASTER_RESERVED_IP=auto.
If specify IPv4 for MASTER_RESERVED_IP like `52.1.1.1`, than we could
not detect ip even during last steps of setuping cluster.
step the KUBE_MASTER_IP is reseted because there are no tag for the
volume.
This change support running kubernetes master on Ubuntu Trusty.
It uses pure cloud-config and shell scripts, and completely gets
rid of saltstack or the release salt tarball.
In the e2e tests detect-master is called directly. In turn, it calls
find-tagged-master-ip, which assumed that find-master-pd has already already
been called. But this wasn't true in the e2e case.
We add a call to find-master-pd; it is idempotent.
Combine the fields that will be used for content transformation
(content-type, codec, and group version) into a single struct in client,
and then pass that struct into the rest client and request. Set the
content-type when sending requests to the server, and accept the content
type as primary.
Will form the foundation for content-negotiation via the client.
- wget is not installed by default on fedora 23. Use curl instead
since it is always available on recent Fedora.
- The repo url for cockpit resulted in an http redirect message being
saved as the repo file which broke deployment. Update the url to
url that was redirected to and ensure that future redirects will be
handled correctly.
- The main Fedora 23 repo includes salt packages, and there is no
salt repo for 23. The salt bootstrap still creates a repo file for
a nonexistent repo, though, and this change removes it to avoid
having dnf report an error on every update.
build-runtime-config was being called in verify-prereqs, which didn't
match how GCE called it, and didn't seem to actually work.
Instead call it just before the master configuration is built. Also
call it just before the node configuration is built, even though the
nodes don't _currently_ require the runtime_config.
To fix it, I just add openssl depedency on "generate-cert" state. It
should work on Debian-like and RedHat-Like systems. (and, Archlinux,
Opensuse, etc)
Fixed error :
$ sudo salt 'kubernetes-master' state.apply
----------
ID: kubernetes-cert
Function: cmd.script
Result: False
Comment: Command 'kubernetes-cert' run
Started: 06:57:06.634203
Duration: 208.719 ms
Changes:
----------
pid:
793
retcode:
1
stderr:
/tmpm24T3R.sh: line 22: openssl: command not found
chgrp: cannot access '/srv/kubernetes/server.key': No such file or directory
chgrp: cannot access '/srv/kubernetes/server.cert': No such file or directory
chmod: cannot access '/srv/kubernetes/server.key': No such file or directory
chmod: cannot access '/srv/kubernetes/server.cert': No such file or directory
stdout:
After applying my patch (success) :
----------
ID: kubernetes-cert
Function: cmd.script
Result: True
Comment: Command 'kubernetes-cert' run
Started: 07:17:04.172384
Duration: 1041.092 ms
Changes:
----------
pid:
1045
retcode:
0
stderr:
Generating a 4096 bit RSA private key
......................................................................++
...............................................................................++
writing new private key to '/srv/kubernetes/server.key'
-----
stdout:
----------
If we don't use an elastic IP, the IP address will be lost if we lose
the master for any reason, and a replacement master will not have the
same IP. But the master IP is set both in client kubeconfig files and
the master SSL certificate. Hence the default should be to allocate an
elastic IP for the master.
One complication: AWS doesn't allow tags on elastic IPs, so it is hard
to track the elastic IP so we can delete it as part of kube-down.
Instead, we take the master EBS volume with the elastic IP. This is a
little odd, but works because the master volume & the master elastic IP
really need to be assigned to the same machine, so might be thought of
as a pair.
Also, we now delete the master EBS volume as part of kube-down, as
people expect kube-down to clean-up everything it creates.
We adapt the existing code to work across all zones in a region.
We require a feature-flag to enable Ubernetes-Lite
Reasons:
* There are some behavioural changes if users create volumes with
the same name in two zones.
* We don't want to make one API call per zone if we're not running
Ubernetes-Lite.
* Ubernetes-Lite is still experimental.
There isn't a parallel flag implemented for AWS, because at the moment
there would be no behaviour changes from this.
This is for internal use at the moment, for testing Ubernetes Lite, but
arguably makes the code a little cleaner.
Also rename KUBE_SHARE_MASTER -> KUBE_USE_EXISTING_MASTER
The version of Salt we're running doesn't do a good job of detecting
systemd. Inspired by https://github.com/saltstack/salt/issues/13926,
I added a provider-force to the services.
With this change, salt-call -l debug state.highstate succeeds, even for
repeated invocations.
The issue was (probably) benign, but definitely caused noised (e.g. #11297)
I got the package name wrong before, which meant that salt was failing
on invocations after the first (the name apparently doesn't matter on
the first invocation).
We've had a lot of salt problems with systemd on AWS; we have a
workaround in place that we use everywhere else, we should use that for
kube-node-unpacker too.
Fixes#19386
Issue #19388
Some functionality in hack/lib is currently depended on by
cluster/common.sh so kube-up from the full release tar (which
does not include hack/) is currently broken. With this PR we
create cluster/lib/ and move the necessary bits from hack/
over to get kube-up working again.
Fixes: 96d1b8d1b2
Signed-off-by: Mike Danese <mikedanese@google.com>
- Move keygen image mesosphere/kubernetes-mesos-keygen -> mesosphere/kubernetes-keygen:v1.0.0
- Remove resolveip in favor of github.com/karlkfi/resolveip (resolveip.sh)
- Remove util-temp-dir.sh in favor of github.com/karlkfi/intemp (intemp.sh)
- Refactor bash code to use intemp (extract functions to scripts)
- Remove util-ssl.sh in favor of mesosphere/kubernetes-keygen
Implement a flag that defines the frequency at which a node's out of
disk condition can change its status. Use this flag to suspend out of
disk status changes in the time period specified by the flag, after
the status is changed once.
Set the flag to 0 in e2e tests so that we can predictably test out of
disk node condition.
Also, use util.Clock interface for all time related functionality in
the kubelet. Calling time functions in unversioned package or time
package such as unversioned.Now() or time.Now() makes it really hard
to test such code. It also makes the tests flaky and sometimes
unnecessarily slow due to time.Sleep() calls used to simulate the
time elapsed. So use util.Clock interface instead which can be faked
in the tests.
For AWS EBS, a volume can only be attached to a node in the same AZ.
The scheduler must therefore detect if a volume is being attached to a
pod, and ensure that the pod is scheduled on a node in the same AZ as
the volume.
So that the scheduler need not query the cloud provider every time, and
to support decoupled operation (e.g. bare metal) we tag the volume with
our placement labels. This is done automatically by means of an
admission controller on AWS when a PersistentVolume is created backed by
an EBS volume.
Support for tagging GCE PVs will follow.
Pods that specify a volume directly (i.e. without using a
PersistentVolumeClaim) will not currently be scheduled correctly (i.e.
they will be scheduled without zone-awareness).
To build the python image, BUILD_PYTHON_IMAGE should be set during make.
When the addon script is running, it will check if python is installed
on the machine, if not, it will use the python image that built previously.
- default the release to the value of latest_release instead of
the string 'latest_release'
- use curl -O when retrieving kubectl to write output to disk instead
of to the screen
In MacOS there is error during setup a new cluster:
```
+ sed -i -e 's/^[[:blank:]]*#.*$//' -e '/^[[:blank:]]*$/d' /sometmpfile
sed: -e: No such file or directory
```
Because sed version of MacOS does not support modern features.
Currently if a pod is being scheduled with no meta.RolesKey label
attached to it, per convention the first configured mesos (framework)
role is being used.
This is quite limiting and also lets e2e tests fail. This commit
introduces a new configuration option "--mesos-default-pod-roles" defaulting to
"*" which defines the default pod roles in case the meta.RolesKey pod
label is missing.
Currently when using a custom elastic IP, the ENV var `KUBE_MASTER_IP` gets
the output of `$(assign-elastic-ip $ip $master_id)` assigned.
This is wrong since the command returns a string:
`Attaching IP 99.999.999.999 to instance i-9999999`
This patch fixes the assignment by calling `get_instance_public_ip` again.
This change is to pick up the fix in PR #18178. It avoids confusing
cadvisor when systemd is present in an instance but does not act
as the init system.
When PROXY_SETTING is empty, you end up an empty
command of "", as witnessed by this bash debug
output when +x is enabled:
+ '' /home/ubuntu/kube/make-ca-cert.sh 10.0.0.232 IP:10.0.0.232,IP:192.168.3.1,DNS:kubernetes,DNS:kubernetes.default,DNS:kubernetes.default.svc,DNS:kubernetes.default.svc.cluster.local
Given the example:
PROXY_SETTING="http_proxy=http://server:port https_proxy=https://server:port"
You would not want this quoted on the script executed
on the remote master or minion node.
Enabling +e, for additional tracing and to
abort on any failure in the remote SSH session.
Adding a DEBUG parameter into config-default.sh allowing additional
debug information to be present in the logs during node rollout, using
bash's "set -x" when DEBUG=true
This allows resource usage monitoring test to launch 100 test pods per node, in
addition to the add-on pods.
Also reduce the test time length since the results over the shorter period are
representative enough.
This change refactors the code of preparing kube-system manifests
for trusty based cluster. The manifests used by nodes do not contain
salt configuration, so we can simply copy them from the directory
cluster/saltbase/salt, make a tarball, and upload to Google Storage.
- document needed packages in hostexec image
- add RunHostCmdOrDie
- kube-proxy e2e: port from ssh to hostexec
- use preset NodeName to schedule test pods to different nodes
- parallel launch of pods
- port from ssh to hostexec
- add timeout because nc might block on udp
- delete test container without grace period
- PrivilegedPod e2e: port from ssh to hostexec
- NodePort e2e: port from ssh to hostexec
- cluster/mesos/docker: Enable privileged pods
Remove a comment that disabled the redirection of output destined for
`/etc/salt/minion.d/grains.conf`. Must have been a commented added to
debug the generation of the line, to view it on `STDOUT`.
The node.yaml has some logic that will be also used by the kubernetes
master on trusty work (issue #16702). This change moves the code
shared by the master and node configuration to a separate script, and
the master and node configuration can source it to use the code.
Moreover, this change stages the script for GKE use.
- add busybox static pod to mesos-docker cluster
- customize static pods with binding annotations
- code cleanup
- removed hacky podtask.And func; support minimal resources for static pods when resource accounting is disabled
- removed zip archive of static pods, changed to gzip of PodList json
- pod utilities moved to package podutil
- added e2e test
- merge watched mirror pods into the mesos pod config stream
We use the AWS CLI support for --query and --filter instead; should be
more reliable and clearer.
Also set the output format to text, so we don't have to set it every
time and don't risk problems if we forget to set it.
Fixes#16747
We do still have to use JSON parsing in one place: ELB does not support
--filter, so we have to use Python there.
Addresses #15968
This patch removes KUBE_ENABLE_EXPERIMENTAL_API and similar calls in
favor of specifying desired features in KUBE_RUNTIME_CONFIG. Changes
have also been made to e2e scripts to re-enable using
KUBE_RUNTIME_CONFIG rather than EXPERIMENTAL_API env vars.
This also introduces KUBE_ENABLE_DAEMONSETS and KUBE_ENABLE_DEPLOYMENTS.
Signed-off-by: Christian Stewart <christian@paral.in>
We can't tag ASGs, but we can see what instances are running in an ASG,
and we can match those by our tags.
So look for our running instances, and look for the ASGs that created
them, and delete those.
This can be defeated (most notably if users change the ASG size to 0),
but it is safer that other deletion methods.
By setting KUBE_SHARE_MASTER=true we reuse an existing master, rather
than creating a new one.
By setting KUBE_SUBNET_CIDR=172.20.1.0/24 you can specify the CIDR for a
new subnet, avoiding conflicts.
Both these options are documented only in kube-up and clearly marked as
'experimental' i.e. likely to change.
By combining these, you can kube-up a cluster normally, and then kube-up
a cluster in a different AZ, and the new nodes will attach to the same
master.
KUBE_SHARE_MASTER is also useful for addding a second node
auto-scaling-group, for example if you wanted to mix spot & on-demand
instances.
If the `hostname` commands used in the polling loop fail, their stdout
is going to be empty and so `getent hosts` command will actually
succeed. For the loop to work as expected, make sure the subcommands
return a string which is an invalid host name.
Allows loading existing auth from kubeconfig on kube-up if a
valid KUBE_CONTEXT is specified, instead of always force
regenerating auth (basic or token) when creating a new cluster.
When KUBE_E2E_STORAGE_TEST_ENVIRONMENT is set to 'true', kube-up.sh script
will:
- Install the right packages for all storage volumes.
- Use devicemapper as docker storage backend. 'aufs', the default one on
Debian, does not support extended attibutes required by Ceph RBD and Gluster
server containers.
Tested on GCE and Vagrant, e2e tests for storage volumes passes without any
additional configuration.
This ensures nfs-common is installed on GCE, and provides a more
functional explanation/example. I launched two replication controllers
so that there were busybox pods to poke around at the NFS volume, and
so that the later wget actually works (the original example would have
to work on the node, or need some other access to the container
network). After switching to two controllers, it actually makes more
sense to use PV claims, and it's probably a configuration that makes
more sense for indirection for NFS anyways.
Fixes AWS ubuntu deployment due to extra-$(uname) vs extra-virtual
package being installed. See issue #14162
Signed-off-by: Christian Stewart <christian@paral.in>
We want to match the version of netcat that is installed on GCE. We
were having problems with netcat-openbsd having slightly different
timeout behaviour (on UDP packets; when there was no listener).
The --log-level="\debug\" flag in DOCKER_OPTS may not be correctly
interpreted in some cases. We turn on this flag only for testing
clusters. In addition to fixing the docker flag, this change
also removes the confusing numbers from the lines of separating
upstart jobs.
- Refactor common and gce/upgrade.sh to use arbitrary published releases
- Update hack/get-build to use cluster/common code
- Use hack/get-build.sh in cluster upgrade test logic
This avoid that we either waste cpu resources due to rounding or that we report
to much to the kubelet such that the e2e tests think they can schedule more than
resources are available.
This fixes https://github.com/mesosphere/kubernetes-mesos/issues/437