kubernetes

Author	SHA1	Message	Date
Madhusudan.C.S	edef3af34f	Split federation-{up,down} from e2e-{up,down}.	2017-02-24 14:27:31 -08:00
Marco Ceppi	07ef43b630	Update owners file to reflect Juju/Charm knowledgable reviewers	2017-02-24 11:57:19 -05:00
Kubernetes Submit Queue	8e13ee01d6	Merge pull request #41908 from chuckbutler/remove-ivan-from-juju Automatic merge from submit-queue Remove ivan4th from reviewers What this PR does / why we need it: Per @ivan4th request in #41351 he would like to be removed from the reviewers list in this directory tree. This commit addresses that request. Special notes for your reviewer: As Ivan has already investigated the PR in question under 41351 I would like to see that driven to landing before landing this OWNERS file change, unless another reviewer would like to step in and help land that open PR. Release note: ```release-note NONE ```	2017-02-23 22:10:48 -08:00
Kubernetes Submit Queue	84b74074a4	Merge pull request #41674 from ixdy/etcd-empty-dir-cleanup-busybox Automatic merge from submit-queue Base etcd-empty-dir-cleanup on busybox, run as nobody, and update to etcdctl 3.0.14 What this PR does / why we need it: since the `etcd-empty-dir-cleanup` image just uses a simple shell script and `etcdctl`, we can base it on busybox, which is a smaller target than alpine. I've also updated this to use an `etcdctl` from etcd 3.0.14, which matches the version of etcd we're running in 1.6 clusters (I believe), and changed the tag to match the `etcdctl` version. Tested in my own e2e cluster, where it seems to work. I haven't pushed the image yet, so e2e tests may fail. Tagging `do-not-merge`; if you think this looks good, I'll push the image and retest. Release note: ```release-note ``` cc @timstclair @mml @wojtek-t	2017-02-23 21:25:56 -08:00
Kubernetes Submit Queue	e70d23db2a	Merge pull request #41667 from mikedanese/certs Automatic merge from submit-queue (batch tested with PRs 41667, 41820, 40910, 41645, 41361) refactor certs in GCE to break up usages TODO: debian	2017-02-23 20:57:27 -08:00
Kubernetes Submit Queue	b799bbf0a8	Merge pull request #38816 from deads2k/rbac-23-switch-kubedns-sa Automatic merge from submit-queue move kube-dns to a separate service account Switches the kubedns addon to run as a separate service account so that we can subdivide RBAC permission for it. The RBAC permissions will need a little more refinement which I'm expecting to find in https://github.com/kubernetes/kubernetes/pull/38626 . @cjcullen @kubernetes/sig-auth since this is directly related to enabling RBAC with subdivided permissions @thockin @kubernetes/sig-network since this directly affects now kubedns is added. ```release-note `kube-dns` now runs using a separate `system:serviceaccount:kube-system:kube-dns` service account which is automatically bound to the correct RBAC permissions. ```	2017-02-23 12:06:13 -08:00
Mike Danese	192392bddd	refactor certs in GCE	2017-02-23 10:12:31 -08:00
Kubernetes Submit Queue	bb5fdff58b	Merge pull request #41567 from Crassirostris/fluentd-gcp-monitoring Automatic merge from submit-queue (batch tested with PRs 39855, 41433, 41567, 41887, 41652) Add fluentd monitoring to fluentd-gcp image Right now we are not able to monitor the state of fluentd in cluster, which may result in logging subsystem quietly failing. This PR tries to address that problem by introducing the fluentd container monitoring: * fluentd internal metrics, like number of buffers and number of data in buffers * `logging_line_count`, number of lines, read by fluentd from application containers' logs * Has `tag` label, corresponding to the fluentd tag of the entry * `logging_entry_count`, number of entries, emitted to the output plugin * With label `component` set to `container`, generated by application containers * With label `component` set to `system`, generated by system components like kubelet, docker, scheduler, etc. * Has `tag` label, corresponding to the fluentd tag of the entry CC @fabxc @igorpeshansky @edsiper	2017-02-23 09:36:33 -08:00
Wojciech Tyczynski	b70e392161	Update clusters to use 3.0.17 etcd	2017-02-23 10:08:50 +01:00
Wojciech Tyczynski	a7d2136ce1	Update etcd to 3.0.17 in integration tests	2017-02-23 10:08:50 +01:00
Kubernetes Submit Queue	a91cf1ed94	Merge pull request #41771 from cblecker/go-1.7.5 Automatic merge from submit-queue (batch tested with PRs 41812, 41665, 40007, 41281, 41771) Bump golang versions to 1.7.5 What this PR does / why we need it: While #41636 might not make it in until 1.7, this would bump current golang versions from 1.7.4 to 1.7.5 to integrate the fixes from that patch version. This would include, among other things, a fix to ensure cross-built binaries for darwin don't have certificate validation errors (golang/go#18688) Which issue this PR fixes (optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged): none Special notes for your reviewer: Release note: ```release-note Upgrade golang versions to 1.7.5 ```	2017-02-23 00:11:41 -08:00
Kubernetes Submit Queue	8fc311c96c	Merge pull request #41807 from shyamjvs/remove-fart-metrics Automatic merge from submit-queue (batch tested with PRs 41797, 41793, 41795, 41807, 41781) Remove unnecessary metrics (http/process/go) from being exposed by etcd-version-monitor Unregister metrics we do not want from the etcd version metrics handler. cc @wojtek-t @piosz	2017-02-22 22:06:35 -08:00
Kubernetes Submit Queue	e64835683b	Merge pull request #41795 from Crassirostris/fluentd-gcp-turn-supervisor-off Automatic merge from submit-queue (batch tested with PRs 41797, 41793, 41795, 41807, 41781) Turn fluentd supervisor off for fluentd-gcp By default, turn fluentd supervisor off so that when fluentd process fails, for example due to OOM, container fails completely and it would be easy to detect. CC @igorpeshansky @qingling128	2017-02-22 22:06:33 -08:00
Kubernetes Submit Queue	59f4c5911a	Merge pull request #41819 from dchen1107/master Automatic merge from submit-queue (batch tested with PRs 38957, 41819, 41851, 40667, 41373) Bump GCI to gci-stable-56-9000-84-2 Changelogs since gci-beta-56-9000-80-0: - Fixed google-accounts-daemon breaks on GCI when network is unavailable. - Fixed iptables-restore performance regression. cc/ @adityakali @Random-Liu @fabioy	2017-02-22 19:59:33 -08:00
Jeff Grafton	eeec939361	Don't fail if the grep fails to match any resources	2017-02-22 14:55:57 -08:00
Jeff Grafton	511bdc11ae	Bump etcd-empty-dir-cleanup to 3.0.14.0	2017-02-22 13:22:04 -08:00
Jeff Grafton	1f3ba7f484	Base etcd-empty-dir-cleanup on busybox, run as nobody, and update to etcdctl 3.0.14	2017-02-22 13:22:03 -08:00
Charles Butler	3c5009d00a	Remove ivan4th from reviewers Per ivans request in #41351 he would like to be removed from the reviewers list in this directory tree. This commit addresses that request.	2017-02-22 12:06:00 -06:00
Kubernetes Submit Queue	44aa1679c9	Merge pull request #41657 from bowei/update-dns Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657) Update dns ```release-note NONE ```	2017-02-22 08:12:48 -08:00
Kubernetes Submit Queue	fe34705f8a	Merge pull request #41587 from MrHohn/addon-manager-fix-hpa Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657) Update kubectl in addon-manager to use HPA in autoscaling/v1 Addon-manager is broken since HPA objects were removed from extensions api group. Came across the logs from [the latest addon-manager on Jenkins](https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce/4290/artifacts/bootstrap-e2e-master/kube-addon-manager.log): ``` INFO: == Entering periodical apply loop at 2017-02-16T17:33:37+0000 == error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:38+0000. 2 tries remaining. == error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:46+0000. 1 tries remaining. == error: error pruning namespaced object extensions/v1beta1, Kind=HorizontalPodAutoscaler: the server could not find the requested resource WRN: == Failed to execute /usr/local/bin/kubectl apply --namespace=kube-system -f /etc/kubernetes/addons --prune=true -l kubernetes.io/cluster-service=true --recursive >/dev/null at 2017-02-16T17:33:53+0000. 0 tries remaining. == WRN: == Kubernetes addon update completed with errors at 2017-02-16T17:33:58+0000 == ``` And notice this commit (`f66679a4e9`) came in two weeks ago, which removed HorizontalPodAutoscaler from extensions/v1beta1. Addon-manager is now partially functioning that it could successfully create and update addons, but will fail to prune objects, which means upgrade tests may mostly fail. Pushed another version of addon-manager with kubectl v1.6.0-alpha.2 ([release 2 days ago](https://github.com/kubernetes/kubernetes/releases/tag/v1.6.0-alpha.2)) for fixing, including below images: - gcr.io/google-containers/kube-addon-manager:v6.4-alpha.2 - gcr.io/google-containers/kube-addon-manager-amd64:v6.4-alpha.2 - gcr.io/google-containers/kube-addon-manager-arm:v6.4-alpha.2 - gcr.io/google-containers/kube-addon-manager-arm64:v6.4-alpha.2 - gcr.io/google-containers/kube-addon-manager-ppc64le:v6.4-alpha.2 - gcr.io/google-containers/kube-addon-manager-s390x:v6.4-alpha.2 @mikedanese cc @wojtek-t @shyamjvs	2017-02-22 08:12:46 -08:00
Kubernetes Submit Queue	b29bdee735	Merge pull request #41256 from mbruzek/mbruzek-juju-lint-fixes Automatic merge from submit-queue (batch tested with PRs 41349, 41532, 41256, 41587, 41657) Lint fixes for the master and worker Python code. What this PR does / why we need it: lint fixes for the python code. Which issue this PR fixes none Special notes for your reviewer: This is lint fixes for the Juju python code. Release note: ```release-note NONE ``` Please consider these changes so we can pass flake8 lint tests in our build process.	2017-02-22 08:12:43 -08:00
Shyam Jeedigunta	d5a28b3618	Remove unnecessary metrics (http/process/go) from being exposed by etcd-version-monitor	2017-02-22 13:11:00 +01:00
Christoph Blecker	c3de31c8d0	Bump golang versions to 1.7.5	2017-02-21 13:02:16 -08:00
Madhusudan.C.S	2cb2200847	Move kube-dns ConfigMap creation/deletion out of federated services e2e tests to federation-up.sh/federation-down.sh where the clusters are joined/unjoined.	2017-02-21 10:27:31 -08:00
Shyam JVS	746cc5d284	Merge pull request #41800 from shyamjvs/fix-hollow-node-logging Whitelist kubemark in node_ssh_supported_providers for log dump	2017-02-21 19:13:08 +01:00
Dawn Chen	3d510461a3	Bump GCI to gci-stable-56-9000-84-2	2017-02-21 10:03:14 -08:00
Kubernetes Submit Queue	409d7d0a91	Merge pull request #41326 from ncdc/ci-cache-mutation Automatic merge from submit-queue (batch tested with PRs 41364, 40317, 41326, 41783, 41782) Add ability to enable cache mutation detector in GCE Add the ability to enable the cache mutation detector in GCE. The current default behavior (disabled) is retained. When paired with https://github.com/kubernetes/test-infra/pull/1901, we'll be able to detect shared informer cache mutations in gce e2e PR jobs.	2017-02-21 07:45:42 -08:00
Shyam Jeedigunta	3bc6bf6b70	Whitelist kubemark in node_ssh_supported_providers for log dump	2017-02-21 14:02:17 +01:00
Mik Vyatskov	5d59d4d27b	Turn fluentd supervisor off for fluentd-gcp	2017-02-21 13:50:47 +01:00
Kubernetes Submit Queue	70c9eebd21	Merge pull request #41739 from shyamjvs/hollow-node-logs Automatic merge from submit-queue (batch tested with PRs 41706, 39063, 41330, 41739, 41576) [Kubemark] Add option to log hollow-node logs Ref https://github.com/kubernetes/kubernetes/issues/41613 Added an option to log kubemark hollow-node logs which includes kubelet, kubeproxy and npd logs for each hollow-node. Setting the env var `ENABLE_HOLLOW_NODE_LOGS=true` should now enable logging for tests. cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek @yujuhong @Random-Liu	2017-02-21 02:24:43 -08:00
Zihong Zheng	2c8e89820a	Update kubectl in addon-manager to use HPA in autoscaling/v1 instead of extensions/v1beta1	2017-02-20 10:49:10 -08:00
deads2k	36b586d5d7	move kube-dns to a separate service account	2017-02-20 07:35:08 -05:00
Shyam Jeedigunta	ed0ab3cd8e	[Kubemark] Add option to log hollow-node logs	2017-02-20 11:52:49 +01:00
Kubernetes Submit Queue	ff12e5688c	Merge pull request #40206 from Random-Liu/add-standalone-npd Automatic merge from submit-queue Add standalone npd on GCI. This PR added standalone NPD in GCE GCI cluster. I already verified the PR, and it should work. /cc @dchen1107 @fabioy @andyxning @kubernetes/sig-node-misc	2017-02-18 02:00:20 -08:00
Kubernetes Submit Queue	4b3a097ecd	Merge pull request #41525 from yujuhong/fix_output Automatic merge from submit-queue Fix the output of health-mointor.sh The script show prints the errors/response of the health check, but not show the progress of `curl`.	2017-02-17 16:57:29 -08:00
Random-Liu	d40c0a7099	Add standalone npd on GCI.	2017-02-17 16:18:08 -08:00
Bowei Du	9f75db3c69	Update kube-dns image versions to the latest stable release	2017-02-17 11:12:25 -08:00
Kubernetes Submit Queue	6d5b2ef49e	Merge pull request #41080 from shyamjvs/etcd-version-monitor Automatic merge from submit-queue Added a basic monitor for providing etcd version related info Fixes #41071 This tool scrapes metrics partly from etcd's /version and /metrics endpoints and partly using etcdctl and exposes them as prometheus metrics at `http://localhost:9101/metrics` endpoint on the master. Here is a summary of the metrics it exposes (self-explanatory from the code): - etcdVersionFetchCount = prometheus.NewCounterVec( prometheus.CounterOpts{ Namespace: "etcd", Name: "version_info_fetch_count", Help: "Number of times etcd's version info was fetched, labeled by etcd's server binary and cluster version", }, []string{"serverversion", "clusterversion"}) - etcdGRPCRequestsTotal = prometheus.NewCounterVec( prometheus.CounterOpts{ Namespace: namespace, Name: "grpc_requests_total", Help: "Counter of received grpc requests, labeled by grpc method and grpc service names", }, []string{"grpc_method", "grpc_service"}) For further info on how to run this as a binary/docker-container/kubernetes-pod and checking the metrics, have a look at the README.md file. cc @fgrzadkowski @wojtek-t @piosz	2017-02-17 10:18:48 -08:00
Kubernetes Submit Queue	46cd8ec91b	Merge pull request #41637 from wojtek-t/expose_storage_format_as_env Automatic merge from submit-queue Expose storage media type as env variable Ref #40636 @mml	2017-02-17 08:15:27 -08:00
Andy Goldstein	688c19ec71	Allow cache mutation detector enablement by PRs Allow cache mutation detector enablement by PRs in an attempt to find mutations before they're merged in to the code base. It's just for the apiserver and controller-manager for now. If/when the other components start using a SharedInformerFactory, we should set them up just like this as well.	2017-02-17 10:03:13 -05:00
Kubernetes Submit Queue	3b14667afe	Merge pull request #41604 from shyamjvs/kubemark-num-nodes Automatic merge from submit-queue Reduce default value of kubemark's NUM_NODES to 10 Changing the default value of kubemark's NUM_NODES from 100 to 10, as it would then be possible to start kubemark on gce clusters that have been started using kube-up that uses the default config of three n1-standard-2 nodes. I've already been asked by a couple of people about why kubemark is not starting on their cluster because of this. More people shouldn't be facing this issue in future. cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek	2017-02-17 06:49:21 -08:00
Wojciech Tyczynski	3695e85b34	Expose storage media type as env variable	2017-02-17 14:16:55 +01:00
Shyam Jeedigunta	7e6b8ac26b	Added a basic monitor for watching etcd version and size related info	2017-02-17 12:52:54 +01:00
Shyam Jeedigunta	94d2ed5e34	Reduce default value of kubemark's NUM_NODES to 10	2017-02-16 23:35:39 +01:00
Matt Bruzek	3b29b6a9ef	Lint fixes for the master and worker Python code.	2017-02-16 14:01:30 -06:00
Mik Vyatskov	8d2d91070a	Add fluentd monitoring to fluentd-gcp image	2017-02-16 19:04:32 +01:00
Kubernetes Submit Queue	30e8953fad	Merge pull request #41564 from Crassirostris/fluentd-gcp-plugin-version-bump Automatic merge from submit-queue Bump fluentd-gcp google_cloud plugin version Bump the version of `fluent-plugin-google-cloud` in fluentd-gcp image, because it's broken for version `0.5.2`. Recently, gem `google-api-client` was updated to version `0.10.0`. The new version broke `fluent-plugin-google-cloud` which doesn't specify the upper version of `google-api-client` gem. I'm bumping the version used in our image to allow future changes in this release to be run and tested. This PR doesn't bump the version, since no effective changes has happened, leaving this for the next PR to do. CC @igorpeshansky	2017-02-16 09:20:12 -08:00
Mik Vyatskov	e8de31623f	Bump fluentd-gcp google_cloud plugin version	2017-02-16 16:49:16 +01:00
Kubernetes Submit Queue	627c6ce2b8	Merge pull request #41489 from Crassirostris/fluentd-add-toleration Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489) Add toleration to fluentd daemonset to make it run on master Because of https://github.com/kubernetes/kubernetes/pull/41172 fluentd pods stopped being allocated on master node. This PR introduces toleration for master taint for fluentd. CC @davidopp @janetkuo @kubernetes/sig-scheduling-bugs Unfortunately, we don't have e2e tests to ensure that master logs are being ingested. This problem is a great signal to work on https://github.com/kubernetes/kubernetes/issues/41411	2017-02-16 01:52:08 -08:00
Kubernetes Submit Queue	5ff9a72ea0	Merge pull request #41508 from Crassirostris/fluentd-dns-problem-fix Automatic merge from submit-queue (batch tested with PRs 40000, 41508, 41489) Make fluentd use default dns instead of cluster dns to make it work o… Fix https://github.com/kubernetes/kubernetes/issues/41415 Fluentd for Stackdriver requires external urls (e.g. `logging.googleapis.com`) to be available in order to work. If fluentd runs on master, it cannot access the service endpoint of cluster DNS. This change makes fluentd use default dns to fix this problem. CC @thockin @bowei	2017-02-16 01:52:06 -08:00

1 2 3 4 5 ...

5283 Commits