Commit Graph

1111 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
f44608171a
Merge pull request #55715 from shyamjvs/fix-prom-to-sd-sidecar-in-metadata-proxy
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix prometheus-to-sd sidecar in metadata proxy

Ref https://github.com/kubernetes/kubernetes/issues/55695#issuecomment-344300188

This is making 2 changes:
- restoring resource requests and limits of the metadata-proxy sidecar as it was before, and remove them for prom-to-sd sidecar (best effort) like at everywhere else
- pass pod name and namespace args to prom-to-sd sidecar (because just noticed)

/cc @ihmccreery @loburm @crassirostris - Does this make sense?
2017-11-14 19:28:54 -08:00
Mike Danese
962e1e2f6d gce: readd kubelet-bootstrap to kubelet user 2017-11-14 13:46:08 -08:00
Kubernetes Submit Queue
95b4312899
Merge pull request #55466 from x13n/addon-manager
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Use results of kube-controller-manager leader election in addon manager

**What this PR does / why we need it**:
This adds leader election-like mechanism to addon manager. Currently, in a multi-master setup, upgrading one master will trigger a fight between addon managers on different masters, each forcing its own versions of addons. This leads to pod unavailability until all masters are upgraded to new version.

To avoid implementing leader election in bash, results of leader election in kube-controller-manager are used. Long term, addon manager probably should be rewritten in a real prgramming language (probably Go), and then, real leader election should be implemented there.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
I don't think there was an issue for this specifically, but this PR is related to https://github.com/kubernetes/kubernetes/issues/473

**Special notes for your reviewer**:

**Release note**:
```release-note
Addon manager supports HA masters.
```
2017-11-14 11:26:31 -08:00
Shyam Jeedigunta
6e50b1f90b Pass pod name and namespace argss to prom-to-sd sidecar of metadata-proxy 2017-11-14 16:52:55 +01:00
Shyam Jeedigunta
13c235d31c Fix resource requests & limits of metadata-proxy 2017-11-14 16:51:15 +01:00
Kubernetes Submit Queue
b2125f5aa8
Merge pull request #55509 from tallclair/psp-addons
Automatic merge from submit-queue (batch tested with PRs 54602, 54877, 55243, 55509, 55128). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

PodSecurityPolicies for addons

**What this PR does / why we need it**:

1. Colocate addon PodSecurityPolicy config with the addons (in a `podsecuritypolicies` subdirectory). 
2. Add policies for addons that are currently missing policies (not in the default GCE suite)
3. Remove HostPath SSL certs from several heapster deployments, so that heapster doesn't require a special PSP

**Which issue(s) this PR fixes**:
#43538

**Release note**:
```release-note
- Add PodSecurityPolicies for cluster addons
- Remove SSL cert HostPath volumes from heapster addons
```
2017-11-14 03:03:30 -08:00
Daniel Kłobuszewski
ae6e506fdc
Merge branch 'master' into addon-manager 2017-11-14 09:36:20 +01:00
Kubernetes Submit Queue
4f91113075
Merge pull request #54826 from mindprince/addon-manager
Automatic merge from submit-queue (batch tested with PRs 54826, 53576, 55591, 54946, 54825). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached

- Instead of the old `Accelerators` feature that added `alpha.kubernetes.io/nvidia-gpu` resource, use the new `DevicePlugins` feature that adds vendor specific resources. (In case of nvidia GPUs it will
add `nvidia.com/gpu` resource.)

- Add node label to GCE nodes with accelerators attached. This node label is the same as what GKE attaches to node pools with accelerators attached. (For example, for nvidia-tesla-p100 GPU, the label would be `cloud.google.com/gke-accelerator=nvidia-tesla-p100`) This will help us target accelerator specific
daemonsets etc. to these nodes.

- Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached.

- Some minor documentation improvements in addon manager.

**Release note**:
```release-note
GCE nodes with NVIDIA GPUs attached now expose `nvidia.com/gpu` as a resource instead of `alpha.kubernetes.io/nvidia-gpu`.
```

/sig cluster-lifecycle
/sig scheduling
/area hw-accelerators

https://github.com/kubernetes/features/issues/368
2017-11-13 14:46:55 -08:00
Daniel Kłobuszewski
5e4692f784 Use results of kube-controller-manager leader election in addon manager 2017-11-13 14:54:37 +01:00
Kubernetes Submit Queue
f5c29f51fa
Merge pull request #55506 from Random-Liu/fix-cri-fluentd
Automatic merge from submit-queue (batch tested with PRs 54460, 55258, 54858, 55506, 55510). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix CRI fluentd config.

This should fix the cri-containerd stackdriver test failure:
```
Cluster level logging implemented by Stackdriver should ingest logs
```

I copied the pattern from a comment previously. However, it doesn't actually work properly. `\b` only matches word boundary, and seems to match the boundary of previous word in our case.

That's why we get the log with a leading space:
```
Nov 10 18:39:11.661: INFO: Unexpected error occurred: log entry ingested incorrectly, got --> <--I0101 00:00:00.000000       1 main.go:1] Text, want Text
```

@kubernetes/sig-node-bugs @kubernetes/sig-instrumentation-bugs 

Signed-off-by: Lantao Liu <lantaol@google.com>

```release-note
none
```
2017-11-11 10:45:27 -08:00
Kubernetes Submit Queue
dad41f8526
Merge pull request #54215 from mrahbar/elasticsearch_logging_discovery
Automatic merge from submit-queue (batch tested with PRs 54987, 55221, 54099, 55144, 54215). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

extracted elasticsearch-logging service name as environment variable

**What this PR does / why we need it**:
Deploying the cluster-addon fluentd-elasticsearch with customized resource definitions can cause elasticsearch discovery to fail because the service name `elasticsearch-logging` is hard-coded in  cluster/addons/fluentd-elasticsearch/es-image/elasticsearch_logging_discovery.go

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
-> none yet

**Special notes for your reviewer**:
The name of the  environment variable is ELASTICSEARCH_SERVICE_NAME. When non is given the fallback service-name  fallback is  `elasticsearch-logging`

```release-note
[fluentd-elasticsearch addon] Elasticsearch service name can be overridden via env variable ELASTICSEARCH_SERVICE_NAME
```
2017-11-10 14:51:33 -08:00
Tim Allclair
2f0b930466
Remove SSL cert volumes from heapster addons 2017-11-10 13:57:35 -08:00
Tim Allclair
cd720c4759
Add optional addon PSPs 2017-11-10 13:57:33 -08:00
Tim Allclair
a1513161b3
Reorganize addon PodSecurityPolicies 2017-11-10 13:57:32 -08:00
Lantao Liu
53d7494b9e Fix CRI fluentd config.
Signed-off-by: Lantao Liu <lantaol@google.com>
2017-11-10 20:55:56 +00:00
mrahbar
4ecd54f47f extracted elasticsearch-logging service name as environment variable ELASTICSEARCH_SERVICE_NAME with fallback on default 2017-11-10 14:14:22 +01:00
Dr. Stefan Schimanski
bec617f3cc Update generated files 2017-11-09 12:14:08 +01:00
Dr. Stefan Schimanski
012b085ac8 pkg/apis/core: mechanical import fixes in dependencies 2017-11-09 12:14:08 +01:00
Kubernetes Submit Queue
8eb0b39afe
Merge pull request #53144 from mikedanese/kubelet-revoke
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

gce: revoke kubelet binding permissions

and move the binding addon to EnsureExists, so new clusters will pickup
the new binding and old clusters will keep the old binding. The binding
is no longer required now that we are migrating to node authorizer.

fixes https://github.com/kubernetes/kubernetes/issues/53151
2017-11-07 04:13:38 -08:00
Kubernetes Submit Queue
6a7b3892f7
Merge pull request #54852 from kawych/ms_config
Automatic merge from submit-queue (batch tested with PRs 53866, 54852, 55178, 55185, 55130). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Adjust resources for Metrics Server

**What this PR does / why we need it**:
This PR adjusts resources set for Metrics Server by Pod Nanny to reduce resources usage by core Kubernetes components when enabling Metrics Server. In Kubernetes 1.8 Metrics Server is used only by HPAv2, other use-cases are covered by Heapster.

**Release note**:
```release-note
NONE
```
2017-11-06 22:20:24 -08:00
Kubernetes Submit Queue
f35c4a2b5f
Merge pull request #55015 from fasaxc/calico-disable-grace
Automatic merge from submit-queue (batch tested with PRs 53645, 54734, 54586, 55015, 54688). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Disable the grace termination period for the calico/node pod

**What this PR does / why we need it**:

Disable the termination grace period for the calico/node add-on DaemonSet.  The grace period is unnecessary for calico/node and it delays restart of a new calico/node pod to take over routing and policy updates.

Setting the grace period to 0 has the special meaning of doing a force deletion, which avoids a slow round-trip through the kubelet and API server.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

Fixes #55013

**Special notes for your reviewer**:

**Release note**:

```release-note
Disable the termination grace period for the calico/node add-on DaemonSet to reduce downtime during a rolling upgrade or deletion.
```
2017-11-06 15:33:47 -08:00
Isaac Hollander McCreery
be8aaf9ff8 Add prometheus-to-sd-exporter to metadata-proxy addon; bump to proxy to v0.1.4 and e2e to v0.0.2; remove configmag 2017-11-03 10:23:05 -07:00
Kubernetes Submit Queue
63c409727c
Merge pull request #54996 from mwielgus/metadata-proxy
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Reduce metadata-proxy cpu requests to 30m

After the recent change enabling metadata-proxy in tests (https://github.com/kubernetes/kubernetes/pull/54150) we started seeing problems with scheduling cluster autoscaler on master. Metadata-proxy eats all of the available space leaving nothing for CA to run on. 

This PR reduces the cpu requests for metadata-proxy allowing other components to fit in.

cc: @kubernetes/sig-autoscaling-bugs
2017-11-02 18:08:10 -07:00
Rohit Agarwal
cf292754ba Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached. 2017-11-02 12:58:29 -07:00
Rohit Agarwal
3de7e5ab40 Remove redundant comment and improve documentation.
The comment is also present in lines 143-145 where it makes more sense.
2017-11-02 12:58:29 -07:00
Shaun Crampton
0cddb6b097 Disable the grace termination period for the calico/node pod
The grace period is unneccessary for calico/node and it delays restart of
a new calico/node pod to take over routing and policy updates.

Setting the grace period to 0 has the special meaning of doing a force deletion,
which avoids a slow round-trip through the kubelet and API server.

Fixes #55013
2017-11-02 17:31:35 +00:00
Marcin Wielgus
3c615b4b4d Reduce metadata-proxy cpu requests to 30m 2017-11-02 14:52:30 +01:00
Tim Allclair
368afc6217
Add GCP addon PodSecurityPolicies & Bindings 2017-11-01 14:03:05 -07:00
Karol Wychowaniec
5f5110c650 Adjust resources for Metrics Server 2017-10-31 10:42:00 +01:00
Lantao Liu
70a0cdfa8e Add CRI log format support in fluentd. 2017-10-30 06:25:52 +00:00
Kubernetes Submit Queue
1bc5f7cfa3
Merge pull request #54346 from zouyee/rbac
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

update rbac apiversion

**What this PR does / why we need it**:
update rbac apiversion to v1
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```NONE
```
2017-10-28 22:02:35 -07:00
Kubernetes Submit Queue
949ec719c3
Merge pull request #54635 from loburm/prom-to-sd
Automatic merge from submit-queue (batch tested with PRs 54635, 54250, 54657, 54696, 54700). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bump version of prometheus-to-sd to 0.2.2.

Bump version of prometheus-to-sd to improve logging, add pod_name and
pod_namespace flags and remove deprecated flags.

Fixes #54583 

```release-note
NONE
```
2017-10-27 14:38:21 -07:00
Kubernetes Submit Queue
fc8bfe2d89 Merge pull request #54395 from crassirostris/fluentd-gcp-rollback-host-networking
Automatic merge from submit-queue (batch tested with PRs 50776, 54395). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Move fluentd-gcp out of host network

Since metadata proxy doesn't filter service account after all, make fluentd-gcp addon run in its own network

This will mitigate the problem with port collision

```release-note
[fluentd-gcp addon] Fluentd now runs in its own network, not in the host one.
```
2017-10-27 03:09:25 -07:00
Kubernetes Submit Queue
d945927077 Merge pull request #53545 from heschlie/calico-update
Automatic merge from submit-queue (batch tested with PRs 54419, 53545). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Updating Calico to v2.6.1

**What this PR does / why we need it**:

Updating Calico to the most recent release v2.6.1.

[Release page](https://docs.projectcalico.org/v2.6/releases/) and [blog post](https://www.projectcalico.org/project-calico-2-6-released/)

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2017-10-27 00:14:22 -07:00
zouyee
ea93a04073 update rbac apiversion 2017-10-27 10:39:55 +08:00
Mike Danese
3f7e1cccd2 don't add kubelet legacy binding if we aren't registering the master kubelet 2017-10-26 13:30:59 -07:00
Mike Danese
8b3a8adb17 reorganize rbac addon dir into subdirectories 2017-10-26 13:26:52 -07:00
Marian Lobur
5b62eb29d2 Bump version of prometheus-to-sd to 0.2.2.
Bump version of prometheus-to-sd to improve logging, add pod_name and
pod_namespace flags and remove deprecated flags.
2017-10-26 15:54:54 +02:00
Kubernetes Submit Queue
7cadcd0558 Merge pull request #53993 from JonPulsifer/typha-rbac
Automatic merge from submit-queue (batch tested with PRs 53946, 53993, 54315, 54143, 54532). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

RBAC for Calico Typha Horizontal Autoscaler

**What this PR does / why we need it**:

On v1.8.0-gke.1 I noticed a number of RBAC failures for `default` in kube-system. Turns out the only container missing the serviceAccountName was the typha-horizontal-autoscaler.

**Special notes for your reviewer**:

cc @caseydavenport seems like this is up your alley 

**Release note**:

```release-note
NONE
```
2017-10-25 21:20:29 -07:00
Tim Allclair
b18edfec7a
Update fluentd-gcp DaemonSet
- Use a dedicated service account to run the fluentd-gcp DS
- Update prometheus-to-sd from v0.1.3 to v0.2.1
- Use the certificates in the prometheus-to-sd image rather than mounting the host certs
2017-10-25 13:11:35 -07:00
Kubernetes Submit Queue
ef100b12f6 Merge pull request #52003 from vfreex/mount-lib-modules
Automatic merge from submit-queue (batch tested with PRs 52003, 54559, 54518). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Load kernel modules automatically inside a kube-proxy pod

**What this PR does / why we need it**:
This change will mount `/lib/modules` on host to the kube-proxy pod,
so that a kube-proxy pod can load kernel modules by need
or when `modprobe <kmod>` is run inside the pod.

This will be convenient for kube-proxy running in IPVS mode.
Users will don't have to run `modprobe ip_vs` on nodes before starting
a kube-proxy pod.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
The kube-proxy IPVS proxier will check if the kernel supports IPVS, or it will fallback to iptables or userspace modes. There is a false negative condition in the check, #51874 addressed that issue.

**Release note**:

```release-note
Load kernel modules automatically inside a kube-proxy pod
```
2017-10-25 11:38:36 -07:00
Kubernetes Submit Queue
3e694c38e0 Merge pull request #54357 from zouyee/storage-class-1
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[addon/storage-class] update storageclass groupversion in storage-class

**What this PR does / why we need it**:
[addon/storage-class] update storageclass groupversion in storage-class
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```NONE
```
2017-10-23 23:11:03 -07:00
Bowei Du
c7d6934433 Update kube-dns 1.14.7
```release-notes
* Logging cleanups
* Updates kube-dns to use client-go 3
* Updates containers to use alpine as the base image on all platforms
* Adds support for IPv6
```
2017-10-23 14:37:13 -07:00
Mik Vyatskov
d30af4d8a0 Move fluentd-gcp out of host network 2017-10-23 12:02:54 +02:00
zouyee
e594b2c121 [addon/storage-class] update storageclass groupversion in storage-class 2017-10-22 19:50:47 +08:00
André Martins
3e4b9fad6a addons/dns: changing probes for SRV record type
Signed-off-by: André Martins <aanm90@gmail.com>
2017-10-20 20:07:25 +02:00
Shyam JVS
607c3d6967 Revert "kube-dns-anti-affinity: kube-dns never-co-located-in-the-same-node" 2017-10-18 22:01:42 +02:00
Matt Farina
4327603573
Updated cluster/addons readme to match and point to docs 2017-10-18 10:36:24 -04:00
Kubernetes Submit Queue
ef87482923 Merge pull request #52193 from StevenACoffman/kube-dns-anti-affinity
Automatic merge from submit-queue (batch tested with PRs 53106, 52193, 51250, 52449, 53861). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

kube-dns-anti-affinity: kube-dns never-co-located-in-the-same-node

**What this PR does / why we need it**:

This is upstreaming the kubernetes/kops#2705 pull request by @jamesbucher that was originally against [kops](github.com/kubernetes/kops).
Please see kubernetes/kops#2705 for more details, including a lengthy discussion.

Briefly, given the constraints of how the system works today:

+ if you need multiple DNS pods primarily for availability, then requiredDuringSchedulingIgnoredDuringExecution makes sense because putting more than one DNS pod on the same node isn't useful
+ if you need multiple DNS pods primarily for performance, then
preferredDuringScheduling IgnoredDuringExecution makes sense because it will allow the DNS pods to schedule even if they can't be spread across nodes

**Which issue this PR fixes**

fixes kubernetes/kops#2693

**Release note**:


```release-note
Improve resilience by annotating kube-dns addon with podAntiAffinity to prefer scheduling on different nodes.
```
2017-10-16 14:47:20 -07:00
Jonathan Pulsifer
24e319c056
RBAC for Calico Typha Horizontal Autoscaler 2017-10-16 13:47:41 -04:00