Commit Graph

615 Commits

Author SHA1 Message Date
Michael Taufen
d4e48fd789 graduate DynamicKubeletConfig feature to beta 2018-05-24 09:59:29 -07:00
Marian Lobur
c1d0004013 Add environment variable to control truncating backend. 2018-05-18 15:52:47 +02:00
Maciej Borsz
128d6d3498 Add a way to pass extra arguments to etcd. 2018-05-17 10:48:13 +02:00
Kubernetes Submit Queue
7b8bb6e7d3
Merge pull request #63357 from Random-Liu/install-and-use-crictl
Automatic merge from submit-queue (batch tested with PRs 63167, 63357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Install and use crictl in gce kube-up.sh

Download and use crictl in gce kube-up.sh.

This PR:
1. Downloads crictl `v1.0.0-beta.0` onto the node, which supports CRI v1alpha2. We'll upgrade it to `v1.0.0-beta.1` soon after the release is cut.
2. Change `kube-docker-monitor` to `kube-container-runtime-monitor`, and let it use `crictl` to do health monitoring.
3. Change `e2e-image-puller` to use `crictl`. Because of https://github.com/kubernetes/kubernetes/issues/63355, it doesn't work now. But in `crictl v1.0.0-beta.1`, we are going to statically link it, and the `e2e-image-puller` should work again.
4. Use `systemctl kill --kill-who=main` instead of `pkill`, the reason is that:
  a. `pkill docker` will send `SIGTERM` to all processes including `dockerd`, `docker-containerd`, `docker-containerd-shim`. This is not a problem for Docker 17.03 CE, because `containerd-shim` in containerd 0.2.x doesn't exit with SIGERM (see [code](https://github.com/containerd/containerd/blob/v0.2.x/containerd-shim/main.go#L123)). However, `containerd-shim` in containerd 1.0+ does exit with SIGTERM (see [code](https://github.com/containerd/containerd/blob/master/cmd/containerd-shim/main_unix.go#L200)). This means that `pkill docker` and `pkill containerd` will kill all shim processes for Docker 17.11+ and containerd 1.0+.
  b. We can use `pkill -x` instead. However, docker systemd service name is `docker`, but daemon process name is `dockerd`. We have to introduce another environment variable to specify "daemon process name". Given so, it seems easier to just use `systemctl kill` which only requires systemd service name. `systemctl kill --kill-who=main` will make sure only main process receives SIGTERM.

Signed-off-by: Lantao Liu <lantaol@google.com>

/cc @filbranden @yujuhong @feiskyer @mrunalp @kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews 

**Release note**:

```release-note
Kubernetes cluster on GCE have crictl installed now. Users can use it to help debug their node. The documentation of crictl can be found https://github.com/kubernetes-incubator/cri-tools/blob/master/docs/crictl.md.
```
2018-05-15 21:18:12 -07:00
Kubernetes Submit Queue
b617748f7b
Merge pull request #62905 from serathius/event-exporter-region
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[fluentd-gcp addon] Pass region in seperate field

This PR makes location passed to event-exporter based on `MULTIZONE` env.

Fixes https://github.com/kubernetes/kubernetes/issues/62399
```release-note
NONE
```
/cc @loburm
2018-05-11 06:00:44 -07:00
Marek Siarkowicz
f351b00a99 [fluentd-gcp addon] Pass region in seperate field 2018-05-11 09:50:07 +02:00
yankaiz
3989ec66eb Add MAX_PODS_PER_NODE env allowing kubelet to be max-pods aware. 2018-05-04 11:09:55 -07:00
Lantao Liu
d94a2b39d9 Install and use crictl in gce kube-up.sh
Signed-off-by: Lantao Liu <lantaol@google.com>
2018-05-03 17:17:55 -07:00
Kubernetes Submit Queue
b5f61ac129
Merge pull request #62657 from matthyx/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update all script shebangs to use /usr/bin/env interpreter instead of /bin/interpreter

This is required to support systems where bash doesn't reside in /bin (such as NixOS, or the *BSD family) and allow users to specify a different interpreter version through $PATH manipulation.
https://www.cyberciti.biz/tips/finding-bash-perl-python-portably-using-env.html
```release-note
Use /usr/bin/env in all script shebangs to increase portability.
```
2018-05-02 19:44:32 -07:00
Karol Wychowaniec
6fb42aea4a Remove METADATA_AGENT_VERSION config option 2018-04-23 12:15:48 +02:00
Matthias Bertschy
9b15af19b2 Update all script to use /usr/bin/env bash in shebang 2018-04-19 13:20:13 +02:00
Kubernetes Submit Queue
bb8f58b6e6
Merge pull request #62195 from serathius/prometheus
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add prometheus cluster monitoring addon.

This PR adds new cluster monitoring addon based on prometheus.
It adds prometheus deployment with e2e tests.
Additional components will be added iterativly in future.
Manifests based on current Helm chart.
At current state it's not intended for production use.

cc @piosz @kawych @miekg
```release-note
Add prometheus cluster monitoring addon to kube-up
```
/sig instrumentation
/kind feature
/priority important-soon
2018-04-18 02:17:48 -07:00
Michael Taufen
420edc7b50 provision Kubelet config file for GCE
This PR extends the client-side startup scripts to provision a Kubelet
config file instead of legacy flags. This PR also extends the
master/node init scripts to install this config file from the GCE
metadata server, and provide the --config argument to the Kubelet.
2018-04-13 13:08:38 -07:00
Marek Siarkowicz
9544222e91 Test e2e prometheus addon 2018-04-13 11:12:10 +02:00
Kubernetes Submit Queue
72b7dacf07
Merge pull request #58178 from mikedanese/token-auth
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

enable token authentication for kubelets in GCE

```release-note
NONE
```
2018-04-12 15:06:07 -07:00
Mike Danese
23d02c8f07 enable token auth for kubelets in GCE 2018-04-12 09:31:00 -07:00
Shyam Jeedigunta
be2e5e65d3 Fix subnet cleanup logic when using IP-aliases with custom subnets 2018-04-11 15:44:28 +02:00
Shyam Jeedigunta
da01243af1 Fix IP-alias subnet creation logic 2018-04-06 13:23:38 +02:00
Kubernetes Submit Queue
4cfa2e4dfd
Merge pull request #60102 from satyasm/gcloud_net_flag
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fixes #54017, remove deprecated --mode flag

**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #54017

**Special notes for your reviewer**:

**Release note**:

```release-note
remove deprecated --mode flag in check-network-mode
```
2018-04-05 17:53:00 -07:00
Supriya Garg
e350c46116 Update the stackdriver agents yaml to include a deployment for cluster level resources 2018-04-05 10:09:11 -04:00
Kubernetes Submit Queue
4685df26dd
Merge pull request #60590 from immutableT/enc_config_automation
Automatic merge from submit-queue (batch tested with PRs 60420, 60590). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Enable AESGCM encryption of secrets in etcd by default.

**What this PR does / why we need it**:
Enable encryption of secrets in etcd via AESGCM transform (as described here https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/) during kube-up.sh build of a cluster.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-03-28 23:53:06 -07:00
Kubernetes Submit Queue
e3af2374a6
Merge pull request #60801 from jingax10/gce_util_branch
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Suppress error message from grep when checking whether a subnet has a secondary range or not.

**What this PR does / why we need it**:

Get rid of stdrr caused by grep command when running cluster/kube-up.sh for GCE.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

N/A

**Special notes for your reviewer**:

No behavior change.

**Release note**:

```release-note
"NONE"
```
2018-03-25 02:40:33 -07:00
Kubernetes Submit Queue
053a12aee9
Merge pull request #60107 from wangzhen127/cos-audit-placeholder
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Update GCP fluentd configmap for COS audit logging on GKE node

**What this PR does / why we need it**:
This PR adds a placeholder in fluentd configmap for COS audit logging on GKE node.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
NONE

**Release note**:

```release-note
NONE
```
2018-03-25 00:51:52 -07:00
immutablet
d08799ca09 Enable AESGCM encryption of secrets in etcd by default. 2018-03-22 13:51:09 -07:00
Kubernetes Submit Queue
e81965d456
Merge pull request #61065 from freehan/fix-gcloud-dev
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix validation for dev gcloud

```release-note
NONE
```
2018-03-22 13:15:12 -07:00
Zhen Wang
d5c2cdcbbb Update GCP fluentd configmap for GKE node journal logging 2018-03-22 12:04:11 -07:00
Kubernetes Submit Queue
130caab7d5
Merge pull request #61235 from yguo0905/client-2
Automatic merge from submit-queue (batch tested with PRs 61124, 59537, 61235, 61258, 61114). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Support new NODE_OS_DISTRIBUTION 'custom' on GCE

**What this PR does / why we need it**:

This PR allows us to run e2e tests against arbitrary OS images on GCE.

It will be cherry picked into 1.8, 1.9 and 1.10.

**Release note**:

```
Support new NODE_OS_DISTRIBUTION 'custom' on GCE.
```

/assign @dashpole
2018-03-21 08:39:23 -07:00
Jing Ai
384868e570 Suppress error message from grep by removing in the end as it is wrongly interpreted as a file. 2018-03-16 18:12:39 -07:00
Kubernetes Submit Queue
c6d77ee656
Merge pull request #61119 from mtaufen/fix-cluster-autoscaler
Automatic merge from submit-queue (batch tested with PRs 61284, 61119, 61201). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add AUTOSCALER_ENV_VARS to kube-env to hotfix cluster autoscaler

This provides a temporary way for the cluster autoscaler to get at
values that were removed from kube-env in #60020. Ideally this
information will eventually be available via e.g. the Cluster API,
because kube-env is an internal interface that carries no stability
guarantees.

This is the first half of the fix; the other half is that cluster autoscaler
needs to be modified to read from AUTOSCALER_ENV_VARS, if it is
available.

Since cluster autoscaler was also reading KUBELET_TEST_ARGS for the
kube-reserved flag, and we don't want to resurrect KUBELET_TEST_ARGS in kube-env,
we opted to create AUTOSCALER_ENV_VARS instead of just adding back
the old env vars. This also makes it clear that we have an ugly dependency
on kube-env.

```release-note
NONE
```
2018-03-16 16:56:00 -07:00
Michael Taufen
8cf3dc103e Add AUTOSCALER_ENV_VARS to kube-env to hotfix cluster autoscaler
This provides a temporary way for the cluster autoscaler to get at
values that were removed from kube-env in #60020. Ideally this
information will eventually be available via e.g. the Cluster API,
because kube-env is an internal interface that carries no stability
guarantees.
2018-03-16 11:43:41 -07:00
Yang Guo
518c6c1a37 Support new NODE_OS_DISTRIBUTION 'custom' on GCE 2018-03-15 14:05:15 -07:00
Ryan Hitchman
68f5d44865 Fix deprecated gcloud compute networks --mode switches.
"create --mode" becomes "create --subnet-mode", and switch-mode has been
folded into "update".

Create --mode was deprecated in October and will be removed in the next
gcloud release. It is already failing in staging tests.
2018-03-14 15:00:59 -07:00
Minhan Xia
ec77fe97ec fix validation for dev gcloud 2018-03-12 14:10:35 -07:00
Kubernetes Submit Queue
31b4719066
Merge pull request #60859 from verult/remount-kube-env
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Setting REMOUNT_VOLUME_PLUGIN_DIR for COS images in kube-env

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60725

**Special notes for your reviewer**: Not sure if it's the best place to set `REMOUNT_VOLUME_PLUGIN_DIR`.

/sig storage
/sig cluster-lifecycle
2018-03-12 10:54:31 -07:00
Mik Vyatskov
07905d6ee8 Make log audit backend configurable in GCE
Signed-off-by: Mik Vyatskov <vmik@google.com>
2018-03-08 14:09:32 +01:00
Cheng Xing
16ecc14017 Setting REMOUNT_VOLUME_PLUGIN_DIR for COS images in kube-env 2018-03-06 14:22:41 -08:00
Jing Ai
977252d4b2 Suppress error message from grep when checking whether a subnet has a secondary range or not. 2018-03-05 09:54:11 -08:00
Mike Danese
857690baf5 gce: add support for enabling TokenRequest feature 2018-02-27 18:54:03 -08:00
Robert Bailey
fe10c27ec0 Move kubelet flag generation from the node to the client, and
pass the kubelet flags through a new variable in kube-env
(KUBELET_ARGS).

Remove vars from kube-env that were only used for kubelet flags.

This will make it simpler to gradually migrate to dynamic kubelet
config, because we can gradually replace flags with config file
options in a single place without worrying about the plumbing to
move variables from the client onto the node.
2018-02-24 22:39:36 -08:00
Kubernetes Submit Queue
a85f7d9fff
Merge pull request #58090 from serathius/pass-location-to-event-exporter
Automatic merge from submit-queue (batch tested with PRs 60054, 60202, 60219, 58090, 60275). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Pass location parameter to event exporter.

**What this PR does / why we need it**:
This PR makes event-exporter export cluster location together with events.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-02-23 23:15:43 -08:00
Marek Siarkowicz
bbfcd681b5 Pass location parameter to event exporter.
Location passed based on ZONE from kube-env.
2018-02-21 12:54:29 +01:00
Satyadeep Musuvathy
59b1ff820c fixes #54017, remove deprecated --mode flag 2018-02-20 14:53:19 -08:00
Karol Wychowaniec
443fd11bb9 Add cluster-location to GCE instance attributes 2018-02-19 10:48:25 +01:00
Robert Bailey
49cb1024b7 Move code only used by gce out of common.sh and into gce/util.sh. 2018-02-15 21:31:12 -08:00
Tim Hockin
3586986416 Switch to k8s.gcr.io vanity domain
This is the 2nd attempt.  The previous was reverted while we figured out
the regional mirrors (oops).

New plan: k8s.gcr.io is a read-only facade that auto-detects your source
region (us, eu, or asia for now) and pulls from the closest.  To publish
an image, push k8s-staging.gcr.io and it will be synced to the regionals
automatically (similar to today).  For now the staging is an alias to
gcr.io/google_containers (the legacy URL).

When we move off of google-owned projects (working on it), then we just
do a one-time sync, and change the google-internal config, and nobody
outside should notice.

We can, in parallel, change the auto-sync into a manual sync - send a PR
to "promote" something from staging, and a bot activates it.  Nice and
visible, easy to keep track of.
2018-02-07 21:14:19 -08:00
Mike Danese
d6918bbbc0 cluster: remove kube-registry-proxy 2018-02-01 07:23:50 -08:00
Mike Danese
e420e0fca8 cluster: remove unused kubelet token 2018-02-01 07:23:50 -08:00
Mike Danese
02de75fb41 cluster: remove some cvm stuff 2018-02-01 07:23:50 -08:00
Mike Danese
4961065562 cluster: remove unused functions 2018-02-01 07:23:50 -08:00
Jesse Shieh
f9e43f3a6f
Fix master regex when running multiple clusters
I'm running two Kubernetes clusters on GCE. One for production and one for staging. The instance prefix I use for production is `kubernetes` and for staging it's `staging-kubernetes`. This caused a problem when running `kube-up.sh` for production because when it tries to find all instances which match `kubernetes(-...)?` it finds both the production and staging instances. This probably results in multiple problems, but the most noticeable one for me was that I`NITIAL_ETCD_CLUSTER` was incorrect and so etcd wouldn't start up correctly so the api server doesn't start up correctly so nothing else starts up. I tested this manually and it seems to work for me, but I didn't write an automated test.
2018-01-19 18:44:52 -08:00