Commit Graph

4781 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
63ae7a02fa Merge pull request #36783 from mml/migrate-debug
Automatic merge from submit-queue

Add debug logging to all etcd migration operations.
2016-11-16 00:31:42 -08:00
Kubernetes Submit Queue
723690c5d9 Merge pull request #36822 from mtaufen/gci-not-default-yet-gce
Automatic merge from submit-queue

K8s 1.5 keeps container-vm as default node image on GCE

There is a concern that some GCE users may be running automation that
(a) turns up ephemeral clusters and (b) always uses the latest K8s
release. If any of these workloads fall outside the set supported on
GCI, cutting the release will break the automation. We are therefore
delaying this change until we have provided sufficient warning.

```release-note
K8s 1.5 keeps container-vm as the default node image on GCE for backwards compatibility reasons. Please beware that container-vm is officially deprecated and you should replace it with GCI if at all possible. You can review the migration guide here for more detail: https://cloud.google.com/container-engine/docs/node-image-migration
```

/cc @aronchick @vishh @roberthbailey
2016-11-15 22:39:00 -08:00
Kubernetes Submit Queue
fedf17826b Merge pull request #36738 from wojtek-t/fix_rollback_etcd3
Automatic merge from submit-queue

Remove v2 data before etcd rollback

Fix #36555
2016-11-15 16:09:15 -08:00
Matt Liggett
fd289c2d55 Add debug logging to all etcd migration operations. 2016-11-15 15:41:42 -08:00
Kubernetes Submit Queue
09a6da3207 Merge pull request #36741 from wojtek-t/fix_migration_ports
Automatic merge from submit-queue

Fix ports in migration script

This may fix problems with migration that you observed.
2016-11-15 12:07:31 -08:00
Michael Taufen
6c5b4761c8 K8s 1.5 keeps container-vm as default node image on GCE
There is a concern that some GCE users may be running automation that
(a) turns up ephemeral clusters and (b) always uses the latest K8s
release. If any of these workloads fall outside the set supported on
GCI, cutting the release will break the automation. We are therefore
delaying this change until we have provided sufficient warning.
2016-11-15 08:34:10 -08:00
Wojciech Tyczynski
2bccbafb6d Set --name flag in etcd migration script 2016-11-15 10:27:02 +01:00
Wojciech Tyczynski
c42729e967 Remove v2 data before etcd rollback 2016-11-15 09:03:49 +01:00
Wojciech Tyczynski
83d83ebb47 Fix ports in migration script 2016-11-14 12:17:34 +01:00
Kubernetes Submit Queue
5e52db2e4f Merge pull request #35895 from rf232/patch-1
Automatic merge from submit-queue

Update Dashboard UI version to 1.4.2

**What this PR does / why we need it**:

Dashboard 1.4.2 contains a fix for an XSS security bug, so I think it would be prudent to update the Dashboard version 'shipped' with kubernetes to this version

**Special notes for your reviewer**:

**Release note**:
- Updated dashboard version in addons to 1.4.2```
2016-11-14 01:15:12 -08:00
Michael Taufen
a38c61395e Bump GCI version to gci-dev-56-8977-0-0 2016-11-11 16:00:18 -08:00
Kubernetes Submit Queue
52ca344cc8 Merge pull request #36261 from bowei/dnsmasq-metrics-in-dns-pod
Automatic merge from submit-queue

Add dnsmasq-metrics to the standard DNS pod
2016-11-10 11:09:55 -08:00
Kubernetes Submit Queue
a7870447cc Merge pull request #35516 from jszczepkowski/ha-etcd-certs
Automatic merge from submit-queue

SSL certificates for etcd cluster.

Added generation of SSL certificates for etcd cluster's internal communication.
Turned on on GCE (gci, trusty and debain).
2016-11-10 07:59:01 -08:00
Kubernetes Submit Queue
c34babc2b3 Merge pull request #36537 from rickypai/patch-1
Automatic merge from submit-queue

Fix Docker Registry image version to 2.5.1

`registry:2` is constantly being updated with new versions. This means there's a possibility that the image may be changed unintentionally. For example, when the Pod is rescheduled on nodes that does not already have the image, depending on the time of the pull, `registry:2` may result in different images.

Fix this to the latest `registry:2.5.1` instead to avoid this problem.

@uluyol @freehan
2016-11-10 07:22:54 -08:00
Jerzy Szczepkowski
ab7266bf19 SSL certificates for etcd cluster.
Added generation of SSL certificates for etcd cluster internal
communication. Turned on on gci & trusty.
2016-11-10 15:26:03 +01:00
Kubernetes Submit Queue
981304872c Merge pull request #36486 from wojtek-t/increase_master_disk_size
Automatic merge from submit-queue

Increase master disk size in large clusters

Ref #34911
2016-11-10 06:12:07 -08:00
Kubernetes Submit Queue
1014bc411a Merge pull request #36346 from jszczepkowski/ha-masterip
Automatic merge from submit-queue

Change master to advertise external IP in kubernetes service.

Change master to advertise external IP in kubernetes service.
In effect, in HA mode in case of multiple masters, IP of external load
balancer will be advertise in kubernetes service.
2016-11-10 05:00:48 -08:00
Rob Franken
4981e0e37c Update used dashboard version to 1.4.2
Dashboard 1.4.2 contains a fix for an XSS security bug, so I think it would be prudent to update the Dashboard version 'shipped' with kubernetes to this version
2016-11-10 11:49:07 +01:00
Kubernetes Submit Queue
c98fc70195 Merge pull request #36008 from MrHohn/addon-rc-migrate
Automatic merge from submit-queue

Migrates addons from RCs to Deployments

Fixes #33698.

Below addons are being migrated:
- kube-dns
- GLBC default backend
- Dashboard UI
- Kibana

For the new deployments, the version suffixes are removed from their names. Version related labels are also removed because they are confusing and not needed any more with regard to how Deployment and the new Addon Manager works.

The `replica` field in `kube-dns` Deployment manifest is removed for the incoming DNS horizontal autoscaling feature #33239.

The `replica` field in `Dashboard` Deployment manifest is also removed because the rescheduler e2e test is manually scaling it.

Some resource limit related fields in `heapster-controller.yaml` are removed, as they will be set up by the `addon resizer` containers. Detailed reasons in #34513.

Three e2e tests are modified:
- `rescheduler.go`: Changed to resize Dashboard UI Deployment instead of ReplicationController.
- `addon_update.go`: Some namespace related changes in order to make it compatible with the new Addon Manager.
- `dns_autoscaling.go`: Changed to examine kube-dns Deployment instead of ReplicationController.

Both of above two tests passed on my own cluster. The upgrade process --- from old Addons with RCs to new Addons with Deployments --- was also tested and worked as expected.

The last commit upgrades Addon Manager to v6.0. It is still a work in process and currently waiting for #35220 to be finished. (The Addon Manager image in used comes from a non-official registry but it mostly works except some corner cases.)

@piosz @gmarek could you please review the heapster part and the rescheduler test?

@mikedanese @thockin 

cc @kubernetes/sig-cluster-lifecycle 

---

Notes:
- Kube-dns manifest still uses *-rc.yaml for the new Deployment. The stale file names are preserved here for receiving faster review. May send out PR to re-organize kube-dns's file names after this.
- Heapster Deployment's name remains in the old fashion(with `-v1.2.0` suffix) for avoiding describe this upgrade transition explicitly. In this way we don't need to attach fake apply labels to the old Deployments.
2016-11-10 02:36:38 -08:00
Bowei Du
9478c4b01f Add dnsmasq-metrics to the standard DNS pod
- Enables prometheus metrics on kube-dns
- Explicitly set v=0 logging for now
2016-11-10 00:08:14 -08:00
Kubernetes Submit Queue
a330acddee Merge pull request #36358 from Crassirostris/use-new-fluentd-gcp-config
Automatic merge from submit-queue

Use new fluentd-gcp image version

In #35618 we used new version of fluentd agent, which includes new version of jeamalloc, allowing us to use it.

Additionally, we came up with a hacky way to encourage Ruby GC to be invoked more often by using RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR variable.

@piosz
2016-11-09 21:50:53 -08:00
Kubernetes Submit Queue
0f082c6663 Merge pull request #36280 from rkouj/better-mount-error
Automatic merge from submit-queue

Better messaging for missing volume binaries on host

**What this PR does / why we need it**:
When mount binaries are not present on a host, the error returned is a generic one.
This change is to check the mount binaries before the mount and return a user-friendly error message.

This change is specific to GCI and the flag is experimental now.

https://github.com/kubernetes/kubernetes/issues/36098

**Release note**:
Introduces a flag `check-node-capabilities-before-mount` which if set, enables a check (`CanMount()`) prior to mount operations to verify that the required components (binaries, etc.) to mount the volume are available on the underlying node. If the check is enabled and `CanMount()` returns an error, the mount operation fails. Implements the `CanMount()` check for NFS.















Sample output post change :


rkouj@rkouj0:~/go/src/k8s.io/kubernetes$ kubectl describe pods
Name:		sleepyrc-fzhyl
Namespace:	default
Node:		e2e-test-rkouj-minion-group-oxxa/10.240.0.3
Start Time:	Mon, 07 Nov 2016 21:28:36 -0800
Labels:		name=sleepy
Status:		Pending
IP:		
Controllers:	ReplicationController/sleepyrc
Containers:
  sleepycontainer1:
    Container ID:	
    Image:		gcr.io/google_containers/busybox
    Image ID:		
    Port:		
    Command:
      sleep
      6000
    QoS Tier:
      cpu:	Burstable
      memory:	BestEffort
    Requests:
      cpu:		100m
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Environment Variables:
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  data:
    Type:	NFS (an NFS mount that lasts the lifetime of a pod)
    Server:	127.0.0.1
    Path:	/export
    ReadOnly:	false
  default-token-d13tj:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-d13tj
Events:
  FirstSeen	LastSeen	Count	From						SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----						-------------	--------	------		-------
  7s		7s		1	{default-scheduler }						Normal		Scheduled	Successfully assigned sleepyrc-fzhyl to e2e-test-rkouj-minion-group-oxxa
  6s		3s		4	{kubelet e2e-test-rkouj-minion-group-oxxa}			Warning		FailedMount	Unable to mount volume kubernetes.io/nfs/32c7ef16-a574-11e6-813d-42010af00002-data (spec.Name: data) on pod sleepyrc-fzhyl (UID: 32c7ef16-a574-11e6-813d-42010af00002). Verify that your node machine has the required components before attempting to mount this volume type. Required binary /sbin/mount.nfs is missing
2016-11-09 18:51:00 -08:00
Kubernetes Submit Queue
de2bec7691 Merge pull request #36550 from yujuhong/kern_timestamps
Automatic merge from submit-queue

Get kernel logs with timestamps
2016-11-09 18:13:06 -08:00
Kubernetes Submit Queue
b392910bc7 Merge pull request #36505 from Crassirostris/kibana-image-fix
Automatic merge from submit-queue

Fix startup script bug in kibana image

Big thanks to @lhopki01 for noticing this!

As mention in discussion in https://github.com/kubernetes/kubernetes/pull/36103 current image crashes if we don't want to work behind proxy because of string interpolation in bash.

@piosz
2016-11-09 17:33:58 -08:00
Kubernetes Submit Queue
9922489abc Merge pull request #36384 from Crassirostris/fluentd-es-rescheduler-config
Automatic merge from submit-queue

Add rescheduler logs to the fluentd-elasticsearch configuration

Same as https://github.com/kubernetes/kubernetes/pull/36359 for elasticsearch plugin

@piosz
2016-11-09 17:33:50 -08:00
Yu-Ju Hong
fac2aeb416 Get kernel logs with timestamps
Without the timestamps, the log is not very useful.
2016-11-09 17:23:33 -08:00
Kubernetes Submit Queue
986839e9fb Merge pull request #35886 from MrHohn/addon-manager-token
Automatic merge from submit-queue

Fixes token_found bug in addon manager

From #35832.

Above PR exposed addon manager's logs on Jenkins, found below error on the gce e2e test artifacts:
```
Error from server: serviceaccounts "default" not found
error executing template "{{with index .secrets 0}}{{.name}}{{end}}": template: output:1:7: executing "output" at <index .secrets 0>: error calling index: index of untyped nil
== default service account in the kube-system namespace has token Error executing template: template: output:1:7: executing "output" at <index .secrets 0>: error calling index: index of untyped nil. Printing more information for debugging the template:
	template was:
		{{with index .secrets 0}}{{.name}}{{end}}
	raw data was:
		{"kind":"ServiceAccount","apiVersion":"v1","metadata":{"name":"default","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/serviceaccounts/default","uid":"de3f2f85-9d6a-11e6-9df3-42010af00002","resourceVersion":"48","creationTimestamp":"2016-10-29T00:01:40Z"}}
	object given to template engine was:
		map[apiVersion:v1 metadata:map[selfLink:/api/v1/namespaces/kube-system/serviceaccounts/default uid:de3f2f85-9d6a-11e6-9df3-42010af00002 resourceVersion:48 creationTimestamp:2016-10-29T00:01:40Z name:default namespace:kube-system] kind:ServiceAccount] ==
```

Seems like the script failed to retrieve service token at the first time and mistakenly used the error message as the token content. Fixes by replacing `|| true` with if condition.
2016-11-09 15:55:02 -08:00
Rajat Ramesh Koujalagi
d81e216fc6 Better messaging for missing volume components on host to perform mount 2016-11-09 15:16:11 -08:00
Ricky Pai
9c850044ae Fix Docker Registry image version to 2.5.1
https://hub.docker.com/r/library/registry/tags/

`registry:2` is constantly being updated with new versions. This means there's a possibility that the image may be changed unintentionally. For example, when the Pod is rescheduled on nodes that does not already have the image, depending on the time of the pull, `registry:2` may result in different images.

Fix this to the latest `registry:2.5.1` instead to avoid this problem.
2016-11-09 12:46:40 -08:00
Kubernetes Submit Queue
916f526811 Merge pull request #36435 from wojtek-t/fix_max_inflight_requests
Automatic merge from submit-queue

Increase max-requests-inflight in large clusters

Fix #35402
2016-11-09 09:27:02 -08:00
Zihong Zheng
fe3a0d2937 Changed kube-dns-autoscaler's target to Deployment/kube-dns 2016-11-09 09:20:51 -08:00
Zihong Zheng
e8c66d4aee Bumps up Addon Manager to v6.0-alpha.1 and updates related e2e test 2016-11-09 09:19:15 -08:00
Zihong Zheng
b26faae7fc Migrates addons from using ReplicationControllers to Deployments 2016-11-09 09:17:05 -08:00
Mik Vyatskov
94eeca8d2c Fixed startup script bug in kibana image 2016-11-09 16:35:34 +01:00
Wojciech Tyczynski
3a3031fd5b Increase master disk size in large clusters 2016-11-09 12:15:06 +01:00
Kubernetes Submit Queue
54274807d9 Merge pull request #35832 from MrHohn/addon-manager-logs
Automatic merge from submit-queue

Expose addon manager's log by logging to file

Fixes #35823.

Use the same way as  how [`kube-proxy`](https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/kube-proxy/kube-proxy.manifest) deals with logging. We would be able to check Addon Manager's logs for Jenkins tests after this.

Would like to see the Jenkins test result to examine.

@mikedanese
2016-11-08 22:50:57 -08:00
Vishnu kannan
773ad9be29 Make gci mounter pre-fetch mounter image to reduce startup latency during runtime
Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-11-08 12:13:49 -08:00
Jing Xu
d07396f7c7 Update configure.sh
Update the gci-mounter sha1 number
2016-11-08 12:13:49 -08:00
Vishnu kannan
77218d361b Use a local file for rkt stage1 and gci-mounter docker image.
Added a make rule `make upload` to audit and automate release artifact
uploads to GCS.

Signed-off-by: Vishnu kannan <vishnuk@google.com>
2016-11-08 11:09:13 -08:00
Vishnu kannan
dd8ec911f3 Revert "Revert "Merge pull request #35821 from vishh/gci-mounter-scope""
This reverts commit 402116aed4.
2016-11-08 11:09:10 -08:00
Mik Vyatskov
279e20ed13 Fix flunetd-gcp image Dockerfile 2016-11-08 15:14:09 +01:00
Wojciech Tyczynski
75d7d1ad37 Increase max-requests-inflight in large clusters 2016-11-08 14:41:58 +01:00
Kubernetes Submit Queue
e5fb8ac226 Merge pull request #36431 from mwielgus/ca-0.4.0-b1
Automatic merge from submit-queue

Switch cluster autoscaler to 0.4.0-beta1

Switch Kubernetes to new 0.4.0-beta1 Cluster Autoscaler. The release contains mainly bugfixes:
* unschedulable nodes don't stop cluster autoscaler
* better logging
* events for deltions
* bulk delete for empty nodes

cc: @fgrzadkowski @piosz @jszczepkowski
2016-11-08 03:47:21 -08:00
Marcin
b6ef1a132e Switch cluster autoscaler to 0.4.0-beta1 2016-11-08 11:45:42 +01:00
Kubernetes Submit Queue
ece94c317a Merge pull request #36077 from mtaufen/upgrade-log-os-and-k8s-ver
Automatic merge from submit-queue

Print osImage and kubeletVersion for nodes before and after GCE upgrade

This will print, e.g.:
```
== Pre-Upgrade Node OS and Kubelet Versions ==
name: "e2e-test-mtaufen-master", osImage: "Google Container-VM Image", kubeletVersion: "v1.4.5-beta.0.45+90d209221ec8dc-dirty"
name: "e2e-test-mtaufen-minion-group-jo79", osImage: "Debian GNU/Linux 7 (wheezy)", kubeletVersion: "v1.4.5-beta.0.45+90d209221ec8dc-dirty"
name: "e2e-test-mtaufen-minion-group-ox5l", osImage: "Debian GNU/Linux 7 (wheezy)", kubeletVersion: "v1.4.5-beta.0.45+90d209221ec8dc-dirty"
name: "e2e-test-mtaufen-minion-group-qvbq", osImage: "Debian GNU/Linux 7 (wheezy)", kubeletVersion: "v1.4.5-beta.0.45+90d209221ec8dc-dirty"
```

Let me know what output format you prefer and I'll see if I can make it work, I have the extent of flexibility allowed by jsonpath.
2016-11-08 02:18:44 -08:00
Kubernetes Submit Queue
a0c34eee35 Merge pull request #33239 from MrHohn/dns-autoscaler
Automatic merge from submit-queue

Deploy kube-dns with cluster-proportional-autoscaler

This PR integrates [cluster-proportional-autoscaler](https://github.com/kubernetes-incubator/cluster-proportional-autoscaler) with kube-dns for DNS horizontal autoscaling. 

Fixes #28648 and #27781.
2016-11-07 19:31:31 -08:00
Kubernetes Submit Queue
465c6b749c Merge pull request #36370 from Crassirostris/flunetd-gcp-image-fix
Automatic merge from submit-queue

Fix config file names inside fluentd-gcp image

Need this in order to merge https://github.com/kubernetes/kubernetes/pull/36358

Because on container-vm we need implicitly used configuration file

@piosz
2016-11-07 13:51:07 -08:00
Kubernetes Submit Queue
4ef95cd720 Merge pull request #36356 from jszczepkowski/exp-flag
Automatic merge from submit-queue

Removed EXPERIMENTAL from KUBE_REPLICATE_EXISTING_MASTER flag.
2016-11-07 12:45:31 -08:00
Mik Vyatskov
d478307106 Fix config file names inside fluentd-gcp image 2016-11-07 20:31:12 +01:00
Mik Vyatskov
800aafea9b Add rescheduler logs to the fluentd-elasticsearch configuration 2016-11-07 20:24:06 +01:00