Commit Graph

13 Commits

Author SHA1 Message Date
Matt Bruzek
258ee22858 Putting the nvidia-smi command in a try catch to avoid errors. 2017-04-14 10:45:33 -05:00
Rye Terrell
33fee22032 add support for kube-proxy cluster-cidr option 2017-04-14 10:45:23 -05:00
Jacek N
ebd2f88f6b Add registry action to the kubernetes-worker layer 2017-04-14 10:43:09 -05:00
Rye Terrell
ca4afd8773 Update CDK charms to use snaps 2017-04-14 10:43:00 -05:00
Kubernetes Submit Queue
3a3dc827e4 Merge pull request #43467 from tvansteenburgh/gpu-support
Automatic merge from submit-queue (batch tested with PRs 44047, 43514, 44037, 43467)

Juju: Enable GPU mode if GPU hardware detected

**What this PR does / why we need it**:

Automatically configures kubernetes-worker node to utilize GPU hardware when such hardware is detected.

layer-nvidia-cuda does the hardware detection, installs CUDA and Nvidia
drivers, and sets a state that the k8s-worker can react to.

When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.

When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.

The kube-control interface has subsumed the kube-dns interface
functionality.

An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".

**Special notes for your reviewer**:

Quickest test setup is as follows:
```bash
# Bootstrap. If your aws account doesn't have a default vpc, you'll need to
# specify one at bootstrap time so that juju can provision a p2.xlarge.
# Otherwise you can leave out the --config "vpc-id=vpc-xxxxxxxx" bit.
juju bootstrap --config "vpc-id=vpc-xxxxxxxx" --constraints "cores=4 mem=16G root-disk=64G" aws/us-east-1 k8s

# Deploy the bundle containing master and worker charms built from
# https://github.com/tvansteenburgh/kubernetes/tree/gpu-support/cluster/juju/layers
juju deploy cs:~tvansteenburgh/bundle/kubernetes-gpu-support-3

# Setup kubectl locally
mkdir -p ~/.kube
juju scp kubernetes-master/0:config ~/.kube/config
juju scp kubernetes-master/0:kubectl ./kubectl

# Download a gpu-dependent job spec
wget -O /tmp/nvidia-smi.yaml https://raw.githubusercontent.com/madeden/blogposts/master/k8s-gpu-cloud/src/nvidia-smi.yaml

# Create the job
kubectl create -f /tmp/nvidia-smi.yaml

# You should see a new nvidia-smi-xxxxx pod created
kubectl get pods

# Wait a bit for the job to run, then view logs; you should see the
# nvidia-smi table output
kubectl logs $(kubectl get pods -l name=nvidia-smi -o=name -a)
```

kube-control interface: https://github.com/juju-solutions/interface-kube-control
nvidia-cuda layer: https://github.com/juju-solutions/layer-nvidia-cuda
(Both are registered on http://interfaces.juju.solutions/)

**Release note**:
```release-note
Juju: Enable GPU mode if GPU hardware detected
```
2017-04-04 14:33:26 -07:00
Kubernetes Submit Queue
ff353231ec Merge pull request #42102 from timchenxiaoyu/kubltworderror
Automatic merge from submit-queue

kubelet word mistake
2017-03-24 10:25:06 -07:00
Tim Van Steenburgh
c87ac5ef2e Enable gpu mode if gpu hardware detected.
layer-nvidia-cuda does the hardware detection and sets a state that the
worker can react to.

When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.

When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.

The kube-control interface has subsumed the kube-dns interface
functionality.

An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".
2017-03-23 12:01:23 -04:00
Kubernetes Submit Queue
5b8d600d72 Merge pull request #41919 from Cynerva/gkk/kubelet-auth
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)

Juju: Disable anonymous auth on kubelet

**What this PR does / why we need it**:

This disables anonymous authentication on kubelet when deployed via Juju.

I've also adjusted a few other TLS options for kubelet and kube-apiserver. The end result is that:
1. kube-apiserver can now authenticate with kubelet
2. kube-apiserver now verifies the integrity of kubelet

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:

https://github.com/juju-solutions/bundle-canonical-kubernetes/issues/219

**Special notes for your reviewer**:

This is dependent on PR #41251, where the tactics changes are being merged in separately.

Some useful pages from the documentation:
* [apiserver -> kubelet](https://kubernetes.io/docs/admin/master-node-communication/#apiserver---kubelet)
* [Kubelet authentication/authorization](https://kubernetes.io/docs/admin/kubelet-authentication-authorization/)

**Release note**:

```release-note
Juju: Disable anonymous auth on kubelet
```
2017-03-03 16:44:37 -08:00
George Kraft
27504d8aca Juju: Disable anonymous auth on kubelet
Adds TLS verification between kube-apiserver and kubelet in both directions
2017-02-27 09:02:24 -06:00
axino
83766d2894 add nrpe-external-master relation to kubernetes-master and kubernetes-worker
For now, the checks are very basic and only check if the systemd
services are running properly.
2017-02-26 10:37:34 -06:00
timchenxiaoyu
34bf0bf1cd kubelet word mistake 2017-02-25 22:15:53 +08:00
Matt Bruzek
3b29b6a9ef Lint fixes for the master and worker Python code. 2017-02-16 14:01:30 -06:00
Matt Bruzek
3fcf279cfb Splitting master/node services into separate charm layers
This branch includes a rollup series of commits from a fork of the
kubernetes repository pre 1.5 release because we didn't make the code freeze.
This additional effort has been fully tested and has results submit into
the gubernator to enhance confidence in this code quality vs. the single
layer, posing as both master/node.

To reference the gubernator results, please see:
https://k8s-gubernator.appspot.com/builds/canonical-kubernetes-tests/logs/kubernetes-gce-e2e-node/

Apologies in advance for the large commit, however we did not want to
submit without having successful upstream automated testing results.

This commit includes:

 - Support for CNI networking plugins
 - Support for durable storage provided by ceph
 - Building from upstream templates (read: kubedns - no more template
 drift!)
 - An e2e charm-layer to make running validation tests much simpler/repeatable
 - Changes to support the 1.5.x series of kubernetes

Additional note: We will be targeting -all- future work against upstream
so large pull requests of this magnitude will not occur again.
2017-01-24 09:42:25 -06:00