Update docs. Add design principles. Fixes #6133. Fixes #4182.

# *** ERROR: *** docs are out of sync between cli and markdown # run hack/run-gendocs.sh > docs/kubectl.md to regenerate # # Your commit will be aborted unless you regenerate docs. COMMIT_BLOCKED_ON_GENDOCS
2015-04-16 21:41:07 +00:00
parent 17181cbb81
commit f1cea092df
13 changed files with 149 additions and 189 deletions
--- a/docs/README.md
+++ b/docs/README.md
@@ -11,6 +11,6 @@

 * The [API object documentation](http://kubernetes.io/third_party/swagger-ui/) is a detailed description of all fields found in core API objects.

-* An overview of the [Design of Kubernetes](../DESIGN.md)
+* An overview of the [Design of Kubernetes](design)

 * There are example files and walkthroughs in the [examples](../examples) folder.
--- a/docs/api-conventions.md
+++ b/docs/api-conventions.md
@@ -1,11 +1,11 @@
 API Conventions
 ===============

-Updated: 4/14/2015
+Updated: 4/16/2015

 The conventions of the [Kubernetes API](api.md) (and related APIs in the ecosystem) are intended to ease client development and ensure that configuration mechanisms can be implemented that work across a diverse set of use cases consistently.

-The general style of the Kubernetes API is RESTful - clients create, update, delete, or retrieve a description of an object via the standard HTTP verbs (POST, PUT, DELETE, and GET) - and those APIs preferentially accept and return JSON. Kubernetes also exposes additional endpoints for non-standard verbs and allows alternative content types. All of the JSON accepted and returned by the server has a schema, identified by the "kind" and "apiVersion" fields.
+The general style of the Kubernetes API is RESTful - clients create, update, delete, or retrieve a description of an object via the standard HTTP verbs (POST, PUT, DELETE, and GET) - and those APIs preferentially accept and return JSON. Kubernetes also exposes additional endpoints for non-standard verbs and allows alternative content types. All of the JSON accepted and returned by the server has a schema, identified by the "kind" and "apiVersion" fields. Where relevant HTTP header fields exist, they should mirror the content of JSON fields, but the information should not be represented only in the HTTP header.

 The following terms are defined:

--- a/docs/design/README.md
+++ b/docs/design/README.md
@@ -0,0 +1,17 @@
+# Kubernetes Design Overview
+
+Kubernetes is a system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications. 
+
+Kubernetes establishes robust declarative primitives for maintaining the desired state requested by the user. We see these primitives as the main value added by Kubernetes. Self-healing mechanisms, such as auto-restarting, re-scheduling, and replicating containers require active controllers, not just imperative orchestration.
+
+Kubernetes is primarily targeted at applications composed of multiple containers, such as elastic, distributed micro-services. It is also designed to facilitate migration of non-containerized application stacks to Kubernetes. It therefore includes abstractions for grouping containers in both loosely coupled and tightly coupled formations, and provides ways for containers to find and communicate with each other in relatively familiar ways.
+
+Kubernetes enables users to ask a cluster to run a set of containers. The system automatically chooses hosts to run those containers on. While Kubernetes's scheduler is currently very simple, we expect it to grow in sophistication over time. Scheduling is a policy-rich, topology-aware, workload-specific function that significantly impacts availability, performance, and capacity. The scheduler needs to take into account individual and collective resource requirements, quality of service requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, deadlines, and so on. Workload-specific requirements will be exposed through the API as necessary.
+
+Kubernetes is intended to run on a number of cloud providers, as well as on physical hosts.
+
+A single Kubernetes cluster is not intended to span multiple availability zones. Instead, we recommend building a higher-level layer to replicate complete deployments of highly available applications across multiple zones (see [the availability doc](../availability.md) and [cluster federation proposal](../proposals/federation.md) for more details).
+
+Finally, Kubernetes aspires to be an extensible, pluggable, building-block OSS platform and toolkit. Therefore, architecturally, we want Kubernetes to be built as a collection of pluggable components and layers, with the ability to use alternative schedulers, controllers, storage systems, and distribution mechanisms, and we're evolving its current code in that direction. Furthermore, we want others to be able to extend Kubernetes functionality, such as with higher-level PaaS functionality or multi-cluster layers, without modification of core Kubernetes source. Therefore, its API isn't just (or even necessarily mainly) targeted at end users, but at tool and extension developers. Its APIs are intended to serve as the foundation for an open ecosystem of tools, automation systems, and higher-level API layers. Consequently, there are no "internal" inter-component APIs. All APIs are visible and available, including the APIs used by the scheduler, the node controller, the replication-controller manager, Kubelet's API, etc. There's no glass to break -- in order to handle more complex use cases, one can just access the lower-level APIs in a fully transparent, composable manner.
+
+For more about the Kubernetes architecture, see [architecture](architecture.md).
--- a/docs/design/architecture.md
+++ b/docs/design/architecture.md
@@ -0,0 +1,44 @@
+# Kubernetes architecture
+
+A running Kubernetes cluster contains node agents (kubelet) and master components (APIs, scheduler, etc), on top of a distributed storage solution. This diagram shows our desired eventual state, though we're still working on a few things, like making kubelet itself (all our components, really) run within containers, and making the scheduler 100% pluggable.
+
+![Architecture Diagram](../architecture.png?raw=true "Architecture overview")
+
+## The Kubernetes Node
+
+When looking at the architecture of the system, we'll break it down to services that run on the worker node and services that compose the cluster-level control plane.
+
+The Kubernetes node has the services necessary to run application containers and be managed from the master systems.
+
+Each node runs Docker, of course.  Docker takes care of the details of downloading images and running containers.
+
+### Kubelet
+The **Kubelet** manages [pods](../pods.md) and their containers, their images, their volumes, etc. 
+
+### Kube-Proxy
+
+Each node also runs a simple network proxy and load balancer (see the [services FAQ](https://github.com/GoogleCloudPlatform/kubernetes/wiki/Services-FAQ) for more details).  This reflects `services` (see [the services doc](../docs/services.md) for more details) as defined in the Kubernetes API on each node and can do simple TCP and UDP stream forwarding (round robin) across a set of backends.
+
+Service endpoints are currently found via [DNS](../dns.md) or through environment variables (both [Docker-links-compatible](https://docs.docker.com/userguide/dockerlinks/) and Kubernetes {FOO}_SERVICE_HOST and {FOO}_SERVICE_PORT variables are supported).  These variables resolve to ports managed by the service proxy.
+
+## The Kubernetes Control Plane
+
+The Kubernetes control plane is split into a set of components. Currently they all run on a single _master_ node, but that is expected to change soon in order to support high-availability clusters.  These components work together to provide a unified view of the cluster.
+
+### etcd
+
+All persistent master state is stored in an instance of `etcd`.  This provides a great way to store configuration data reliably.  With `watch` support, coordinating components can be notified very quickly of changes.
+
+### Kubernetes API Server
+
+The apiserver serves up the [Kubernetes API](../api.md). It is intended to be a CRUD-y server, with most/all business logic implemented in separate components or in plug-ins. It mainly processes REST operations, validates them, and updates the corresponding objects in `etcd` (and eventually other stores).
+
+### Scheduler
+
+The scheduler binds unscheduled pods to nodes via the `/binding` API. The scheduler is pluggable, and we expect to support multiple cluster schedulers and even user-provided schedulers in the future.
+
+### Kubernetes Controller Manager Server
+
+All other cluster-level functions are currently performed by the Controller Manager. For instance, `Endpoints` objects are created and updated by the endpoints controller, and nodes are discovered, managed, and monitored by the node controller. These could eventually be split into separate components to make them independently pluggable.
+
+The [`replicationController`](../replication-controller.md) is a mechanism that is layered on top of the simple [`pod`](../pods.md) API. We eventually plan to port it to a generic plug-in mechanism, once one is implemented.
--- a/docs/design/principles.md
+++ b/docs/design/principles.md
@@ -0,0 +1,55 @@
+# Design Principles
+
+Principles to follow when extending Kubernetes. 
+
+## API
+
+See also the [API conventions](../api-conventions.md).
+
+* All APIs should be declarative.
+* API objects should be complementary and composable, not opaque wrappers.
+* The control plane should be transparent -- there are no hidden internal APIs.
+* The cost of API operations should be proportional to the number of objects intentionally operated upon. Therefore, common filtered lookups must be indexed. Beware of patterns of multiple API calls that would incur quadratic behavior.
+* Object status must be 100% reconstructable by observation. Any history kept must be just an optimization and not required for correct operation.
+* Cluster-wide invariants are difficult to enforce correctly. Try not to add them. If you must have them, don't enforce them atomically in master components, that is contention-prone and doesn't provide a recovery path in the case of a bug allowing the invariant to be violated. Instead, provide a series of checks to reduce the probability of a violation, and make every component involved able to recover from an invariant violation. 
+* Low-level APIs should be designed for control by higher-level systems. Higher-level APIs should be intent-oriented (think SLOs) rather than implementation-oriented (think control knobs).
+
+## Control logic
+
+* Functionality must be *level-based*, meaning the system must operate correctly given the desired state and the current/observed state, regardless of how many intermediate state updates may have been missed. Edge-triggered behavior must be just an optimization.
+* Assume an open world: continually verify assumptions and gracefully adapt to external events and/or actors. Example: we allow users to kill pods under control of a replication controller; it just replaces them.
+* Do not define comprehensive state machines for objects with behaviors associated with state transitions and/or "assumed" states that cannot be ascertained by observation. 
+* Don't assume a component's decisions will not be overridden or rejected, nor for the component to always understand why. For example, etcd may reject writes. Kubelet may reject pods. The scheduler may not be able to schedule pods. Retry, but back off and/or make alternative decisions.
+* Components should be self-healing. For example, if you must keep some state (e.g., cache) the content needs to be periodically refreshed, so that if an item does get erroneously stored or a deletion event is missed etc, it will be soon fixed, ideally on timescales that are shorter than what will attract attention from humans.
+* Component behavior should degrade gracefully. Prioritize actions so that the most important activities can continue to function even when overloaded and/or in states of partial failure.
+
+## Architecture
+
+* Only the apiserver should communicate with etcd/store, and not other components (scheduler, kubelet, etc.).
+* Compromising a single node shouldn't compromise the cluster.
+* Components should continue to do what they were last told in the absence of new instructions (e.g., due to network partition or component outage).
+* All components should keep all relevant state in memory all the time. The apiserver should write through to etcd/store, other components should write through to the apiserver, and they should watch for updates made by other clients. 
+* Watch is preferred over polling.
+
+## Extensibility
+
+TODO: pluggability
+
+## Bootstrapping
+
+* [Self-hosting](https://github.com/GoogleCloudPlatform/kubernetes/issues/246) of all components is a goal.
+* Minimize the number of dependencies, particularly those required for steady-state operation.
+* Stratify the dependencies that remain via principled layering.
+* Break any circular dependencies by converting hard dependencies to soft dependencies.
+  * Also accept that data from other components from another source, such as local files, which can then be manually populated at bootstrap time and then continuously updated once those other components are available.
+  * State should be rediscoverable and/or reconstructable.
+  * Make it easy to run temporary, bootstrap instances of all components in order to create the runtime state needed to run the components in the steady state; use a lock (master election for distributed components, file lock for local components like Kubelet) to coordinate handoff. We call this technique "pivoting".
+  * Have a solution to restart dead components. For distributed components, replication works well. For local components such as Kubelet, a process manager or even a simple shell loop works.
+
+## Availability
+
+TODO
+
+## General principles
+
+* [Eric Raymond's 17 UNIX rules](https://en.wikipedia.org/wiki/Unix_philosophy#Eric_Raymond.E2.80.99s_17_Unix_Rules)
--- a/docs/getting-started-guides/README.md
+++ b/docs/getting-started-guides/README.md
@@ -18,18 +18,18 @@ Bare-metal     | Ansible      | Fedora | flannel     | [docs](../../docs/getting
 AWS            | CoreOS       | CoreOS | flannel     | [docs](../../docs/getting-started-guides/coreos.md)    | Community                    | Uses K8s version 0.11.0
 GCE            | CoreOS       | CoreOS | flannel     | [docs](../../docs/getting-started-guides/coreos.md)    | Community (@kelseyhightower) | Uses K8s version 0.11.0
 Vagrant        | CoreOS       | CoreOS | flannel     | [docs](../../docs/getting-started-guides/coreos.md)    | Community (@pires)           | Uses K8s version 0.11.0
-CloudStack     | Ansible      | CoreOS | flannel     | [docs](../../docs/getting-started-guides/cloudstack.md)| Community (@sebgoa)          | Uses K8s version 0.9.1
+CloudStack     | Ansible      | CoreOS | flannel     | [docs](../../docs/getting-started-guides/cloudstack.md)| Community (@runseb)          | Uses K8s version 0.9.1
 Vmware         |              | Debian | OVS         | [docs](../../docs/getting-started-guides/vsphere.md)   | Community (@pietern)         | Uses K8s version 0.9.1
 AWS            | Saltstack    | Ubuntu | OVS         | [docs](../../docs/getting-started-guides/aws.md)       | Community (@justinsb)        | Uses K8s version 0.5.0
 Vmware         | CoreOS       | CoreOS | flannel     | [docs](../../docs/getting-started-guides/coreos.md)    | Community (@kelseyhightower) |
-Azure          | Saltstack    | Ubuntu | OpenVPN     | [docs](../../docs/getting-started-guides/azure.md)     | Community (@jeffmendoza)     |
+Azure          | Saltstack    | Ubuntu | OpenVPN     | [docs](../../docs/getting-started-guides/azure.md)     | Community                    |
 Bare-metal     | custom       | Ubuntu | _none_      | [docs](../../docs/getting-started-guides/ubuntu_single_node.md) | Community (@jainvipin)       |
 Bare-metal     | custom       | Ubuntu Cluster | flannel | [docs](../../docs/getting-started-guides/ubuntu_multinodes_cluster.md) | Community (@resouer @WIZARD-CXY) | use k8s version 0.12.0
 Docker Single Node        | custom       | N/A    | local       | [docs](docker.md) | Project (@brendandburns) | Tested @ 0.14.1 |
 Docker Multi Node        | Flannel| N/A    | local       | [docs](docker-multinode.md) | Project (@brendandburns) | Tested @ 0.14.1 |
 Local          |              |        | _none_      | [docs](../../docs/getting-started-guides/locally.md)   | Community (@preillyme)                     |
-Ovirt          |              |        |             | [docs](../../docs/getting-started-guides/ovirt.md)     | Inactive                     |
-Rackspace      | CoreOS       | CoreOS | Rackspace   | [docs](../../docs/getting-started-guides/rackspace.md) | Inactive                     |
+Ovirt          |              |        |             | [docs](../../docs/getting-started-guides/ovirt.md)     | Inactive (@simon3z)          |
+Rackspace      | CoreOS       | CoreOS | Rackspace   | [docs](../../docs/getting-started-guides/rackspace.md) | Inactive (@doubleerr)        |
 Bare-metal     | custom       | CentOS | _none_      | [docs](../../docs/getting-started-guides/centos/centos_manual_config.md) | Community(@coolsvap)    | Uses K8s v0.9.1
 libvirt/KVM    | CoreOS       | CoreOS | libvirt/KVM | [docs](../../docs/getting-started-guides/libvirt-coreos.md) | Community (@lhuard1A)   |
 AWS            | Juju         | Ubuntu | flannel     | [docs](../../docs/getting-started-guides/juju.md)      | [Community](https://github.com/whitmo/bundle-kubernetes) ( [@whit](https://github.com/whitmo), [@matt](https://github.com/mbruzek), [@chuck](https://github.com/chuckbutler) ) | [Tested](http://reports.vapour.ws/charm-tests-by-charm/kubernetes) K8s v0.8.1
--- a/docs/glossary.md
+++ b/docs/glossary.md
@@ -35,6 +35,9 @@ for easy scaling of replicated systems, and handles restarting of a Pod when the
 **Resource**
 : CPU, memory, and other things that a pod can request.   See [resources](resources.md).

+**Secret**
+: An object containing sensitive information, such as authentication tokens, which can be made available to containers upon request. See [secrets](secrets.md).
+
 **Selector**
 : An expression that matches Labels.  Can identify related objects, such as pods which are replicas in a load-balanced
 service.  See [labels](labels.md).
--- a/docs/overview.md
+++ b/docs/overview.md
@@ -1,8 +1,8 @@
 # Kubernetes User Documentation

-Kubernetes is an open-source system for managing containerized applications (currently Docker containers) across multiple hosts in a cluster. It provides mechanisms for application deployment, scheduling, updating, maintenance, and scaling. A key feature of Kubernetes is that it actively manages the containers to ensure that the state of the cluster continually matches the user's intentions.
+Kubernetes is an open-source system for managing containerized applications across multiple hosts in a cluster. It provides mechanisms for application deployment, scheduling, updating, maintenance, and scaling. A key feature of Kubernetes is that it actively manages the containers to ensure that the state of the cluster continually matches the user's intentions.

-Today, Kubernetes focuses on continuously-running stateless (e.g. web server or in-memory object cache) and "cloud native" stateful applications (e.g. NoSQL datastores), but in the near future it will support all the other workload types commonly found in production cluster environments, such as batch, stream processing, and traditional databases.
+Today, Kubernetes supports just [Docker](http://www.docker.io) containers, but other container image formats and container runtimes will be supported in the future (e.g., [Rocket](https://coreos.com/blog/rocket/) support is in progress). Similarly, while Kubernetes currently focuses on continuously-running stateless (e.g. web server or in-memory object cache) and "cloud native" stateful applications (e.g. NoSQL datastores), in the near future it will support all the other workload types commonly found in production cluster environments, such as batch, stream processing, and traditional databases. 

 In Kubernetes, all containers run inside [pods](pods.md). A pod can host a single container, or multiple cooperating containers; in the latter case, the containers in the pod are guaranteed to be co-located on the same machine and can share resources. A pod can also contain zero or more [volumes](volumes.md), which are directories that are private to a container or shared across containers in a pod. For each pod the user creates, the system finds a machine that is healthy and that has sufficient availabile capacity, and starts up the corresponding container(s) there. If a container fails it can be automatically restarted by Kubernetes' node agent, called the Kubelet. But if the pod or its machine fails, it is not automatically moved or restarted unless the user also defines a [replication controller](replication-controller.md), which we discuss next.

--- a/docs/pods.md
+++ b/docs/pods.md
@@ -30,7 +30,7 @@ In addition to defining the application containers that run in the pod, the pod

 ### Management

-Pods also simplify application deployment and management by providing a higher-level abstraction than the raw, low-level container interface. Pods serve as units of deployment and horizontal scaling/replication. Co-location, fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.
+Pods also simplify application deployment and management by providing a higher-level abstraction than the raw, low-level container interface. Pods serve as units of deployment and horizontal scaling/replication. Co-location (co-scheduling), fate sharing, coordinated replication, resource sharing, and dependency management are handled automatically.

 ## Uses of pods