back off restarts of crashlooping containers

Signed-off-by: Sam Abed <samabed@gmail.com>
This commit is contained in:
Sam Abed
2015-08-13 22:59:15 +10:00
parent c92cc62145
commit 995cb15bb6
14 changed files with 505 additions and 34 deletions

View File

@@ -77,7 +77,7 @@ More detailed information about the current (and previous) container statuses ca
## RestartPolicy
The possible values for RestartPolicy are `Always`, `OnFailure`, or `Never`. If RestartPolicy is not set, the default value is `Always`. RestartPolicy applies to all containers in the pod. RestartPolicy only refers to restarts of the containers by the Kubelet on the same node. As discussed in the [pods document](pods.md#durability-of-pods-or-lack-thereof), once bound to a node, a pod will never be rebound to another node. This means that some kind of controller is necessary in order for a pod to survive node failure, even if just a single pod at a time is desired.
The possible values for RestartPolicy are `Always`, `OnFailure`, or `Never`. If RestartPolicy is not set, the default value is `Always`. RestartPolicy applies to all containers in the pod. RestartPolicy only refers to restarts of the containers by the Kubelet on the same node. Failed containers that are restarted by Kubelet, are restarted with an exponential back-off delay, the delay is in multiples of sync-frequency 0, 1x, 2x, 4x, 8x ... capped at 5 minutes and is reset after 10 minutes of successful execution. As discussed in the [pods document](pods.md#durability-of-pods-or-lack-thereof), once bound to a node, a pod will never be rebound to another node. This means that some kind of controller is necessary in order for a pod to survive node failure, even if just a single pod at a time is desired.
The only controller we have today is [`ReplicationController`](replication-controller.md). `ReplicationController` is *only* appropriate for pods with `RestartPolicy = Always`. `ReplicationController` should refuse to instantiate any pod that has a different restart policy.