When etcd is down today we don't specifically handle the error involved,
which means clients get a generic 500 error. This commit adds a formal
error type internally for both WatchExpired and EtcdUnreachable, and
then converts them to api/errors before returning to the client. It also
upgrades the client to retry on any 429 or 5xx error that has a
Retry-After header, instead of just 429.
In combination, this allows the apiserver to exert backpressure on
controllers that are hotlooping. Picked 2 seconds by default, but we
could potentially ramp that up even further in a future iteration.
Avoid TTL by deleting pods immediately when they aren't
scheduled, and letting the Kubelet delete them otherwise.
Ensure the Kubelet uses pod.Spec.TerminationGracePeriodSeconds
when no pod.DeletionGracePeriodSeconds is available.
Change the signature of GuaranteedUpdate so that TTL can
be more easily preserved. Allow a simpler (no ttl) and
more complex (response and node directly available, set ttl)
path for GuaranteedUpdate. Add some tests to ensure this
doesn't blow up again.
This commit adds support to core resources to enable deferred deletion
of resources. Clients may optionally specify a time period after which
resources must be deleted via an object sent with their DELETE. That
object may define an optional grace period in seconds, or allow the
default "preferred" value for a resource to be used. Once the object
is marked as pending deletion, the deletionTimestamp field will be set
and an etcd TTL will be in place.
Clients should assume resources that have deletionTimestamp set will
be deleted at some point in the future. Other changes will come later
to enable graceful deletion on a per resource basis.
In order to support graceful deletion, the resource object will
need access to the TTL value in etcd. Also, in the future we
may want to get the creation index (distinct from modifiedindex)
and expose it to clients. Change EtcdResourceVersioner to be
more type specific (objects vs lists) and provide a default
implementation that relies on the internal API convention.
Also, rename etcd_tools.go to etcd_helper.go and split a few
things up.