Provide backpressure to clients when etcd goes down
When etcd is down today we don't specifically handle the error involved, which means clients get a generic 500 error. This commit adds a formal error type internally for both WatchExpired and EtcdUnreachable, and then converts them to api/errors before returning to the client. It also upgrades the client to retry on any 429 or 5xx error that has a Retry-After header, instead of just 429. In combination, this allows the apiserver to exert backpressure on controllers that are hotlooping. Picked 2 seconds by default, but we could potentially ramp that up even further in a future iteration.
This commit is contained in:
@@ -215,6 +215,12 @@ const (
|
||||
// Status code 500
|
||||
StatusReasonInternalError = "InternalError"
|
||||
|
||||
// StatusReasonExpired indicates that the request is invalid because the content you are requesting
|
||||
// has expired and is no longer available. It is typically associated with watches that can't be
|
||||
// serviced.
|
||||
// Status code 410 (gone)
|
||||
StatusReasonExpired = "Expired"
|
||||
|
||||
// StatusReasonServiceUnavailable means that the request itself was valid,
|
||||
// but the requested service is unavailable at this time.
|
||||
// Retrying the request after some time might succeed.
|
||||
|
Reference in New Issue
Block a user