Provide backpressure to clients when etcd goes down
When etcd is down today we don't specifically handle the error involved, which means clients get a generic 500 error. This commit adds a formal error type internally for both WatchExpired and EtcdUnreachable, and then converts them to api/errors before returning to the client. It also upgrades the client to retry on any 429 or 5xx error that has a Retry-After header, instead of just 429. In combination, this allows the apiserver to exert backpressure on controllers that are hotlooping. Picked 2 seconds by default, but we could potentially ramp that up even further in a future iteration.
This commit is contained in:
@@ -839,7 +839,10 @@ func isTextResponse(resp *http.Response) bool {
|
||||
// checkWait returns true along with a number of seconds if the server instructed us to wait
|
||||
// before retrying.
|
||||
func checkWait(resp *http.Response) (int, bool) {
|
||||
if resp.StatusCode != errors.StatusTooManyRequests {
|
||||
switch r := resp.StatusCode; {
|
||||
// any 500 error code and 429 can trigger a wait
|
||||
case r == errors.StatusTooManyRequests, r >= 500:
|
||||
default:
|
||||
return 0, false
|
||||
}
|
||||
i, ok := retryAfterSeconds(resp)
|
||||
|
Reference in New Issue
Block a user