kubernetes

Files

Kubernetes Submit Queue bc7ccfe93b Merge pull request #50106 from julia-stripe/improve-scheduler-error-handling

Automatic merge from submit-queue

Retry scheduling pods after errors more consistently in scheduler

**What this PR does / why we need it**:

This fixes 2 places in the scheduler where pods can get stuck in Pending forever.  In both these places, errors happen and `sched.config.Error` is not called afterwards. This is a problem because `sched.config.Error` is responsible for requeuing pods to retry scheduling when there are issues (see [here](2540b333b2/plugin/pkg/scheduler/factory/factory.go (L958))), so if we don't call `sched.config.Error` then the pod will never get scheduled (unless the scheduler is restarted).

One of these (where it returns when `ForgetPod` fails instead of continuing and reporting an error) is a regression from [this refactor](https://github.com/kubernetes/kubernetes/commit/ecb962e6585#diff-67f2b61521299ca8d8687b0933bbfb19L234), and with the [old behavior](80f26fa8a8/plugin/pkg/scheduler/scheduler.go (L233-L237)) the error was reported correctly. As far as I can tell changing the error handling in that refactor wasn't intentional.

When AssumePod fails there's never been an error reported but I think adding this will help the scheduler recover when something goes wrong instead of letting pods possibly never get scheduled.

This will help prevent issues like https://github.com/kubernetes/kubernetes/issues/49314 in the future.

**Release note**:

```release-note
Fix incorrect retry logic in scheduler
```

2017-08-07 01:35:17 -07:00

admission

update generated deepcopy code

2017-07-31 22:33:00 +08:00

auth

Merge pull request #46685 from xilabao/fix-err-message-in-namespace_policy

2017-08-03 23:59:05 -07:00

scheduler

Merge pull request #50106 from julia-stripe/improve-scheduler-error-handling

2017-08-07 01:35:17 -07:00