kubernetes

Files

Kubernetes Submit Queue 6de28fab7d Merge pull request #42942 from vishh/gpu-cont-fix

Automatic merge from submit-queue (batch tested with PRs 42942, 42935)

[Bug] Handle container restarts and avoid using runtime pod cache while allocating GPUs

Fixes #42412

**Background**
Support for multiple GPUs is an experimental feature in v1.6. 
Container restarts were handled incorrectly which resulted in stranding of GPUs
Kubelet is incorrectly using runtime cache to track running pods which can result in race conditions (as it did in other parts of kubelet). This can result in same GPU being assigned to multiple pods.

**What does this PR do**
This PR tracks assignment of GPUs to containers and returns pre-allocated GPUs instead of (incorrectly) allocating new GPUs.
GPU manager is updated to consume a list of active pods derived from apiserver cache instead of runtime cache.
Node e2e has been extended to validate this failure scenario.

**Risk**
Minimal/None since support for GPUs is an experimental feature that is turned off by default. The code is also isolated to GPU manager in kubelet.

**Workarounds**
In the absence of this PR, users can mitigate the original issue by setting `RestartPolicyNever`  in their pods.
There is no workaround for the race condition caused by using the runtime cache though.
Hence it is worth including this fix in v1.6.0.

cc @jianzhangbjz @seelam @kubernetes/sig-node-pr-reviews 

Replaces #42560

2017-03-14 10:19:17 -07:00

boilerplate

Enable auto-generating sources rules

2017-01-05 14:14:13 -08:00

cmd/teststale

…

e2e-internal

Split federation-{up,down} from e2e-{up,down}.

2017-02-24 14:27:31 -08:00

gen-swagger-doc

change the relative links to definition in operations.html to satisfy the new path set in the kubernetes.io

2016-07-29 13:29:34 -07:00

jenkins

Add staging repos to GOPATH in verify-godeps

2017-03-01 10:23:30 -05:00

lib

Merge pull request #42623 from liggitt/kubectl-version

2017-03-13 15:06:31 -07:00

make-rules

Introduce new generator for apps/v1beta1 deployments

2017-03-10 12:08:01 +01:00

testdata

add apply cmd tests for TPR

2017-02-02 15:20:45 -08:00

verify-flags

Merge pull request #41794 from shashidharatd/federation-upgrade-tests-1

2017-03-10 22:02:15 -08:00

.linted_packages

linter fixes

2017-03-13 10:58:26 -07:00

autogenerated_placeholder.txt

…

benchmark-go.sh

unify newline format for benchmark-go.sh

2016-12-10 01:15:30 -08:00

benchmark-integration.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

BUILD

Add verify-gofmt as a Bazel test.

2017-02-10 17:00:28 -08:00

build-cross.sh

Make releases work

2016-07-12 21:52:54 -07:00

build-go.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

build-ui.sh

move swagger route to apiserver

2017-02-01 15:18:32 -05:00

cherry_pick_pull.sh

hack/cherry_pick_pull.sh: cleanup patch files

2016-12-14 14:33:17 -08:00

dev-build-and-push.sh

hack/dev-build-*: Run dev build instead of release build

2016-12-15 10:35:16 -07:00

dev-build-and-up.sh

hack/dev-build-*: Run dev build instead of release build

2016-12-15 10:35:16 -07:00

dev-push-hyperkube.sh

Rename build-tools/ back to build/

2016-12-14 13:42:15 -08:00

e2e_test.go

hack/e2e_test.go's tester shouldn't stat files from the future

2017-02-15 15:59:47 -08:00

e2e-node-test.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

e2e.go

Convert hack/e2e.go to a test-infra/kubetest shim

2017-02-02 17:42:46 -08:00

federated-ginkgo-e2e.sh

[Federation] Unjoin only the joined clusters while bringing down the federation control plane.

2017-03-12 13:05:26 -07:00

generate-bindata.sh

Run bindata generation from KUBE_ROOT

2017-01-10 14:28:19 -05:00

generate-docs.sh

Move .generated_docs to docs/ so docs OWNERS can review / approve

2017-02-16 10:11:57 -08:00

get-build.sh

…

ginkgo-e2e.sh

[Federation][init-11] Switch federation e2e tests to use the new federation control plane bootstrap via the kubefed init command.

2016-12-16 11:22:44 +05:30

godep-restore.sh

hack/godep-restore.sh: use godep v79 which works

2017-03-12 18:43:10 +01:00

godep-save.sh

Unify godep code in hack/*-godep*.sh

2017-03-09 15:03:13 +01:00

grab-profiles.sh

Make all useage of sort deterministic

2016-10-20 16:47:20 -04:00

install-etcd.sh

…

list-feature-tests.sh

Make all useage of sort deterministic

2016-10-20 16:47:20 -04:00

local-up-cluster.sh

Merge pull request #42316 from feiskyer/cri-local

2017-03-01 07:09:19 -08:00

lookup_pull.py

…

OWNERS

Convert hack/e2e.go to a test-infra/kubetest shim

2017-02-02 17:42:46 -08:00

print-workspace-status.sh

bazel: save git version in kubernetes.tar.gz

2017-01-23 17:28:08 -08:00

run-in-gopath.sh

Allow make to run from outside GOPATH

2016-07-15 08:42:12 -07:00

test-cmd.sh

fix hack/test-cmd

2016-08-02 10:27:29 -04:00

test-go.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

test-integration.sh

choose a particular directory test-integration

2016-08-26 12:33:06 -04:00

test-update-storage-objects.sh

Update clusters to use 3.0.17 etcd

2017-02-23 10:08:50 +01:00

update_owners.py

updated test owner generation script to add sig column

2017-02-03 12:41:47 -08:00

update-all.sh

Unify godep code in hack/*-godep*.sh

2017-03-09 15:03:13 +01:00

update-api-reference-docs.sh

update generation bash to handle vendor dir

2017-01-17 09:06:34 -05:00

update-bazel.sh

update-bazel.sh to treat GOPATH as a path

2017-02-16 14:40:05 -08:00

update-codecgen.sh

Make all useage of sort deterministic

2016-10-20 16:47:20 -04:00

update-codegen.sh

Add settings API and admission controller

2017-03-01 13:04:28 -08:00

update-federation-api-reference-docs.sh

update generation bash to handle vendor dir

2017-01-17 09:06:34 -05:00

update-federation-generated-swagger-docs.sh

update generation bash to handle vendor dir

2017-01-17 09:06:34 -05:00

update-federation-openapi-spec.sh

genericapiserver: move MasterCount and service options into master

2016-12-16 17:23:43 +01:00

update-federation-swagger-spec.sh

Federation does not generate swagger spec correctly

2017-01-06 23:45:04 -05:00

update-generated-docs.sh

Move .generated_docs to docs/ so docs OWNERS can review / approve

2017-02-16 10:11:57 -08:00

update-generated-protobuf-dockerized.sh

spell check for test/*

2016-12-14 06:03:00 -08:00

update-generated-protobuf.sh

Rename build-tools/ back to build/

2016-12-14 13:42:15 -08:00

update-generated-runtime-dockerized.sh

CRI: use more gogoprotobuf plugins

2017-01-25 13:52:24 -08:00

update-generated-runtime.sh

Rename build-tools/ back to build/

2016-12-14 13:42:15 -08:00

update-generated-swagger-docs.sh

update generation bash to handle vendor dir

2017-01-17 09:06:34 -05:00

update-godep-licenses.sh

make godep licenses/copyright check case insensitive

2016-10-24 18:00:08 -07:00

update-gofmt.sh

hack/*.sh: re-add staging dirs to verify+update scripts

2017-02-17 08:51:31 +01:00

update-openapi-spec.sh

Fix race in service IP allocation repair loop

2016-12-26 21:59:27 -08:00

update-staging-client-go.sh

update-staging-{client-go,godeps}.sh: no godep-restore, pin godep, check workdir

2017-02-25 22:38:23 +01:00

update-staging-godeps.sh

Don't try to run hack/verify-staging-* on dirty repository

2017-03-09 13:05:31 +01:00

update-swagger-spec.sh

update generation scripts to share API group version constants

2016-09-22 13:30:41 -04:00

update-translations.sh

Update extraction script, sort messages, add .pot file.

2017-02-23 18:53:00 +00:00

verify-all.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

verify-api-groups.sh

add script to check for updates to the files for generation

2016-11-01 15:59:50 -04:00

verify-api-reference-docs.sh

…

verify-bazel.sh

bump gazel to v14

2017-02-09 11:09:13 -08:00

verify-boilerplate.sh

Add a build rule for the boilerplate unit test.

2017-01-01 22:54:32 -08:00

verify-cli-conventions.sh

Tools for checking CLI conventions

2016-10-17 11:50:02 -02:00

verify-codecgen.sh

add apiregistration types

2016-12-06 13:45:10 -05:00

verify-codegen.sh

update scripts for new kube-aggregator location

2017-02-14 14:16:59 -05:00

verify-description.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

verify-federation-openapi-spec.sh

Add verify script federation OpenAPI spec generation

2016-11-07 02:41:50 -08:00

verify-flags-underscore.py

ignore BUILD in the flags-underscore.py validation

2016-10-21 17:32:33 -07:00

verify-generated-docs.sh

Move .generated_docs to docs/ so docs OWNERS can review / approve

2017-02-16 10:11:57 -08:00

verify-generated-protobuf.sh

utils: Use macOS copatible copying method

2016-10-18 11:09:38 +02:00

verify-generated-runtime.sh

add update-staging-client-go.sh and verify-staging-client-go.sh;

2016-10-29 14:20:39 -07:00

verify-generated-swagger-docs.sh

docs generation: Use macos compatible copy method

2016-10-18 11:11:03 +02:00

verify-godep-licenses.sh

…

verify-godeps.sh

Add staging repos to GOPATH in verify-godeps

2017-03-01 10:23:30 -05:00

verify-gofmt.sh

hack/*.sh: re-add staging dirs to verify+update scripts

2017-02-17 08:51:31 +01:00

verify-golint.sh

hack/verify-golint: enforce cleanup of old packages

2017-01-24 08:34:06 +01:00

verify-govet.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

verify-import-boss.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

verify-linkcheck.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

verify-openapi-spec.sh

verify-openapi-spec.sh should not ignore extra file in the spec folder api/openapi-spec

2016-11-01 01:13:11 -07:00

verify-pkg-names.sh

hack/*.sh: re-add staging dirs to verify+update scripts

2017-02-17 08:51:31 +01:00

verify-readonly-packages.sh

hack/*.sh: re-add staging dirs to verify+update scripts

2017-02-17 08:51:31 +01:00

verify-staging-client-go.sh

hack/verify-staging-client-go.sh: fail on changes

2017-02-27 14:11:41 +01:00

verify-staging-godeps.sh

update-staging-{client-go,godeps}.sh: no godep-restore, pin godep, check workdir

2017-02-25 22:38:23 +01:00

verify-staging-imports.sh

add godep.json to staging repos

2017-02-21 09:38:55 -05:00

verify-swagger-spec.sh

Use make as the main build tool

2016-07-12 21:52:00 -07:00

verify-symbols.sh

spell check for test/*

2016-12-14 06:03:00 -08:00

verify-test-images.sh

Make all useage of sort deterministic

2016-10-20 16:47:20 -04:00

verify-test-owners.sh

Disable verify-test-owners.sh and make go vet more obvious

2016-12-21 11:44:04 -08:00