kubernetes

Author	SHA1	Message	Date
David Porter	e1a951afe5	Fix COS GPU driver installation * Rely on the built in GPU driver installer in COS as recommended in public docs - https://cloud.google.com/container-optimized-os/docs/how-to/run-gpus * Run `nvidia-smi` after installation to verify installation	2021-10-28 17:49:50 -07:00
Danielle Lancashire	0cc8af82a1	e2e_node: use upstream gpu installer The current GPU installer was built in 2017, from source that no longer exists in Kubernetes ([adding commit][1]. The image was built on 2017-06-13. Unfortunately, this installer no longer appears to work. When debugging on the same node type as used by test-infra, it failed to build the driver as the kernel sha was no longer available. This lead to needing to find a new way to install GPUs. The smallest logical change was switching to [cos-gpu-installer][2] . There is a newer version of this available on [googlesource][3] that I have not yet tested as it's not clear what the state of the project is, as I couldn't find docs outside of the source itself. We install things to the same location as previously to avoid needing extra downstream changes. There are a couple of weird issues here however, like needing to run the container twice to correctly update the LD Cache. [1]: `1e77594958/cluster/gce/gci/nvidia-gpus/Dockerfile` [2]: https://github.com/GoogleCloudPlatform/cos-gpu-installer [3]: https://cos.googlesource.com/cos/tools/+/refs/heads/master/src/cmd/cos_gpu_installer/	2021-08-26 14:09:45 +02:00
Tim Hockin	3586986416	Switch to k8s.gcr.io vanity domain This is the 2nd attempt. The previous was reverted while we figured out the regional mirrors (oops). New plan: k8s.gcr.io is a read-only facade that auto-detects your source region (us, eu, or asia for now) and pulls from the closest. To publish an image, push k8s-staging.gcr.io and it will be synced to the regionals automatically (similar to today). For now the staging is an alias to gcr.io/google_containers (the legacy URL). When we move off of google-owned projects (working on it), then we just do a one-time sync, and change the google-internal config, and nobody outside should notice. We can, in parallel, change the auto-sync into a manual sync - send a PR to "promote" something from staging, and a bot activates it. Nice and visible, easy to keep track of.	2018-02-07 21:14:19 -08:00
Tim Hockin	e9dd8a68f6	Revert k8s.gcr.io vanity domain This reverts commit `eba5b6092a`. Fixes https://github.com/kubernetes/kubernetes/issues/57526	2017-12-22 14:36:16 -08:00
Tim Hockin	eba5b6092a	Use k8s.gcr.io vanity domain for container images	2017-12-18 09:18:34 -08:00
zouyee	68c5ce19b8	[test/e2e_node]Redirect dl.k8s.io to the kubernetes-release GCS bucket	2017-11-02 12:18:50 +08:00
Rohit Agarwal	9c0bf19f80	Use cos-stable-59-9460-60-0 and newer installer for GPU node e2e tests.	2017-06-13 15:36:20 -07:00
Rohit Agarwal	4a5badfafa	Move the nvidia installer to the beginning. When the installer runs for the first time, it disables loadpin and restarts the node. So, it is better to run it in the beginning so that we can avoid redoing the later steps. One of the later steps include downloading a tar file and untarring it. Doing that only once saves around 1m30s in test runtime for the gci image.	2017-06-08 09:55:14 -07:00
Vishnu kannan	d45286c575	update cos kernel sha for node e2e GPU installer Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-06-01 17:09:18 -07:00
Vishnu kannan	1e77594958	Adding an installer script that installs Nvidia drivers in Container Optimized OS Packaged the script as a docker container stored in gcr.io/google-containers A daemonset deployment is included to make it easy to consume the installer A cluster e2e has been added to test the installation daemonset along with verifying installation by using a sample CUDA application. Node e2e for GPUs updated to avoid running on nodes without GPU devices. Signed-off-by: Vishnu kannan <vishnuk@google.com>	2017-05-20 21:17:19 -07:00

10 Commits