All images used by e2e tests must use templates in order to allow
relocation. In addition this is hitting Dockerhub which will be
getting throttled soon.
Updates sig-scheduling e2e Nvidia GPU tests to install drivers using
local manifest by default. Currently the DaemonSet is fetched from the
GoogleCloudPlatform/container-enginer-accelerators repo by default.
Using a local manifest allows for manually specifying the image
cos-gpu-installer image rather than always using latest. A remote
manifest can still be fetched by setting
NVIDIA_DRIVER_INSTALLER_DAEMONSET env var.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Copy csi-hostpath driver manifests from
kubernetes-csi/csi-driver-host-path. It bumps version of all images to the
release shipped along Kubernetes 1.18.
The mock driver gets instructed to return a ResourceExhausted error
for the first CreateVolume invocation via the storage class
parameters.
How this should be handled depends on the situation: for normal
volumes, we just want external-scheduler to retry. For late binding,
we want to reschedule the pod. It also depends on topology support.
Especially related to "uncertain" global mounts. A large refactoring of CSI
mock tests were necessary:
- to be able to script the driver to return errors as required by the test
- to parse the CSI driver logs to check kubelet called the right CSI calls
When deleting fails, the tests should be considered as failed,
too. Ignoring the error caused a wrong return code in the CSI mock
driver to go unnoticed (see
https://github.com/kubernetes-csi/csi-test/pull/250). The v3.1.0
release of the CSI mock driver fixes that.
On systems with SELinux enabled, non-privileged containers can't access
data of privileged containers. Since the CSI driver socket is exposed
by a privileged container, all sidecars must be privileged too.
- add csi pd driver manifests
- modify snapshottable test case
- fix tests of pod has to be created first for delay-binding PVC, otherwise PVC won't be bound
The redis version has been bumped to version 5.0.5, but the maximum version supported on
Windows is 3.2. This can lead to failing tests, the output and behaviour can be different
(see #80516). In order to prevent such failures, the amount of times the Redis image is
used can be reduced.
This commit uses the previously added agnhost guestbook subcommand as a replacement for the
Guestbook application created by the test "should create and stop a working application".
Adds AgnhostPrivate to test/utils/image/manifest. Some tests are trying to pull
the agnhost image from the private registry, meaning that we would need to
always build and push the agnhost image to both e2e and private registry
whenever we bump its version. Decoupling them would mean that we only need
to push the image to the e2e registry.
This is required for promoting volume limits to GA. The new version of
the driver reports the max number of volumes it supports. Such number
should be specified as a CLI argument when starting the driver.
This is needed for raw block volumes. It mirrors a change made in the upstream
deployment in https://github.com/kubernetes-csi/csi-driver-host-path/pull/109
Raw block volumes use loop devices under the hood. "losetup --find
--show" uses LOOP_CTL_GET_FREE to get a free loop device. It then
expects to have the corresponding /dev/loopX already available. When
/dev inside the container is a static tmpfs which doesn't already have
those /dev/loop* devices (*) the new device fails to show up,
resulting in:
I1028 13:25:19.937846 1 server.go:117] GRPC call: /csi.v1.Controller/CreateVolume
I1028 13:25:19.938083 1 server.go:118] GRPC request: {"accessibility_requirements":{"preferred":[{"segments":{"topology.hostpath.csi/node":"pmem-csi-pmem-govm-worker3"}}],"requisite":[{"segments":{"topology.hostpath.csi/node":"pmem-csi-pmem-govm-worker3"}}]},"capacity_range":{"required_bytes":5368709120},"name":"pvc-24985a49-5638-4bf6-b789-bb99a28d1073","volume_capabilities":[{"AccessType":{"Block":{}},"access_mode":{"mode":1}}]}
I1028 13:25:19.961124 1 volume_path_handler_linux.go:41] Creating device for path: /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004
I1028 13:25:20.391472 1 volume_path_handler_linux.go:75] Failed device create command for path: /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004 exit status 1 losetup: /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004: failed to set up loop device: No such file or directory
E1028 13:25:20.392916 1 server.go:121] GRPC error: rpc error: code = Internal desc = failed to create volume 635c6569-f986-11e9-baa6-0242ac110004: failed to attach device /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004: exit status 1
(*) It seems that the static tmpfs gets populated by Docker based on
what's currently on the host when the container starts. That would
explain why it worked in the Kubernetes Prow testing - the host must
have had enough loop devices already defined.
This updates to the releases meant to be used with Kubernetes 1.16
except for external-snapshotter, which is kept at the more recent
2.0.0-rc1 which targets 1.17.
The new external-attacher v2.0.0 needs updated RBAC rules, copied
verbatim from the v2.0.0 release.