Ensure resources are created in zone with schedulable
nodes. For example, if we have 4 zones with 3 zones
having worker nodes and 1 zone having master nodes(unscheduable
for workloads), we should not create resources like PV, PVC or
pods in that zone.
We're running ubernetes tests
`should only be allowed to provision PDs in zones
where nodes exist`
on gcp&gke. While the test is useful in exercising
the scenario of identifying extra zone and
creating a node in it, not every Kube
distribution uses the same approach to create a node,
further if even there is an extra zone, we cannot
guarantee the zone to have enough quota. There can also
be other GCP specific edge cases all of which cannot be
covered within this test. So, removing the test
as agreed upon with the storage team
This test verifies an implementation detail in the in-tree gcepd
plugin. The behavior is not implementated in the gcepd CSI driver
and therefore the test will be obsolete after CSI migration.
Some CSI drivers can't clone a volume into other topology segment (e.g. a
cloud availability zone). The scheduler does not know about these
restrictions and schedules pods with PVCs that clone a volume mostly
randomly.
Run all volume cloning tests in the same topology segment, if such segment
is available and has at least one schedulable node.
The previous approach with grabbing via a nginx proxy had some
drawbacks:
- it did not work when the pods only listened on localhost (as
configured by kubeadm) and the proxy got deployed on a different
node
- starting the proxy raced with starting the pods, causing
sporadic test failures because the proxy was not set up
properly unless it saw all pods when starting the e2e.test
- the proxy was always started, whether it is needed or not
- the proxy was left running after a test and then the next
test run triggered potentially confusing messages when
it failed to create objects for the proxy
The new approach is similar to "kubectl port-forward" + "kubectl get
--raw". It uses the port forwarding feature to establish a TCP
connection via a custom dialer, then lets client-go handle TLS and
credentials.
Somehow verifying the server certificate did not work. As this
shouldn't be a big concern for E2E testing, certificate checking gets
disabled on the client side instead of investigating this further.
This reverts commit
c15fd76ee9. Most (all?) of the hostpath
tests and several other tests started to fail again in
gce-scale-master-correctness after re-enabling the controller. This
shows that it was not just the obsolete agent which causes scalability
problems, but also the controller.
It has to be disabled until the scalability problems are addressed.
This make sure the testcase that cannot run locally will be skipped
instead of throwing the misleading failure message.
Signed-off-by: Dave Chen <dave.chen@arm.com>
CSI driver need to pass special mount opts to XFS filesystem to be able to
mount a volume + its clone or its restored snapshot on the same node. Add a
test to exhibit this behavior.
The test is optional for now, giving CSI drivers time to fix it.
These tests have been flaky for a long time, with a relatively low
rate of flakes. Nonetheless it seems better to extend the timeouts to
reduce the flakiness.
It was disabled together with the agent to avoid test failures in
gce-master-scale-correctness (https://github.com/kubernetes/kubernetes/issues/102452). That
solved the problem, but we still need to check whether the controller
alone works.
They are not needed for any of the tests and may be causing too much
overhead (see
https://github.com/kubernetes/kubernetes/issues/102452#issuecomment-854452816).
We already disabled them earlier and then re-enabled them again
because it wasn't clear how much overhead they were causing. A recent
change in how the sidecars get
deployed (https://github.com/kubernetes/kubernetes/pull/102282) seems
to have made the situation worse again. There's no logical explanation
for that yet, though.
(cherry picked from commit 0c2cee5676e64976f9e767f40c4c4750a8eeb11f)
As seen in https://github.com/kubernetes/kubernetes/issues/102452, we
currently don't have pod events for the CSI driver pods because of the
different namespace and would need them to determine whether the
driver gets evicted.
Previously, only changes of the pods where logged. Perhaps even more
interesting are events in the namespace.