Refactor the code related to creating an internal type load balancer in the e2e tests for network load balancers. The modification removes the check for the "azure" provider and updates it to only check for "gke" and "gce" providers. This change ensures that the test only runs when the cluster is using "gke" or "gce" as the provider. The counterpart test is in the out-of-tree cloud provider azure.
This moves adding a pod to ReservedFor out of the main scheduling cycle into
PreBind. There it is done concurrently in different goroutines. For claims
which were specifically allocated for a pod (the most common case), that
usually makes no difference because the claim is already reserved.
It starts to matter when that pod then cannot be scheduled for other reasons,
because then the claim gets unreserved to allow deallocating it. It also
matters for claims that are created separately and then get used multiple times
by different pods.
Because multiple pods might get added to the same claim rapidly independently
from each other, it makes sense to do all claim status updates via patching:
then it is no longer necessary to have an up-to-date copy of the claim because
the patch operation will succeed if (and only if) the patched claim is valid.
Server-side-apply cannot be used for this because a client always has to send
the full list of all entries that it wants to be set, i.e. it cannot add one
entry unless it knows the full list.
* put var in local
Signed-off-by: husharp <jinhao.hu@pingcap.com>
* revert gomod
Signed-off-by: husharp <jinhao.hu@pingcap.com>
---------
Signed-off-by: husharp <jinhao.hu@pingcap.com>
ginkgo.GinkgoHelper is a recent addition to ginkgo which allows functions to
mark themselves as helper. This then changes which callstack gets reported for
failures. It makes sense to support the same mechanism also for logging.
There's also no reason why framework.Logf should produce output that is in a
different format than klog log entries. Having time stamps formatted
differently makes it hard to read test output which uses a mixture of both.
Another user-visible advantage is that the error log entry from
framework.ExpectNoError now references the test source code.
With textlogger there is a simple replacement for klog that can be reconfigured
to let the caller handle stack unwinding. klog itself doesn't support that
and should be modified to support it (feature freeze).
Emitting printf-style output via that logger would work, but become less
readable because the message string would get quoted instead of printing it
verbatim as before. So instead, the traditional klog header gets reproduced
in the framework code. In this example, the first line is from klog, the second
from Logf:
I0111 11:00:54.088957 332873 factory.go:193] Registered Plugin "containerd"
...
I0111 11:00:54.987534 332873 util.go:506] >>> kubeConfig: /var/run/kubernetes/admin.kubeconfig
Indention is a bit different because the initial output is printed before
installing the logger which writes through ginkgo.GinkgoWriter.
One welcome side effect is that now "go vet" detects mismatched parameters for
framework.Logf because fmt.Sprintf is called without mangling the format
string. Some of the calls were incorrect.
A stand-alone binary shouldn't import the test/e2e/framework, which is targeted
towards usage in a Ginkgo test suite. This currently works, but will break once
test/e2e/framework becomes more opinionated about how to configure logging.
The simplest solution is to duplicate the one short function that the binary
was calling in the framework.
Now that we have it (8a89a1f5a5), let's also make sure that
the new WithFlaky is used everywhere instead if [Flaky]. This way it can be
used for filtering by label.
Using klog.Fatal to abort a test leads to a poor user experience because the
output is buffered in ginkgo.GinkgoWriter and not flushed before killing the
process. The output is also different from other failures. Using the normal
error checking is better.
Before:
$ KUBECONFIG=/no/such/config go test -v ./test/e2e/
Jan 19 10:06:58.475: INFO: The --provider flag is not set. Continuing as if --provider=skeleton had been used.
=== RUN TestE2E
I0119 10:06:58.475844 99472 e2e.go:109] Starting e2e run "5303f626-ae0e-44d7-abf1-b4956d910ef4" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /nvme/gopath/src/k8s.io/kubernetes/test/e2e
=================================================================================
Random Seed: 1705655217 - will randomize all specs
Will run 4678 of 7421 specs
goroutine 817 [running]:
k8s.io/klog/v2/internal/dbg.Stacks(0x0)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/internal/dbg/dbg.go:35 +0x85
k8s.io/klog/v2.(*loggingT).output(0x9d92b20, 0x3, 0x0, 0xc00069d7a0, 0x2, {0x834c6e8?, 0x9d91c80?}, 0x300000060?, 0x0)
...
k8s.io/klog/v2.Fatal(...)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1652
k8s.io/kubernetes/test/e2e.setupSuite({0x7fb49064c078, 0xc003072360})
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:187 +0x125
...
FAIL k8s.io/kubernetes/test/e2e 0.759s
FAIL
After:
$ KUBECONFIG=/no/such/config go test -v ./test/e2e/
Jan 19 10:12:58.889: INFO: The --provider flag is not set. Continuing as if --provider=skeleton had been used.
=== RUN TestE2E
I0119 10:12:58.889224 106019 e2e.go:109] Starting e2e run "bed5a77a-f595-42d0-b512-5f601067444b" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /nvme/gopath/src/k8s.io/kubernetes/test/e2e
=================================================================================
Random Seed: 1705655578 - will randomize all specs
Will run 4678 of 7421 specs
------------------------------
[SynchronizedBeforeSuite] [FAILED] [0.001 seconds]
[SynchronizedBeforeSuite]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:69
Timeline >>
Jan 19 10:12:59.063: INFO: >>> kubeConfig: /no/such/config
Jan 19 10:12:59.063: INFO: Unexpected error: Error loading client:
<*errors.errorString | 0xc00182c130>:
error creating client: error loading KubeConfig: open /no/such/config: no such file or directory
{
s: "error creating client: error loading KubeConfig: open /no/such/config: no such file or directory",
}
[FAILED] in [SynchronizedBeforeSuite] - /nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:186 @ 01/19/24 10:12:59.064
<< Timeline
[FAILED] Error loading client: error creating client: error loading KubeConfig: open /no/such/config: no such file or directory
In [SynchronizedBeforeSuite] at: /nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:186 @ 01/19/24 10:12:59.064
------------------------------
Summarizing 1 Failure:
[FAIL] [SynchronizedBeforeSuite]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:186
Ran 0 of 7421 Specs in 0.001 seconds
FAIL! -- A BeforeSuite node failed so all tests were skipped.
--- FAIL: TestE2E (0.18s)
FAIL
FAIL k8s.io/kubernetes/test/e2e 0.769s
FAIL
This replaces the klog formatting and message routing with a simpler
implementation that uses less code. The main difference is that we skip the
entire unused message routing.
Instead, the same split output streams as for JSON gets implemented in the
io.Writer implementation that gets passed to the textlogger.
The dead code was found with:
deadcode -test -filter=k8s.io/kubernetes/test/e2e/framework/... ./test/e2e ./test/e2e_node ./test/e2e_node ./test/e2e_kubeadm
See https://go.dev/blog/deadcode for an introduction.
Only dead code which is clearly not needed anymore (glog logging),
questionable (skipping based on feature gates) or
redundant (WaitForPodSuccessInNamespaceSlow) gets removed for now. More
removals might make sense in the future.
While the benchmark is focused on encoding, it becomes a bit more realistic
when actually passing the encoded data to the Linux kernel. Features like
output buffering are more likely to have a visible effect when invoking
syscalls.
Add ready conditions to the Endpoints of the self-generated
EndpointSlice tests so that the readiness is not ambiguous and it will
work across CNIs that filter for ready endpoints.
the original logic always guarantee the NodePort's value if it was there. the NodePort should be allowed to set 0 if the Service has LB type with AllocateLoadBalancerNodePorts=false
EndpointSlices and Endpoints usually become ready pretty fast, but the
test always waited 5s before performing every check and it performed the
check 4 times in total, so unnecessarily extends the test 20s.
The commit changes the poll function to perform a check before waiting,
and reduces the interval to 2 seconds to align with other EndpointSlice
tests. It reduces the test duration from 30s to 4s.
Signed-off-by: Quan Tian <qtian@vmware.com>
Some of SELinux relabeling metrics got a new label with volume plugin in
1.29. Add the label to metrics scraping in the SELinux e2e tests.
I had to remove check that all metrics were collected, because metrics with
volume plugin label will start to exist only after an event that raises the
metric happens. They're missing in the initial metric grab.
This yaml file uses `docker.io/alpine/socat:1.7.4.3-r0`, either we figure out
how to replace the image or just eliminate the yaml itself if it is not
being used for testing anything in this repository.
Found this when we run `e2e.test --list-images`, the dockerhub image reference
above shows up which gives a false impression that we depend on this image
for our testing purposes. Also we should NOT depend on a dockerhub image anyways!
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Removes kube-proxy specific proxy type detection and globally increases
the timeout for session affinity testing so that it works for more
use-cases by default (noteably including IPVS)
Developers who are unaware of the Ginkgo wrappers in the framework might end up
passing the label decorators directly to Ginkgo. Previously, this led to an
error that was hard to understand without background knowledge:
Unknown Decorator
ginkgo.It("must deallocate on non graceful node shutdown", f.WithSerial(), f.WithDisruptive(), f.WithSlow(), func(ctx context.Context) {
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:527
[It] node was passed an unknown decorator:
'framework.label{parts:[]string{"Serial"}, extra:""}'
Learn more at: http://onsi.github.io/ginkgo/#node-decorators-overview
When including a special field that Ginkgo dumps the message gets a bit better:
Unknown Decorator
ginkgo.It("must deallocate on non graceful node shutdown", f.WithSerial(), f.WithDisruptive(), f.WithSlow(), func(ctx context.Context) {
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:527
[It] node was passed an unknown decorator:
'framework.label{parts:[]string{"Serial"}, extra:"", explanation:"If you see
this as part of an \"Unknown Decorator\" error from Ginkgo, then you need to
replace the ginkgo.It/Context/Describe call with the corresponding
framework.It/Context/Describe or (if available) f.It/Context/Describe."}'
Learn more at: http://onsi.github.io/ginkgo/#node-decorators-overview
Nov 29 22:56:18.559: INFO: Waited 371.058378ms for the sample-apiserver to be ready to handle requests.
Nov 29 22:56:49.503: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: { } Scheduled: Successfully assigned e2e-openapiv3-5878/sample-apiserver-deployment-58dfd44dd-gp8nd to ip-10-0-80-91.us-west-2.compute.internal
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set sample-apiserver-deployment-58dfd44dd to 1
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd: {replicaset-controller } SuccessfulCreate: Created pod: sample-apiserver-deployment-58dfd44dd-gp8nd
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {multus } AddedInterface: Add eth0 [10.129.2.137/23] from ovn-kubernetes
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulling: Pulling image "quay.io/openshift/community-e2e-images:e2e-3-registry-k8s-io-e2e-test-images-sample-apiserver-1-17-7-F6raNs0YQ76APdUD"
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulling: Pulling image "quay.io/openshift/community-e2e-images:e2e-11-registry-k8s-io-etcd-3-5-7-0-C5nYFPeT0lxaFCao"
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Started: Started container sample-apiserver
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Created: Created container sample-apiserver
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulled: Successfully pulled image "quay.io/openshift/community-e2e-images:e2e-3-registry-k8s-io-e2e-test-images-sample-apiserver-1-17-7-F6raNs0YQ76APdUD" in 3.068195738s (3.068206328s including waiting)
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:06 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulled: Successfully pulled image "quay.io/openshift/community-e2e-images:e2e-11-registry-k8s-io-etcd-3-5-7-0-C5nYFPeT0lxaFCao" in 9.1810668s (9.18107711s including waiting)
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:06 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Created: Created container etcd
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:06 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Started: Started container etcd
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:48 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Killing: Stopping container sample-apiserver
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:48 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Killing: Stopping container etcd
Nov 29 22:56:49.570: INFO: POD NODE PHASE GRACE CONDITIONS
Nov 29 22:56:49.570: INFO:
Nov 29 22:56:49.702: INFO: skipping dumping cluster info - cluster too large
k8s.io/kubernetes/test/e2e/apimachinery.glob..func19.3({0x7f470406fbf8, 0xc001c4b530})
k8s.io/kubernetes@v1.27.1/test/e2e/apimachinery/openapiv3.go:161 +0x6bf
fail [runtime/panic.go:260]: Test Panicked: runtime error: invalid memory address or nil pointer dereference
Ginkgo exit error 1: exit with code 1
Signed-off-by: Dan Williams <dcbw@redhat.com>
The e2e test patch the status of a ResourceQuota resources and tries to
verify the controller reset its status, however, the controller ignores
the updates and only reconcile the objects every a predefined interval,
by default 5 minutes.
Since the test polls for 5 minutes, there are some edge cases that the
time to reconcile the object by the reconcile loop is greater than 5
minutes failing the test.
To take into account the time to reconcile the objects and the reconcile
loop period, we increase by one minute the poll loop.
Change-Id: I30f7fda36cdfb47c543b5b2b120e39f7d6c2442d
Because labels are currently typically added also to the spec texts, we don't
need to write them separately.
This redundancy got introduced in f2cfbf44b1 when registering all inline tags
also as labels.
- Remove redundant tests
- Fix formatting of the query command by using fmt.Sprintf to
prevent spurious characters from being introduced
- Fix running of the journalctl command on the node by add the
default options
- Restrict running the tests on a single node
kube-dns as an alternative DNS addon to CoreDNS hasn't been supported
since 1.22 when kubeadm's v1beta3 API was added.
Remove the related tests from the e2e_kubeadm test framework.
Add Azure to the list of providers that support accessing nodes
using SSH.
Note: This will require a follow up PR adding the required
environment variables, AZURE_SSH_KEY, KUBE_SSH_BASTION to the test
configuration.
Add a test that checks if the CRB (kubeadm:cluster-admins)
used for binding admin.conf file users (part of the
kubeadm:cluster-admins Group) to the "cluster-admins"
ClusterRole exists in kubeadm clusters.
It does that only for versions newer than the version
when this feature was added.
This checks that the With* label functions are used instead of the previous
inline tags. To catch strings passed to Ginkgo directly instead of the
framework wrapper functions, the final test specs are checked.
This changes the text registration so that tags for which the framework has a
dedicated API (features, feature gates, slow, serial, etc.) those APIs are
used.
Arbitrary, custom tags are still left in place for now.
Since ServiceCIDR and IPAddresses are mostly API driven integration
test will give us a good coverage, exercising real use cases like
the migration from one ServiceCIDR range to a new range.
* It is observed in some of the periodic job results that the kubelet along with few other logs
are not getting copied to the artifacts directory once the node e2e tests are executed
* Following is the sample error log that is displayed once the tests are run
```
I1031 13:15:49.056897 40204 ssh.go:146] Running the command ssh, with args: [-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR -i /home/svanka/.ssh/google_compute_engine core@35.185.108.51 -- sudo ls core@35.185.108.51:/tmp/node-e2e-20231031T125637/results/*.log]
E1031 13:16:15.346641 40204 ssh.go:149] failed to run SSH command: out: ls: cannot access 'core@35.185.108.51:/tmp/node-e2e-20231031T125637/results/*.log': No such file or directory
, err: exit status 2
```
* This change fixes the above issue and helps in gathering the required test artifacts once the tests execution is completed
Signed-off-by: Sai Ramesh Vanka <svanka@redhat.com>
Add retry logic to the `assertConsistentConnectivity` function from
the `test/e2e/windows/hybrid_network.go` file.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
This test depends on CDI support in a runtime and doesn't work
with the out-of-the box Containerd. Marking it as a NodeSpecialFeature
should fix Containerd CI job failures.
update pod replacement policy feature flag comment and refactor the e2e test for pod replacement policy
minor fixes for pod replacement policy and e2e test
fix wrong assertions for pod replacement policy e2e test
more fixes to pod replacement policy e2e test
refactor PodReplacementPolicy e2e test to use finalizers
fix unit tests when pod replacement policy feature flag is promoted to beta
fix podgc controller unit tests when pod replacement feature is enabled
fix lint issue in pod replacement policy e2e test
assert no error in defer function for removing finalizer in pod replacement policy e2e test
implement test using a sh trap for pod replacement policy
reduce sleep after SIGTERM in pod replacement policy e2e test to 5s
Linting together with an upcoming klog update finds this problem:
test/images/sample-device-plugin/sampledeviceplugin.go:165:4: printf: k8s.io/klog/v2.Errorf does not support error-wrapping directive %w (govet)
klog.Errorf("Failed to add watch to %q: %w", triggerPath, err)
^
It looks like the test or the branch is never executed, because it wouldn't
pass: a []v1.NodeIP is value is never the same as []string. Found by the
upcoming ginkgolinter update.
ERROR: test/e2e_node/pod_host_ips.go:167:45: ginkgo-linter: use Equal with different types: Comparing []k8s.io/api/core/v1.HostIP with []string; either change the expected value type if possible, or use the BeEquivalentTo() matcher, instead of Equal() (ginkgolinter)
ERROR: gomega.Expect(p.Status.HostIPs).Should(gomega.Equal(nodeIPs))
ERROR: ^
It looks like the test is never executed, because it wouldn't pass: an int32
value is never the same as an int 0. Found by the upcoming ginkgolinter update.
Container runtimes like CRI-O actually show the image identifier in the
`ImageID` field rather than the repo digest. For the digest we already
have the `Image` field. We still allow the digest in the `ImageID` field
for historic reasons.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>