ginkgo.GinkgoHelper is a recent addition to ginkgo which allows functions to
mark themselves as helper. This then changes which callstack gets reported for
failures. It makes sense to support the same mechanism also for logging.
There's also no reason why framework.Logf should produce output that is in a
different format than klog log entries. Having time stamps formatted
differently makes it hard to read test output which uses a mixture of both.
Another user-visible advantage is that the error log entry from
framework.ExpectNoError now references the test source code.
With textlogger there is a simple replacement for klog that can be reconfigured
to let the caller handle stack unwinding. klog itself doesn't support that
and should be modified to support it (feature freeze).
Emitting printf-style output via that logger would work, but become less
readable because the message string would get quoted instead of printing it
verbatim as before. So instead, the traditional klog header gets reproduced
in the framework code. In this example, the first line is from klog, the second
from Logf:
I0111 11:00:54.088957 332873 factory.go:193] Registered Plugin "containerd"
...
I0111 11:00:54.987534 332873 util.go:506] >>> kubeConfig: /var/run/kubernetes/admin.kubeconfig
Indention is a bit different because the initial output is printed before
installing the logger which writes through ginkgo.GinkgoWriter.
One welcome side effect is that now "go vet" detects mismatched parameters for
framework.Logf because fmt.Sprintf is called without mangling the format
string. Some of the calls were incorrect.
A stand-alone binary shouldn't import the test/e2e/framework, which is targeted
towards usage in a Ginkgo test suite. This currently works, but will break once
test/e2e/framework becomes more opinionated about how to configure logging.
The simplest solution is to duplicate the one short function that the binary
was calling in the framework.
Now that we have it (8a89a1f5a5), let's also make sure that
the new WithFlaky is used everywhere instead if [Flaky]. This way it can be
used for filtering by label.
Using klog.Fatal to abort a test leads to a poor user experience because the
output is buffered in ginkgo.GinkgoWriter and not flushed before killing the
process. The output is also different from other failures. Using the normal
error checking is better.
Before:
$ KUBECONFIG=/no/such/config go test -v ./test/e2e/
Jan 19 10:06:58.475: INFO: The --provider flag is not set. Continuing as if --provider=skeleton had been used.
=== RUN TestE2E
I0119 10:06:58.475844 99472 e2e.go:109] Starting e2e run "5303f626-ae0e-44d7-abf1-b4956d910ef4" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /nvme/gopath/src/k8s.io/kubernetes/test/e2e
=================================================================================
Random Seed: 1705655217 - will randomize all specs
Will run 4678 of 7421 specs
goroutine 817 [running]:
k8s.io/klog/v2/internal/dbg.Stacks(0x0)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/internal/dbg/dbg.go:35 +0x85
k8s.io/klog/v2.(*loggingT).output(0x9d92b20, 0x3, 0x0, 0xc00069d7a0, 0x2, {0x834c6e8?, 0x9d91c80?}, 0x300000060?, 0x0)
...
k8s.io/klog/v2.Fatal(...)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1652
k8s.io/kubernetes/test/e2e.setupSuite({0x7fb49064c078, 0xc003072360})
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:187 +0x125
...
FAIL k8s.io/kubernetes/test/e2e 0.759s
FAIL
After:
$ KUBECONFIG=/no/such/config go test -v ./test/e2e/
Jan 19 10:12:58.889: INFO: The --provider flag is not set. Continuing as if --provider=skeleton had been used.
=== RUN TestE2E
I0119 10:12:58.889224 106019 e2e.go:109] Starting e2e run "bed5a77a-f595-42d0-b512-5f601067444b" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /nvme/gopath/src/k8s.io/kubernetes/test/e2e
=================================================================================
Random Seed: 1705655578 - will randomize all specs
Will run 4678 of 7421 specs
------------------------------
[SynchronizedBeforeSuite] [FAILED] [0.001 seconds]
[SynchronizedBeforeSuite]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:69
Timeline >>
Jan 19 10:12:59.063: INFO: >>> kubeConfig: /no/such/config
Jan 19 10:12:59.063: INFO: Unexpected error: Error loading client:
<*errors.errorString | 0xc00182c130>:
error creating client: error loading KubeConfig: open /no/such/config: no such file or directory
{
s: "error creating client: error loading KubeConfig: open /no/such/config: no such file or directory",
}
[FAILED] in [SynchronizedBeforeSuite] - /nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:186 @ 01/19/24 10:12:59.064
<< Timeline
[FAILED] Error loading client: error creating client: error loading KubeConfig: open /no/such/config: no such file or directory
In [SynchronizedBeforeSuite] at: /nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:186 @ 01/19/24 10:12:59.064
------------------------------
Summarizing 1 Failure:
[FAIL] [SynchronizedBeforeSuite]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/e2e.go:186
Ran 0 of 7421 Specs in 0.001 seconds
FAIL! -- A BeforeSuite node failed so all tests were skipped.
--- FAIL: TestE2E (0.18s)
FAIL
FAIL k8s.io/kubernetes/test/e2e 0.769s
FAIL
This replaces the klog formatting and message routing with a simpler
implementation that uses less code. The main difference is that we skip the
entire unused message routing.
Instead, the same split output streams as for JSON gets implemented in the
io.Writer implementation that gets passed to the textlogger.
The dead code was found with:
deadcode -test -filter=k8s.io/kubernetes/test/e2e/framework/... ./test/e2e ./test/e2e_node ./test/e2e_node ./test/e2e_kubeadm
See https://go.dev/blog/deadcode for an introduction.
Only dead code which is clearly not needed anymore (glog logging),
questionable (skipping based on feature gates) or
redundant (WaitForPodSuccessInNamespaceSlow) gets removed for now. More
removals might make sense in the future.
While the benchmark is focused on encoding, it becomes a bit more realistic
when actually passing the encoded data to the Linux kernel. Features like
output buffering are more likely to have a visible effect when invoking
syscalls.