The logging instrumentation for contextual logging that was added for 1.29
slowed down the scheduler (i.e. logging verbosity <= 3) by a significant
percentage (-28.66% for SchedulingBasic/5000Nodes at -v3) if (and only if!)
contextual logging was enabled.
Retrieving the logger from the context causes no measurable slowdown, it's only
the various WithName/WithValues calls which cause this.
By being more careful about when to use those, the performance impact can be
avoided:
- At -v3 or lower, only `WithValues("pod")` is used once per scheduling cycle.
This has the intended effect that all log messages for the cycle include the
pod information. Once contextual logging is GA, "pod" key/value pairs can
be removed from all log calls.
- At -v4 or higher, richer log entries get produced where `WithValues` is also
used for the node (when applicable) and `WithName` is used for the current
operation and plugin.
With these changes, enabling contextual logging causes no measurable slowdown
at -v3 or lower. At -v4, the slowdown depends on the test case (-30.51%
throughput for SchedulingBasic/5000Nodes, no change for
SchedulingCSIPVs/5000Nodes). For some unknown reason (measuring bias?),
SchedulingCSIPVs/500Nodes has a ~3& *higher* throughput with contextual
logging.