History

Jan Chaloupka 5b3b4de972 scheduler_perf: do not override throughput labels Throughput labels are currently initialized with a "Name" label. So we need to append to the map instead of creating a new one.		2020-02-28 16:10:50 +01:00
..
config	bump pause to 3.2 in test/	2020-02-14 14:03:19 -08:00
BUILD	Collect some of scheduling metrics and scheduling throughput	2020-02-13 13:32:09 +01:00
main_test.go	use testmain in integration tests	2017-07-12 17:34:55 -07:00
OWNERS	Updated OWNERS files to include link to docs	2019-02-04 22:33:12 +01:00
README.md	scheduler_perf: describe how to run BenchmarkPerfScheduling manually	2020-02-24 19:12:00 +01:00
scheduler_bench_test.go	Add BenchmarkSchedulingWaitForFirstConsumerPVs benchmark	2020-02-25 14:41:14 +01:00
scheduler_perf_test.go	Make MetricCollector configurable for scheduler benchmark tests	2020-02-18 14:02:57 -08:00
scheduler_perf_types.go	# This is a combination of 2 commits.	2017-07-19 00:28:40 -04:00
scheduler_test.go	generated: run refactor	2020-02-08 12:30:21 -05:00
util.go	scheduler_perf: do not override throughput labels	2020-02-28 16:10:50 +01:00

README.md

Scheduler Performance Test

Motivation

We already have a performance testing system -- Kubemark. However, Kubemark requires setting up and bootstrapping a whole cluster, which takes a lot of time.

We want to have a standard way to reproduce scheduling latency metrics result and benchmark scheduler as simple and fast as possible. We have the following goals:

Save time on testing
- The test and benchmark can be run in a single box. We only set up components necessary to scheduling without booting up a cluster.
Profiling runtime metrics to find out bottleneck
- Write scheduler integration test but focus on performance measurement. Take advantage of go profiling tools and collect fine-grained metrics, like cpu-profiling, memory-profiling and block-profiling.
Reproduce test result easily
- We want to have a known place to do the performance related test for scheduler. Developers should just run one script to collect all the information they need.

Currently the test suite has the following:

density test (by adding a new Go test)
- schedule 30k pods on 1000 (fake) nodes and 3k pods on 100 (fake) nodes
- print out scheduling rate every second
- let you learn the rate changes vs number of scheduled pods
benchmark
- make use of go test -bench and report nanosecond/op.
- schedule b.N pods when the cluster has N nodes and P scheduled pods. Since it takes relatively long time to finish one round, b.N is small: 10 - 100.

How To Run

Density tests

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=true -logtostderr=true -run=." KUBE_TIMEOUT="--timeout=60m" SHORT="--short=false"

Benchmark tests

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=false -logtostderr=false -run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling"

The benchmark suite runs all the tests specified under config/performance-config.yaml.

Once the benchmark is finished, JSON file with metrics is available in the current directory (test/integration/scheduler_perf). Look for BenchmarkPerfScheduling_YYYY-MM-DDTHH:MM:SSZ.json. You can use -data-items-dir to generate the metrics file elsewhere.

In case you want to run a specific test in the suite, you can specify the test through -bench flag:

Also, bench time is explicitly set to 1ns (-benchtime=1ns flag) so each test is run only once. Otherwise, the golang benchmark framework will try to run a test more than once in case it ran for less than 1s.

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=false -logtostderr=false -run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling/SchedulingBasic/5000Nodes/5000InitPods/1000PodsToSchedule"

To produce a cpu profile:

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TIMEOUT="-timeout=3600s" KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=false -logtostderr=false -run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling -cpuprofile ~/cpu-profile.out"