kubernetes/test/integration/scheduler_perf
Davanum Srinivas 9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
..
config updates pause image references 2021-08-29 21:50:05 -07:00
.gitignore scheduler_perf: refactor to allow arbitrary workloads 2020-09-17 19:22:20 +00:00
main_test.go test/integration: skip etcd startup for -help flag 2021-09-24 11:51:58 +02:00
OWNERS Check in OWNERS modified by update-yamlfmt.sh 2021-12-09 21:31:26 -05:00
README.md Update docs and fix redundant logic of scheduler perf 2020-11-06 23:45:09 -08:00
scheduler_perf_legacy_test.go Remove //lint:ignore pragmas that aren't being used anymore 2021-11-17 08:56:54 +01:00
scheduler_perf_test.go Add the metric data for different extension points 2021-08-23 13:43:48 +08:00
scheduler_perf_types.go # This is a combination of 2 commits. 2017-07-19 00:28:40 -04:00
scheduler_test.go Enable scheduler_perf to support scheduler config file 2021-03-16 23:13:40 +08:00
util.go remove scheduler component config v1beta1 2021-09-28 13:13:17 +08:00

Scheduler Performance Test

Motivation

We already have a performance testing system -- Kubemark. However, Kubemark requires setting up and bootstrapping a whole cluster, which takes a lot of time.

We want to have a standard way to reproduce scheduling latency metrics result and benchmark scheduler as simple and fast as possible. We have the following goals:

  • Save time on testing
    • The test and benchmark can be run in a single box. We only set up components necessary to scheduling without booting up a cluster.
  • Profiling runtime metrics to find out bottleneck
    • Write scheduler integration test but focus on performance measurement. Take advantage of go profiling tools and collect fine-grained metrics, like cpu-profiling, memory-profiling and block-profiling.
  • Reproduce test result easily
    • We want to have a known place to do the performance related test for scheduler. Developers should just run one script to collect all the information they need.

Currently the test suite has the following:

  • density test (by adding a new Go test)
    • schedule 30k pods on 1000 (fake) nodes and 3k pods on 100 (fake) nodes
    • print out scheduling rate every second
    • let you learn the rate changes vs number of scheduled pods
  • benchmark
    • make use of go test -bench and report nanosecond/op.
    • schedule b.N pods when the cluster has N nodes and P scheduled pods. Since it takes relatively long time to finish one round, b.N is small: 10 - 100.

How To Run

Density tests

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=true -logtostderr=true -run=." KUBE_TIMEOUT="--timeout=60m" SHORT="--short=false"

Benchmark tests

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=false -logtostderr=false -run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling"

The benchmark suite runs all the tests specified under config/performance-config.yaml.

Once the benchmark is finished, JSON file with metrics is available in the current directory (test/integration/scheduler_perf). Look for BenchmarkPerfScheduling_YYYY-MM-DDTHH:MM:SSZ.json. You can use -data-items-dir to generate the metrics file elsewhere.

In case you want to run a specific test in the suite, you can specify the test through -bench flag:

Also, bench time is explicitly set to 1ns (-benchtime=1ns flag) so each test is run only once. Otherwise, the golang benchmark framework will try to run a test more than once in case it ran for less than 1s.

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=false -logtostderr=false -run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling/SchedulingBasic/5000Nodes/5000InitPods/1000PodsToSchedule"

To produce a cpu profile:

# In Kubernetes root path
make test-integration WHAT=./test/integration/scheduler_perf KUBE_TIMEOUT="-timeout=3600s" KUBE_TEST_VMODULE="''" KUBE_TEST_ARGS="-alsologtostderr=false -logtostderr=false -run=^$$ -benchtime=1ns -bench=BenchmarkPerfScheduling -cpuprofile ~/cpu-profile.out"

How to configure benchmark tests

Configuration file located under config/performance-config.yaml contains a list of templates. Each template allows to set:

  • node manifest
  • manifests for initial and testing pod
  • number of nodes, number of initial and testing pods
  • templates for PVs and PVCs
  • feature gates

See op data type implementation in scheduler_perf_test.go for available operations to build WorkloadTemplate.

Initial pods create a state of a cluster before the scheduler performance measurement can begin. Testing pods are then subject to performance measurement.

The configuration file under config/performance-config.yaml contains a default list of templates to cover various scenarios. In case you want to add your own, you can extend the list with new templates. It's also possible to extend op data type, respectively its underlying data types to extend configuration of possible test cases.