Commit Graph

289 Commits

Author SHA1 Message Date
Mark Rossetti
40f3e624a6 Switching everything to use pause:3.8
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2022-07-21 14:53:15 -07:00
Kensei Nakada
b0d47cb380
scheduler_perf: allow users to specify default pod and node specs (#101799)
* scheduler_perf: default pod and node spec

* Fix: un-support DefaultNodeTemplatePath
2022-06-29 11:44:07 -07:00
Davanum Srinivas
50bea1dad8
Move from k8s.gcr.io to registry.k8s.io
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-05-31 10:16:53 -04:00
Kubernetes Prow Robot
570f1092f4
Merge pull request #109542 from sanposhiho/fix-test-case-scheduler-perf
scheduler_perf: Remove test cases for Preemption which always fail
2022-05-07 03:33:29 -07:00
Kubernetes Prow Robot
71df3e819b
Merge pull request #109545 from sanposhiho/fix-nun-on-scheduler_perf
Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN
2022-05-05 11:53:45 -07:00
Kensei Nakada
4af3c5efeb Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN 2022-05-05 16:11:22 +00:00
Kubernetes Prow Robot
3bef1692ef
Merge pull request #109696 from Huang-Wei/rm-sched-perf-legacy
Cleanup legacy scheduler perf tests
2022-05-04 02:35:43 -07:00
Kubernetes Prow Robot
629706e0fe
Merge pull request #109546 from sanposhiho/replace-metrics
Replace scheduler_e2e_scheduling_duration_seconds with scheduler_scheduling_attempt_duration_seconds in scheduler_perf
2022-05-04 01:29:22 -07:00
sanposhiho
1c2c20e6bd Change test cases for Preemption to create fewer Pods 2022-05-04 07:47:46 +00:00
Kubernetes Prow Robot
f0cd3725d3
Merge pull request #101835 from sanposhiho/scheduler_perf/feature/op-sleep
scheduler_perf: create sleep operation
2022-05-03 17:17:11 -07:00
Wei Huang
846ebf7814
Cleanup legacy scheduler perf tests 2022-04-27 09:57:17 -07:00
sanposhiho
b7b94b6b39 scheduler_perf: create sleep operation 2022-04-25 23:02:09 +00:00
sanposhiho
6e0da69632 Replace scheduler_e2e_scheduling_duration_seconds with scheduler_scheduling_attempt_duration_seconds in scheduler_perf 2022-04-20 00:48:12 +09:00
Davanum Srinivas
f7ad09c447
Switch to pause 3.7
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-03-29 15:36:38 -04:00
Kubernetes Prow Robot
546e4fa1ef
Merge pull request #107771 from sanposhiho/fix-tiny
make scheduler_perf stable
2022-03-04 17:22:52 -08:00
sanposhiho
4c3a1000c7 fix by gofmt 2022-02-25 00:23:01 +09:00
sanposhiho
1080c2d717 Make scheduler_perf stable 2022-02-24 01:29:38 +09:00
Kubernetes Prow Robot
21c0f6f6ff
Merge pull request #107677 from pohly/scheduler-integration-benchmark
scheduler integration benchmark improvements
2022-02-14 01:23:28 -08:00
Patrick Ohly
e1e84c8e5f scheduler_perf: run with -v=0 by default
This provides a mechanism for overriding the forced increase of the klog
verbosity to 4 when starting the apiserver and uses that for the scheduler_perf
benchmark. Other tests run as before.

A global variable was used because adding an explicit parameter to several
helper functions would have caused a lot of code churn (test ->
integration/util.StartApiserver ->
integration/framework.RunAnAPIServerUsingServer ->
integration/framework.startAPIServerOrDie).
2022-02-11 16:58:33 +01:00
Patrick Ohly
c62d7407c8 scheduler_perf: dump test data when writing it failed
Occasionally, writing as JSON failed because a NaN float couldn't be
encoded. The extended log message helps understand where that comes from, for
example:

F0120 20:24:45.515745  511835 scheduler_perf_test.go:540] BenchmarkPerfScheduling: unable to write measured data {Version:v1 DataItems:[{Data:map[Average:35.714285714285715 Perc50:2 Perc90:36 Perc95:412 Perc99:412] Unit:pods/s Labels:map[Metric:SchedulingThroughput Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2]} {Data:map[Average:27.863967530999993 Perc50:13.925925925925926 Perc90:30.06711409395973 Perc95:31.85682326621924 Perc99:704] Unit:ms Labels:map[Metric:scheduler_e2e_scheduling_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2]} {Data:map[Average:11915.651577744 Perc50:15168.796680497926 Perc90:19417.759336099585 Perc95:19948.87966804979 Perc99:20373.77593360996] Unit:ms Labels:map[Metric:scheduler_pod_scheduling_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2]} {Data:map[Average:1.1865832049999983 Perc50:0.7636363636363637 Perc90:2.891903719912473 Perc95:3.066958424507659 Perc99:5.333333333333334] Unit:ms Labels:map[Metric:scheduler_framework_extension_point_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2 extension_point:Filter]} {Data:map[Average:NaN Perc50:NaN Perc90:NaN Perc95:NaN Perc99:NaN] Unit:ms Labels:map[Metric:scheduler_framework_extension_point_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2 extension_point:Score]}]}: json: unsupported value: NaN
2022-02-07 08:59:19 +01:00
Patrick Ohly
8d44b819b3 scheduler_perf: avoid ambiguous test names
"-bench=PerfScheduling/Preemption/500Nodes" ran both the
PerfScheduling/Preemption/500Nodes and the
PerfScheduling/PreemptionPVs/500Nodes benchmark.

This can be avoided by choosing names where none is the prefix of another.
2022-02-07 08:59:19 +01:00
Patrick Ohly
259a8ad0b7 test: allow controlling etcd log level
When running an integration test that measures performance, like for example
test/integration/scheduler_perf, running etcd with debug level output is
undesirable because it creates additional load on the system and isn't
realistic.

The default is still "debug", but ETCD_LOGLEVEL=warn can be used to override
that.
2022-02-07 08:59:19 +01:00
ahrtr
fe95aa614c io/ioutil has already been deprecated in golang 1.16, so replace all ioutil with io and os 2022-02-03 05:32:12 +08:00
sanposhiho
d8840405e2 Create namespace for Pod not to occur error log of namespace not-found 2022-01-26 00:39:12 +09:00
Kubernetes Prow Robot
ca4af7a981
Merge pull request #104716 from sanposhiho/feature/scheduler_perf/unused-template-params
test/integration/scheduler_perf: check for unused template parameters
2022-01-10 16:21:16 -08:00
Davanum Srinivas
9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
Hanna Lee
07a883d8e6 Remove //lint:ignore pragmas that aren't being used anymore 2021-11-17 08:56:54 +01:00
Hanna Lee
c8fde197f5 Add more //nolint:staticcheck for failures caught in PR tests 2021-11-17 08:56:02 +01:00
kerthcet
75a255d2ed remove scheduler component config v1beta1
Signed-off-by: kerthcet <kerthcet@gmail.com>
2021-09-28 13:13:17 +08:00
Kubernetes Prow Robot
86d23cf441
Merge pull request #105206 from pohly/test-integration-help
test/integration: skip etcd startup for -help flag
2021-09-24 10:29:23 -07:00
Patrick Ohly
81b4a695b3 test/integration: skip etcd startup for -help flag
By parsing flags in the test's main function before starting etcd we bail out
early without ever starting etcd when the test was invoked with -help.

Otherwise etcd must be available, gets started and then hangs because
flag.Parse itself exits when called by testing.go. This bypasses the code in
EtcdMain which normally stops etcd.
2021-09-24 11:51:58 +02:00
Kubernetes Prow Robot
857d4c107c
Merge pull request #104808 from chendave/indent
Format json file with proper indentation
2021-09-21 19:14:00 -07:00
sanposhiho
1318f74609 Fix: use Fatalf and list all unused params in one error 2021-09-09 07:34:30 +09:00
sanposhiho
6bf6e424a1 Fix: rename getParams→get 2021-09-09 07:11:36 +09:00
sanposhiho
24643c67d5 Fix: make struct un-exported 2021-09-09 07:10:37 +09:00
Dave Chen
dda8090037 Format json file with proper indentation
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-09-07 16:14:34 +08:00
sanposhiho
cc846c9d33 Feature: check for unused template parameters 2021-09-02 01:46:34 +09:00
Claudiu Belu
18936d4785 updates pause image references
The pause:3.6 image has been published.

Also updates older / incorrect references.
2021-08-29 21:50:05 -07:00
Dave Chen
63b4710f38 Don't expose struct from prometheus client library 2021-08-27 22:21:24 +08:00
Dave Chen
58ab18bc1e Add the metric data for different extension points
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-08-23 13:43:48 +08:00
Kubernetes Prow Robot
4ab9c950d9
Merge pull request #102007 from vaibhav2107/perf-config
Update the typo in values of pods in performance-config.yaml
2021-08-12 13:59:50 -07:00
Wei Huang
55765f1b49
sched: support HistogramVec in scheduler performance test 2021-07-26 20:27:37 -07:00
Mengjiao Liu
4eab19ae7d Clean up the master term in test/integration comments 2021-06-18 16:31:05 +08:00
Sascha Grunert
b167fc24d7
Update pause image to v3.5
Update dependencies and the test images to use pause 3.5. We also
provide a changelog entry for the new container image version.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-05-25 09:04:46 +02:00
vaibhav
a1e56b4f6d Update the typo in values of pods in performance-config.yaml 2021-05-14 17:16:48 +05:30
Abdullah Gharaibeh
6988653457 Added benchmarks for pod affinity namespaceselector 2021-04-23 14:14:38 -04:00
Kubernetes Prow Robot
6d130d3b97
Merge pull request #100557 from chendave/validation_cleanup
Validate plugin config for KubeSchedulerConfiguration
2021-04-14 18:20:01 -07:00
Dave Chen
c6e65079c7 Validate plugin config for KubeSchedulerConfiguration
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-04-14 09:30:20 +08:00
Kubernetes Prow Robot
ed3e0d302f
Merge pull request #100644 from Huang-Wei/sched-fwk-config
Surface kube config in scheduler framework handle
2021-04-12 19:12:49 -07:00
Nicolas Mitchell
0e994e9481 return error with non-unique workload name in scheduler_perf_test 2021-04-06 10:24:04 -04:00
Nicolas Mitchell
338b06fb69 validate test/workload names in validateTestCases 2021-04-04 14:18:39 -04:00
Wei Huang
e7f67b1a63
Surface kube config in scheduler framework handle 2021-03-30 11:54:59 -07:00
Kubernetes Prow Robot
c78b5497ae
Merge pull request #99638 from chendave/perf_config
Enable scheduler_perf to support scheduler config file
2021-03-16 14:49:03 -07:00
Dave Chen
d50c0aeb5f Enable scheduler_perf to support scheduler config file
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-03-16 23:13:40 +08:00
Wei Huang
68ff3168b8
sched: fix a bug that literal 'p99' is mapped to 95th-percentile 2021-03-12 12:03:12 -08:00
Wei Huang
b93b4a2c96
sched: fix a bug that metrics of init or collected pods are re-collected 2021-03-11 10:28:39 -08:00
Kubernetes Prow Robot
823fa75643
Merge pull request #98900 from Huang-Wei/churn-cluster-op
Introduce a churnOp to scheduler perf testing framework
2021-03-11 02:00:24 -08:00
Kubernetes Prow Robot
23af91b293
Merge pull request #97779 from tiloso/staticcheck-test-integration-gs
Fix staticcheck in test/integration/{garbagecollector,scheduler_perf}
2021-03-10 16:04:23 -08:00
Kubernetes Prow Robot
841cb4adc4
Merge pull request #99844 from minbaev/scheduler-test-perf-optimization
add if check for number of scheduled pods to be greater than 0
2021-03-08 20:47:37 -08:00
David Eads
b8194cf77c switch most e2e tests to storage/v1 over v1beta1 2021-03-08 13:04:24 -05:00
Alexander Minbaev
359116f525 add if check for number of scheduled pods to be greater than 0 2021-03-05 09:05:42 -06:00
Kubernetes Prow Robot
0d4924e371
Merge pull request #99439 from minbaev/fix-typos
Fix typo in util.go
2021-03-04 11:00:22 -08:00
Wei Huang
1e5878b910
Introduce a churnOp to scheduler perf testing framework
- support two modes: recreate and create
- use DynmaicClient to create API objects
2021-03-03 06:51:53 -08:00
Benjamin Elder
56e092e382 hack/update-bazel.sh 2021-02-28 15:17:29 -08:00
Alexander Minbaev
5b73122105 Fix typo in util.go 2021-02-25 00:54:37 -06:00
Wei Huang
983272ce6a
sched: create dataItemsDir during a performance test if not exist 2021-02-17 12:44:16 -08:00
tiloso
e1ceac0783 Fix staticcheck in test/integration/{scheduler_perf,garbagecollector} 2021-02-10 10:55:09 +01:00
Kubernetes Prow Robot
2b7c61b1bb
Merge pull request #98205 from pacoxu/build/pauses
update pause image to 3.4.1 and also update the change log
2021-02-08 18:20:58 -08:00
pacoxu
0c152cbbbe update pause to 3.4.1 for tests(e2e)
Signed-off-by: pacoxu <paco.xu@daocloud.io>
2021-02-05 21:32:53 +08:00
Adhityaa Chandrasekar
b5808c6df9 scheduler_perf: remove implicit barrier at the end
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
2021-02-03 12:49:28 +00:00
Kubernetes Prow Robot
6d43e2b3bb
Merge pull request #96834 from chendave/fix_race
Add performance benchmark for the preemption with volume
2020-12-15 07:13:49 -08:00
Dave Chen
ebcca92771 Add performance benchmark for the preemption with volume
This will help to reveal the potential issues when the
volume is in place.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-12-15 10:54:01 +08:00
Kubernetes Prow Robot
bd4d197b52
Merge pull request #96447 from chendave/bind_postfilter
Remove the deprecated metrics from scheduler
2020-12-14 06:31:28 -08:00
Dave Chen
5144e2ec78 Remove the deprecated metrics from scheduler
Deprecated metrics are removed and suggest to use the Histogram
metrics got from scheduler extension points.

Signed-off-by: Dave Chen <dave.chen@arm.com>
Co-authored-by: wawa0210 <xiaozhang0210@hotmail.com>
2020-12-14 11:31:50 +08:00
Kubernetes Prow Robot
65d57211e3
Merge pull request #97068 from chendave/selectors
Add constraint selector to pod template
2020-12-08 22:01:19 -08:00
Dave Chen
58142288a5 Add constraint selector to pod template
PodTopologySpread plugin will only count the existing pod when that
pod's label matches with `constraint.Selector`, which means all pods
could be scheduled to one topology zone when the constraint does not
have any selector defined.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-12-04 18:04:26 +08:00
Tim Hockin
4068402459 Change trivial topology labels
In these cases the actual label key is incidental.
2020-11-12 11:21:37 -08:00
Wei Huang
267acdbe81
Update docs and fix redundant logic of scheduler perf 2020-11-06 23:45:09 -08:00
Tim Hockin
819ff9b087
Use topology labels instead of old beta names (#96033)
* Rename const for topology.../zone

* Rename const for topology.../region

* Rename const for failure-domain.../zone

* Rename const for failure-domain.../region

* Restore old names for compat
2020-11-05 20:26:50 -08:00
tangwz
518c502f54 scheduler_perf: use time.Ticker in throughput measurement 2020-09-19 09:36:17 +08:00
Adhityaa Chandrasekar
71bc9ce9c2 scheduler_perf: refactor to allow arbitrary workloads
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
2020-09-17 19:22:20 +00:00
Adhityaa Chandrasekar
59a6e86c0c scheduler_perf: label nodes for pod affinity
3 commits:
  label nodes for pod affinity
  use topology.kubernetes.io/zone label
  use 3 zones in pod affinity rules
2020-08-10 16:07:46 +00:00
hasheddan
a70b78e49a
Skip failing scheduler integration tests until they can be run successfully
The IPAM and scheduler performance tests are currently causing
integration-master job to fail because of timeouts. They were not
previously running as part of integration-master, so we can disable them
without loss of test coverage. They should be re-enabled as part of fix
for #93112.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
2020-07-15 10:28:35 -05:00
Dave Chen
ae735a1189 scheduler_perf: fix the nil pointer dereference
Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-06-16 13:23:05 +08:00
Abdullah Gharaibeh
36a3ad8752 Added a benchmark to evaluate overhead of unschedulable pods 2020-06-05 15:19:07 -04:00
Abdullah Gharaibeh
d650b57141 Added Preemption benchmark 2020-05-28 14:05:52 -04:00
Davanum Srinivas
07d88617e5
Run hack/update-vendor.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:33 -04:00
Davanum Srinivas
442a69c3bd
switch over k/k to use klog v2
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2020-05-16 07:54:27 -04:00
Kubernetes Prow Robot
d73907fc25
Merge pull request #89380 from alculquicondor/perf-spreading
Add perf test case for Topology Spreading
2020-03-28 02:43:52 -07:00
Dave Chen
49283364bf Decouple yaml based integration test from legacy test
- Move utilities or constants out so that both of them should be able
to run independently.
- Rename the legacy test so that it can eventually be deleted when the
perf dash changes is done
2020-03-27 08:45:59 +08:00
Aldo Culquicondor
9b2ff544ed Fix pod affinity performance test configuration
Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-03-26 15:52:44 -04:00
Aldo Culquicondor
671cd33986 Add perf test cases for topology spreading
Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-03-25 10:06:09 -04:00
Aldo Culquicondor
c9314dde59 Add support for multiple label values in test cases
They are assigned in a round robin fashion

Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-03-25 10:06:09 -04:00
Kubernetes Prow Robot
7e7c4d1021
Merge pull request #89272 from alculquicondor/perf-mixed-pods
Add multiple init pods to scheduler perf test cases
2020-03-25 00:29:02 -07:00
Aldo Culquicondor
e902e70d0d Use sqrt(n) chunk size in pod affinity and core scheduler 2020-03-24 10:29:59 -04:00
Kubernetes Prow Robot
2ab6357df0
Merge pull request #88528 from ingvagabund/doc-how-to-extend-scheduler-perf-tests
[doc] scheduler_perf: describe suite configuration in more detail
2020-03-23 17:10:47 -07:00
Aldo Culquicondor
5adc4c41e3 Add multiple init pods to perf test cases
Add test case with several init pods with affinity or antiaffinity.

Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-03-23 14:55:12 -04:00
Aldo Culquicondor
0e66e56e70 Use b.Fatal instead of klog.Fatal in scheduler perf tests
The test tool doesn't work properly with klog.Fatal

Signed-off-by: Aldo Culquicondor <acondor@google.com>
2020-03-23 14:53:15 -04:00
Dave Chen
eedfb593a9 Respect flags of testing package
`go test -c -o "perf.test"`
`./perf.test --help` doesn't understand "help" flag without
calling `flag.Parse` explicitly.
2020-03-16 10:24:15 +08:00
Jan Chaloupka
b09676921c [doc] scheduler_perf: describe suite configuration in more detail
The configuration file was design as a yaml file on purpose.
To easily extend the test cases without a need to modify
the testing binary. Also, it's possible to extend the configuration
itself to enrich individual test cases.
2020-03-05 11:42:05 +01:00