Patrick Ohly
aa73f06e56
scheduler perf: allow creating 0 items
...
It makes sense to define a test where, depending on the parameters, some
operation creations zero pods, namespaces or nodes. The validation didn't allow
that previously due to the way how it was implemented although the underlying
code works fine with zero as count.
2023-04-11 09:59:16 +02:00
Patrick Ohly
49bbf7c268
scheduler_perf: fix race condition
...
collector.collect got called without ensuring that collector.run had
terminated, so it could have happened that collector.run adds another sample
while collector.collect is reading them.
2023-04-11 09:46:34 +02:00
Patrick Ohly
a869a89825
scheduler perf: remove cleanup func
...
b.Cleanup may as well get called inside the function instead
of leaving that to the caller.
2023-04-11 09:43:45 +02:00
sarab
8d18ae6fc2
Use the generic Set in scheduler
2023-04-09 11:34:17 +05:30
Wei Huang
c9bc2f98d0
fix: remove SchedulingMigratedInTreePVs feature gate in sched perf test
2023-03-08 08:34:44 -08:00
Patrick Ohly
cc4bcd1d8e
scheduler_perf: report data items as benchmark results
...
This replaces the pretty useless us/op metric (useless because it includes
setup and teardown times) with the same values that also get stored in the JSON
file.
The main advantage is that benchstat can be used to analyze and compare
results.
2023-02-28 23:08:23 +01:00
Patrick Ohly
961129c5f1
scheduler_perf: add logging flags
...
This enables testing of different real production configurations (JSON
vs. text, different log levels, contextual logging).
2023-02-28 23:08:17 +01:00
Kante Yin
3d0894fabf
Fix failure(context canceled) in scheduler_perf benchmark ( #114843 )
...
* Fix failure in scheduler_perf benchmark
Signed-off-by: Kante Yin <kerthcet@gmail.com>
* Fatal when error in cleaning up nodes in scheduler perf tests
Signed-off-by: Kante Yin <kerthcet@gmail.com>
* Use derived context to better organize the codes
Signed-off-by: Kante Yin <kerthcet@gmail.com>
* Change log level to 2 in scheduler perf-test
Signed-off-by: Kante Yin <kerthcet@gmail.com>
---------
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-01-30 16:21:00 -08:00
Kensei Nakada
e8092cc885
cleanup(scheduler_perf): remove all removed feature gates
2023-01-04 01:07:47 +00:00
Patrick Ohly
2f6c4f5eab
e2e: use Ginkgo context
...
All code must use the context from Ginkgo when doing API calls or polling for a
change, otherwise the code would not return immediately when the test gets
aborted.
2022-12-16 20:14:04 +01:00
Mark Rossetti
534f052a8d
Updating pause image refernces to 3.9
...
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2022-11-14 10:24:54 -08:00
kerthcet
d6ffb47832
Replace klog with benchmark log in scheduler_perf
...
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-11-09 09:11:55 +08:00
Kubernetes Prow Robot
73f6b96f0a
Merge pull request #113615 from kerthcet/feat/add-benchmark-tests
...
Add nodeInclusionPolicy benchmark tests to scheduler_perf
2022-11-07 09:18:28 -08:00
kerthcet
bc15aca26d
Refactor SchedulerConfigFile
...
Rename to SchedulerConfigPath and make it a pointer
to be consist with other fields
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-11-05 00:30:34 +08:00
kerthcet
48f2c9ec20
Add benchmark tests for nodeInclusionPolicy
...
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-11-05 00:13:43 +08:00
kerthcet
cfc53ee524
Refactor code and annotations for readability
...
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-11-01 17:44:45 +08:00
kerthcet
21e8a69a22
Use operationCode instead of string directly
...
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-11-01 17:01:22 +08:00
Patrick Ohly
41619ace15
stop using deprecated klog flags
...
Some scripts and tools still relied on the deprecated flags, the ones
which are about to be removed.
This is intentionally not a complete removal of all those flags in the entire
repo. This would lead to much more code churn also in places where commands
still accept the flags because they use klog directly.
2022-09-04 21:02:43 +02:00
Davanum Srinivas
a9593d634c
Generate and format files
...
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-07-26 13:14:05 -04:00
Wojciech Tyczyński
5b042f0bf4
Remove RunAnAPIServer from integration tests
2022-07-25 17:52:31 +02:00
Mark Rossetti
40f3e624a6
Switching everything to use pause:3.8
...
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2022-07-21 14:53:15 -07:00
Kensei Nakada
b0d47cb380
scheduler_perf: allow users to specify default pod and node specs ( #101799 )
...
* scheduler_perf: default pod and node spec
* Fix: un-support DefaultNodeTemplatePath
2022-06-29 11:44:07 -07:00
Davanum Srinivas
50bea1dad8
Move from k8s.gcr.io to registry.k8s.io
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-05-31 10:16:53 -04:00
Kubernetes Prow Robot
570f1092f4
Merge pull request #109542 from sanposhiho/fix-test-case-scheduler-perf
...
scheduler_perf: Remove test cases for Preemption which always fail
2022-05-07 03:33:29 -07:00
Kubernetes Prow Robot
71df3e819b
Merge pull request #109545 from sanposhiho/fix-nun-on-scheduler_perf
...
Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN
2022-05-05 11:53:45 -07:00
Kensei Nakada
4af3c5efeb
Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN
2022-05-05 16:11:22 +00:00
Kubernetes Prow Robot
3bef1692ef
Merge pull request #109696 from Huang-Wei/rm-sched-perf-legacy
...
Cleanup legacy scheduler perf tests
2022-05-04 02:35:43 -07:00
Kubernetes Prow Robot
629706e0fe
Merge pull request #109546 from sanposhiho/replace-metrics
...
Replace scheduler_e2e_scheduling_duration_seconds with scheduler_scheduling_attempt_duration_seconds in scheduler_perf
2022-05-04 01:29:22 -07:00
sanposhiho
1c2c20e6bd
Change test cases for Preemption to create fewer Pods
2022-05-04 07:47:46 +00:00
Kubernetes Prow Robot
f0cd3725d3
Merge pull request #101835 from sanposhiho/scheduler_perf/feature/op-sleep
...
scheduler_perf: create sleep operation
2022-05-03 17:17:11 -07:00
Wei Huang
846ebf7814
Cleanup legacy scheduler perf tests
2022-04-27 09:57:17 -07:00
sanposhiho
b7b94b6b39
scheduler_perf: create sleep operation
2022-04-25 23:02:09 +00:00
sanposhiho
6e0da69632
Replace scheduler_e2e_scheduling_duration_seconds with scheduler_scheduling_attempt_duration_seconds in scheduler_perf
2022-04-20 00:48:12 +09:00
Davanum Srinivas
f7ad09c447
Switch to pause 3.7
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2022-03-29 15:36:38 -04:00
Kubernetes Prow Robot
546e4fa1ef
Merge pull request #107771 from sanposhiho/fix-tiny
...
make scheduler_perf stable
2022-03-04 17:22:52 -08:00
sanposhiho
4c3a1000c7
fix by gofmt
2022-02-25 00:23:01 +09:00
sanposhiho
1080c2d717
Make scheduler_perf stable
2022-02-24 01:29:38 +09:00
Kubernetes Prow Robot
21c0f6f6ff
Merge pull request #107677 from pohly/scheduler-integration-benchmark
...
scheduler integration benchmark improvements
2022-02-14 01:23:28 -08:00
Patrick Ohly
e1e84c8e5f
scheduler_perf: run with -v=0 by default
...
This provides a mechanism for overriding the forced increase of the klog
verbosity to 4 when starting the apiserver and uses that for the scheduler_perf
benchmark. Other tests run as before.
A global variable was used because adding an explicit parameter to several
helper functions would have caused a lot of code churn (test ->
integration/util.StartApiserver ->
integration/framework.RunAnAPIServerUsingServer ->
integration/framework.startAPIServerOrDie).
2022-02-11 16:58:33 +01:00
Patrick Ohly
c62d7407c8
scheduler_perf: dump test data when writing it failed
...
Occasionally, writing as JSON failed because a NaN float couldn't be
encoded. The extended log message helps understand where that comes from, for
example:
F0120 20:24:45.515745 511835 scheduler_perf_test.go:540] BenchmarkPerfScheduling: unable to write measured data {Version:v1 DataItems:[{Data:map[Average:35.714285714285715 Perc50:2 Perc90:36 Perc95:412 Perc99:412] Unit:pods/s Labels:map[Metric:SchedulingThroughput Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2]} {Data:map[Average:27.863967530999993 Perc50:13.925925925925926 Perc90:30.06711409395973 Perc95:31.85682326621924 Perc99:704] Unit:ms Labels:map[Metric:scheduler_e2e_scheduling_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2]} {Data:map[Average:11915.651577744 Perc50:15168.796680497926 Perc90:19417.759336099585 Perc95:19948.87966804979 Perc99:20373.77593360996] Unit:ms Labels:map[Metric:scheduler_pod_scheduling_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2]} {Data:map[Average:1.1865832049999983 Perc50:0.7636363636363637 Perc90:2.891903719912473 Perc95:3.066958424507659 Perc99:5.333333333333334] Unit:ms Labels:map[Metric:scheduler_framework_extension_point_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2 extension_point:Filter]} {Data:map[Average:NaN Perc50:NaN Perc90:NaN Perc95:NaN Perc99:NaN] Unit:ms Labels:map[Metric:scheduler_framework_extension_point_duration_seconds Name:BenchmarkPerfScheduling/PreemptionPVs/500Nodes/namespace-2 extension_point:Score]}]}: json: unsupported value: NaN
2022-02-07 08:59:19 +01:00
Patrick Ohly
8d44b819b3
scheduler_perf: avoid ambiguous test names
...
"-bench=PerfScheduling/Preemption/500Nodes" ran both the
PerfScheduling/Preemption/500Nodes and the
PerfScheduling/PreemptionPVs/500Nodes benchmark.
This can be avoided by choosing names where none is the prefix of another.
2022-02-07 08:59:19 +01:00
Patrick Ohly
259a8ad0b7
test: allow controlling etcd log level
...
When running an integration test that measures performance, like for example
test/integration/scheduler_perf, running etcd with debug level output is
undesirable because it creates additional load on the system and isn't
realistic.
The default is still "debug", but ETCD_LOGLEVEL=warn can be used to override
that.
2022-02-07 08:59:19 +01:00
ahrtr
fe95aa614c
io/ioutil has already been deprecated in golang 1.16, so replace all ioutil with io and os
2022-02-03 05:32:12 +08:00
sanposhiho
d8840405e2
Create namespace for Pod not to occur error log of namespace not-found
2022-01-26 00:39:12 +09:00
Kubernetes Prow Robot
ca4af7a981
Merge pull request #104716 from sanposhiho/feature/scheduler_perf/unused-template-params
...
test/integration/scheduler_perf: check for unused template parameters
2022-01-10 16:21:16 -08:00
Davanum Srinivas
9405e9b55e
Check in OWNERS modified by update-yamlfmt.sh
...
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2021-12-09 21:31:26 -05:00
Hanna Lee
07a883d8e6
Remove //lint:ignore pragmas that aren't being used anymore
2021-11-17 08:56:54 +01:00
Hanna Lee
c8fde197f5
Add more //nolint:staticcheck for failures caught in PR tests
2021-11-17 08:56:02 +01:00
kerthcet
75a255d2ed
remove scheduler component config v1beta1
...
Signed-off-by: kerthcet <kerthcet@gmail.com>
2021-09-28 13:13:17 +08:00
Kubernetes Prow Robot
86d23cf441
Merge pull request #105206 from pohly/test-integration-help
...
test/integration: skip etcd startup for -help flag
2021-09-24 10:29:23 -07:00
Patrick Ohly
81b4a695b3
test/integration: skip etcd startup for -help flag
...
By parsing flags in the test's main function before starting etcd we bail out
early without ever starting etcd when the test was invoked with -help.
Otherwise etcd must be available, gets started and then hangs because
flag.Parse itself exits when called by testing.go. This bypasses the code in
EtcdMain which normally stops etcd.
2021-09-24 11:51:58 +02:00
Kubernetes Prow Robot
857d4c107c
Merge pull request #104808 from chendave/indent
...
Format json file with proper indentation
2021-09-21 19:14:00 -07:00
sanposhiho
1318f74609
Fix: use Fatalf and list all unused params in one error
2021-09-09 07:34:30 +09:00
sanposhiho
6bf6e424a1
Fix: rename getParams→get
2021-09-09 07:11:36 +09:00
sanposhiho
24643c67d5
Fix: make struct un-exported
2021-09-09 07:10:37 +09:00
Dave Chen
dda8090037
Format json file with proper indentation
...
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-09-07 16:14:34 +08:00
sanposhiho
cc846c9d33
Feature: check for unused template parameters
2021-09-02 01:46:34 +09:00
Claudiu Belu
18936d4785
updates pause image references
...
The pause:3.6 image has been published.
Also updates older / incorrect references.
2021-08-29 21:50:05 -07:00
Dave Chen
63b4710f38
Don't expose struct from prometheus client library
2021-08-27 22:21:24 +08:00
Dave Chen
58ab18bc1e
Add the metric data for different extension points
...
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-08-23 13:43:48 +08:00
Kubernetes Prow Robot
4ab9c950d9
Merge pull request #102007 from vaibhav2107/perf-config
...
Update the typo in values of pods in performance-config.yaml
2021-08-12 13:59:50 -07:00
Wei Huang
55765f1b49
sched: support HistogramVec in scheduler performance test
2021-07-26 20:27:37 -07:00
Mengjiao Liu
4eab19ae7d
Clean up the master term in test/integration comments
2021-06-18 16:31:05 +08:00
Sascha Grunert
b167fc24d7
Update pause image to v3.5
...
Update dependencies and the test images to use pause 3.5. We also
provide a changelog entry for the new container image version.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2021-05-25 09:04:46 +02:00
vaibhav
a1e56b4f6d
Update the typo in values of pods in performance-config.yaml
2021-05-14 17:16:48 +05:30
Abdullah Gharaibeh
6988653457
Added benchmarks for pod affinity namespaceselector
2021-04-23 14:14:38 -04:00
Kubernetes Prow Robot
6d130d3b97
Merge pull request #100557 from chendave/validation_cleanup
...
Validate plugin config for KubeSchedulerConfiguration
2021-04-14 18:20:01 -07:00
Dave Chen
c6e65079c7
Validate plugin config for KubeSchedulerConfiguration
...
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-04-14 09:30:20 +08:00
Kubernetes Prow Robot
ed3e0d302f
Merge pull request #100644 from Huang-Wei/sched-fwk-config
...
Surface kube config in scheduler framework handle
2021-04-12 19:12:49 -07:00
Nicolas Mitchell
0e994e9481
return error with non-unique workload name in scheduler_perf_test
2021-04-06 10:24:04 -04:00
Nicolas Mitchell
338b06fb69
validate test/workload names in validateTestCases
2021-04-04 14:18:39 -04:00
Wei Huang
e7f67b1a63
Surface kube config in scheduler framework handle
2021-03-30 11:54:59 -07:00
Kubernetes Prow Robot
c78b5497ae
Merge pull request #99638 from chendave/perf_config
...
Enable scheduler_perf to support scheduler config file
2021-03-16 14:49:03 -07:00
Dave Chen
d50c0aeb5f
Enable scheduler_perf to support scheduler config file
...
Signed-off-by: Dave Chen <dave.chen@arm.com>
2021-03-16 23:13:40 +08:00
Wei Huang
68ff3168b8
sched: fix a bug that literal 'p99' is mapped to 95th-percentile
2021-03-12 12:03:12 -08:00
Wei Huang
b93b4a2c96
sched: fix a bug that metrics of init or collected pods are re-collected
2021-03-11 10:28:39 -08:00
Kubernetes Prow Robot
823fa75643
Merge pull request #98900 from Huang-Wei/churn-cluster-op
...
Introduce a churnOp to scheduler perf testing framework
2021-03-11 02:00:24 -08:00
Kubernetes Prow Robot
23af91b293
Merge pull request #97779 from tiloso/staticcheck-test-integration-gs
...
Fix staticcheck in test/integration/{garbagecollector,scheduler_perf}
2021-03-10 16:04:23 -08:00
Kubernetes Prow Robot
841cb4adc4
Merge pull request #99844 from minbaev/scheduler-test-perf-optimization
...
add if check for number of scheduled pods to be greater than 0
2021-03-08 20:47:37 -08:00
David Eads
b8194cf77c
switch most e2e tests to storage/v1 over v1beta1
2021-03-08 13:04:24 -05:00
Alexander Minbaev
359116f525
add if check for number of scheduled pods to be greater than 0
2021-03-05 09:05:42 -06:00
Kubernetes Prow Robot
0d4924e371
Merge pull request #99439 from minbaev/fix-typos
...
Fix typo in util.go
2021-03-04 11:00:22 -08:00
Wei Huang
1e5878b910
Introduce a churnOp to scheduler perf testing framework
...
- support two modes: recreate and create
- use DynmaicClient to create API objects
2021-03-03 06:51:53 -08:00
Benjamin Elder
56e092e382
hack/update-bazel.sh
2021-02-28 15:17:29 -08:00
Alexander Minbaev
5b73122105
Fix typo in util.go
2021-02-25 00:54:37 -06:00
Wei Huang
983272ce6a
sched: create dataItemsDir during a performance test if not exist
2021-02-17 12:44:16 -08:00
tiloso
e1ceac0783
Fix staticcheck in test/integration/{scheduler_perf,garbagecollector}
2021-02-10 10:55:09 +01:00
Kubernetes Prow Robot
2b7c61b1bb
Merge pull request #98205 from pacoxu/build/pauses
...
update pause image to 3.4.1 and also update the change log
2021-02-08 18:20:58 -08:00
pacoxu
0c152cbbbe
update pause to 3.4.1 for tests(e2e)
...
Signed-off-by: pacoxu <paco.xu@daocloud.io>
2021-02-05 21:32:53 +08:00
Adhityaa Chandrasekar
b5808c6df9
scheduler_perf: remove implicit barrier at the end
...
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
2021-02-03 12:49:28 +00:00
Kubernetes Prow Robot
6d43e2b3bb
Merge pull request #96834 from chendave/fix_race
...
Add performance benchmark for the preemption with volume
2020-12-15 07:13:49 -08:00
Dave Chen
ebcca92771
Add performance benchmark for the preemption with volume
...
This will help to reveal the potential issues when the
volume is in place.
Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-12-15 10:54:01 +08:00
Kubernetes Prow Robot
bd4d197b52
Merge pull request #96447 from chendave/bind_postfilter
...
Remove the deprecated metrics from scheduler
2020-12-14 06:31:28 -08:00
Dave Chen
5144e2ec78
Remove the deprecated metrics from scheduler
...
Deprecated metrics are removed and suggest to use the Histogram
metrics got from scheduler extension points.
Signed-off-by: Dave Chen <dave.chen@arm.com>
Co-authored-by: wawa0210 <xiaozhang0210@hotmail.com>
2020-12-14 11:31:50 +08:00
Kubernetes Prow Robot
65d57211e3
Merge pull request #97068 from chendave/selectors
...
Add constraint selector to pod template
2020-12-08 22:01:19 -08:00
Dave Chen
58142288a5
Add constraint selector to pod template
...
PodTopologySpread plugin will only count the existing pod when that
pod's label matches with `constraint.Selector`, which means all pods
could be scheduled to one topology zone when the constraint does not
have any selector defined.
Signed-off-by: Dave Chen <dave.chen@arm.com>
2020-12-04 18:04:26 +08:00
Tim Hockin
4068402459
Change trivial topology labels
...
In these cases the actual label key is incidental.
2020-11-12 11:21:37 -08:00
Wei Huang
267acdbe81
Update docs and fix redundant logic of scheduler perf
2020-11-06 23:45:09 -08:00
Tim Hockin
819ff9b087
Use topology labels instead of old beta names ( #96033 )
...
* Rename const for topology.../zone
* Rename const for topology.../region
* Rename const for failure-domain.../zone
* Rename const for failure-domain.../region
* Restore old names for compat
2020-11-05 20:26:50 -08:00
tangwz
518c502f54
scheduler_perf: use time.Ticker in throughput measurement
2020-09-19 09:36:17 +08:00