kubernetes

Author	SHA1	Message	Date
Patrick Ohly	6b01ece580	scheduler-perf: fix perfdash display problem perfdash expects all data items to have the same set of labels. It then renders drop-down buttons for each label with all values found for each label. Previously, data items that didn't have a label didn't match any label filter in perfdash and couldn't get selected because perfdash doesn't have "unset" in it's drop-down menus. To avoid that, scheduler-perf now collects all labels and then adds missing labels with "not applicable" as value: { "data": { "Average": 939.7071223010004, "Perc50": 927.7987421383649, "Perc90": 2166.153846153846, "Perc95": 2363.076923076923, "Perc99": 2520.6153846153848 }, "unit": "ms", "labels": { "Metric": "scheduler_pod_scheduling_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "not applicable", "result": "not applicable" } }, ... { "data": { "Average": 1.1172570650000004, "Perc50": 1.1418367346938776, "Perc90": 1.5500000000000003, "Perc95": 1.6410256410256412, "Perc99": 3.7333333333333334 }, "unit": "ms", "labels": { "Metric": "scheduler_framework_extension_point_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "Score", "result": "not applicable" } },	2023-07-03 21:16:53 +02:00
Patrick Ohly	cecebe8ea2	scheduler_perf: add TestScheduling integration test This runs workloads that are labeled as "integration-test". The apiserver and scheduler are only started once per unique configuration, followed by each workload using that configuration. This makes execution faster. In contrast to benchmarking, we care less about starting with a clean slate for each test.	2023-06-28 09:22:25 +02:00
Patrick Ohly	dfd646e0a8	scheduler_perf: fix namespace deletion Merely deleting the namespace is not enough: - Workloads might rely on the garbage collector to get rid of obsolete objects, so we should run it to be on the safe side. - Pods must be force-deleted because kubelet is not running. - Finally, the namespace controller is needed to get rid of deleted namespaces.	2023-06-28 09:22:25 +02:00
kerthcet	0616d15712	Fix perf-test by increasing the error margin Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-05-17 12:14:06 +08:00
Kubernetes Prow Robot	8b33eaa0a7	Merge pull request #116207 from pohly/dra-scheduler-perf scheduler_perf: dynamic resource allocation test cases	2023-05-10 10:58:59 -07:00
Patrick Ohly	034528a9f0	scheduler perf: add DynamicResourceAllocation test cases The default scheduler configuration must be based on the v1 API where the plugin is enabled by default. Then if (and only if) the DynamicResourceAllocation feature gate for a test is set, the corresponding API group also gets enabled. The normal dynamic resource claim controller is started if needed to create ResourceClaims from ResourceClaimTemplates. Without the upcoming optimizations in the scheduler, scheduling with dynamic resources is fairly slow. The new test cases take around 15 minutes wall clock time on my desktop.	2023-05-04 13:08:06 +02:00
Kante Yin	a7035f5459	Pass Context to StartTestServer Signed-off-by: Kante Yin <kerthcet@gmail.com>	2023-05-04 10:25:09 +08:00
Kubernetes Prow Robot	47f1bd9f80	Merge pull request #117649 from SataQiu/scheduler-remove-v1beta2-20230427 scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration component config	2023-05-03 09:54:41 -07:00
Kubernetes Prow Robot	aece6838e8	Merge pull request #117232 from pohly/scheduler-perf-code-cleanups scheduler_perf: code cleanups	2023-05-03 09:54:13 -07:00
SataQiu	1f7c07f355	scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration	2023-05-03 21:43:19 +08:00
Patrick Ohly	b3e0bc8864	scheduler_perf: let the test decide which informers are needed This will change when adding dynamic resource allocation test cases. Instead of changing mustSetupScheduler and StartScheduler for that, let's return the informer factory and create informers as needed in the test.	2023-04-27 15:31:40 +02:00
Patrick Ohly	78b8af9fed	scheduler_perf: update throughputCollector The previous solution had some shortcomings: - It was based on the assumption that the goroutine gets woken up at regular intervals. This is not actually guaranteed. Now the code keeps track of the actual start and end of an interval and verifies that assumption. - If no pod was scheduled (unlikely, but could happen), then "0 pods/s" got recorded. In such a case, the metric was always either zero or >= 1. A better solution is to extend the interval until some pod gets scheduled. With the larger time interval it is then possible to also track, for example, 0.5 pods/s.	2023-04-26 08:11:50 +02:00
Kubernetes Prow Robot	2bfaaf21c1	Merge pull request #117197 from pohly/scheduler-perf-cleanup scheduler perf: remove cleanup func	2023-04-11 21:17:57 -07:00
Patrick Ohly	a869a89825	scheduler perf: remove cleanup func b.Cleanup may as well get called inside the function instead of leaving that to the caller.	2023-04-11 09:43:45 +02:00
sarab	8d18ae6fc2	Use the generic Set in scheduler	2023-04-09 11:34:17 +05:30
Kante Yin	3d0894fabf	Fix failure(context canceled) in scheduler_perf benchmark (#114843 ) * Fix failure in scheduler_perf benchmark Signed-off-by: Kante Yin <kerthcet@gmail.com> * Fatal when error in cleaning up nodes in scheduler perf tests Signed-off-by: Kante Yin <kerthcet@gmail.com> * Use derived context to better organize the codes Signed-off-by: Kante Yin <kerthcet@gmail.com> * Change log level to 2 in scheduler perf-test Signed-off-by: Kante Yin <kerthcet@gmail.com> --------- Signed-off-by: Kante Yin <kerthcet@gmail.com>	2023-01-30 16:21:00 -08:00
kerthcet	d6ffb47832	Replace klog with benchmark log in scheduler_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-11-09 09:11:55 +08:00
Wojciech Tyczyński	5b042f0bf4	Remove RunAnAPIServer from integration tests	2022-07-25 17:52:31 +02:00
Kensei Nakada	4af3c5efeb	Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN	2022-05-05 16:11:22 +00:00
Kubernetes Prow Robot	21c0f6f6ff	Merge pull request #107677 from pohly/scheduler-integration-benchmark scheduler integration benchmark improvements	2022-02-14 01:23:28 -08:00
Patrick Ohly	e1e84c8e5f	scheduler_perf: run with -v=0 by default This provides a mechanism for overriding the forced increase of the klog verbosity to 4 when starting the apiserver and uses that for the scheduler_perf benchmark. Other tests run as before. A global variable was used because adding an explicit parameter to several helper functions would have caused a lot of code churn (test -> integration/util.StartApiserver -> integration/framework.RunAnAPIServerUsingServer -> integration/framework.startAPIServerOrDie).	2022-02-11 16:58:33 +01:00
ahrtr	fe95aa614c	io/ioutil has already been deprecated in golang 1.16, so replace all ioutil with io and os	2022-02-03 05:32:12 +08:00
kerthcet	75a255d2ed	remove scheduler component config v1beta1 Signed-off-by: kerthcet <kerthcet@gmail.com>	2021-09-28 13:13:17 +08:00
Dave Chen	dda8090037	Format json file with proper indentation Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-09-07 16:14:34 +08:00
Dave Chen	63b4710f38	Don't expose struct from prometheus client library	2021-08-27 22:21:24 +08:00
Dave Chen	58ab18bc1e	Add the metric data for different extension points Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-08-23 13:43:48 +08:00
Wei Huang	55765f1b49	sched: support HistogramVec in scheduler performance test	2021-07-26 20:27:37 -07:00
Mengjiao Liu	4eab19ae7d	Clean up the master term in test/integration comments	2021-06-18 16:31:05 +08:00
Wei Huang	e7f67b1a63	Surface kube config in scheduler framework handle	2021-03-30 11:54:59 -07:00
Kubernetes Prow Robot	c78b5497ae	Merge pull request #99638 from chendave/perf_config Enable scheduler_perf to support scheduler config file	2021-03-16 14:49:03 -07:00
Dave Chen	d50c0aeb5f	Enable scheduler_perf to support scheduler config file Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-03-16 23:13:40 +08:00
Wei Huang	68ff3168b8	sched: fix a bug that literal 'p99' is mapped to 95th-percentile	2021-03-12 12:03:12 -08:00
Wei Huang	b93b4a2c96	sched: fix a bug that metrics of init or collected pods are re-collected	2021-03-11 10:28:39 -08:00
Kubernetes Prow Robot	823fa75643	Merge pull request #98900 from Huang-Wei/churn-cluster-op Introduce a churnOp to scheduler perf testing framework	2021-03-11 02:00:24 -08:00
Alexander Minbaev	359116f525	add if check for number of scheduled pods to be greater than 0	2021-03-05 09:05:42 -06:00
Wei Huang	1e5878b910	Introduce a churnOp to scheduler perf testing framework - support two modes: recreate and create - use DynmaicClient to create API objects	2021-03-03 06:51:53 -08:00
Alexander Minbaev	5b73122105	Fix typo in util.go	2021-02-25 00:54:37 -06:00
Wei Huang	983272ce6a	sched: create dataItemsDir during a performance test if not exist	2021-02-17 12:44:16 -08:00
tangwz	518c502f54	scheduler_perf: use time.Ticker in throughput measurement	2020-09-19 09:36:17 +08:00
Adhityaa Chandrasekar	71bc9ce9c2	scheduler_perf: refactor to allow arbitrary workloads Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>	2020-09-17 19:22:20 +00:00
Dave Chen	ae735a1189	scheduler_perf: fix the nil pointer dereference Signed-off-by: Dave Chen <dave.chen@arm.com>	2020-06-16 13:23:05 +08:00
Abdullah Gharaibeh	d650b57141	Added Preemption benchmark	2020-05-28 14:05:52 -04:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Dave Chen	49283364bf	Decouple yaml based integration test from legacy test - Move utilities or constants out so that both of them should be able to run independently. - Rename the legacy test so that it can eventually be deleted when the perf dash changes is done	2020-03-27 08:45:59 +08:00
Jan Chaloupka	5b3b4de972	scheduler_perf: do not override throughput labels Throughput labels are currently initialized with a "Name" label. So we need to append to the map instead of creating a new one.	2020-02-28 16:10:50 +01:00
Kubernetes Prow Robot	fe9073b8c1	Merge pull request #88318 from mborsz/bench Add BenchmarkSchedulingWaitForFirstConsumerPVs benchmark	2020-02-25 07:52:49 -08:00
Maciej Borsz	bd8ed0a2a7	Add BenchmarkSchedulingWaitForFirstConsumerPVs benchmark	2020-02-25 14:41:14 +01:00
Cong Liu	7f56c753b3	Make MetricCollector configurable for scheduler benchmark tests	2020-02-18 14:02:57 -08:00
Jan Chaloupka	7b5534021c	Collect some of scheduling metrics and scheduling throughput In addition to getting overall performance measurements from golang benchmark, collect metrics that provides information about insides of the scheduler itself. This is a first step towards improving what we collect about the scheduler. Metrics in question: - scheduler_scheduling_algorithm_predicate_evaluation_seconds - scheduler_scheduling_algorithm_priority_evaluation_seconds - scheduler_binding_duration_seconds - scheduler_e2e_scheduling_duration_seconds Scheduling throughput is computed on the fly inside perfScheduling.	2020-02-13 13:32:09 +01:00
Mike Danese	38ecb30c58	Revert "Collect some of scheduling metrics and scheduling throughput"	2020-02-06 10:18:00 -08:00

1 2

99 Commits