kubernetes

Author	SHA1	Message	Date
kerthcet	1ffa1e17cd	Remove noisy log in scheduler_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-06-12 11:53:35 +08:00
Kensei Nakada	c72b688e12	support `scheduler_plugin_execution_duration_seconds` in scheduler_perf	2024-04-27 08:22:53 +00:00
Patrick Ohly	c46ae1b26a	scheduler_perf: use ktesting.TContext + staging StartTestServer ktesting.TContext combines several different interfaces. This makes the code simpler because less parameters need to be passed around. An intentional side effect is that the apiextensions client interface becomes available, which makes it possible to use CRDs. This will be needed for future DRA tests. Support for CRDs depends on starting the apiserver via k8s.io/kubernetes/cmd/kube-apiserver/app/testing because only that enables the CRD extensions. As discussed on Slack, the long-term goal is to replace the in-tree StartTestServer with the one in staging, so this is going in the right direction.	2024-02-11 10:51:38 +01:00
Kensei Nakada	5310abe14a	make scheduler_perf usable from other repositories	2023-12-01 12:43:08 +00:00
Patrick Ohly	c74d045c4b	scheduler_perf: show name of one pending pod in error message If pods get stuck, then giving the name of one makes it possible to search for it in the log output. Without the name it's hard to figure out which pods got stuck.	2023-09-04 09:54:26 +02:00
Patrick Ohly	6b01ece580	scheduler-perf: fix perfdash display problem perfdash expects all data items to have the same set of labels. It then renders drop-down buttons for each label with all values found for each label. Previously, data items that didn't have a label didn't match any label filter in perfdash and couldn't get selected because perfdash doesn't have "unset" in it's drop-down menus. To avoid that, scheduler-perf now collects all labels and then adds missing labels with "not applicable" as value: { "data": { "Average": 939.7071223010004, "Perc50": 927.7987421383649, "Perc90": 2166.153846153846, "Perc95": 2363.076923076923, "Perc99": 2520.6153846153848 }, "unit": "ms", "labels": { "Metric": "scheduler_pod_scheduling_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "not applicable", "result": "not applicable" } }, ... { "data": { "Average": 1.1172570650000004, "Perc50": 1.1418367346938776, "Perc90": 1.5500000000000003, "Perc95": 1.6410256410256412, "Perc99": 3.7333333333333334 }, "unit": "ms", "labels": { "Metric": "scheduler_framework_extension_point_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "Score", "result": "not applicable" } },	2023-07-03 21:16:53 +02:00
Patrick Ohly	cecebe8ea2	scheduler_perf: add TestScheduling integration test This runs workloads that are labeled as "integration-test". The apiserver and scheduler are only started once per unique configuration, followed by each workload using that configuration. This makes execution faster. In contrast to benchmarking, we care less about starting with a clean slate for each test.	2023-06-28 09:22:25 +02:00
Patrick Ohly	dfd646e0a8	scheduler_perf: fix namespace deletion Merely deleting the namespace is not enough: - Workloads might rely on the garbage collector to get rid of obsolete objects, so we should run it to be on the safe side. - Pods must be force-deleted because kubelet is not running. - Finally, the namespace controller is needed to get rid of deleted namespaces.	2023-06-28 09:22:25 +02:00
kerthcet	0616d15712	Fix perf-test by increasing the error margin Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-05-17 12:14:06 +08:00
Kubernetes Prow Robot	8b33eaa0a7	Merge pull request #116207 from pohly/dra-scheduler-perf scheduler_perf: dynamic resource allocation test cases	2023-05-10 10:58:59 -07:00
Patrick Ohly	034528a9f0	scheduler perf: add DynamicResourceAllocation test cases The default scheduler configuration must be based on the v1 API where the plugin is enabled by default. Then if (and only if) the DynamicResourceAllocation feature gate for a test is set, the corresponding API group also gets enabled. The normal dynamic resource claim controller is started if needed to create ResourceClaims from ResourceClaimTemplates. Without the upcoming optimizations in the scheduler, scheduling with dynamic resources is fairly slow. The new test cases take around 15 minutes wall clock time on my desktop.	2023-05-04 13:08:06 +02:00
Kante Yin	a7035f5459	Pass Context to StartTestServer Signed-off-by: Kante Yin <kerthcet@gmail.com>	2023-05-04 10:25:09 +08:00
Kubernetes Prow Robot	47f1bd9f80	Merge pull request #117649 from SataQiu/scheduler-remove-v1beta2-20230427 scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration component config	2023-05-03 09:54:41 -07:00
Kubernetes Prow Robot	aece6838e8	Merge pull request #117232 from pohly/scheduler-perf-code-cleanups scheduler_perf: code cleanups	2023-05-03 09:54:13 -07:00
SataQiu	1f7c07f355	scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration	2023-05-03 21:43:19 +08:00
Patrick Ohly	b3e0bc8864	scheduler_perf: let the test decide which informers are needed This will change when adding dynamic resource allocation test cases. Instead of changing mustSetupScheduler and StartScheduler for that, let's return the informer factory and create informers as needed in the test.	2023-04-27 15:31:40 +02:00
Patrick Ohly	78b8af9fed	scheduler_perf: update throughputCollector The previous solution had some shortcomings: - It was based on the assumption that the goroutine gets woken up at regular intervals. This is not actually guaranteed. Now the code keeps track of the actual start and end of an interval and verifies that assumption. - If no pod was scheduled (unlikely, but could happen), then "0 pods/s" got recorded. In such a case, the metric was always either zero or >= 1. A better solution is to extend the interval until some pod gets scheduled. With the larger time interval it is then possible to also track, for example, 0.5 pods/s.	2023-04-26 08:11:50 +02:00
Kubernetes Prow Robot	2bfaaf21c1	Merge pull request #117197 from pohly/scheduler-perf-cleanup scheduler perf: remove cleanup func	2023-04-11 21:17:57 -07:00
Patrick Ohly	a869a89825	scheduler perf: remove cleanup func b.Cleanup may as well get called inside the function instead of leaving that to the caller.	2023-04-11 09:43:45 +02:00
sarab	8d18ae6fc2	Use the generic Set in scheduler	2023-04-09 11:34:17 +05:30
Kante Yin	3d0894fabf	Fix failure(context canceled) in scheduler_perf benchmark (#114843 ) * Fix failure in scheduler_perf benchmark Signed-off-by: Kante Yin <kerthcet@gmail.com> * Fatal when error in cleaning up nodes in scheduler perf tests Signed-off-by: Kante Yin <kerthcet@gmail.com> * Use derived context to better organize the codes Signed-off-by: Kante Yin <kerthcet@gmail.com> * Change log level to 2 in scheduler perf-test Signed-off-by: Kante Yin <kerthcet@gmail.com> --------- Signed-off-by: Kante Yin <kerthcet@gmail.com>	2023-01-30 16:21:00 -08:00
kerthcet	d6ffb47832	Replace klog with benchmark log in scheduler_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-11-09 09:11:55 +08:00
Wojciech Tyczyński	5b042f0bf4	Remove RunAnAPIServer from integration tests	2022-07-25 17:52:31 +02:00
Kensei Nakada	4af3c5efeb	Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN	2022-05-05 16:11:22 +00:00
Kubernetes Prow Robot	21c0f6f6ff	Merge pull request #107677 from pohly/scheduler-integration-benchmark scheduler integration benchmark improvements	2022-02-14 01:23:28 -08:00
Patrick Ohly	e1e84c8e5f	scheduler_perf: run with -v=0 by default This provides a mechanism for overriding the forced increase of the klog verbosity to 4 when starting the apiserver and uses that for the scheduler_perf benchmark. Other tests run as before. A global variable was used because adding an explicit parameter to several helper functions would have caused a lot of code churn (test -> integration/util.StartApiserver -> integration/framework.RunAnAPIServerUsingServer -> integration/framework.startAPIServerOrDie).	2022-02-11 16:58:33 +01:00
ahrtr	fe95aa614c	io/ioutil has already been deprecated in golang 1.16, so replace all ioutil with io and os	2022-02-03 05:32:12 +08:00
kerthcet	75a255d2ed	remove scheduler component config v1beta1 Signed-off-by: kerthcet <kerthcet@gmail.com>	2021-09-28 13:13:17 +08:00
Dave Chen	dda8090037	Format json file with proper indentation Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-09-07 16:14:34 +08:00
Dave Chen	63b4710f38	Don't expose struct from prometheus client library	2021-08-27 22:21:24 +08:00
Dave Chen	58ab18bc1e	Add the metric data for different extension points Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-08-23 13:43:48 +08:00
Wei Huang	55765f1b49	sched: support HistogramVec in scheduler performance test	2021-07-26 20:27:37 -07:00
Mengjiao Liu	4eab19ae7d	Clean up the master term in test/integration comments	2021-06-18 16:31:05 +08:00
Wei Huang	e7f67b1a63	Surface kube config in scheduler framework handle	2021-03-30 11:54:59 -07:00
Kubernetes Prow Robot	c78b5497ae	Merge pull request #99638 from chendave/perf_config Enable scheduler_perf to support scheduler config file	2021-03-16 14:49:03 -07:00
Dave Chen	d50c0aeb5f	Enable scheduler_perf to support scheduler config file Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-03-16 23:13:40 +08:00
Wei Huang	68ff3168b8	sched: fix a bug that literal 'p99' is mapped to 95th-percentile	2021-03-12 12:03:12 -08:00
Wei Huang	b93b4a2c96	sched: fix a bug that metrics of init or collected pods are re-collected	2021-03-11 10:28:39 -08:00
Kubernetes Prow Robot	823fa75643	Merge pull request #98900 from Huang-Wei/churn-cluster-op Introduce a churnOp to scheduler perf testing framework	2021-03-11 02:00:24 -08:00
Alexander Minbaev	359116f525	add if check for number of scheduled pods to be greater than 0	2021-03-05 09:05:42 -06:00
Wei Huang	1e5878b910	Introduce a churnOp to scheduler perf testing framework - support two modes: recreate and create - use DynmaicClient to create API objects	2021-03-03 06:51:53 -08:00
Alexander Minbaev	5b73122105	Fix typo in util.go	2021-02-25 00:54:37 -06:00
Wei Huang	983272ce6a	sched: create dataItemsDir during a performance test if not exist	2021-02-17 12:44:16 -08:00
tangwz	518c502f54	scheduler_perf: use time.Ticker in throughput measurement	2020-09-19 09:36:17 +08:00
Adhityaa Chandrasekar	71bc9ce9c2	scheduler_perf: refactor to allow arbitrary workloads Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>	2020-09-17 19:22:20 +00:00
Dave Chen	ae735a1189	scheduler_perf: fix the nil pointer dereference Signed-off-by: Dave Chen <dave.chen@arm.com>	2020-06-16 13:23:05 +08:00
Abdullah Gharaibeh	d650b57141	Added Preemption benchmark	2020-05-28 14:05:52 -04:00
Davanum Srinivas	442a69c3bd	switch over k/k to use klog v2 Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2020-05-16 07:54:27 -04:00
Dave Chen	49283364bf	Decouple yaml based integration test from legacy test - Move utilities or constants out so that both of them should be able to run independently. - Rename the legacy test so that it can eventually be deleted when the perf dash changes is done	2020-03-27 08:45:59 +08:00
Jan Chaloupka	5b3b4de972	scheduler_perf: do not override throughput labels Throughput labels are currently initialized with a "Name" label. So we need to append to the map instead of creating a new one.	2020-02-28 16:10:50 +01:00

1 2 3

104 Commits