kubernetes

Author	SHA1	Message	Date
Patrick Ohly	b51d68bb87	DRA: bump API v1alpha2 -> v1alpha3 This is in preparation for revamping the resource.k8s.io completely. Because there will be no support for transitioning from v1alpha2 to v1alpha3, the roundtrip test data for that API in 1.29 and 1.30 gets removed. Repeating the version in the import name of the API packages is not really required. It was done for a while to support simpler grepping for usage of alpha APIs, but there are better ways for that now. So during this transition, "resourceapi" gets used instead of "resourcev1alpha3" and the version gets dropped from informer and lister imports. The advantage is that the next bump to v1beta1 will affect fewer source code lines. Only source code where the version really matters (like API registration) retains the versioned import.	2024-07-21 17:28:13 +02:00
Maciej Skoczeń	ad59b4026e	Increase API server timeout in scheduler_perf tests	2024-07-10 07:34:59 +00:00
kerthcet	e106b3a31f	Log the error margin to avoid failures in schedule_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-07-01 18:22:31 +08:00
Kubernetes Prow Robot	0fd6746b2a	Merge pull request #125518 from pohly/scheduler-perf-cleanup-fix scheduler_perf: shut down apiserver clients before apiserver	2024-06-16 10:03:29 -07:00
kerthcet	1ffa1e17cd	Remove noisy log in scheduler_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-06-12 11:53:35 +08:00
Patrick Ohly	246e2aedf5	scheduler_perf: shut down apiserver clients before apiserver The cancellation of the context happened after the cleanup of the apiserver, so clients using that context were kept running. That wasn't the intent and causes a slow shutdown because the apiserver delays its shutdown when it has active clients. The fix is to create a new cancellation context and to use that for the clients. The automatic cancellation of it then happens before the apiserver cleanup.	2024-06-05 11:00:46 +02:00
Kensei Nakada	c72b688e12	support `scheduler_plugin_execution_duration_seconds` in scheduler_perf	2024-04-27 08:22:53 +00:00
Patrick Ohly	c46ae1b26a	scheduler_perf: use ktesting.TContext + staging StartTestServer ktesting.TContext combines several different interfaces. This makes the code simpler because less parameters need to be passed around. An intentional side effect is that the apiextensions client interface becomes available, which makes it possible to use CRDs. This will be needed for future DRA tests. Support for CRDs depends on starting the apiserver via k8s.io/kubernetes/cmd/kube-apiserver/app/testing because only that enables the CRD extensions. As discussed on Slack, the long-term goal is to replace the in-tree StartTestServer with the one in staging, so this is going in the right direction.	2024-02-11 10:51:38 +01:00
Kensei Nakada	5310abe14a	make scheduler_perf usable from other repositories	2023-12-01 12:43:08 +00:00
Patrick Ohly	c74d045c4b	scheduler_perf: show name of one pending pod in error message If pods get stuck, then giving the name of one makes it possible to search for it in the log output. Without the name it's hard to figure out which pods got stuck.	2023-09-04 09:54:26 +02:00
Patrick Ohly	6b01ece580	scheduler-perf: fix perfdash display problem perfdash expects all data items to have the same set of labels. It then renders drop-down buttons for each label with all values found for each label. Previously, data items that didn't have a label didn't match any label filter in perfdash and couldn't get selected because perfdash doesn't have "unset" in it's drop-down menus. To avoid that, scheduler-perf now collects all labels and then adds missing labels with "not applicable" as value: { "data": { "Average": 939.7071223010004, "Perc50": 927.7987421383649, "Perc90": 2166.153846153846, "Perc95": 2363.076923076923, "Perc99": 2520.6153846153848 }, "unit": "ms", "labels": { "Metric": "scheduler_pod_scheduling_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "not applicable", "result": "not applicable" } }, ... { "data": { "Average": 1.1172570650000004, "Perc50": 1.1418367346938776, "Perc90": 1.5500000000000003, "Perc95": 1.6410256410256412, "Perc99": 3.7333333333333334 }, "unit": "ms", "labels": { "Metric": "scheduler_framework_extension_point_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "Score", "result": "not applicable" } },	2023-07-03 21:16:53 +02:00
Patrick Ohly	cecebe8ea2	scheduler_perf: add TestScheduling integration test This runs workloads that are labeled as "integration-test". The apiserver and scheduler are only started once per unique configuration, followed by each workload using that configuration. This makes execution faster. In contrast to benchmarking, we care less about starting with a clean slate for each test.	2023-06-28 09:22:25 +02:00
Patrick Ohly	dfd646e0a8	scheduler_perf: fix namespace deletion Merely deleting the namespace is not enough: - Workloads might rely on the garbage collector to get rid of obsolete objects, so we should run it to be on the safe side. - Pods must be force-deleted because kubelet is not running. - Finally, the namespace controller is needed to get rid of deleted namespaces.	2023-06-28 09:22:25 +02:00
kerthcet	0616d15712	Fix perf-test by increasing the error margin Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-05-17 12:14:06 +08:00
Kubernetes Prow Robot	8b33eaa0a7	Merge pull request #116207 from pohly/dra-scheduler-perf scheduler_perf: dynamic resource allocation test cases	2023-05-10 10:58:59 -07:00
Patrick Ohly	034528a9f0	scheduler perf: add DynamicResourceAllocation test cases The default scheduler configuration must be based on the v1 API where the plugin is enabled by default. Then if (and only if) the DynamicResourceAllocation feature gate for a test is set, the corresponding API group also gets enabled. The normal dynamic resource claim controller is started if needed to create ResourceClaims from ResourceClaimTemplates. Without the upcoming optimizations in the scheduler, scheduling with dynamic resources is fairly slow. The new test cases take around 15 minutes wall clock time on my desktop.	2023-05-04 13:08:06 +02:00
Kante Yin	a7035f5459	Pass Context to StartTestServer Signed-off-by: Kante Yin <kerthcet@gmail.com>	2023-05-04 10:25:09 +08:00
Kubernetes Prow Robot	47f1bd9f80	Merge pull request #117649 from SataQiu/scheduler-remove-v1beta2-20230427 scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration component config	2023-05-03 09:54:41 -07:00
Kubernetes Prow Robot	aece6838e8	Merge pull request #117232 from pohly/scheduler-perf-code-cleanups scheduler_perf: code cleanups	2023-05-03 09:54:13 -07:00
SataQiu	1f7c07f355	scheduler: remove deprecated v1beta2 KubeSchedulerConfiguration	2023-05-03 21:43:19 +08:00
Patrick Ohly	b3e0bc8864	scheduler_perf: let the test decide which informers are needed This will change when adding dynamic resource allocation test cases. Instead of changing mustSetupScheduler and StartScheduler for that, let's return the informer factory and create informers as needed in the test.	2023-04-27 15:31:40 +02:00
Patrick Ohly	78b8af9fed	scheduler_perf: update throughputCollector The previous solution had some shortcomings: - It was based on the assumption that the goroutine gets woken up at regular intervals. This is not actually guaranteed. Now the code keeps track of the actual start and end of an interval and verifies that assumption. - If no pod was scheduled (unlikely, but could happen), then "0 pods/s" got recorded. In such a case, the metric was always either zero or >= 1. A better solution is to extend the interval until some pod gets scheduled. With the larger time interval it is then possible to also track, for example, 0.5 pods/s.	2023-04-26 08:11:50 +02:00
Kubernetes Prow Robot	2bfaaf21c1	Merge pull request #117197 from pohly/scheduler-perf-cleanup scheduler perf: remove cleanup func	2023-04-11 21:17:57 -07:00
Patrick Ohly	a869a89825	scheduler perf: remove cleanup func b.Cleanup may as well get called inside the function instead of leaving that to the caller.	2023-04-11 09:43:45 +02:00
sarab	8d18ae6fc2	Use the generic Set in scheduler	2023-04-09 11:34:17 +05:30
Kante Yin	3d0894fabf	Fix failure(context canceled) in scheduler_perf benchmark (#114843 ) * Fix failure in scheduler_perf benchmark Signed-off-by: Kante Yin <kerthcet@gmail.com> * Fatal when error in cleaning up nodes in scheduler perf tests Signed-off-by: Kante Yin <kerthcet@gmail.com> * Use derived context to better organize the codes Signed-off-by: Kante Yin <kerthcet@gmail.com> * Change log level to 2 in scheduler perf-test Signed-off-by: Kante Yin <kerthcet@gmail.com> --------- Signed-off-by: Kante Yin <kerthcet@gmail.com>	2023-01-30 16:21:00 -08:00
kerthcet	d6ffb47832	Replace klog with benchmark log in scheduler_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2022-11-09 09:11:55 +08:00
Wojciech Tyczyński	5b042f0bf4	Remove RunAnAPIServer from integration tests	2022-07-25 17:52:31 +02:00
Kensei Nakada	4af3c5efeb	Skip adding data to avoid "json: unsupported value: NaN" panic when data is NaN	2022-05-05 16:11:22 +00:00
Kubernetes Prow Robot	21c0f6f6ff	Merge pull request #107677 from pohly/scheduler-integration-benchmark scheduler integration benchmark improvements	2022-02-14 01:23:28 -08:00
Patrick Ohly	e1e84c8e5f	scheduler_perf: run with -v=0 by default This provides a mechanism for overriding the forced increase of the klog verbosity to 4 when starting the apiserver and uses that for the scheduler_perf benchmark. Other tests run as before. A global variable was used because adding an explicit parameter to several helper functions would have caused a lot of code churn (test -> integration/util.StartApiserver -> integration/framework.RunAnAPIServerUsingServer -> integration/framework.startAPIServerOrDie).	2022-02-11 16:58:33 +01:00
ahrtr	fe95aa614c	io/ioutil has already been deprecated in golang 1.16, so replace all ioutil with io and os	2022-02-03 05:32:12 +08:00
kerthcet	75a255d2ed	remove scheduler component config v1beta1 Signed-off-by: kerthcet <kerthcet@gmail.com>	2021-09-28 13:13:17 +08:00
Dave Chen	dda8090037	Format json file with proper indentation Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-09-07 16:14:34 +08:00
Dave Chen	63b4710f38	Don't expose struct from prometheus client library	2021-08-27 22:21:24 +08:00
Dave Chen	58ab18bc1e	Add the metric data for different extension points Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-08-23 13:43:48 +08:00
Wei Huang	55765f1b49	sched: support HistogramVec in scheduler performance test	2021-07-26 20:27:37 -07:00
Mengjiao Liu	4eab19ae7d	Clean up the master term in test/integration comments	2021-06-18 16:31:05 +08:00
Wei Huang	e7f67b1a63	Surface kube config in scheduler framework handle	2021-03-30 11:54:59 -07:00
Kubernetes Prow Robot	c78b5497ae	Merge pull request #99638 from chendave/perf_config Enable scheduler_perf to support scheduler config file	2021-03-16 14:49:03 -07:00
Dave Chen	d50c0aeb5f	Enable scheduler_perf to support scheduler config file Signed-off-by: Dave Chen <dave.chen@arm.com>	2021-03-16 23:13:40 +08:00
Wei Huang	68ff3168b8	sched: fix a bug that literal 'p99' is mapped to 95th-percentile	2021-03-12 12:03:12 -08:00
Wei Huang	b93b4a2c96	sched: fix a bug that metrics of init or collected pods are re-collected	2021-03-11 10:28:39 -08:00
Kubernetes Prow Robot	823fa75643	Merge pull request #98900 from Huang-Wei/churn-cluster-op Introduce a churnOp to scheduler perf testing framework	2021-03-11 02:00:24 -08:00
Alexander Minbaev	359116f525	add if check for number of scheduled pods to be greater than 0	2021-03-05 09:05:42 -06:00
Wei Huang	1e5878b910	Introduce a churnOp to scheduler perf testing framework - support two modes: recreate and create - use DynmaicClient to create API objects	2021-03-03 06:51:53 -08:00
Alexander Minbaev	5b73122105	Fix typo in util.go	2021-02-25 00:54:37 -06:00
Wei Huang	983272ce6a	sched: create dataItemsDir during a performance test if not exist	2021-02-17 12:44:16 -08:00
tangwz	518c502f54	scheduler_perf: use time.Ticker in throughput measurement	2020-09-19 09:36:17 +08:00
Adhityaa Chandrasekar	71bc9ce9c2	scheduler_perf: refactor to allow arbitrary workloads Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>	2020-09-17 19:22:20 +00:00

1 2 3

109 Commits