kubernetes

Author	SHA1	Message	Date
Patrick Ohly	bde9b64cdf	DRA: remove "source" indirection from v1 Pod API This makes the API nicer: resourceClaims: - name: with-template resourceClaimTemplateName: test-inline-claim-template - name: with-claim resourceClaimName: test-shared-claim Previously, this was: resourceClaims: - name: with-template source: resourceClaimTemplateName: test-inline-claim-template - name: with-claim source: resourceClaimName: test-shared-claim A more long-term benefit is that other, future alternatives might not make sense under the "source" umbrella. This is a breaking change. It's justified because DRA is still alpha and will have several other API breaks in 1.31.	2024-06-27 17:53:24 +02:00
Maciej Skoczeń	7532e74117	Don't fail on churn delete in scheduler_perf tests when context canceled	2024-06-19 08:08:13 +00:00
Maciej Skoczeń	05b2c14d64	Measure performance of scheduling when many gated pods	2024-06-18 12:39:21 +00:00
Maciej Skoczeń	c09440c691	Add possibility to delete pods at specified frequency in scheduler_perf tests	2024-06-18 09:40:50 +00:00
Kubernetes Prow Robot	5df8e15a84	Merge pull request #125562 from pohly/scheduler-perf-default-verbosity scheduler_perf: fix setting default verbosity	2024-06-18 02:16:07 -07:00
Kubernetes Prow Robot	3b90ae4f58	Merge pull request #124548 from pohly/dra-scheduler-perf-structured-parameters scheduler_perf: add DRA structured parameters test with shared claims	2024-06-18 02:15:58 -07:00
Patrick Ohly	381c28407e	scheduler_perf: fix setting default verbosity It needs to be set twice, once for ktesting+klog, once for component-base/logs. The latter was not done before and thus quite a bit of log output was produced with verbosity 0.	2024-06-18 08:44:16 +02:00
Patrick Ohly	d88a153086	scheduler_perf: add DRA structured parameters test with shared claims Several pods sharing the same claim is not common, but can be useful and thus should get tested. Before, createPods and createAny operations were not able to do this because each generated object was the same. What we need are different, predictable names of the claims (from createAny) and different references to those in the pods (from createPods). Now text/template processing with the index number of the pod respectively claim as input is used to inject these varying fields. A "div" function is needed to use the same claim in several different pods. While at it, some existing test cases get cleaned up a bit (removal of incorrect comments, adding comments for testing with queuing hints).	2024-06-17 10:13:22 +02:00
Kubernetes Prow Robot	0fd6746b2a	Merge pull request #125518 from pohly/scheduler-perf-cleanup-fix scheduler_perf: shut down apiserver clients before apiserver	2024-06-16 10:03:29 -07:00
kerthcet	1ffa1e17cd	Remove noisy log in scheduler_perf Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-06-12 11:53:35 +08:00
Patrick Ohly	246e2aedf5	scheduler_perf: shut down apiserver clients before apiserver The cancellation of the context happened after the cleanup of the apiserver, so clients using that context were kept running. That wasn't the intent and causes a slow shutdown because the apiserver delays its shutdown when it has active clients. The fix is to create a new cancellation context and to use that for the clients. The automatic cancellation of it then happens before the apiserver cleanup.	2024-06-05 11:00:46 +02:00
Kensei Nakada	ef9e14db79	scheduler_perf: measure the degradation of daemonset scheduling	2024-06-05 02:36:31 +00:00
kerthcet	e678496c6e	reorganize the scheduler_perf testcases Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-05-31 16:47:19 +08:00
Lubomir I. Ivanov	5e290ebc90	switch k/k to pause version 3.10	2024-05-24 10:02:51 +03:00
Kubernetes Prow Robot	ade0d2140a	Merge pull request #124578 from sanposhiho/scheduler_perf_scheduler_plugin_execution_duration_seconds support `scheduler_plugin_execution_duration_seconds` in scheduler_perf	2024-05-05 06:40:44 -07:00
Kensei Nakada	c72b688e12	support `scheduler_plugin_execution_duration_seconds` in scheduler_perf	2024-04-27 08:22:53 +00:00
Marek Siarkowicz	3ee8178768	Cleanup defer from SetFeatureGateDuringTest function call	2024-04-24 20:25:29 +02:00
Patrick Ohly	a0add8d2c7	dra api: NodeResourceModel -> ResourceModel When renaming NodeResourceSlice to ResourceSlice, the embedded [Node]ResourceModel also should have been renamed.	2024-03-14 18:07:36 +01:00
Patrick Ohly	0b6a0d686a	dra api: rename NodeResourceSlice -> ResourceSlice While currently those objects only get published by the kubelet for node-local resources, this could change once we also support network-attached resources. Dropping the "Node" prefix enables such a future extension. The NodeName in ResourceSlice and StructuredResourceHandle then becomes optional. The kubelet still needs to provide one and it must match its own node name, otherwise it doesn't have permission to access ResourceSlice objects.	2024-03-07 22:22:55 +01:00
Patrick Ohly	4ed2b3eaeb	scheduler_perf: test DRA with structured parameters	2024-03-07 22:21:58 +01:00
Kubernetes Prow Robot	55d1518126	Merge pull request #123588 from pohly/scheduler-perf-any-cleanup scheduler_perf: automatically delete created objects	2024-03-04 04:49:12 -08:00
Patrick Ohly	eb6abf0462	scheduler_perf: automatically delete created objects This is not relevant for namespaced objects, but matters for the cluster-scoped ResourceClass during unit testing. This works right now because there is only one such unit test, but will fail when adding a second one. Instead of passing a boolean flag down into all functions where it might be needed, it's now a context value.	2024-03-04 09:54:38 +01:00
Patrick Ohly	d6851ec735	scheduler_perf: fail when input YAML is invalid The YAML files get decoded into an unstructured object, without validation, and then sent to the apiserver with a generic client. The default behavior is to issue a warning to the client, which gets logged by client-go. What we want instead is an error that causes the test to fail in a clean way right at the beginning.	2024-02-29 09:53:16 +01:00
Patrick Ohly	da0c9a93ae	scheduler_perf: use dynamic client to create arbitrary objects With a dynamic client and a rest mapper it is possible to load arbitrary YAML files and create the object defined by it. This is simpler than adding specific Go code for each supported type. Because the version now matters, the incorrect version in the DRA YAMLs were found and fixed.	2024-02-11 10:51:38 +01:00
Patrick Ohly	c46ae1b26a	scheduler_perf: use ktesting.TContext + staging StartTestServer ktesting.TContext combines several different interfaces. This makes the code simpler because less parameters need to be passed around. An intentional side effect is that the apiextensions client interface becomes available, which makes it possible to use CRDs. This will be needed for future DRA tests. Support for CRDs depends on starting the apiserver via k8s.io/kubernetes/cmd/kube-apiserver/app/testing because only that enables the CRD extensions. As discussed on Slack, the long-term goal is to replace the in-tree StartTestServer with the one in staging, so this is going in the right direction.	2024-02-11 10:51:38 +01:00
Kensei Nakada	f29d6970c9	doc(scheduler_perf): enrich the documentation	2024-01-15 08:50:08 +00:00
Kensei Nakada	74a6a4581f	fix by linters	2023-12-02 09:58:34 +00:00
Kensei Nakada	5310abe14a	make scheduler_perf usable from other repositories	2023-12-01 12:43:08 +00:00
Kubernetes Prow Robot	4b9e15e0fe	Merge pull request #120873 from pohly/dra-e2e-test-driver-enhancements e2e dra: enhance test driver	2023-10-10 13:32:55 +02:00
Kubernetes Prow Robot	44cfd556b3	Merge pull request #120339 from pohly/scheduler-perf-dra-driver-names scheduler_perf: use different log names for different DRA drivers	2023-10-02 06:32:56 -07:00
Kubernetes Prow Robot	5cc92713d1	Merge pull request #120335 from pohly/scheduler-perf-pod-name scheduler_perf: show name of one pending pod in log	2023-10-02 06:32:45 -07:00
Patrick Ohly	36146ad686	e2e dra: enhance test driver Several enhancements: - `--resource-config` is now listed under `controller` options instead of `leader election`: merely a cosmetic change - The driver name can be configured as part of the resource config. The command line flag overrides the config, but only when set explicitly. This makes it possible to pre-define complete driver setups where the name is associated with certain resource availability. This will be used for testing cluster autoscaling. - The set of nodes where resources are available can optionally be specified via node labels. This will be used for testing cluster autoscaling.	2023-09-25 19:50:33 +02:00
Junhao Zou	43c05e98ca	cleanup: Replace the deprecated NewMemCacheClient with memory.NewMemCacheClient	2023-09-08 11:57:46 +08:00
Patrick Ohly	c74d045c4b	scheduler_perf: show name of one pending pod in error message If pods get stuck, then giving the name of one makes it possible to search for it in the log output. Without the name it's hard to figure out which pods got stuck.	2023-09-04 09:54:26 +02:00
Patrick Ohly	78f3b76390	scheduler_perf: use different log names for different DRA drivers This helps when using -feature-gate=ContextualLogging=true and running the SchedulingWithMultipleResourceClaims test case because then output from the two driver instances is easy to distinguish.	2023-09-01 09:25:09 +02:00
Kubernetes Prow Robot	8428655308	Merge pull request #119963 from pohly/dra-scheduler-perf-multiple-claims dra: scheduler_perf test case with multiple claims per pod	2023-08-29 00:25:34 -07:00
SataQiu	5524f1651a	using wait.PollUntilContextTimeout instead of deprecated wait.Poll/PollWithContext/PollImmediate/PollImmediateWithContext methods for scheduler	2023-08-24 18:35:59 +08:00
Patrick Ohly	1e961af858	scheduler_perf: test case for DRA with multiple claims The new test case covers pods with multiple claims from multiple drivers. This leads to different behavior (scheduler waits for information from all drivers instead of optimistically selecting one node right away) and to more concurrent updates of the PodSchedulingContext objects. The test case is currently not enabled for unit testing or integration testing. It can be used manually with: -bench=BenchmarkPerfScheduling/SchedulingWithMultipleResourceClaims/2000pods_100nodes ... -perf-scheduling-label-filter=	2023-08-16 08:32:36 +02:00
Patrick Ohly	0331e98957	scheduler_perf: fix installing DRA test driver multiple times The driver name configuration option was ignored, so a second driver would have used the same name.	2023-08-16 08:32:36 +02:00
Heba Elayoty	224087abfa	Add Pod Scheduling SLI Duration metric (#119049 ) Signed-off-by: Heba Elayoty <hebaelayoty@gmail.com> Co-authored-by: Aldo Culquicondor <1299064+alculquicondor@users.noreply.github.com>	2023-08-15 15:17:41 -07:00
Jordan Liggitt	a164005cc0	Fix non-test code relying on test-code	2023-07-24 11:37:57 -04:00
Patrick Ohly	6b01ece580	scheduler-perf: fix perfdash display problem perfdash expects all data items to have the same set of labels. It then renders drop-down buttons for each label with all values found for each label. Previously, data items that didn't have a label didn't match any label filter in perfdash and couldn't get selected because perfdash doesn't have "unset" in it's drop-down menus. To avoid that, scheduler-perf now collects all labels and then adds missing labels with "not applicable" as value: { "data": { "Average": 939.7071223010004, "Perc50": 927.7987421383649, "Perc90": 2166.153846153846, "Perc95": 2363.076923076923, "Perc99": 2520.6153846153848 }, "unit": "ms", "labels": { "Metric": "scheduler_pod_scheduling_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "not applicable", "result": "not applicable" } }, ... { "data": { "Average": 1.1172570650000004, "Perc50": 1.1418367346938776, "Perc90": 1.5500000000000003, "Perc95": 1.6410256410256412, "Perc99": 3.7333333333333334 }, "unit": "ms", "labels": { "Metric": "scheduler_framework_extension_point_duration_seconds", "Name": "SchedulingBasic/5000Nodes/namespace-2", "extension_point": "Score", "result": "not applicable" } },	2023-07-03 21:16:53 +02:00
Patrick Ohly	29e5771aa4	scheduler-perf: shorten "Name" label in metrics Because the JSON file gets written at the end of the top-level benchmark, all data items had `BenchmarkPerfScheduling/` as prefix in the `Name` label. This is redundant and makes it harder to see the actual name. Now that common prefix gets removed.	2023-07-03 21:15:16 +02:00
Patrick Ohly	0d41d509d2	scheduler_perf: replace gomega.Eventually with wait.PollUntilContextTimeout This is done for the sake of consistency. The failure message becomes less useful.	2023-06-28 09:22:26 +02:00
Patrick Ohly	cecebe8ea2	scheduler_perf: add TestScheduling integration test This runs workloads that are labeled as "integration-test". The apiserver and scheduler are only started once per unique configuration, followed by each workload using that configuration. This makes execution faster. In contrast to benchmarking, we care less about starting with a clean slate for each test.	2023-06-28 09:22:25 +02:00
Patrick Ohly	dfd646e0a8	scheduler_perf: fix namespace deletion Merely deleting the namespace is not enough: - Workloads might rely on the garbage collector to get rid of obsolete objects, so we should run it to be on the safe side. - Pods must be force-deleted because kubelet is not running. - Finally, the namespace controller is needed to get rid of deleted namespaces.	2023-06-28 09:22:25 +02:00
Patrick Ohly	d9c16a1ced	scheduler_perf: fix goroutine leak in runWorkload This becomes relevant when doing more fine-grained leak checking.	2023-06-28 08:14:34 +02:00
Patrick Ohly	c91c578795	scheduler_perf: skip expensive cleanup during benchmarks Each benchmark test case runs with a fresh etcd instance. Therefore it is not necessary to delete objects after a run. A future unit test might reuse etcd, therefore cleanup is optional.	2023-06-22 08:56:14 +02:00
Kubernetes Prow Robot	2057a48ee5	Merge pull request #114771 from sanposhiho/scheduling_perf_scheduler_scheduling_attempt_duration_seconds feature(scheduler_perf): distinguish result in scheduler_scheduling_attempt_duration_seconds metric result	2023-06-07 06:18:13 -07:00
Kensei Nakada	a4ea058cc7	feature(scheduler_perf): distinguish result in scheduler_scheduling_attempt_duration_seconds metric result	2023-06-02 14:45:55 +00:00

1 2 3 4 5 ...

332 Commits