Commit Graph

67 Commits

Author SHA1 Message Date
Shyam Jeedigunta
203664933d Add etcd DB size monitoring in density test 2018-08-30 14:40:59 +02:00
foxyriver
4baeb09f6c need ExpectNoError check 2018-08-01 18:10:14 +08:00
Krzysztof Siedlecki
e5c9383b59 Collecting etcd histogram metrics 2018-07-16 14:32:54 +02:00
Kubernetes Submit Queue
a8777c26fa
Merge pull request #64695 from krzysied/etcd_metrics
Automatic merge from submit-queue (batch tested with PRs 64695, 65982, 65908). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Collecting etcd metrics

**What this PR does / why we need it**:
Adding etcd metrics to performance test log.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #64030

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-07-10 08:55:03 -07:00
Kubernetes Submit Queue
6c847f3e7a
Merge pull request #65307 from shyamjvs/fix-scheduler-reset-metrics-bug
Automatic merge from submit-queue (batch tested with PRs 65301, 65291, 65307, 63845, 65313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix scheduler reset metrics bug in testinfra

/cc @krzysied 

```release-note
NONE
```
2018-06-22 03:08:13 -07:00
Shyam Jeedigunta
b9ae20c99e Split scheduler latency metric to fine-grained steps 2018-06-21 14:19:39 +02:00
Shyam Jeedigunta
cd1a5353eb Fix scheduler reset metrics bug in testinfra 2018-06-21 13:50:59 +02:00
Krzysztof Siedlecki
e32910a544 Readding summary metrics 2018-06-14 15:05:12 +02:00
Krzysztof Siedlecki
0547bbf744 Revert "Fixing scheduling latency metrics"
This reverts commit 0e833bfc83.
2018-06-14 14:50:12 +02:00
Kubernetes Submit Queue
65a5e68147
Merge pull request #64521 from shyamjvs/compute-scheduler-throughput-avg
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Compute avg and quantiles of scheduler throughput in density test

Based on my comment here - https://github.com/kubernetes/kubernetes/pull/64266#issuecomment-393189953

/sig scheduling
/kind cleanup
/priority important-soon
/milestone v1.11
/cc @wojtek-t 

```release-note
NONE
```
2018-06-13 14:23:51 -07:00
Shyam Jeedigunta
979a8d73e1 Compute avg and quantiles of scheduler throughput in density test 2018-06-12 18:40:52 +02:00
Krzysztof Siedlecki
aa022310a4 Collecting etcd metrics 2018-06-04 16:23:08 +02:00
Krzysztof Siedlecki
0e833bfc83 Fixing scheduling latency metrics 2018-05-30 11:20:12 +02:00
Shyam Jeedigunta
f363f549c0 Measure scheduler throughput in density test 2018-05-25 14:49:11 +02:00
Shyam Jeedigunta
0f0c754eb4 Get rid of duplicate VerifyPodStartupLatency util in node density tests 2018-03-21 16:58:31 +01:00
Shyam Jeedigunta
b0dd166fa3 Capture different parts of pod-startup latency as metrics 2018-03-21 16:58:25 +01:00
Kubernetes Submit Queue
d4724d7e43
Merge pull request #55056 from porridge/typo-percentil
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix a typo.

**Release note**:
```release-note
NONE
```
2017-11-20 01:40:50 -08:00
xiangpengzhao
32675e6f62 Remove check for SubResourcePodProxyVersion and SubResourceServiceAndNodeProxyVersion 2017-11-03 23:11:09 +08:00
Marcin Owsiany
c2ab5c8246 Fix a typo. 2017-11-03 13:43:32 +01:00
Kevin
4c8539cece use core client with explicit version globally 2017-10-27 15:48:32 +08:00
Shyam Jeedigunta
f373645865 Increase api latency threshold for cluster-scoped list calls 2017-09-21 13:33:22 +02:00
Clayton Coleman
30a92a8f0a
Report scope in e2e test metrics 2017-09-11 22:13:55 -04:00
Kubernetes Submit Queue
fdf14b8218 Merge pull request #50913 from shyamjvs/list-call-slo
Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640)

Increase latency threshold for list api calls

This is only a short-term solution to make our density test green. In the long-term, we should measure as per our new SLIs.
From @wojtek-t's [doc](https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw) on the new SLIs/SLOs, we have the following SLO for list calls:

```
SLO1: In default Kubernetes installation, 99th percentile of SLI2 per cluster-day:
<= 1s if total number of objects of the same type as resource in the system <= X
<= 5s if total number of objects of the same type as resource in the system <= Y
<= 30s if total number of objects of the same types as resource in the system <= Z
```

I would guess that 170,000 pods would fall into the 2nd bracket (at least) and hence the new value of 5s. WDYT?

cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
2017-08-22 05:31:07 -07:00
Shyam Jeedigunta
70123e71bb Increase latency threshold for list api calls 2017-08-19 00:55:35 +02:00
Kubernetes Submit Queue
b67b0ad7eb Merge pull request #50768 from shyamjvs/fix-scheduler-metric-in-gke
Automatic merge from submit-queue (batch tested with PRs 50550, 50768)

Don't SSH to master for metrics in case of GKE

cc @kubernetes/sig-scalability-misc @crassirostris
2017-08-17 03:13:59 -07:00
Shyam Jeedigunta
a938c000e3 Don't SSH to master for metrics in case of GKE 2017-08-16 15:24:50 +02:00
Aleksandra Malinowska
55682f2a55 add grabbing CA metrics in e2e tests 2017-08-10 11:22:45 +02:00
Mik Vyatskov
e79a228a78 Move the sig-instrumentation test to a dedicated folder 2017-08-07 10:33:03 +02:00
Jacob Simpson
29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
gmarek
55880e6b4b Move metrics_grabbert to test/e2e 2017-07-07 13:13:44 +02:00
Shyam Jeedigunta
04822a9672 Increase threshold for LIST apicall latencies to 2s 2017-06-13 15:49:01 +02:00
Kubernetes Submit Queue
40dcbc4eb3 Merge pull request #46461 from ncdc/e2e-suite-metrics
Automatic merge from submit-queue

Support grabbing test suite metrics

**What this PR does / why we need it**:
Add support for grabbing metrics that cover the entire test suite's execution.

Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.

If you enable `--gather-suite-metrics-at-teardown`, the metrics file is written to a file with a name such as `MetricsForE2ESuite_2017-05-25T20:25:57Z.json` in the `--report-dir`. If you don't specify `--report-dir`, the metrics are written to the test log output.

I'd like to enable this for some of the `pull-*` CI jobs, which will require a separate PR to test-infra.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

@kubernetes/sig-testing-pr-reviews @smarterclayton @wojtek-t @gmarek @derekwaynecarr @timothysc
2017-05-30 16:41:49 -07:00
Shyam Jeedigunta
e897b21506 Fix minor bugs in setting API call metrics with subresource 2017-05-29 15:04:52 +02:00
Andy Goldstein
41345418cb Support grabbing test suite metrics
Update the "interesting" controller-manager metrics to match the
current names for the garbage collector, and add namespace controller
metrics to the list.
2017-05-26 11:21:27 -04:00
Clayton Coleman
ad431c454c
Subresources are not included in apiserver prometheus metrics
Subresources are very often completely different code paths and errors
generated on those code paths are important to distinguish.
2017-05-24 16:23:50 -04:00
Shyam Jeedigunta
48688fa70d Print pod startup latency metric as perfdata 2017-05-12 14:31:18 +02:00
gmarek
6dcbdfaf58 Print API latency metrics as perfdata 2017-05-12 08:51:17 +02:00
gmarek
f68b884a9d Move rest of performance data gathered by tests to Summaries 2017-05-10 14:50:38 +02:00
Kubernetes Submit Queue
01321936b6 Merge pull request #45021 from shyamjvs/add-request-count
Automatic merge from submit-queue (batch tested with PRs 45033, 44961, 45021, 45097, 44938)

Add request count to APICall metric

Ref https://github.com/kubernetes/kubernetes/issues/44701

This should add beside the API call latencies, the count of the requests.

cc @wojtek-t @gmarek
2017-04-28 13:16:42 -07:00
Shyam Jeedigunta
d77378a688 Add request count to APICall metric 2017-04-27 15:48:51 +02:00
NickrenREN
d4376599ba Cleanup: replace some hardcoded codes and remove unused functions 2017-04-25 09:38:25 +08:00
gmarek
7ad55c8a47 Output some spam to files instead of main log files 2017-04-20 16:13:40 +02:00
gmarek
be987ac247 Allow summaries to be printed out to ReportDir instead of stdout 2017-04-19 16:17:36 +02:00
Chao Xu
31cb266340 tests 2017-02-28 23:05:41 -08:00
deads2k
bf30b0c71b add WATCH to list of excluded verbs for latency metrics 2017-02-23 15:47:28 -05:00
Jordan Liggitt
88a876b1d0
Update to use proxy subresource consistently 2017-02-13 22:05:00 -05:00
Clayton Coleman
469df12038
refactor: move ListOptions references to metav1 2017-01-23 17:52:46 -05:00
Kubernetes Submit Queue
4811ad0231 Merge pull request #38592 from krousey/client-context
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)

Add optional per-request context to restclient

**What this PR does / why we need it**: It adds per-request contexts to restclient's API, and uses them to add timeouts to all proxy calls in the e2e tests. An entire e2e shouldn't hang for hours on a single API call.

**Which issue this PR fixes**: #38305

**Special notes for your reviewer**:

This adds a feature to the low-level rest client request feature that is entirely optional. It doesn't affect any requests that don't use it. The api of the generated clients does not change, and they currently don't take advantage of this.

I intend to patch this in to 1.5 as a mostly test only change since it's not going to affect any controller, generated client, or user of the generated client.


cc @kubernetes/sig-api-machinery 
cc @saad-ali
2017-01-16 10:37:38 -08:00
NickrenREN
a12dea14e0 fix redundant alias clientset 2017-01-12 10:21:05 +08:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00