Commit Graph

4823 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
406d2dfe61 Merge pull request #119250 from pohly/controller-contextual-logging
kube-controller-manager: finish conversion to contextual logging
2023-07-12 18:59:30 -07:00
Kubernetes Prow Robot
4af23c157c Merge pull request #119242 from carlory/add-logger
change the QueueingHintFn to pass a logger
2023-07-12 13:03:31 -07:00
Kubernetes Prow Robot
2ec4e14bfa Merge pull request #118812 from serathius/storage-metric
Improve apiserver storage size metric
2023-07-12 10:57:26 -07:00
carlory
0599b3caa0 change the QueueingHintFn to pass a logger 2023-07-13 00:56:41 +08:00
Patrick Ohly
7d064812bb kube-controller-manager: finish conversion to contextual logging
This removes all exceptions and fixes the remaining unconverted log calls.
2023-07-12 14:57:29 +02:00
Marek Siarkowicz
7a63997c8a Improve apiserver storage size metric to allow it's graduation
Change name to make it compliant with prometheus guidelines.
Calculate it on demand instead of periodic to comply with prometheus standards.
Replace "endpoint" with "server" label to make it semantically consistent with storage factory
2023-07-12 14:33:10 +02:00
Mengjiao Liu
19869478c1 Migrate /pkg/controller/disruption to structured and contextual logging 2023-07-12 11:30:45 +08:00
Kubernetes Prow Robot
98e7c2a751 Merge pull request #119237 from jpbetz/jpbetz-apiserver-integration-owner
Add jpbetz as approver of apiserver integration tests
2023-07-11 20:03:18 -07:00
Kubernetes Prow Robot
6ffca50136 Merge pull request #116443 from benluddy/secondary-authz-decision-caching
Cache authz decisions within the scope of validating policy admission.
2023-07-11 12:41:11 -07:00
Joe Betz
6d6595d0f6 Add jpbetz as approver of apiserver integration tests 2023-07-11 14:36:45 -04:00
Kubernetes Prow Robot
8f1852bb44 Merge pull request #115295 from Namanl2001/pkg/controller/endpointslice
Migrated `pkg/controller/endpointslice` and `pkg/controller/endpointslicemirroring` to contextual logging
2023-07-11 03:19:12 -07:00
carlory
f443c458af move non-graceful node shutdown to GA 2023-07-11 13:51:51 +08:00
Kubernetes Prow Robot
ad72319ece Merge pull request #115122 from r-erema/110782-oidc-test-coverage
add integration tests for OIDC authenticator
2023-07-10 15:29:10 -07:00
Naman
645cb90732 migrated pkg/controller/endpointslicemirroring to contextual logging
Signed-off-by: Naman <namanlakhwani@gmail.com>
2023-07-11 01:43:30 +05:30
Naman
09849b09cf migrated pkg/controller/endpointslice to contextual logging
Signed-off-by: Naman <namanlakhwani@gmail.com>
2023-07-11 01:28:22 +05:30
Kubernetes Prow Robot
d653dcab5a Merge pull request #119048 from pohly/scheduler-perf-metrics-for-perfdash
scheduler-perf: metrics for perfdash
2023-07-09 09:27:04 -07:00
Kubernetes Prow Robot
19a25bac05 Merge pull request #119159 from alculquicondor/fix-job-uncounted
Only declare job as finished after removing all finalizers
2023-07-08 01:55:03 -07:00
kerthcet
47ef977ddd Direct reference to the packages
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-08 12:03:46 +08:00
Aldo Culquicondor
f7a1fb76f4 Only declare job as finished after removing all finalizers
Change-Id: Id4b01b0e6fabe24134e57e687356e0fc613cead4
2023-07-07 14:08:19 -04:00
kerthcet
c0eb0caf4a Support fine-gained rescheduling in ReservePlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 13:30:29 +08:00
Kubernetes Prow Robot
b07a843cb5 Merge pull request #119046 from kerthcet/fix/handle-unschedule-plugins
Fix fitError in Permit plugin not handled perfectly
2023-07-06 21:01:03 -07:00
kerthcet
278a8376e1 Fix: fiterror in permit plugin not handled perfectly
We only added failed plulgins, but actually this will not work unless
we make the status with a fitError because we only copy the failured plugins
to podInfo if it is a fitError

Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 10:35:59 +08:00
Kubernetes Prow Robot
6f9d1d38d8 Merge pull request #118817 from pohly/dra-delete-claims
DRA: improve handling of completed pods
2023-07-06 10:15:15 -07:00
Kubernetes Prow Robot
8c1bf4f461 Merge pull request #116930 from fatsheep9146/contextual-logging-cleanup
contextual logging cleanup
2023-07-06 07:39:03 -07:00
Kubernetes Prow Robot
e5efa0a5ee Merge pull request #117108 from pohly/test-integration-race-detection-component-base-logs
component-base/logs: improve handling of re-applying a configuration
2023-07-05 21:29:08 -07:00
Kubernetes Prow Robot
cd32adebd9 Merge pull request #118386 from Richabanker/enhance-storage-version
Add servedVersions info in StorageVersion API
2023-07-05 19:23:02 -07:00
Ziqi Zhao
dfc1838379 Migrated pkg/controller/volume|util|replicaset|nodeipam to contextual logging
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-07-06 07:39:52 +08:00
Patrick Ohly
02efe09abe component-base/logs: improve handling of re-applying a configuration
Normal binaries should never have to do this. It's not safe when there are
already some goroutines running which might do logging. Therefore the new
default is to return an error when a binary accidentally re-applies.

A few unit ensure that there are no goroutines and have to call the functions
more then once. The new ResetForTest API gets used by those to enable changing the
logging settings more than once in the same process.

Integration tests use the same code as the normal binaries. To make reuse of
that code safe, component-base/logs can be configured to silently ignore any
additional calls. This addresses data races that were found when enabling -race
for integration tests. To catch cases where the integration test does want
to modify the config, the old and new config get compared and an error is
raised when it's not the same.

To avoid having to modify all integration tests which start test servers,
reconfiguring component-base/logs is done by the test server packages.
2023-07-05 19:08:54 +02:00
Patrick Ohly
7f5a02fc7e dra resourceclaim controller: enhance logging
Adding logging to event handlers makes it more obvious why (or why not) claims
and pods need to be processed.
2023-07-05 16:10:20 +02:00
Kubernetes Prow Robot
a9a7a3730e Merge pull request #118994 from pohly/test-integration-race-detection-grpc-logger
integration testing: configure gRPC logging during init
2023-07-05 02:58:55 -07:00
roman
18f2e9055f Add OIDC integration tests 2023-07-04 08:04:53 +03:00
Patrick Ohly
6b01ece580 scheduler-perf: fix perfdash display problem
perfdash expects all data items to have the same set of labels.  It then
renders drop-down buttons for each label with all values found for each
label. Previously, data items that didn't have a label didn't match any label
filter in perfdash and couldn't get selected because perfdash doesn't have
"unset" in it's drop-down menus.

To avoid that, scheduler-perf now collects all labels and then adds missing
labels with "not applicable" as value:

    {
      "data": {
        "Average": 939.7071223010004,
        "Perc50": 927.7987421383649,
        "Perc90": 2166.153846153846,
        "Perc95": 2363.076923076923,
        "Perc99": 2520.6153846153848
      },
      "unit": "ms",
      "labels": {
        "Metric": "scheduler_pod_scheduling_duration_seconds",
        "Name": "SchedulingBasic/5000Nodes/namespace-2",
        "extension_point": "not applicable",
        "result": "not applicable"
      }
    },
    ...
    {
      "data": {
        "Average": 1.1172570650000004,
        "Perc50": 1.1418367346938776,
        "Perc90": 1.5500000000000003,
        "Perc95": 1.6410256410256412,
        "Perc99": 3.7333333333333334
      },
      "unit": "ms",
      "labels": {
        "Metric": "scheduler_framework_extension_point_duration_seconds",
        "Name": "SchedulingBasic/5000Nodes/namespace-2",
        "extension_point": "Score",
        "result": "not applicable"
      }
    },
2023-07-03 21:16:53 +02:00
Patrick Ohly
29e5771aa4 scheduler-perf: shorten "Name" label in metrics
Because the JSON file gets written at the end of the top-level benchmark, all
data items had `BenchmarkPerfScheduling/` as prefix in the `Name` label. This
is redundant and makes it harder to see the actual name. Now that common prefix
gets removed.
2023-07-03 21:15:16 +02:00
Patrick Ohly
ea34d03925 integration testing: configure gRPC logging during init
Doing the initialization once was not good enough because it was not guaranteed
that RunCustomEtcd gets called early enough, before there are other goroutines
which use gRPC. The data race for
test/integration/apiserver.TestWatchCacheUpdatedByEtcd was:

WARNING: DATA RACE
Read at 0x00000cfffb90 by goroutine 140052:
  k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.V()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/grpclog.go:41 +0x30
  k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.(*componentData).V()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/component.go:103 +0x4e
  k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.(*http2Client).Close()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:955 +0xca
  k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.(*http2Client).reader()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:1619 +0xbfb
  k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.newHTTP2Client.func11()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:394 +0x47

Previous write at 0x00000cfffb90 by goroutine 145643:
  k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.SetLoggerV2()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/loggerv2.go:75 +0x104
  k8s.io/kubernetes/test/integration/framework.RunCustomEtcd.func2()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/framework/etcd.go:157 +0x33
  sync.(*Once).doSlow()
      /usr/local/go/src/sync/once.go:74 +0x101
  sync.(*Once).Do()
      /usr/local/go/src/sync/once.go:65 +0x46
  k8s.io/kubernetes/test/integration/framework.RunCustomEtcd()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/framework/etcd.go:156 +0xb97
  k8s.io/kubernetes/test/integration/apiserver.multiEtcdSetup()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:41 +0xc4
  k8s.io/kubernetes/test/integration/apiserver.TestWatchCacheUpdatedByEtcd()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:92 +0xa9
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1629 +0x47
2023-06-30 09:52:57 +02:00
Richa Banker
1c48b7ec14 Add servedVersions info in StorageVersion API 2023-06-29 15:40:54 -07:00
Ben Luddy
f1700e4b95 Cache authz decisions within validating policy admission.
This avoids the surprise of identical authorization checks within a
policy evaluating to different decisions during the same admission
pass, and reduces the overhead of repeatedly referencing the same
authorization check.
2023-06-28 15:30:04 -04:00
Kubernetes Prow Robot
c78204dc06 Merge pull request #118202 from pohly/scheduler-perf-unit-test
scheduler-perf: run as integration tests
2023-06-28 06:24:31 -07:00
Kubernetes Prow Robot
ddbf3575a7 Merge pull request #116729 from AxeZhan/handlers_sync
[Scheduler] Make sure handlers have synced before scheduling
2023-06-28 01:26:31 -07:00
Patrick Ohly
0d41d509d2 scheduler_perf: replace gomega.Eventually with wait.PollUntilContextTimeout
This is done for the sake of consistency. The failure message becomes less
useful.
2023-06-28 09:22:26 +02:00
Patrick Ohly
cecebe8ea2 scheduler_perf: add TestScheduling integration test
This runs workloads that are labeled as "integration-test". The apiserver and
scheduler are only started once per unique configuration, followed by each
workload using that configuration. This makes execution faster. In contrast to
benchmarking, we care less about starting with a clean slate for each test.
2023-06-28 09:22:25 +02:00
Patrick Ohly
dfd646e0a8 scheduler_perf: fix namespace deletion
Merely deleting the namespace is not enough:
- Workloads might rely on the garbage collector to get rid of obsolete objects,
  so we should run it to be on the safe side.
- Pods must be force-deleted because kubelet is not running.
- Finally, the namespace controller is needed to get rid of
  deleted namespaces.
2023-06-28 09:22:25 +02:00
Patrick Ohly
d9c16a1ced scheduler_perf: fix goroutine leak in runWorkload
This becomes relevant when doing more fine-grained leak checking.
2023-06-28 08:14:34 +02:00
Patrick Ohly
2e7f37353c test/integration: avoid errors in fake PC controller during shutdown
Once the context is canceled, the controller can stop processing
events. Without this change it prints errors when the apiserver is already
down.
2023-06-28 08:14:34 +02:00
Kubernetes Prow Robot
960830bc66 Merge pull request #118102 from RomanBednar/retro-sc-assignment-ga
graduate RetroactiveDefaultStorageClass feature to GA in 1.28
2023-06-27 20:46:32 -07:00
Aldo Culquicondor
a4519665fe Skip terminal Pods with a deletion timestamp from the Daemonset sync (#118716)
* Skip terminal Pods with a deletion timestamp from the Daemonset sync

Change-Id: I64a347a87c02ee2bd48be10e6fff380c8c81f742

* Review comments and fix integration test

Change-Id: I3eb5ec62bce8b4b150726a1e9b2b517c4e993713

* Include deleted terminal pods in history

Change-Id: I8b921157e6be1c809dd59f8035ec259ea4d96301
2023-06-27 08:56:33 -07:00
kidddddddddddddddddddddd
9c7166ff63 wait for eventhandlers to sync before run scheduler 2023-06-27 23:19:34 +08:00
Kubernetes Prow Robot
6dbb1c6cf0 Merge pull request #118902 from pohly/staticcheck-cleanup
Cleanup staticcheck workarounds, improve gomega calls, update to golangci-lint 1.53.3
2023-06-27 07:54:32 -07:00
Kevin Hannon
ecd727e4c7 Fix PodGC test when PodDisruptionConditions disabled (#118805)
* test comment should match the code in podgc

* Update test/integration/podgc/podgc_test.go

Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>

* test comment should match the code in podgc

---------

Co-authored-by: Michał Woźniak <mimowo@users.noreply.github.com>
2023-06-27 05:50:29 -07:00
Madhav Jivrajani
bdbf07525f test: remove exception comments in discovery tests
The exception comments were added due to a false positive in
staticcheck. This has since been rectified.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2023-06-27 14:20:41 +02:00
Dr. Stefan Schimanski
764da8a01d FIXUP: cmd/kube-apiserver/app/options: split apart controlplane part 2023-06-26 21:50:38 +02:00