Commit Graph

4237 Commits

Author SHA1 Message Date
Jefftree
735be024cf Make CRDs built and aggregated lazily for oasv2 2023-07-18 04:49:56 +00:00
Kensei Nakada
c7e7eee554 feature(scheduling_queue): track events per Pods (#118438)
* feature(sscheduling_queue): track events per Pods

* fix typos

* record events in one slice and make each in-flight Pod to refer it

* fix: use Pop() in test before AddUnschedulableIfNotPresent to register in-flight Pods

* eliminate MakeNextPodFuncs

* call Done inside the scheduling queue

* fix comment

* implement done() not to require lock in it

* fix UTs

* improve the receivedEvents implementation based on suggestions

* call DonePod when we don't call AddUnschedulableIfNotPresent

* fix UT

* use queuehint to filter out events for in-flight Pods

* fix based on suggestion from aldo

* fix based on suggestion from Wei

* rename lastEventBefore → previousEvent

* fix based on suggestion

* address comments from aldo

* fix based on the suggestion from Abdullah

* gate in-flight Pods logic by the SchedulingQueueHints feature gate
2023-07-17 15:53:07 -07:00
Kubernetes Prow Robot
8633adbb07 Merge pull request #119342 from A-Hilaly/api-server/webhooks/match-conditions-integration-tests
Add integration tests for `MatchConditions` feature gate enablement
2023-07-17 12:47:23 -07:00
Amine
00de051729 Make matchConditionsFeatureGateInitiallyEnabled a boolean instead 2023-07-17 18:34:42 +01:00
Aohan Yang
b1850497b4 Integration tests for IP mode field 2023-07-17 16:03:02 +08:00
Amine
6b3ce3004d Add integration tests for match conditions feature gate enablement 2023-07-16 01:06:08 +01:00
Cici Huang
13172cba5c ValidatingAdmissionPolicy: support namespace access (#118267)
* Support namespace access from cel expression in validatingadmissionpolicy.

* Whitelist the exposed fields in namespace object and add test

* better handling of cluster-scoped resources.

* [API REVIEW] namespaceObject in Expression doc.

* compatibility with composition.

* generated: ./hack/update-codegen.sh && ./hack/update-openapi-spec.sh

* workaround namespace of namespace is unexpectedly set.

* basic test coverage for namespaceObject.

---------

Co-authored-by: Jiahui Feng <jhf@google.com>
2023-07-14 17:53:08 -07:00
Kubernetes Prow Robot
47aeec63a8 Merge pull request #119272 from deads2k/resources
add list of served versions to storage version
2023-07-14 13:22:41 -07:00
David Eads
90ab7580aa add list of served versions to storage version 2023-07-14 13:47:19 -04:00
Kubernetes Prow Robot
1e21da87b8 Merge pull request #118988 from nilekhc/hash-keyid
[KMSv2] chore: hashes keyID being logged
2023-07-13 15:47:48 -07:00
Kubernetes Prow Robot
be2cfc9697 Merge pull request #118228 from carlory/move-non-graceful-node-shutdown-to-GA
move non-graceful node shutdown to GA
2023-07-13 15:47:37 -07:00
Kubernetes Prow Robot
bea27f82d3 Merge pull request #118209 from pohly/dra-pre-scheduled-pods
dra: pre-scheduled pods
2023-07-13 14:43:37 -07:00
Nilekh Chaudhari
131216fa8f chore: hashes keyID
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
2023-07-13 20:42:09 +00:00
Jiahui Feng
049614f884 ValidatingAdmissionPolicy controller for Type Checking (#117377)
* [API REVIEW] ValidatingAdmissionPolicyStatucController config.

worker count.

* ValidatingAdmissionPolicyStatus controller.

* remove CEL typechecking from API server.

* fix initializer tests.

* remove type checking integration tests

from API server integration tests.

* validatingadmissionpolicy-status options.

* grant access to VAP controller.

* add defaulting unit test.

* generated: ./hack/update-codegen.sh

* add OWNERS for VAP status controller.

* type checking test case.
2023-07-13 13:41:50 -07:00
Patrick Ohly
80ab8f0542 dra: handle scheduled pods in kube-controller-manager
When someone decides that a Pod should definitely run on a specific node, they
can create the Pod with spec.nodeName already set. Some custom scheduler might
do that. Then kubelet starts to check the pod and (if DRA is enabled) will
refuse to run it, either because the claims are still waiting for the first
consumer or the pod wasn't added to reservedFor. Both are things the scheduler
normally does.

Also, if a pod got scheduled while the DRA feature was off in the
kube-scheduler, a pod can reach the same state.

The resource claim controller can handle these two cases by taking over for the
kube-scheduler when nodeName is set. Triggering an allocation is simpler than
in the scheduler because all it takes is creating the right
PodSchedulingContext with spec.selectedNode set. There's no need to list nodes
because that choice was already made, permanently. Adding the pod to
reservedFor also isn't hard.

What's currently missing is triggering de-allocation of claims to re-allocate
them for the desired node. This is not important for claims that get created
for the pod from a template and then only get used once, but it might be
worthwhile to add de-allocation in the future.
2023-07-13 21:27:11 +02:00
Kubernetes Prow Robot
406d2dfe61 Merge pull request #119250 from pohly/controller-contextual-logging
kube-controller-manager: finish conversion to contextual logging
2023-07-12 18:59:30 -07:00
Kubernetes Prow Robot
4af23c157c Merge pull request #119242 from carlory/add-logger
change the QueueingHintFn to pass a logger
2023-07-12 13:03:31 -07:00
Kubernetes Prow Robot
2ec4e14bfa Merge pull request #118812 from serathius/storage-metric
Improve apiserver storage size metric
2023-07-12 10:57:26 -07:00
carlory
0599b3caa0 change the QueueingHintFn to pass a logger 2023-07-13 00:56:41 +08:00
Patrick Ohly
7d064812bb kube-controller-manager: finish conversion to contextual logging
This removes all exceptions and fixes the remaining unconverted log calls.
2023-07-12 14:57:29 +02:00
Marek Siarkowicz
7a63997c8a Improve apiserver storage size metric to allow it's graduation
Change name to make it compliant with prometheus guidelines.
Calculate it on demand instead of periodic to comply with prometheus standards.
Replace "endpoint" with "server" label to make it semantically consistent with storage factory
2023-07-12 14:33:10 +02:00
Mengjiao Liu
19869478c1 Migrate /pkg/controller/disruption to structured and contextual logging 2023-07-12 11:30:45 +08:00
Kubernetes Prow Robot
98e7c2a751 Merge pull request #119237 from jpbetz/jpbetz-apiserver-integration-owner
Add jpbetz as approver of apiserver integration tests
2023-07-11 20:03:18 -07:00
Kubernetes Prow Robot
6ffca50136 Merge pull request #116443 from benluddy/secondary-authz-decision-caching
Cache authz decisions within the scope of validating policy admission.
2023-07-11 12:41:11 -07:00
Joe Betz
6d6595d0f6 Add jpbetz as approver of apiserver integration tests 2023-07-11 14:36:45 -04:00
Kubernetes Prow Robot
8f1852bb44 Merge pull request #115295 from Namanl2001/pkg/controller/endpointslice
Migrated `pkg/controller/endpointslice` and `pkg/controller/endpointslicemirroring` to contextual logging
2023-07-11 03:19:12 -07:00
carlory
f443c458af move non-graceful node shutdown to GA 2023-07-11 13:51:51 +08:00
Kubernetes Prow Robot
ad72319ece Merge pull request #115122 from r-erema/110782-oidc-test-coverage
add integration tests for OIDC authenticator
2023-07-10 15:29:10 -07:00
Naman
645cb90732 migrated pkg/controller/endpointslicemirroring to contextual logging
Signed-off-by: Naman <namanlakhwani@gmail.com>
2023-07-11 01:43:30 +05:30
Naman
09849b09cf migrated pkg/controller/endpointslice to contextual logging
Signed-off-by: Naman <namanlakhwani@gmail.com>
2023-07-11 01:28:22 +05:30
Kubernetes Prow Robot
d653dcab5a Merge pull request #119048 from pohly/scheduler-perf-metrics-for-perfdash
scheduler-perf: metrics for perfdash
2023-07-09 09:27:04 -07:00
Kubernetes Prow Robot
19a25bac05 Merge pull request #119159 from alculquicondor/fix-job-uncounted
Only declare job as finished after removing all finalizers
2023-07-08 01:55:03 -07:00
kerthcet
47ef977ddd Direct reference to the packages
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-08 12:03:46 +08:00
Aldo Culquicondor
f7a1fb76f4 Only declare job as finished after removing all finalizers
Change-Id: Id4b01b0e6fabe24134e57e687356e0fc613cead4
2023-07-07 14:08:19 -04:00
kerthcet
c0eb0caf4a Support fine-gained rescheduling in ReservePlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 13:30:29 +08:00
Kubernetes Prow Robot
b07a843cb5 Merge pull request #119046 from kerthcet/fix/handle-unschedule-plugins
Fix fitError in Permit plugin not handled perfectly
2023-07-06 21:01:03 -07:00
kerthcet
278a8376e1 Fix: fiterror in permit plugin not handled perfectly
We only added failed plulgins, but actually this will not work unless
we make the status with a fitError because we only copy the failured plugins
to podInfo if it is a fitError

Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 10:35:59 +08:00
Kubernetes Prow Robot
6f9d1d38d8 Merge pull request #118817 from pohly/dra-delete-claims
DRA: improve handling of completed pods
2023-07-06 10:15:15 -07:00
Kubernetes Prow Robot
8c1bf4f461 Merge pull request #116930 from fatsheep9146/contextual-logging-cleanup
contextual logging cleanup
2023-07-06 07:39:03 -07:00
Kubernetes Prow Robot
e5efa0a5ee Merge pull request #117108 from pohly/test-integration-race-detection-component-base-logs
component-base/logs: improve handling of re-applying a configuration
2023-07-05 21:29:08 -07:00
Kubernetes Prow Robot
cd32adebd9 Merge pull request #118386 from Richabanker/enhance-storage-version
Add servedVersions info in StorageVersion API
2023-07-05 19:23:02 -07:00
Ziqi Zhao
dfc1838379 Migrated pkg/controller/volume|util|replicaset|nodeipam to contextual logging
Signed-off-by: Ziqi Zhao <zhaoziqi9146@gmail.com>
2023-07-06 07:39:52 +08:00
Patrick Ohly
02efe09abe component-base/logs: improve handling of re-applying a configuration
Normal binaries should never have to do this. It's not safe when there are
already some goroutines running which might do logging. Therefore the new
default is to return an error when a binary accidentally re-applies.

A few unit ensure that there are no goroutines and have to call the functions
more then once. The new ResetForTest API gets used by those to enable changing the
logging settings more than once in the same process.

Integration tests use the same code as the normal binaries. To make reuse of
that code safe, component-base/logs can be configured to silently ignore any
additional calls. This addresses data races that were found when enabling -race
for integration tests. To catch cases where the integration test does want
to modify the config, the old and new config get compared and an error is
raised when it's not the same.

To avoid having to modify all integration tests which start test servers,
reconfiguring component-base/logs is done by the test server packages.
2023-07-05 19:08:54 +02:00
Patrick Ohly
7f5a02fc7e dra resourceclaim controller: enhance logging
Adding logging to event handlers makes it more obvious why (or why not) claims
and pods need to be processed.
2023-07-05 16:10:20 +02:00
Kubernetes Prow Robot
a9a7a3730e Merge pull request #118994 from pohly/test-integration-race-detection-grpc-logger
integration testing: configure gRPC logging during init
2023-07-05 02:58:55 -07:00
roman
18f2e9055f Add OIDC integration tests 2023-07-04 08:04:53 +03:00
Patrick Ohly
6b01ece580 scheduler-perf: fix perfdash display problem
perfdash expects all data items to have the same set of labels.  It then
renders drop-down buttons for each label with all values found for each
label. Previously, data items that didn't have a label didn't match any label
filter in perfdash and couldn't get selected because perfdash doesn't have
"unset" in it's drop-down menus.

To avoid that, scheduler-perf now collects all labels and then adds missing
labels with "not applicable" as value:

    {
      "data": {
        "Average": 939.7071223010004,
        "Perc50": 927.7987421383649,
        "Perc90": 2166.153846153846,
        "Perc95": 2363.076923076923,
        "Perc99": 2520.6153846153848
      },
      "unit": "ms",
      "labels": {
        "Metric": "scheduler_pod_scheduling_duration_seconds",
        "Name": "SchedulingBasic/5000Nodes/namespace-2",
        "extension_point": "not applicable",
        "result": "not applicable"
      }
    },
    ...
    {
      "data": {
        "Average": 1.1172570650000004,
        "Perc50": 1.1418367346938776,
        "Perc90": 1.5500000000000003,
        "Perc95": 1.6410256410256412,
        "Perc99": 3.7333333333333334
      },
      "unit": "ms",
      "labels": {
        "Metric": "scheduler_framework_extension_point_duration_seconds",
        "Name": "SchedulingBasic/5000Nodes/namespace-2",
        "extension_point": "Score",
        "result": "not applicable"
      }
    },
2023-07-03 21:16:53 +02:00
Patrick Ohly
29e5771aa4 scheduler-perf: shorten "Name" label in metrics
Because the JSON file gets written at the end of the top-level benchmark, all
data items had `BenchmarkPerfScheduling/` as prefix in the `Name` label. This
is redundant and makes it harder to see the actual name. Now that common prefix
gets removed.
2023-07-03 21:15:16 +02:00
Patrick Ohly
ea34d03925 integration testing: configure gRPC logging during init
Doing the initialization once was not good enough because it was not guaranteed
that RunCustomEtcd gets called early enough, before there are other goroutines
which use gRPC. The data race for
test/integration/apiserver.TestWatchCacheUpdatedByEtcd was:

WARNING: DATA RACE
Read at 0x00000cfffb90 by goroutine 140052:
  k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.V()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/grpclog.go:41 +0x30
  k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.(*componentData).V()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/component.go:103 +0x4e
  k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.(*http2Client).Close()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:955 +0xca
  k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.(*http2Client).reader()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:1619 +0xbfb
  k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.newHTTP2Client.func11()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:394 +0x47

Previous write at 0x00000cfffb90 by goroutine 145643:
  k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.SetLoggerV2()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/loggerv2.go:75 +0x104
  k8s.io/kubernetes/test/integration/framework.RunCustomEtcd.func2()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/framework/etcd.go:157 +0x33
  sync.(*Once).doSlow()
      /usr/local/go/src/sync/once.go:74 +0x101
  sync.(*Once).Do()
      /usr/local/go/src/sync/once.go:65 +0x46
  k8s.io/kubernetes/test/integration/framework.RunCustomEtcd()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/framework/etcd.go:156 +0xb97
  k8s.io/kubernetes/test/integration/apiserver.multiEtcdSetup()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:41 +0xc4
  k8s.io/kubernetes/test/integration/apiserver.TestWatchCacheUpdatedByEtcd()
      /home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:92 +0xa9
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1629 +0x47
2023-06-30 09:52:57 +02:00
Richa Banker
1c48b7ec14 Add servedVersions info in StorageVersion API 2023-06-29 15:40:54 -07:00