Commit Graph

25669 Commits

Author SHA1 Message Date
Sergey Kanzhelev
62f96d2748 set AllocatedResourcesStatus in the Pod Status 2024-07-24 00:29:35 +00:00
Kubernetes Prow Robot
1353c08110 Merge pull request #126298 from vinayakankugoyal/apparmortest
Update AppArmor e2e tests to use both containers[*].securityContext.appArmorProfile field and annotations.
2024-07-23 15:45:29 -07:00
Cici Huang
a48a92c72e Allowing direct CEL reserved keyword usage in CRD (#126188)
* automatically escape reserved keywords for direct usage

* Add reserved keyword support in a ratcheting way, add tests.

---------

Co-authored-by: Wenxue Zhao <ballista01@outlook.com>
2024-07-23 15:45:20 -07:00
Kubernetes Prow Robot
fa4b8f32ac Merge pull request #125935 from gjkim42/fix-125880
Terminate restartable init containers ignoring not-started containers
2024-07-23 15:45:11 -07:00
Kubernetes Prow Robot
2a372a99bc Merge pull request #126290 from tenzen-y/use-type-parameters-instead-of-casting
Job: Use type parameters instead of type casting for the ptr libraries
2024-07-23 14:40:28 -07:00
Kubernetes Prow Robot
320f1ab30d Merge pull request #126182 from sohankunkerkar/fix-procmount
test/e2e/windows: drop securityContext test for ProcMount
2024-07-23 14:39:51 -07:00
cici37
ac2c450da7 Update with stdlib errors 2024-07-23 21:16:53 +00:00
Kubernetes Prow Robot
c2fdeca4ab Merge pull request #126145 from carlory/kep-3751-api
[KEP-3751] Promote VolumeAttributesClass to beta
2024-07-23 13:31:05 -07:00
Kubernetes Prow Robot
107f621462 Merge pull request #126108 from gnufied/changes-volume-recovery
Reduce state changes when expansion fails and mark certain failures as infeasible
2024-07-23 13:30:56 -07:00
Drew Sirenko
16c2ad5b84 Add labels to PVCollector bound/unbound PVC metrics for VolumeAttributesClass Feature (#126166)
* Add labels to PVCollector bound/unbound PVC metrics

* fixup! Add labels to PVCollector bound/unbound PVC metrics

* wip: Fix 'Unknown
    Decorator'

* fixup! Add labels to PVCollector bound/unbound PVC metrics
2024-07-23 12:21:29 -07:00
Kubernetes Prow Robot
e83fca8dd9 Merge pull request #124530 from sttts/sttts-controlplane-plumbing-split
Step 12 - Add generic controlplane example
2024-07-23 12:21:02 -07:00
Kubernetes Prow Robot
05bb5f71f8 Merge pull request #120611 from pohly/dra-resource-quotas
DRA: resource quotas
2024-07-23 12:20:44 -07:00
Yuki Iwai
25c2731399 Job: Use type parameters instead of type casting for the ptr libraries
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-07-24 03:38:18 +09:00
Kubernetes Prow Robot
9c2302dd3e Merge pull request #126201 from aroradaman/revert-debug-steps
Revert debug steps and logs for #123760
2024-07-23 11:02:38 -07:00
Kubernetes Prow Robot
67c7e77044 Merge pull request #126047 from cpanato/upgrade-go-123
[go] Bump images, dependencies and versions to go 1.23rc2
2024-07-23 11:02:29 -07:00
Kubernetes Prow Robot
a00181d4d4 Merge pull request #121902 from carlory/kep-3751-pv-controller
[kep-3751] pvc bind pv with vac
2024-07-23 11:02:13 -07:00
Sohan Kunkerkar
c5b01a30d3 test/e2e/windows: drop securityContext test for ProcMount
Fixes https://github.com/kubernetes/kubernetes/issues/126180

As the ProcMountType feature is disabled by default in beta and relies
on the UserNamespacesSupport feature, which is also set to false in beta,
running this test is unnecessary.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2024-07-23 13:45:29 -04:00
Vinayak Goyal
b580eb1864 Update AppArmor e2e tests to use Pod field instead of annotations.
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2024-07-23 17:03:17 +00:00
Patrick Ohly
299ecde5cc DRA quota: add ResourceClaim v1.ResourceQuota limits
Dynamic resource allocation is similar to storage in the sense that users
create ResourceClaim objects to request resources, same as with persistent
volume claims. The actual resource usage is only known when allocating claims,
but some limits can already be enforced at admission time:

- "count/resourceclaims.resource.k8s.io" limits the number of ResourceClaim objects in
  a namespace; this is a generic feature that is already supported also without
  this commit.

- "resourceclaims" is *not* an alias - use "count/resourceclaims.resource.k8s.io"
  instead.

- <device-class-name>.deviceclass.resource.k8s.io/devices limits the number of
  ResourceClaim objects in a namespace such that the number of devices
  requested through those objects with that class does not exceed the limit.

A single request may cause the allocation of multiple devices. For exact
counts, the quota limit is based on the sum of those exact counts. For requests
asking for "all" matching devices, the maximum number of allocated devices per
claim is used as a worst-case upper bound.

Requests asking for "admin access" contribute to the quota.

DRA quota: remove admin mode exception
2024-07-23 18:52:34 +02:00
Patrick Ohly
b5c94966bd DRA e2e: fix the quota name
The actual name has the k8s.io suffix.
2024-07-23 18:52:33 +02:00
Antonio Ojea
046e976bab cap the num of nodes on the noSNAT test and remove slow and NoSNAT tag
run NoSNAT network test between pods without any feature tag
2024-07-23 16:27:11 +00:00
Kubernetes Prow Robot
77c3859aee Merge pull request #126270 from stlaz/aggroapi-refactor
integration tests: split Wardle aggregation test API server running
2024-07-23 09:21:37 -07:00
Kubernetes Prow Robot
a4f9910c51 Merge pull request #126014 from PannagaRao/kep-ephemeral-storage-quota
pkg/volume/*: Enable quotas in user namespace
2024-07-23 09:21:02 -07:00
Kubernetes Prow Robot
7590cb7adf Merge pull request #125257 from vinayakankugoyal/armor
KEP-24: Update AppArmor feature gates to GA stage.
2024-07-23 09:20:52 -07:00
Connor Catlett
796ae44c08 Return new PVC in WaitForVolumeModification to prevent stale comparison
Signed-off-by: Connor Catlett <conncatl@amazon.com>
2024-07-23 14:34:34 +00:00
Kubernetes Prow Robot
1854839ff0 Merge pull request #126067 from tenzen-y/implement-job-success-policy-e2e
Graduate the JobSuccessPolicy to Beta
2024-07-23 06:14:23 -07:00
Yuki Iwai
0d4f18bd5b Job: Implement E2E tests for the JobSuccessPolicy
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-07-23 21:05:50 +09:00
Stanislav Láznička
18f4fa0f1a cosmetic - test/integration/examples/apiserver_test.go - put test functions first
The file is too big, test functions should be put first for clarity.
2024-07-23 13:01:32 +02:00
Stanislav Láznička
5a15ae03f2 test:integration: split Wardle test server run
Split running the Wardle aggregated API into preparation and
running phase. This allows reusing the prepared options and
makes it possible for us to introduce additional hooks into
the server authorization flow.
2024-07-23 13:00:53 +02:00
Maciej Skoczeń
c15cdf7431 Init etcd and apiserver per test case in scheduler_perf integration tests 2024-07-23 09:10:01 +00:00
carlory
3a6a4830df pvc bind pv with vac 2024-07-23 15:04:11 +08:00
Dr. Stefan Schimanski
17970b291a generic-controlplane: add generic-controlplane apiserver sample
Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>

generic

Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>
2024-07-23 08:38:33 +02:00
carlory
0260c7d023 Promote VolumeAttributesClass to beta 2024-07-23 13:58:14 +08:00
Cici Huang
5420b2fe9a Hot fix for panic on schema conversion. (#126167) 2024-07-22 19:43:45 -07:00
Yuki Iwai
551931c6a8 Graduate the JobSuccessPolicy to beta
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-07-23 09:29:06 +09:00
Yuki Iwai
6e8dc2c250 Job: Extend the jobs_finished_total metric reason label with SuccessPolicy and CompletionsReached
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-07-23 09:29:02 +09:00
Kubernetes Prow Robot
3d78fe25a7 Merge pull request #121849 from carlory/add-e2e-vac
vac add e2e test
2024-07-22 16:53:03 -07:00
Kubernetes Prow Robot
3e9a73d558 Merge pull request #126058 from AnishShah/patch-2
Deflake kubernetes-node-swap-fedora-serial jobs
2024-07-22 15:48:42 -07:00
Kubernetes Prow Robot
233bc735b5 Merge pull request #126056 from googs1025/refactor_namespace
use ktesting.NewTestContext(t) ctx instead of context.TODO() for namespace integration
2024-07-22 14:25:56 -07:00
Kubernetes Prow Robot
6e52e705d0 Merge pull request #125374 from pwschuurman/kep-3335-stable
Promote StatefulSetStartOrdinal to stable in 1.31
2024-07-22 14:25:49 -07:00
Kubernetes Prow Robot
d21b17264e Merge pull request #125488 from pohly/dra-1.31
DRA for 1.31
2024-07-22 11:45:55 -07:00
Kubernetes Prow Robot
f458a749e7 Merge pull request #125277 from iholder101/swap/skip_critical_pods
[KEP-2400]: Restrict access to swap for containers in high priority Pods
2024-07-22 11:45:48 -07:00
Kubernetes Prow Robot
887def08b6 Merge pull request #126237 from cici37/promoteMetrics
Promote metrics for VAP and CRD validation rules to beta.
2024-07-22 10:17:49 -07:00
Patrick Ohly
d11b58efe6 DRA kubelet: refactor gRPC call timeouts
Some of the E2E node tests were flaky. Their timeout apparently was chosen
under the assumption that kubelet would retry immediately after a failed gRPC
call, with a factor of 2 as safety margin. But according to
0449cef8fd,
kubelet has a different, higher retry period of 90 seconds, which was exactly
the test timeout. The test timeout has to be higher than that.

As the tests don't use the gRPC call timeout anymore, it can be made
private. While at it, the name and documentation gets updated.
2024-07-22 18:09:34 +02:00
Patrick Ohly
357a2926a1 DRA e2e: update VAP for a kubelet plugin
This fixes the message (node name and "cluster-scoped" were switched) and
simplifies the VAP:
- a single matchCondition short circuits completely unless they're a user
  we care about
- variables to extract the userNodeName and objectNodeName once
  (using optionals to gracefully turn missing claims and fields into empty strings)
- leaves very tiny concise validations

Co-authored-by: Jordan Liggitt <liggitt@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
9f36c8d718 DRA: add DRAControlPlaneController feature gate for "classic DRA"
In the API, the effect of the feature gate is that alpha fields get dropped on
create. They get preserved during updates if already set. The
PodSchedulingContext registration is *not* restricted by the feature gate.
This enables deleting stale PodSchedulingContext objects after disabling
the feature gate.

The scheduler checks the new feature gate before setting up an informer for
PodSchedulingContext objects and when deciding whether it can schedule a
pod. If any claim depends on a control plane controller, the scheduler bails
out, leading to:

    Status:       Pending
    ...
      Warning  FailedScheduling             73s   default-scheduler  0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.

The rest of the changes prepare for testing the new feature separately from
"structured parameters". The goal is to have base "dra" jobs which just enable
and test those, then "classic-dra" jobs which add DRAControlPlaneController.
2024-07-22 18:09:34 +02:00
Patrick Ohly
599fe605f9 DRA scheduler: adapt to v1alpha3 API
The structured parameter allocation logic was written from scratch in
staging/src/k8s.io/dynamic-resource-allocation/structured where it might be
useful for out-of-tree components.

Besides the new features (amount, admin access) and API it now supports
backtracking when the initial device selection doesn't lead to a complete
allocation of all claims.

Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
Co-authored-by: John Belamaric <jbelamaric@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
c526d7796e DRA e2e: use VAP to control "admin access" permissions
The advantages of using a validation admission policy (VAP) are that no changes
are needed in Kubernetes and that admins have full flexibility if and how they
want to control which users are allowed to use "admin access" in their
requests.

The downside is that without admins taking actions, the feature is enabled
out-of-the-box in a cluster. Documentation for DRA will have to make it very
clear that something needs to be done in multi-tenant clusters.

The test/e2e/testing-manifests/dra/admin-access-policy.yaml shows how to do
this. The corresponding E2E tests ensures that it actually works as intended.

For some reason, adding the namespace to the message expression leads to a
type check errors, so it's currently commented out.
2024-07-22 18:09:34 +02:00
Patrick Ohly
0b62bfb690 DRA e2e: adapt to v1alpha3 API 2024-07-22 18:09:34 +02:00
Patrick Ohly
91d7882e86 DRA: new API for 1.31
This is a complete revamp of the original API. Some of the key
differences:
- refocused on structured parameters and allocating devices
- support for constraints across devices
- support for allocating "all" or a fixed amount
  of similar devices in a single request
- no class for ResourceClaims, instead individual
  device requests are associated with a mandatory
  DeviceClass

For the sake of simplicity, optional basic types (ints, strings) where the null
value is the default are represented as values in the API types. This makes Go
code simpler because it doesn't have to check for nil (consumers) and values
can be set directly (producers). The effect is that in protobuf, these fields
always get encoded because `opt` only has an effect for pointers.

The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new
"request" field. This is considered acceptable because the entire `claims`
field in the pod spec is still alpha.

The implementation is complete enough to bring up the apiserver.
Adapting other components follows.
2024-07-22 18:09:34 +02:00