Commit Graph

49698 Commits

Author SHA1 Message Date
Patrick Ohly
a0add8d2c7 dra api: NodeResourceModel -> ResourceModel
When renaming NodeResourceSlice to ResourceSlice, the embedded
[Node]ResourceModel also should have been renamed.
2024-03-14 18:07:36 +01:00
Akihiro Suda
8963e73f12 kubelet: fix mixing up runtime classes with runtime handlers
Fix issue 123906

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-14 08:14:48 +09:00
Akihiro Suda
1dc05009fe api: NodeStatus: rename RuntimeClasses to RuntimeHandlers
The runtime classes are apiserver's concept, while the handlers are kubelet's concept.
For NodeStatus, it makes more sense to return the latter ones here.

This commit modifies the following files:

- pkg/apis/core/types.go
- staging/src/k8s.io/api/core/v1/types.go
- pkg/kubelet/nodestatus/setters.go
- pkg/kubelet/kubelet_node_status.go
- pkg/registry/core/node/strategy.go
- test/e2e_node/mount_rro_linux_test.go

Other changes were auto-generated by running `make update`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-14 08:06:39 +09:00
Akihiro Suda
4a776f66ec kubelet: silence "unknown runtime class" errors when unsupported
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-14 07:08:42 +09:00
Antonio Ojea
7ab1ef644e Revert "Implement a field selector for ClusterIP on Services" 2024-03-12 12:20:27 +00:00
Kubernetes Prow Robot
3ec6a38795 Merge pull request #123828 from klueska/non-nil-parameters
dra scheduler: ensure that we never have nil claim/class parameters
2024-03-11 14:35:57 -07:00
Kubernetes Prow Robot
57c89abb45 Merge pull request #123792 from mimowo/propose-api-comments-fix
Adjust the Job field API comments and validation to the current state
2024-03-11 11:26:04 -07:00
Kevin Klues
21a0dd1d70 dra scheduler: create default claim/class parameters instead of nil
Without this, the scheduler was crashing in newClaimController() in
pkg/scheduler/framework/plugins/dynamicresources/structuredparameters.go

The code in newClaimController() assumes that the parameters are not nil.
Furthermore it assumes that there is at least one DriverRequest populated in
order to allocate any resources to a claim.

This PR adds logic to define default claim/class parameters that will allow
allocation to proceed even if an end user doesn't provide any class or claim
parameters themselves.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-03-11 13:57:16 +00:00
Kevin Klues
fc2134c84c dra kubelet: fix error log
Previously we were returning the error string from 'err' (which is nil), when
we should have been returning it from result.Error. Without this it is hard to
debug issues with NodeUnprepareResources.

Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-03-11 13:51:29 +00:00
Kubernetes Prow Robot
b3926d137c Merge pull request #123831 from klueska/fix-unprepare-resources
Add StructuredResourceModel to UnprepareResources call
2024-03-11 03:25:14 -07:00
Kubernetes Prow Robot
611dbaa055 Merge pull request #122790 from carlory/fix-121696
Fix flaky test: Test_Run_OneVolumeDetachFailNodeWithReadWriteOnce
2024-03-10 19:23:40 -07:00
Kubernetes Prow Robot
8f80e01467 Merge pull request #123719 from enj/enj/f/authn_config_beta
Mark StructuredAuthenticationConfiguration feature gate as beta
2024-03-09 17:09:56 -08:00
Anish Ramasekar
62ac88b9ea Add metrics for authentication config reload
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2024-03-09 14:40:22 -08:00
Kubernetes Prow Robot
77ecfb7800 Merge pull request #123525 from enj/enj/f/authn_config_reload
Add dynamic reload support for authentication configuration
2024-03-09 14:13:37 -08:00
Monis Khan
b4935d910d Add dynamic reload support for authentication configuration
Signed-off-by: Monis Khan <mok@microsoft.com>
2024-03-09 14:29:33 -05:00
Kevin Klues
13a6dcc21c dra kubelet: add StructuredResourceModel to UnprepareResources call
Signed-off-by: Kevin Klues <kklues@nvidia.com>
2024-03-09 18:08:14 +00:00
Akihiro Suda
c7f52b34f3 kubelet: KEP-3857: Recursive Read-only (RRO) mounts
See <https://kep.k8s.io/3857>.

An example manifest:
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: rro
spec:
  volumes:
    - name: mnt
      hostPath:
        # tmpfs is mounted on /mnt/tmpfs
        path: /mnt
  containers:
    - name: busybox
      image: busybox
      args: ["sleep", "infinity"]
      volumeMounts:
        # /mnt-rro/tmpfs is not writable
        - name: mnt
          mountPath: /mnt-rro
          readOnly: true
          mountPropagation: None
          recursiveReadOnly: IfPossible
        # /mnt-ro/tmpfs is writable
        - name: mnt
          mountPath: /mnt-ro
          readOnly: true
        # /mnt-rw/tmpfs is writable
        - name: mnt
          mountPath: /mnt-rw
```

Requirements:
- Feature gate "RecursiveReadOnlyMounts" to be enabled
- Linux kernel >= 5.12
- runc >= 1.1

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 03:00:59 +09:00
Akihiro Suda
6f12e1d8e5 kubelet: expose containerStatuses.volumeMounts
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 03:00:59 +09:00
Akihiro Suda
dd0882a83e kubelet: expose node.status.runtimeClasses
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 03:00:59 +09:00
Akihiro Suda
8db07446f1 api: validate RecursiveReadOnlyMounts
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-10 02:59:30 +09:00
Akihiro Suda
8828530fd5 node: dropDisabledFields: recognize RecursiveReadOnlyMounts gate
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:13 +09:00
Akihiro Suda
ce1918875f pod: dropDisabledFields: recognize RecursiveReadOnlyMounts
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:12 +09:00
Akihiro Suda
d940886d0a api: KEP-3857: Recursive Read-only (RRO) mounts
This commit modifies the following files:

- pkg/apis/core/types.go
- staging/src/k8s.io/api/core/v1/types.go

Other changes were auto-generated by running `make update`.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:12 +09:00
Akihiro Suda
0b1a507b00 pkg/features: add RecursiveReadOnlyMounts
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:10 +09:00
Akihiro Suda
76081a10c2 kubelet: RuntimeHandler: add SupportsRecursiveReadOnlyMounts
For KEP-3857: Recursive Read-only (RRO) mounts

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:09 +09:00
Akihiro Suda
27f24a62e3 kubelet: change map[string]RuntimeHandler to []RuntimeHandler
The map is changed to an array so as to retain the order of the original array
propagated from the CRI runtime.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2024-03-09 09:48:07 +09:00
Kubernetes Prow Robot
d3d06c3c7e Merge pull request #123826 from tenzen-y/use-fake-client-job-unit
Job: Use the fake clock in TestTrackJobStatusAndRemoveFinalizers
2024-03-08 15:11:13 -08:00
Yuki Iwai
f2508df279 Job: Use the fake clock in TestTrackJobStatusAndRemoveFinalizers
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-03-09 06:09:05 +09:00
Nilekh Chaudhari
9161302e7f feat: implements svm controller
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
2024-03-08 19:25:10 +00:00
Michal Wozniak
79fe37537c Adjust the validation to the current state 2024-03-08 17:43:24 +01:00
Michal Wozniak
1163c7ed9c Adjust the API comments to the current state 2024-03-08 17:29:49 +01:00
Nilekh Chaudhari
91a7708cdc feat: implements Storage Version Migration API in-tree
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
2024-03-08 04:18:56 +00:00
Kubernetes Prow Robot
7ea3d0245a Merge pull request #123516 from pohly/dra-structured-parameters
DRA: structured parameters
2024-03-07 19:24:48 -08:00
Kubernetes Prow Robot
9ad2aabc64 Merge pull request #123520 from haircommander/proc-mount-rely-userns-2
KEP-4265: Update Unmasked ProcMountType to fail validation without a pod level user namespace
2024-03-07 18:21:08 -08:00
Kubernetes Prow Robot
b1741c004b Merge pull request #123811 from tallclair/apparmor-ga
Keep providing the deprecated AppArmor CRI API for runtimes that haven't migrated
2024-03-07 16:18:44 -08:00
Tim Allclair
04ac13b6b7 Keep providing the deprecated AppArmor CRI API for runtimes that haven't migrated 2024-03-07 15:00:07 -08:00
Kubernetes Prow Robot
364ef335db Merge pull request #123412 from tenzen-y/add-new-jobsuccesspolicy-api
Job: Support for the SuccessPolicy
2024-03-07 14:49:20 -08:00
Patrick Ohly
6a361e1f36 dra api: enable new CEL features by faking their version
There are two approaches for making new versioned CEL features available in the
release where they get introduced:
- Always use the environment for "StoredExpressions".
- Use an older version (typically 1.0) and only bump it up later.

The second approach was used before, so this is now also done here.
2024-03-07 22:26:20 +01:00
Patrick Ohly
251b3859b0 dra scheduler: consider in-flight allocation for resource calculation
Storing a modified claim with allocation and the original resource version in
the assume cache was not reliable: if an update was received, it replaced the
modified claim and the resource that was reserved for the claim might have been
used for some other claim.

To fix this, the in-flight claims are now stored in the map instead of just a
boolean and the status stored there overrides whatever is in the assume cache.

Logging got extended to diagnose this problem better. It started to occur in
E2E tests after splitting the claim update so that first the finalizer is set
and then the status, because setting the finalizer triggered an update.
2024-03-07 22:26:16 +01:00
Patrick Ohly
0b6a0d686a dra api: rename NodeResourceSlice -> ResourceSlice
While currently those objects only get published by the kubelet for node-local
resources, this could change once we also support network-attached
resources. Dropping the "Node" prefix enables such a future extension.

The NodeName in ResourceSlice and StructuredResourceHandle then becomes
optional. The kubelet still needs to provide one and it must match its own node
name, otherwise it doesn't have permission to access ResourceSlice objects.
2024-03-07 22:22:55 +01:00
Patrick Ohly
42ee56f093 dra api: implement semver attribute value type
This adds support for semantic version comparison to the CEL support in the
"named resources" structured parameter model. For example, it can be used to
check that an instance supports a certain API level.

To minimize the risk, the new "semver" type is only defined in the CEL
environment for DRA expressions, not in the base library. See
https://github.com/kubernetes/kubernetes/pull/123664 for a PR which
adds it to the base library.

Validation of semver strings is done with the regular expression from
semver.org. The actual evaluation at runtime then uses semver/v4.
2024-03-07 22:22:13 +01:00
Patrick Ohly
d59676a545 dra kubelet: publish NodeResourceSlices
The information is received from the DRA driver plugin through a new gRPC
streaming interface. This is backwards compatible with old DRA driver kubelet
plugins, their gRPC server will return "not implemented" and that can be
handled by kubelet. Therefore no API break is needed.

However, DRA drivers need to be updated because the Go API changed. They can
return
    status.New(codes.Unimplemented, "no node resource support").Err()
if they don't support the new ListAndWatchResources method and
structured parameters.

The controller in kubelet then synchronizes this information from the driver
with NodeResourceSlice objects, creating, updating and deleting them as needed.
2024-03-07 22:22:13 +01:00
Patrick Ohly
3de376ecf6 dra controller: support structured parameters
When allocation was done by the scheduler, the controller needs to do the
deallocation because there is no control-plane controller which could react to
"DeallocationRequested".
2024-03-07 22:22:13 +01:00
Patrick Ohly
6f1ddfcd2e kubelet: support structured parameters for preparing resources
If the resource handle has data from a structured parameter model, then we need
to pass that to the DRA driver kubelet plugin. Because Kubernetes uses
gogo/protobuf, we cannot use "optional" for that new optional field and have to
resort to "repeated" with a single repetition if present.

This is a new, backwards-compatible field.

That extending the resource.k8s.io changes the checksum of a kubelet checkpoint
is unfortunate. Updating the test cases is a stop-gap measure, the actual
solution will have to be something else before beta.
2024-03-07 22:22:13 +01:00
Patrick Ohly
d4d5ade7f5 dra: add "named resources" structured parameter model
Like the current device plugin interface, a DRA driver using this model
announces a list of resource instances. In contrast to device plugins, this
list is made available to the scheduler together with attributes that can be
used to select suitable instances when they are not all alike.

Because this is the first structured parameter model, some checks that
previously were not possible, in particular "is one structured parameter field
set", now gets enabled. Adding another structured parameter model will be
similar.

The applyconfigs code generator assumes that all types in an API are defined in
a single package. If it wasn't for that, it would be possible to place the
"named resources" types in separate packages, which makes their names in the Go
code more natural and provides an indication of their stability level because
the package name could include a version.
2024-03-07 22:21:16 +01:00
Patrick Ohly
096e948905 dra scheduler: support structured parameters
When a claim uses structured parameters, as indicated by the resource class
flag, the scheduler is responsible for allocating it. To do this it needs to
gather information about available node resources by watching
NodeResourceSlices and then match the in-tree claim parameters against those
resources.
2024-03-07 22:21:04 +01:00
Peter Hunt
23706cb90c api validation: validate proc mount against user namespace
fail if container uses proc mount unmasked but pod does not use user namespace

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-03-07 15:56:06 -05:00
Yuki Iwai
e216742672 Job: Support for the JobSuccessPolicy (alpha)
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-03-08 05:49:09 +09:00
Kubernetes Prow Robot
cc6d9b3037 Merge pull request #123789 from tallclair/apparmor-warnings
Warn on deprecated AppArmor annotation use
2024-03-07 11:53:54 -08:00
Tim Allclair
7bd78b06e9 Warn on deprecated AppArmor annotation use 2024-03-07 09:51:48 -08:00