Commit Graph

467 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
c53ac4cee4
Merge pull request #123157 from jsafrane/selinux-rwx
Add SELinuxMount feature gate
2024-02-26 12:06:39 -08:00
Jan Safranek
2e92036576 Rename "new" reconstruction just to reconstruction
There is no "old" reconstruction, so remove "_new" from the file names and
function names.
2024-02-22 13:20:38 +01:00
Jan Safranek
2a2542289f Remove usage of NewVolumeManagerReconstruction feature gate
This removes lot of code related to "old" VolumeManage reconstruction.
2024-02-22 10:21:13 +01:00
Jan Safranek
d7028a8ed5 Add SELinuxMount feature gate
The feature gate enables mounting with -o context=XYZ mount option for all
volume types, not only ReadWriteOncePod.

All SELinux label tracking & error reporting infrastructure is already in
place from SELinuxMountReadWriteOncePod feature gate. This is just a
trivial extension to all access modes.
2024-02-20 15:40:21 +01:00
Hemant Kumar
d190fa3e7d Fix race condition between external-resizer and kubelet
This fixes the race condition that could happen because
resize controller just finished volume expansiona and has only
finished marking PV and yet to mark PVC.

The workaround proposed here should not be necessary once
RecoverVolumeExpansionFailure goes GA/beta.
2024-01-31 12:23:56 -05:00
Kubernetes Prow Robot
c633ea71ed
Merge pull request #122211 from gnufied/fix-uncertain-raw-block-devices
Fix device uncertain errors on reboot
2023-12-15 15:42:40 +01:00
Hemant Kumar
e706b6ba14 Use a separate function for checking if device was reconstructed 2023-12-14 11:37:47 -05:00
Kubernetes Prow Robot
8cc47b64b2
Merge pull request #121795 from carlory/cleanup-after-blockvolume-featuregate-removed
cleanup todo after feature.BlockVolume gate was removed
2023-12-13 23:54:43 +01:00
Kubernetes Prow Robot
26e2cc5299
Merge pull request #119923 from cvvz/fix-119921
fix: Mount point may become local without calling `NodePublishVolume` after node rebooting
2023-12-13 21:25:51 +01:00
Hemant Kumar
56dd5ab10f Add tests for checking of uncertain device paths 2023-12-11 17:15:16 -05:00
Hemant Kumar
ed0facacfa Fix device uncertain errors on reboot 2023-12-06 22:19:14 -05:00
carlory
1c0044594d cleanup todo after feature.BlockVolume gate was removed 2023-11-09 10:01:24 +08:00
weizhichen
b91f07008c add ut 2023-11-06 08:20:42 +00:00
Jan Safranek
e511edf11f Fix SELinux unit tests
Use device mountable volume, to make it impossible to share the same global
mount with different SELinux contexts.

And fix pod2Name to actually refer to pod2.
2023-10-25 10:43:29 +02:00
Jan Safranek
2f5903b4cf Move SELinux warning metric to be counted once per pod
volume_manager_selinux_volume_context_mismatch_warnings_total should be
counted only once per volume + pod. The previous location is evaluated
periodically, so bump the metric only when a new pod is added to volume.
2023-10-25 10:43:29 +02:00
Kubernetes Prow Robot
8453eb0c24
Merge pull request #121069 from jsafrane/ocp-add-plugin-label
Add volume plugin label to SELinux metrics
2023-10-25 08:13:20 +02:00
Chris Henzie
2dbd405583 Graduate ReadWriteOncePod to GA 2023-10-20 10:40:39 -07:00
Jan Safranek
0be5fdb5ce Add volume plugin label to SELinux metrics
Record volume plugin name when a volume in a Pod needs a different
"mount -o context" value than the actually mounted one.

We expect that NFS, CIFS and CephFS volumes would be able to mount such
volumes just fine with multiple "-o context" values.

We know that the block-volume based ones (ext4, xfs, btrfs, ...) cannot do
that.

Therefore want to distinguish the volume plugin in metrics, anything
block-volume based could break an existing application.
2023-10-09 11:18:39 +02:00
cvvz
03126c5465 add comment 2023-08-29 10:46:31 +08:00
cvvz
94d03ccc83 Squashed commit of the following:
commit d623614de31fe411f1dcb1e784472135f3ca0c5e
Merge: 8054af3b303 91344b4008
Author: cvvz <ftdchenwz@gmail.com>
Date:   Mon Aug 28 18:43:49 2023 +0800

    Merge branch 'master' of https://github.com/kubernetes/kubernetes into fix-volumemanager-logs

commit 8054af3b303e10e7b74b1ba4d3c4035f488cbdad
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 22:03:08 2023 +0800

    fix

commit b414972831c4e4030162ee385d8f600e1e0257ac
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 21:41:36 2023 +0800

    fix

commit ebea00a8dd50eb3d8859a912b464bbda5548b1d4
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 20:54:40 2023 +0800

    123

commit 9f6f1dbbe717fa34e1c13fec645f4c474cbf99a0
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 20:53:16 2023 +0800

    add MarshalLog

commit d7d2878409343df937c770d6796f8c125e18ce7a
Author: cvvz <ftdchenwz@gmail.com>
Date:   Tue Aug 8 23:57:47 2023 +0800

    fix volumemanager logs
2023-08-28 18:44:40 +08:00
cvvz
56c241783e fix 2023-08-25 19:56:54 +08:00
cvvz
ab1f97bd6e fix 2023-08-25 19:55:56 +08:00
Patrick Ohly
2472291790 api: introduce separate VolumeResourceRequirements struct
PVC and containers shared the same ResourceRequirements struct to define their
API. When resource claims were added, that struct got extended, which
accidentally also changed the PVC API. To avoid such a mistake from happening
again, PVC now uses its own VolumeResourceRequirements struct.

The `Claims` field gets removed because risk of breaking someone is low:
theoretically, YAML files which have a claims field for volumes now
get rejected when validating against the OpenAPI. Such files
have never made sense and should be fixed.

Code that uses the struct definitions needs to be updated.
2023-08-21 15:31:28 +02:00
cvvz
e40d00cf53 fix: 119921 2023-08-13 15:52:25 +08:00
Hemant Kumar
e011187114 Update code to use new generic allocatedResourceStatus field 2023-07-17 15:30:35 -04:00
Jan Safranek
354b6c409f Rename updateReconstructedFromAPIServer
to be in sync with volumesNeedUpdateFromNodeStatus.
2023-07-11 11:25:43 +02:00
Jan Safranek
1903f5aa2a Rename volumesNeedDevicePath
To volumesNeedUpdateFromNodeStatus - because both devicePath and uncertain
attach-ability needs to be fixed from node status.
2023-07-11 11:15:24 +02:00
Jan Safranek
7cd60df4aa Update volumesInUse after attachability is confirmed
node.status.volumesInUse should report only attachable volumes, therefore
it needs to wait for the reconciler to update uncertain attachability of
volumes from the API server.
2023-07-11 10:32:22 +02:00
Jan Safranek
0a2272dc68 Add uncertain state of volume attach-ability
During CSI volume reconstruction it's not possible to tell, if the volume
is attachable or not - CSIDriver instance may not be available, because
kubelet may not have connection to the API server at that time.

Adding uncertain state during reconstruction + adding a correct state when
the API server is available.
2023-07-11 10:32:22 +02:00
Jan Safranek
45aa59946a Refactor FindAttachablePluginBySpec out of CSI code path
reconstructVolume() is called when kubelet may not have connection to the
API server yet, therefore it cannot get CSIDriver instances to figure out
if a CSI volume is attachable or not.

Refactor reconstructVolume(), so it does not need
FindAttachablePluginBySpec for CSI volumes, because all of them are
deviceMountable (i.e. FindDeviceMountablePluginBySpec always returns the
CSI volume plugin).
2023-06-23 12:28:15 +02:00
Kubernetes Prow Robot
89bfdf0276
Merge pull request #117079 from qingwave/sort-volumes
kubelet/volumemanager: sort unmounted volumes in error message
2023-06-07 18:52:12 -07:00
Clayton Coleman
1f16d71185
kubelet: Rename PodManager DeletePod to RemovePod
RemovePod is more consistent within the kubelet to be the opposite
of AddPod, and the pod is not being deleted just "removed" from
tracking.
2023-05-12 12:57:27 -04:00
Clayton Coleman
166256f73e
kubelet: Reduce the interface pod.Manager consumers accept
Every component that uses a pod.Manager should use a stub interface
(like we do for podWorker) that explicitly describes what methods
they use. This will allow podWorker to implement the minimum set
of manager interfaces.
2023-05-12 12:57:27 -04:00
Clayton Coleman
bb568844b6
kubelet: Separate the MirrorClient from the PodManager
The two are not coupled except accidentally. Separate them and
update callsites. This will reduce the scope of PodManager interface
to make exposing the pod worker cleaner.
2023-05-12 12:57:26 -04:00
Todd Neal
453f81d1ca
kubelet: pass context to VolumeManager.WaitFor*
This allows us to return with a timeout error as soon as the
context is canceled.  Previously in cases where the mount will
never succeed pods can get stuck deleting for 2 minutes.

In the Sync*Pod methods that call VolumeManager.WaitFor*, we
must filter out wait.Interrupted errors from being logged as
they are part of control flow, not runtime problems. Any
early interruption should result in exiting the Sync*Pod method
as quickly as possible without logging intermediate errors.
2023-04-17 11:53:28 -05:00
Kubernetes Prow Robot
f46626364f
Merge pull request #116833 from mpatlasov/fix-memleak-in-kubelet-volumemanager
Fix memory leak in kubelet volume_manager populator processedPods
2023-04-11 18:19:58 -07:00
Kubernetes Prow Robot
4893c66a48
Merge pull request #116134 from cvvz/fix-111933
fix: After a Node is down and take some time to get back to up again, the mount point of the evicted Pods cannot be cleaned up successfully.
2023-04-11 15:35:41 -07:00
qingwave
e1bcfd47da Sort unmounted volumes message in volume manager 2023-04-04 06:44:17 +00:00
Maxim Patlasov
fbf33e32e6 Fix memory leak in kubelet volume_manager populator processedPods
`findAndRemoveDeletedPods()` processes only pods from volume_manager cache: `dswp.desiredStateOfWorld.GetVolumesToMount()`. `podWorker` calls volume_manager `WaitForUnmount()` asynchronously. If it happens after populator cleaned up resources, an entry is added to `processedPods` and will never be seen. Let's cleanup such entries if they don't have a pod and marked for deletion.
2023-03-21 20:16:02 -07:00
Paco Xu
5134520a3b add lock in volume manager reconciler to avoid data race
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2023-03-17 21:29:10 +08:00
Kubernetes Prow Robot
49649c89ea
Merge pull request #113584 from yangjunmyfm192085/volume-contextual-logging
volume: use contextual logging
2023-03-14 10:40:16 -07:00
Kubernetes Prow Robot
aa49f001bc
Merge pull request #114701 from goushicui/vlm
update comment
2023-03-14 09:38:53 -07:00
Jan Safranek
c4f8c3f628 Fix volume reconstruction in standalone mode
Kubelet in standalone mode won't have kubeclient, it cannot get node.status
and get devices from it. Such a kubelet cannot mount attachable volumes
anyway.
2023-03-14 12:32:21 +01:00
杨军10092085
361e4ff0fa volume: use contextual logging 2023-03-14 08:37:30 +08:00
Kubernetes Prow Robot
1f2d49972c
Merge pull request #116424 from jsafrane/add-selinux-metric-test
Add e2e tests for SELinux metrics
2023-03-10 12:41:06 -08:00
Jan Safranek
05cd2ba863 Don't bump nr. of admitted volumes on retry
AddPodToVolume is called periodically, it does not make sense to bump
volume_manager_selinux_volumes_admitted_total on each call.
2023-03-10 15:03:56 +01:00
Jan Safranek
48ea6a3f3a Fix SELinux mismatch metrics
DesiredStateOfWorld must remember both
- the effective SELinux label to apply as a mount option (non-empty for
  RWOP volumes, empty otherwise)
- and the label that _would_ be used if the mount option would be used by
  all access modes.

Mismatch warning metrics must be generated from the second label.
2023-03-10 15:03:56 +01:00
Todd Neal
4096c9209c dedupe pod resource request calculation 2023-03-09 17:15:53 -06:00
weizhichen
a6ffbb41f8 Squashed commit of the following:
commit 1b3ae27e7af577372d5aaaf28ea401eb33d1c4df
Author: weizhichen <weizhichen@microsoft.com>
Date:   Thu Mar 9 08:39:04 2023 +0000

    fix

commit 566e139308e3cec4c9d4765eb4ccc3a735346c2e
Author: weizhichen <weizhichen@microsoft.com>
Date:   Thu Mar 9 08:36:32 2023 +0000

    fix unit test

commit 13a58ebd25b824dcf854a132e9ac474c8296f0bf
Author: weizhichen <weizhichen@microsoft.com>
Date:   Thu Mar 2 03:32:39 2023 +0000

    add unit test

commit c984e36e37c41bbef8aec46fe3fe81ab1c6a2521
Author: weizhichen <weizhichen@microsoft.com>
Date:   Tue Feb 28 15:25:56 2023 +0000

    fix imports

commit 58ec617e0ff1fbd209ca0af3237017679c3c0ad7
Author: weizhichen <weizhichen@microsoft.com>
Date:   Tue Feb 28 15:24:21 2023 +0000

    delete CheckVolumeExistenceOperation

commit 0d8cf0caa78bdf1f1f84ce011c4cc0e0de0e8707
Author: weizhichen <weizhichen@microsoft.com>
Date:   Tue Feb 28 14:29:37 2023 +0000

    fix 111933
2023-03-09 09:53:38 +00:00
Kubernetes Prow Robot
2c8f63f693
Merge pull request #115268 from jsafrane/split-reconstruction
Split volume reconstruction refactoring from SELinuxMountReadWriteOncePod
2023-03-07 10:44:34 -08:00