Prior to this commit, `readOnly` volumes were not recursively read-only and
could result in compromise of data;
e.g., even if `/mnt` was mounted as read-only, its submounts such as
`/mnt/usbstorage` were not read-only.
This commit utilizes runc's "rro" bind mount option to make read-only bind
mounts literally read-only. The "rro" bind mount options is implemented by
calling `mount_setattr(2)` with `MOUNT_ATTR_RDONLY` and `AT_RECURSIVE`.
The "rro" bind mount options requires kernel >= 5.12, with runc >= 1.1 or
a compatible runtime such as crun >= 1.4.
When the "rro" bind mount options is not available, containerd falls back
to the legacy non-recursive read-only mounts by default.
The behavior is configurable via `/etc/containerd/config.toml`:
```toml
version = 2
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
# treat_ro_mounts_as_rro ("Enabled"|"IfPossible"|"Disabled")
# treats read-only mounts as recursive read-only mounts.
# An empty string means "IfPossible".
# "Enabled" requires Linux kernel v5.12 or later.
# This configuration does not apply to non-volume mounts such as "/sys/fs/cgroup".
treat_ro_mounts_as_rro = ""
```
Replaces:
- kubernetes/enhancements issue 3857
- kubernetes/enhancements PR 3858
Note: this change does not affect non-CRI clients such as ctr, nerdctl, and Docker/Moby.
RRO mounts have been supported since nerdctl v0.14 (containerd/nerdctl PR 511)
and Docker v25 (moby/moby PR 45278).
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>