These host paths have a well known location under /tmp/hostpath_pv
and are therefore safe to be labeled with the shared SELinux label.
Without this label, the mounted volumes cannot be accessed by the
container processes.
Signed-off-by: Mrunal Patel <mpatel@redhat.com>
After the userns PR got merged:
https://github.com/kubernetes/kubernetes/pull/111090
gnufied decided it might be safer if we feature gate this part of the
code, due to the kubelet volume host type assertion.
That is a great catch and this patch just moves the code inside the
feature gate if.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
github.com/opencontainers/selinux/go-selinux needs OS that supports SELinux
and SELinux enabled in it to return useful data, therefore add an interface
in front of it, so we can mock its behavior in unit tests.
In theory the check is not necessary, but for sake of robustness and
completenes, let's check SELinuxMountReadWriteOncePod feature gate before
assuming anything about SELinux labels.
Add a new call to VolumePlugin interface and change all its
implementations.
Kubelet's VolumeManager will be interested whether a volume supports
mounting with -o conext=XYZ or not to hanle SetUp() / MountDevice()
accordingly.
Let volume plugins decide if they want to mount volumes with "-o
context=XYZ" or let the container runtime relabel the volume on container
startup.
Using NewMounter, as it's the call where a volume plugin gets the other MountOptions.
This commit only changes the UID/GID if user namespaces is enabled. When
it is enabled, it changes it so the hostUID and hostGID that are mapped
to the currently used UID/GID. This is needed so volumes are created
with the hostUID/hostGID and the user inside the container can read
them.
If user namespaces are disabled for this pod, this is a no-op: there is
no user namespace mapping, so the hostUID/hostGID are the same as inside
the container.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
In future commits we will need this to set the user/group of supported
volumes of KEP 127 - Phase 1.
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Flocker storage plugin removed from k8s codebase.
Flocker, an early external storage plugin in k8s,
has not been in maintenance and their business is
down. As far as I know, the plugin is not being
used anymore.
This PR removes the whole flocker dependency and
codebase from core k8s to reduce potential security
risks and reduce maintenance work from the sig-storage community.
Currently, there are some unit tests that are failing on Windows due to
various reasons:
- volume mounting is a bit different on Windows: Mount will create the
parent dirs and mklink at the volume path later (otherwise mklink will
raise an error).
- os.Chmod is not working as intended on Windows.
- path.Dir() will always return "." on Windows, and filepath.Dir()
should be used instead (which works correctly).
- on Windows, you can't typically run binaries without extensions. If
the file C:\\foo.bat exists, we can still run C:\\foo because Windows
will append one of the supported file extensions ($env:PATHEXT) to it
and run it.
- Windows file permissions do not work the same way as the Linux ones.
- /tmp directory being used, which might not exist on Windows. Instead,
the OS-specific Temp directory should be used.
Fixes a few other issues:
- rbd.go: Return error in a case in which an error is encountered. This
will prevent "rbd: failed to setup" and "rbd: successfully setup" log
messages to be logged at the same time.
GlusterFS is one of the first dynamic provisioner which made into
Kubernetes release v1.4.
https://github.com/kubernetes/kubernetes/pull/30888
When CSI plugins/drivers to start appear, glusterfs' CSI driver
came into existence, however this project is not maintianed at
present and the last release happened few years back.
https://github.com/gluster/gluster-csi-driver/releases/tag/v0.0.9
The possibilities of migration to compatible CSI driver was also
discussed https://github.com/kubernetes/kubernetes/issues/100897
and consensus was to start the deprecation in v1.25.
This commit start the deprecation process of glusterfs plugin from
in-tree drivers.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
For example, we have two filesystems, one is embedded into another:
/a/test # first filesystem with a directory "/a/test/b2"
/a/test/b2 # not auto mounted yet second filesystem, notice "/a/test/b2" is
# a new directory on this filesystem after this filesystem is mounted
For subpath mount "/a/test/b2", `openat("/a/test", "b2")` gets directory "b2" on the first
filesystem, then "mount -c" will use this wrong directory as source directory.
`fstatat("/a/test", "b2/")` forces triggering auto mount of second filesystem, so
`openat("/a/test", "b2")` gets correct source directory for "mount -c".
This fixes issue https://github.com/kubernetes/kubernetes/issues/110818#issuecomment-1175736550
References:
1. https://man7.org/linux/man-pages/man2/openat.2.html
If pathname refers to an automount point that has not yet
been triggered, so no other filesystem is mounted on it,
then the call returns a file descriptor referring to the
automount directory without triggering a mount.
2. https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html
name_to_handle_at() does not trigger a mount when the final
component of the pathname is an automount point. When a
filesystem supports both file handles and automount points, a
name_to_handle_at() call on an automount point will return with
error EOVERFLOW without having increased handle_bytes. This can
happen since Linux 4.13 with NFS when accessing a directory which
is on a separate filesystem on the server. In this case, the
automount can be triggered by adding a "/" to the end of the
pathname.
Before:
findDisk()
fcPathExp := "^(pci-.*-fc|fc)-0x" + wwn + "-lun-" + lun
After:
findDisk()
fcPathExp := "^(pci-.*-fc|fc)-0x" + wwn + "-lun-" + lun + "$"
fc path may have the same wwns but different luns.for example:
pci-0000:41:00.0-fc-0x500a0981891b8dc5-lun-1
pci-0000:41:00.0-fc-0x500a0981891b8dc5-lun-12
Function findDisk() may mismatch the fc path, return the wrong device and wrong associated devicemapper parent.
This may cause a disater that pods attach wrong disks. Accutally it happended in my testing environment before.
Addresses in the Kubernetes API objects (PV, Pod) have `[]` around IPv6
addresses, while addresses in /dev/ and /sys/ have addresses without them.
Add/remove `[]` as needed.
This resolves a couple of issues for CSI volume reconstruction.
1. IsLikelyNotMountPoint is known not to work for bind mounts and was
causing problems for subpaths and hostpath volumes.
2. Inline volumes were failing reconstruction due to calling
GetVolumeName, which only works when there is a PV spec.
v1.43.0 marked grpc.WithInsecure() deprecated so this commit moves to use
what is the recommended replacement:
grpc.WithTransportCredentials(insecure.NewCredentials())
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
On IPv6 clusters, one of the most frequent problems I encounter is
assumptions that one can build a URL with a host and port simply by
using Sprintf, like this:
```go
fmt.Sprintf("http://%s:%d/foo", host, port)
```
When `host` is an IPv6 address, this produces an invalid URL as it must
be bracketed, like this:
```
http://[2001:4860:4860::8888]:9443
```
This change fixes the occurences of joining a host and port with the
purpose built `net.JoinHostPort` function.
I encounter this problem often enough that I started to [write a linter
for it](https://github.com/stbenjam/go-sprintf-host-port). I don't
think the linter is quite ready for wide use yet, but I did run it
against the Kube codebase and found these. While the host portion in
some of these changes may always be an FQDN or IPv4 IP today, it's an
easy thing that can break later on.
CSI spec 1.5 enhanced the spec to add optional secrets field to
NodeExpandVolumeRequest. This commit adds NodeExpandSecret to the
CSI PV source and also derive the expansion secret in csiclient to
send it out as part of the nodeexpand request.
Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
Signed-off-by: zhucan <zhucan.k8s@gmail.com>
Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:
- Logging a message only above a certain verbosity threshold without
recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
this matters when using a logging backend which records the verbosity
level.
- Passing a format string with parameters to a logging function that
doesn't do string formatting.
All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.
In some cases it reports false positives, but those can be suppressed with
source code comments.
klog.Infof expects a format string as first parameter and then
expands format specifies inside it. What gets passed here
is the final string that must be logged as-is, therefore
klog.Info has to be used.
Signed-off-by: yuswift <yuswift2018@gmail.com>
This patch aims to simplify decoupling "pkg/scheduler/framework/plugins"
from internal "k8s.io/kubernetes" packages. More described in
issue #89930 and PR #102953.
Some helpers from "k8s.io/kubernetes/pkg/controller/volume/persistentvolume"
package moved to "k8s.io/component-helpers/storage/volume" package:
- IsDelayBindingMode
- GetBindVolumeToClaim
- IsVolumeBoundToClaim
- FindMatchingVolume
- CheckVolumeModeMismatches
- CheckAccessModes
- GetVolumeNodeAffinity
Also "CheckNodeAffinity" from "k8s.io/kubernetes/pkg/volume/util"
package moved to "k8s.io/component-helpers/storage/volume" package
to prevent diamond dependency conflict.
Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>
If we time out waiting for volume to be attached the message given
to user is not informative enough:
"Attach timeout for volume vol-123"
It would be better if we provide more information on what's going on
and even include name of the driver that's causing the problem, e.g.:
"timed out waiting for external-attacher of ebs.csi.aws.com CSI driver to attach volume vol-123"
The field in fact says that the container runtime should relabel a volume
when running a container with it, it does not say that the volume supports
SELinux. For example, NFS can support SELinux, but we don't want NFS
volumes relabeled, because they can be shared among several Pods.
The package says:
> the libcontainer SELinux package is only built for Linux, so it is
> necessary to have a NOP wrapper which is built for non-Linux platforms
This is not true, Kubernetes now imports
github.com/opencontainers/selinux/go-selinux and it has proper
multiplatform support (i.e. NOOP on non-Linux platforms).
Removing the whole package and calling go-selinux directly.
Currently, the Kubelet panics when it's started with the
`enable-controller-attach-detach=false` option set and a
volume is attached to a node.
This patch adds a check that prevents the Kubelet from trying to attach
or detach CSI volumes and report a proper error warning the user that this
use case is not supported.
If unmount device succeeds but somehow unmount operation
fails because device was in-use elsewhere, we should mark the
device mount as uncertain because we can't use the global
mount point at this point.
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:
if klog.V(5).Enabled() {
klog.Info("hello world")
}
Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.
Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.