This change does a couple things to remove some cruft/unused functionality
in the Windows snapshotter, as well as add a way to specify the rootfs
size in bytes for a Windows container via a new field added in the CRI api in
k8s 1.24. Setting the rootfs/scratch volume size was assumed to be working
prior to this but turns out not to be the case.
Previously I'd added a change to pass any annotations in the containerd
snapshot form (containerd.io/snapshot/*) as labels for the containers
rootfs snapshot. This was added as a means for a client to be able to provide
containerd.io/snapshot/io.microsoft.container.storage.rootfs.size-gb as an
annotation and have that be translated to a label and ultimately set the
size for the scratch volume in Windows. However, this actually only worked if
interfacing with the CRI api directly (crictl) as Kubernetes itself will
fail to validate annotations that if split by "/" end up with > 2 parts,
which the snapshot labels will (containerd.io / snapshot / foobarbaz).
With this in mind, passing the annotations and filtering to
containerd.io/snapshot/* is moot, so I've removed this code in favor of
a new `snapshotterOpts()` function that will return platform specific
snapshotter options if ones exist. Now on Windows we can just check if
RootfsSizeInBytes is set on the WindowsContainerResources struct and
then return a snapshotter option that sets the right label.
So all in all this change:
- Gets rid of code to pass CRI annotations as labels down to snapshotters.
- Gets rid of the functionality to create a 1GB sized scratch disk if
the client provided a size < 20GB. This code is not used currently and
has a few logical shortcomings as it won't be able to create the disk
if a container is already running and using the same base layer. WCIFS
(driver that handles the unioning of windows container layers together)
holds open handles to some files that we need to delete to create the
1GB scratch disk is the underlying problem.
- Deprecates the containerd.io/snapshot/io.microsoft.container.storage.rootfs.size-gb
label in favor of a new containerd.io/snapshot/windows/rootfs.sizebytes label.
The previous label/annotation wasn't being used by us, and from a cursory
github search wasn't being used by anyone else either. Now that there is a CRI
field to specify the size, this should just be a field that users can set
on their pod specs and don't need to concern themselves with what it eventually
gets translated to, but non-CRI clients can still use the new label/deprecated
label as usual.
- Add test to cri integration suite to validate expanding the rootfs size.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
HostProcess containers require every container in the pod to be a
host process container and have the corresponding field set. The Kubelet
usually enforces this so we'd error before even getting here but we recently
found a bug in this logic so better to be safe than sorry.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
This test takes advantage of the fact that when you tell Windows to
mount the GUID_DEVINTERFACE_DISPLAY_ADAPTER class, it will also mount
the host's device store into the container, even if there is no real GPU
on the host.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
Adds an equivalent TestSandboxRemoveWithoutIPLeakage for Windows, in which
we assert that the IPs are not leaked when a Pod's HNS namespace dissapears
and the Pod is deleted afterwards.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
This offers a more reliable way of killing a process. The /IM flag
allows us to specify the "image name" of the process we're killing.
This means we can use wildcards, foce kill a process and all the child
processes it may have spawned.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Windows HostProcess containers can run containerized workloads on a Windows host.
These containers operate as normal processes but have access to the host network
namespace, storage, and devices when given the appropriate user privileges.
HostProcess containers support the ability to run as one of the following Windows
service accounts: LocalSystem, LocalService, NetworkService.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
The test sets container's Linux.SecurityContext.NamespaceOptions.Pid = NamespaceMode_CONTAINER,
which will ensure that the container keeps running even if the sandbox container dies. We do
not have that option on Windows.
Adds additional logging in the test, so it is easier to figure out which assertion failed.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
Go 1.15.7 contained a security fix for CVE-2021-3115, which allowed arbitrary
code to be executed at build time when using cgo on Windows. This issue also
affects Unix users who have “.” listed explicitly in their PATH and are running
“go get” outside of a module or with module mode disabled.
This issue is not limited to the go command itself, and can also affect binaries
that use `os.Command`, `os.LookPath`, etc.
From the related blogpost (ttps://blog.golang.org/path-security):
> Are your own programs affected?
>
> If you use exec.LookPath or exec.Command in your own programs, you only need to
> be concerned if you (or your users) run your program in a directory with untrusted
> contents. If so, then a subprocess could be started using an executable from dot
> instead of from a system directory. (Again, using an executable from dot happens
> always on Windows and only with uncommon PATH settings on Unix.)
>
> If you are concerned, then we’ve published the more restricted variant of os/exec
> as golang.org/x/sys/execabs. You can use it in your program by simply replacing
This patch replaces all uses of `os/exec` with `golang.org/x/sys/execabs`. While
some uses of `os/exec` should not be problematic (e.g. part of tests), it is
probably good to be consistent, in case code gets moved around.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Most of the tests creating and using Pod Sandboxes in the same way. We can create
a common function that will do that for us, so we can have less duplicated code,
and make it easier to add new tests in the future.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
There was a known issue regarding how the symlink files mounted as
volumes were being handled on Windows. This commit adds tests that
will check against those issue to ensure there won't be any
regressions.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
Currently, the cri-integration tests do not work on Windows due to various reasons.
One of the reasons is because all the tests are using Linux-specific images.
Previous commits refactored the image pulling / usage in the cri-integration tests,
making it easier to update, and easier to configure a custom registry to pull images
with Windows support.
For Windows runs, custom registries can be created, which will also contain Windows
images, and the cri-integration tests can be configured to use those registries by
specifying the "--repo-list" argument, a YAML file which will contain an alternative
mapping of the default registries. This is similar to how E2E tests are handled for
Windows runs in Kubernetes.
Some of the tests are Skipped, as they do not pass yet on Windows.
Windows does not collect inodes used stats, thus, the tests that were expecting non-zero
inodes stats were failing.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>
Currently, the cri-integration tests do not work on Windows due to various reasons.
One of the reasons is because all the tests are using Linux-specific images. This
commit refactors the image pulling / usage in the cri-integration tests, making it
easier to update, and easier to configure the a custom registry to pull those images
from.
For Windows runs, custom registries can be created, which will also contain Windows
images, and the cri-integration tests can be configured to use those registries by
specifying the "--image-list" argument, a TOML file which will contain an alternative
mapping of the default images.
Signed-off-by: Claudiu Belu <cbelu@cloudbasesolutions.com>