OCI hooks aren't implemented on Windows. The test will, and has been,
actuallyrunning fine on Windows because the Github runners seem to have
a 'ps' binary in the users PATH, but there's not any actual hook
functionality being tested as any of the OCI fields are ignored for
Windows containers.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
The CRI-plugin will setup watcher for each container after
StartContainer or RunPodSandbox. It will cleanup task(container/sandbox)
if received the exit event from watcher.
The original test design is to `Delete` sandbox container to get
NOT_READY state and expect to receive NotFound error. It depends on that
CRI-plugin cleanups container after `Delete` API. If not, the shim will
be cleanup and test code will receive `ttrpc: closed: unknown` or other
unknown error. It is flaky.
In this patch, the test will only send the kill signal and wait for the
exit event. When sandbox exits, the state will and must be NOT_READY.
```plain
// test fail log
=== RUN TestContainerdRestart
restart_test.go:92: Make sure no sandbox is running before test
restart_test.go:97: Start test sandboxes and containers
common.go:115: Image "k8s.gcr.io/pause:3.6" already exists, not pulling.
common.go:115: Image "k8s.gcr.io/pause:3.6" already exists, not pulling.
restart_test.go:139:
Error Trace: restart_test.go:139
Error: Should be true
Test: TestContainerdRestart
Messages: delete should return not found error but returned failed to delete task: ttrpc: closed: unknown
--- FAIL: TestContainerdRestart (4.25s)
// containerd log
&TaskExit{ContainerID:4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184,ID:4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184,Pid:28430,ExitStatus:137,ExitedAt:2021-12-12 07:56:01.400753012 +0000 UTC,XXX_unrecognized:[],}"
time="2021-12-12T07:56:01.401120516Z" level=debug msg="event forwarded" ns=k8s.io topic=/tasks/exit type=containerd.events.TaskExit
time="2021-12-12T07:56:01.418934208Z" level=debug msg="event forwarded" ns=k8s.io topic=/tasks/delete type=containerd.events.TaskDelete
time="2021-12-12T07:56:01.419192910Z" level=info msg="shim disconnected" id=4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184
time="2021-12-12T07:56:01.419235911Z" level=warning msg="cleaning up after shim disconnected" id=4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184 namespace=k8s.io
time="2021-12-12T07:56:01.419247711Z" level=info msg="cleaning up dead shim"
time="2021-12-12T07:56:01.419235311Z" level=error msg="failed sending message on channel" error="write unix /run/containerd/s/18afde7fcde70236eb31b9f43f3bd92af1dc1186583c501aa1396255f87f95d4->@: write: broken pipe"
time="2021-12-12T07:56:01.419354712Z" level=debug msg="failed to delete task" error="ttrpc: closed" id=4b4c1d1d303c14a2cc759631d163f153ba8536e9ea6821744a509e4a17346184
```
CI Link: `https://pipelines.actions.githubusercontent.com/G4SighzWVVZ6vsyiz7FFMFjLjRzveJHseEnVyibkSq87Cl2x4O/_apis/pipelines/1/runs/9501/signedlogcontent/76?urlExpires=2021-12-12T08%3A42%3A08.0765750Z&urlSigningMethod=HMACV1&urlSignature=pH93isMSFdZUo1ndnZynJpZbPGrEyvt12MO03fgUU7I%3D`
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Skip this test until this error can be evaluated and the appropriate
test fix or environment configuration can be determined.
Signed-off-by: Derek McGowan <derek@mcg.dev>
The OCI image spec did a v1.0.2 security release for CVE-2021-41190, however
commit 09c9270fee, depends on MediaTypes that
have not yet been released by the OCI image-spec, so using current "main" instead.
full diff: 5ad6f50d62...693428a734
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change enables the TestVolumeOwnership on Windows. The test
assumes that the volume-ownership image is built on Windows, thus
ensuring that Windows file security info (ACLs and ownership info)
are attached to the C:\volumes\test_dir path.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
I noticed we were using some different versions of the same test
images, so changing them to be the same (can help with find/replace
if we need to update them).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It seems that the default ACLs inherited from the parent folder
on Windows Server 2022, does not include "CREATOR OWNER" as it
does on Windows Server 2019. This sets explicit ACLs on test
files.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
GA for ws2022 github actions VMs launched a couple weeks ago so seems like
it's time to try out the CI on this new SKU.
This involved adding new ws2022 runs for the OS matrices in the CI, fixing up
a test in the platforms package and adding a mapping for the ws2022 container image in
integration/client.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
With the ghcr images now built and working, switch over to
use these new images and update the default name.
Signed-off-by: Derek McGowan <derek@mcg.dev>
This change skips the TestExportAndImportMultiLayer in integration/client
for the time being. It seems the image was updated recently and no longer
has a Windows entry in the manifest so the test will always fail. This should
be reverted when we figure out what happened to the image, but this is to
unblock PRs for the time being.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
* Adds Windows dockerfile for volume-ownership image
* Build volume-copy-up on Windows
* Adds a helper tool that fetches the owner username and SID of
a file or folder
* Adds README
* Remove 2004 from Windows versions
* Add ltsc2022 to Windows versions
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This tag contains some changes for the Windows shim for retrying
stdio named pipe connections if containerd restarts. It also is built with v1.1.0 of
ttrpc which has some fixes for a deadlock we'd observed on Windows.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
This tag contains a fix for a deadlock observed when there are multiple
simultaneous requests from the same client connection.
Signed-off-by: Daniel Canter <dcanter@microsoft.com>
Use the time for the last non-running status to determine
whether the restart did not occur as expected. The
current timestamp only accounts for when the running
status was seen, however, the restart would have always
occurred in between the previous check and latest check.
Therefore, it makes more sense to use the previous check
to determine whether a failure was seen from the restart
monitor not restarting as expected.
Signed-off-by: Derek McGowan <derek@mcg.dev>
* Bump k8s.io/cri-api to latest version - v0.23.0-alpha.4
* Vendor github.com/vishvananda/netlink for network stats
Signed-off-by: David Porter <porterdavid@google.com>
restart monitor test was failing due to occasionally taking past the
deadline on windows tests. Add a small additional grace period to
deflake the test.
Signed-off-by: David Porter <porterdavid@google.com>