Commit Graph

21522 Commits

Author SHA1 Message Date
Antonio Ojea
2b822161f0 agnhost: fix sigterm shutdown 2022-05-25 12:50:23 +02:00
Lukasz Szaszkiewicz
c4e337c57c hardens TestAggregatedAPIServer
Since ClientCAs are provided by "client-ca::kube-system::extension-apiserver-authentication::client-ca-file" controller
we need to wait until it picks up the configmap (via a lister) before checking the CAs otherwise the response might contain an empty result.
2022-05-25 12:41:26 +02:00
Kubernetes Prow Robot
e1d92980e3 Merge pull request #107419 from sanposhiho/non-need-e2e-queue-move
Delete non-need `AddUnschedulableIfNotPresent` calling in `TestCoreResourceEnqueue`
2022-05-24 17:06:43 -07:00
Tim Allclair
702ab97722 Run common pod E2Es as restricted 2022-05-24 16:10:16 -07:00
Tim Allclair
ccc69b1e9a Add MixinRestrictedPodSecurity e2e util 2022-05-24 16:10:16 -07:00
Kubernetes Prow Robot
f8c77fda0c Merge pull request #110176 from deads2k/try-new-image
update to new level of agnhost
2022-05-24 14:27:25 -07:00
Hemant Kumar
50f1e16e4d Log new size and old sizes 2022-05-24 14:57:29 -04:00
Kubernetes Prow Robot
1ad8613d5c Merge pull request #109790 from p0lyn0mial/users-watchtools-informerwatcher
users of watchtools.NewIndexerInformerWatcher should wait for the informer to sync
2022-05-24 08:52:05 -07:00
David Eads
c1a891c661 update to the latest agnhost image 2022-05-24 11:12:37 -04:00
Kubernetes Prow Robot
78a4ba6af8 Merge pull request #110174 from deads2k/readyz-agnhost
add readyz handling to netexec
2022-05-24 07:26:18 -07:00
Kubernetes Prow Robot
c3d550d4e7 Merge pull request #110101 from MikeSpreitzer/rename-observers
Give apf metrics abstractions more familiar names
2022-05-24 07:26:06 -07:00
Lukasz Szaszkiewicz
59a5c1a6ea hardens integration job tests
the job controller used by the tests must wait for the caches to sync
since the tests don't check /readyz there is no way
the tests can tell it is safe to call the server and requests won't be rejected
2022-05-24 13:47:38 +02:00
Lukasz Szaszkiewicz
4a7845b485 users of watchtools.NewIndexerInformerWatcher should wait for the informer to sync
previously users of this method were relying on the fact that a call to LIST was made.
Instead, users should use the dedicated `HasSynced` method.
2022-05-24 13:05:40 +02:00
Hemant Kumar
fb9db79d3f Enable volume expansion tests for generic ephemeral volumes 2022-05-23 21:58:34 -04:00
Kubernetes Prow Robot
fdb2d54475 Merge pull request #108210 from jlsong01/update_kubectl_warning
coordinate the kubectl warning style
2022-05-23 15:57:09 -07:00
Mike Spreitzer
7d64a93a14 Give apf metrics abstractions more familiar names
The logic is similar to Prometheus gauges and vectors,
adopt that terminology.
2022-05-23 16:09:43 -04:00
David Eads
566394467e add readyz handling to netexec 2022-05-23 14:26:09 -04:00
Kubernetes Prow Robot
1131fb95fc Merge pull request #110125 from wojtek-t/fix_resource_quota_shutdown
Fix resource quota shutdown
2022-05-23 07:18:03 -07:00
jlsong01
272e245f06 add a warning printer in cli-runtime to coordinate warning style
modified:   staging/src/k8s.io/kubectl/pkg/cmd/auth/auth.go
2022-05-23 19:10:15 +08:00
Kubernetes Prow Robot
e9f1c9cc7c Merge pull request #110138 from wojtek-t/fix_leaking_goroutines_in_kubelet_test
Fix leaking goroutines in kubelet integration test
2022-05-23 04:06:01 -07:00
Wojciech Tyczyński
f8211d7e44 Fix ResourceQuota admission shutdown 2022-05-23 12:34:50 +02:00
Wojciech Tyczyński
0d41d2921e Fix leaking goroutines in kubelet integration test 2022-05-23 11:50:29 +02:00
Wantong Jiang
93692ef57d enhance assertions in test/e2e/common/node 2022-05-23 00:39:09 +00:00
sanposhiho
bbd5f19497 Delete non-need AddUnschedulableIfNotPresent in e2e 2022-05-22 06:54:49 +00:00
Kubernetes Prow Robot
4ab90ccebb Merge pull request #109719 from stlaz/e2e_nodeauthn_nosasecret
auth e2e: node_authn test: don't expect a SA secret
2022-05-19 15:13:53 -07:00
Kubernetes Prow Robot
a608fba48c Merge pull request #110055 from brianpursley/vol-limit-flake
Increase csiNodeInfoTimeout from 1 minute to 2 minutes
2022-05-19 09:21:20 -07:00
Kubernetes Prow Robot
a1c8e9386a Merge pull request #110090 from wojtek-t/shutdown_broadcaster_in_controllers
Fix event broadcaster shutdown in multiple controllers
2022-05-18 03:38:53 -07:00
Kubernetes Prow Robot
84c8afeba3 Merge pull request #110095 from neolit123/1.25-update-master-label-taint
kubeadm: cleanup the "master" taint on CP nodes during upgrade
2022-05-18 00:52:54 -07:00
Kevin Delgado
91c016e4d5 Add unknown metadata field validation tests (#109316)
* add unknown metadata validation e2e tests

* Address PR Feedback

* explicitly check for unexpected nil errors or namespace errors
2022-05-17 15:04:30 -07:00
Wojciech Tyczyński
11b679c66a Fix event broadcaster shutdown in multiple controllers 2022-05-17 22:14:19 +02:00
Lubomir I. Ivanov
ddd046f3dd kubeadm: cleanup the "master" taint on CP nodes during upgrade
- iniconfiguration.go: stop applying the "master" taint
for new clusters; update related unit tests in _test.go
- apply.go: Remove logic related to cleanup of the "master" label
during upgrade
- apply.go: Add cleanup of the "master" taint on CP nodes
during upgrade
- controlplane_nodes_test.go: remove test for old "master" taint
on nodes (this needs backport to 1.24, because we have a kubeadm
1.25 vs kubernetes test suite 1.24 e2e test)
2022-05-17 19:21:49 +03:00
Francesco Romani
f3e157d168 e2e: node: re-enable the device plugin tests
Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 16:05:13 +02:00
Francesco Romani
48b5af49e0 e2e: node: reorder imports
trivial cleanup

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 16:04:01 +02:00
Francesco Romani
98eb6db7c0 e2e: node: fix plugins directory
Previously, the e2e test was overriding the plugins socket directory to
"/var/lib/kubelet/plugins_registry". This seems wrong, and with that
setting the e2e test was already failing, because the registration
process was timing out, in turn because the kubelet was trying to call
back the device plugin in the wrong place (see below for details).

I can't explain why it worked before - or it if worked at all - but
it really seems that `pluginapi.DevicePluginPath` is the right
setting here.

+++

In a nutshell, the device plugin registration process works like this:

1. The kubelet runs and creates the device plugin socket registration
   endpoint:
	KubeletSocket = DevicePluginPath + "kubelet.sock"
	DevicePluginPath = "/var/lib/kubelet/device-plugins/"
2. Each device plugin will listen to an ENDPOINT the kubelet will connect
   backk to.  IOW the kubelet will act like a client to each device plugin,
   to perform allocation requests (and more)
   Each device plugin will serve from a endpoint.
   The endpoint name is plugin-specific, but they all must be inside a
   well-known directory: pluginapi.DevicePluginPath
3. The kubelet creates the device plugin pod, like any other pod
4. During the startup, each device plugin wants to register itself in the
   kubelet. So it sends a request through
   the registration endpoint. Key details:
	grpc.Dial(kubelet registration socket)
	registration request
	reqt := &pluginapi.RegisterRequest{
		Version:      pluginapi.Version,
		Endpoint:     endpointSocket,	<- socket relative to pluginapi.DevicePluginPath
		ResourceName: resourceName, 	<- resource name to be exposed
}
5. While handling the registration request, kubelet dial back the
   device plugin on socketDir + req.Endpoint.
   But socketDir is hardcoded in the device manager code to
   pluginapi.KubeletSocket

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 16:03:50 +02:00
Francesco Romani
23147ff4b3 e2e: node: devplugin: tolerate node readiness flip
In the AfterEach check of the e2e node device plugin tests,
the tests want really bad to clean up after themselves:
- delete the sample device plugin
- restart again the kubelet
- ensure that after the restart, no stale sample devices
  (provided  by the sample device plugin) are reported anymore.

We observed that in the AfterEach block of these e2e tests
we have quite reliably a flip/flop of the kubelet readiness
state, possibly related to a race with/ a slow runtime/PLEG check.

What happens is that the kubelet readiness state is true,
but goes false for a quick interval and then goes true again
and it's pretty stable after that (observed adding more logs
to the check loop).

The key factor here is the function `getLocalNode` aborts the
test (as in `framework.ExpectNoError`) if the node state is
not ready. So any occurrence of this scenario, even if it
is transient, will cause a test failure. I believe this will
make the e2e test unnecessarily fragile without making it more
correct.

For the purpose of the test we can tolerate this kind of glitches,
with kubelet flip/flopping the ready state, granted that we meet
eventually the final desired condition on which the node reports
ready AND reports no sample devices present - which was the condition
the code was trying to check.

So, we add a variant of `getLocalNode`, which just fetches the
node object the e2e_node framework created, alongside to a flag
reporting the node readiness. The new helper does not make
implicitly the test abort if the node is not ready, just bubbles
up this information.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 14:22:25 +02:00
Francesco Romani
56c539bff0 e2e: node: deviceplug: deepcopy the pod dev template
Let's avoid unexpected side effects

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 14:22:24 +02:00
Kubernetes Prow Robot
22a21f974f Merge pull request #110063 from wojtek-t/cleanup_testing_namespaces_in_integration
Simplify Create/Delete-TestingNamespace functions
2022-05-16 01:50:18 -07:00
Wojciech Tyczyński
deef9e40de Simplify Create/Delete-TestingNamespace functions 2022-05-15 23:06:26 +02:00
Brian Pursley
db9d5c1e77 Increase csiNodeInfoTimeout from 1 minute to 2 minutes 2022-05-14 16:29:28 -04:00
cpanato
90871a0b2f Update Go to 1.18.2
Signed-off-by: cpanato <ctadeu@gmail.com>
2022-05-14 00:57:15 +02:00
Kubernetes Prow Robot
bbdcce6a9e Merge pull request #109880 from Jefftree/patch-4
Remove warning log for crd merging
2022-05-13 15:31:54 -07:00
Kubernetes Prow Robot
47bb8c6d0c Merge pull request #108777 from pjo256/recursive-rollout-status
feat(kubectl rollout): support multiple resources for rollout status
2022-05-13 13:15:55 -07:00
Jefftree
fad5353ef8 Integration test for openapi scale & status 2022-05-13 11:45:31 -07:00
Kubernetes Prow Robot
1a6adee3d6 Merge pull request #109753 from matthyx/109577
do not install docker with curl
2022-05-13 07:33:49 -07:00
Matthias Bertschy
d42321dc05 recommend containerd instead of docker, cleanup 2022-05-13 15:25:15 +02:00
Kubernetes Prow Robot
9720d130e4 Merge pull request #110030 from wojtek-t/clean_shutdown_2
Minor cleanups in integration test shutdown
2022-05-13 03:43:48 -07:00
Wojciech Tyczyński
492e7111a0 Minor cleanup of apply tests 2022-05-13 11:37:22 +02:00
Wojciech Tyczyński
89549142c0 Stop leaking apiserver in tracing test 2022-05-13 09:47:24 +02:00
Kubernetes Prow Robot
340ba56567 Merge pull request #109989 from tallclair/image-type
Use typed ImageID for imageutils images
2022-05-13 00:13:48 -07:00
Kubernetes Prow Robot
1be1ec4aa3 Merge pull request #109970 from stevekuznetsov/skuznets/isolate-versioner
storage: move the APIObjectVersioner definition to storage
2022-05-12 12:32:44 -07:00