kubernetes

Author	SHA1	Message	Date
vinay kulkarni	9a805db010	Set default resize policy only for specified resource types, rename RestartNotRequired -> NotRequired	2023-03-12 23:46:40 +00:00
vinay kulkarni	8b23497ae7	Restructure naming of resource resize restart policy	2023-03-12 23:11:32 +00:00
Kubernetes Prow Robot	3c6e419cc3	Merge pull request #116450 from vinaykul/restart-free-pod-vertical-scaling-api Rename ContainerStatus.ResourcesAllocated to ContainerStatus.AllocatedResources	2023-03-12 16:06:40 -07:00
Kubernetes Prow Robot	3710d93d14	Merge pull request #115976 from ii/pending_eligible_endpoints Create pending_eligible_endpoints.yaml and move endpoints from ineligible_endpoints.yaml	2023-03-12 12:20:51 -07:00
cpanato	7b0b87e057	Updated distroless iptables to use released image `registry.k8s.io/build-image/distroless-iptables:v0.2.2` Signed-off-by: cpanato <ctadeu@gmail.com>	2023-03-12 15:26:15 +01:00
Kubernetes Prow Robot	cc3855e0cf	Merge pull request #116170 from aojea/watch_instead_poll_system_namespaces Watch instead poll system namespaces	2023-03-11 11:24:39 -08:00
Francesco Romani	b837a0c1ff	kubelet: podresources: DOS prevention with builtin ratelimit Implement DOS prevention wiring a global rate limit for podresources API. The goal here is not to introduce a general ratelimiting solution for the kubelet (we need more research and discussion to get there), but rather to prevent misuse of the API. Known limitations: - the rate limits value (QPS, BurstTokens) are hardcoded to "high enough" values. Enabling user-configuration would require more discussion and sweeping changes to the other kubelet endpoints, so it is postponed for now. - the rate limiting is global. Malicious clients can starve other clients consuming the QPS quota. Add e2e test to exercise the flow, because the wiring itself is mostly boilerplate and API adaptation.	2023-03-11 08:00:54 +01:00
Kubernetes Prow Robot	71b596e0d6	Merge pull request #116426 from SergeyKanzhelev/twoMoreLifecycleChecks Three more lifecycle checks to demonstrate various validation techniques for containers lifecycle checks	2023-03-10 15:22:51 -08:00
Kubernetes Prow Robot	1f2d49972c	Merge pull request #116424 from jsafrane/add-selinux-metric-test Add e2e tests for SELinux metrics	2023-03-10 12:41:06 -08:00
Kubernetes Prow Robot	0010333bdd	Merge pull request #116161 from danielvegamyhre/mutable-scheduling-directives Mutable pod scheduling directives	2023-03-10 12:40:58 -08:00
Kubernetes Prow Robot	7529178924	Merge pull request #111372 from HeavenTonight/master code cleanup	2023-03-10 11:44:40 -08:00
Sergey Kanzhelev	13855373e5	three more checks for containers lifecycle	2023-03-10 18:42:14 +00:00
Daniel Vega-Myhre	86f41dc012	mutable pod scheduling directives	2023-03-10 18:30:09 +00:00
Antonio Ojea	eecfaf658e	decouple system namespaces from bootstrap controller Use an informer instead of polling. Change-Id: Ib071e53addb914fcb31d8a1346cf61ca6d22520b	2023-03-10 17:49:47 +00:00
Kubernetes Prow Robot	2e3c5003b9	Merge pull request #115630 from Jefftree/agg-discovery-metrics Add metrics for aggregated discovery	2023-03-10 07:44:41 -08:00
vinay kulkarni	01b96e7704	Rename ContainerStatus.ResourcesAllocated to ContainerStatus.AllocatedResources	2023-03-10 14:49:26 +00:00
Jan Safranek	771c9be291	Add e2e test for SELinux metrics This is the commit message for patch #2 (refresh-temp):	2023-03-10 15:03:56 +01:00
Kubernetes Prow Robot	bdf5af93e8	Merge pull request #115468 from pwschuurman/kep-3335-e2e-tests Add e2e tests for StatefulSetStartOrdinal feature	2023-03-10 04:35:13 -08:00
Kubernetes Prow Robot	cb00077cd3	Merge pull request #113471 from ncdc/gc-contextual-logging garbagecollector: use contextual logging	2023-03-10 04:34:39 -08:00
Kubernetes Prow Robot	08fbe92fa7	Merge pull request #116423 from ffromani/e2e-podres-node-conformance e2e: podresources: promote platform-independent test as NodeConformance	2023-03-10 00:06:52 -08:00
Kubernetes Prow Robot	8ce3a2bbef	Merge pull request #116398 from tzneal/rework-init-containers-test rework init containers test to remove host file dependency	2023-03-09 22:44:25 -08:00
Kubernetes Prow Robot	16d2d55bc0	Merge pull request #115969 from DangerOnTheRanger/messageExpression-for-crd Add messageExpression field for CRD validation	2023-03-09 22:43:19 -08:00
Kubernetes Prow Robot	e8ae6658ed	Merge pull request #115065 from apelisse/apimachinery-managed-fields managedfields: Move most of fieldmanager package to managefields	2023-03-09 21:34:22 -08:00
Kermit Alexander II	4e26f680a9	Implement MessageExpression.	2023-03-09 23:37:59 +00:00
Kubernetes Prow Robot	45b96eae98	Merge pull request #113145 from smarterclayton/zombie_terminating_pods kubelet: Force deleted pods can fail to move out of terminating	2023-03-09 15:32:30 -08:00
Jefftree	387d97605e	Add metrics for aggregated discovery	2023-03-09 17:24:02 +00:00
Francesco Romani	5ca235e0ee	e2e: podresources: promote platform-independent test as NodeConformance We have quite a few podresources e2e tests and, as the feature progresses to GA, we should consider moving them to NodeConformance. Unfortunately most of them require linux-specific features not in the test themselves but in the test prelude (fixture) to check or create the node conditions (e.g. presence or not of devices, online CPUS...) to be verified in the test proper. For this reason we promote only a single test for starters. Signed-off-by: Francesco Romani <fromani@redhat.com>	2023-03-09 16:26:01 +01:00
Kubernetes Prow Robot	f90643435e	Merge pull request #113840 from 249043822/br-context-logging-statefulset statefulset: use contextual logging	2023-03-09 06:42:02 -08:00
Kubernetes Prow Robot	f02da82e36	Merge pull request #116404 from cpanato/go1202 [go] Bump images, dependencies and versions to go 1.20.2	2023-03-09 05:16:20 -08:00
Kubernetes Prow Robot	da87af638f	Merge pull request #115856 from lanycrost/e2e-115780-grpc-probe-tests Promote gRPC probe e2e test to Conformance	2023-03-09 01:06:03 -08:00
cpanato	99c80ac119	[go] Bump images, dependencies and versions to go 1.20.2	2023-03-09 09:57:45 +01:00
Todd Neal	78ca93e39c	rework init containers test to remove host file dependency Since we can't rely on the test runner and hosts under test to be on the same machine, we write to the terminate log from each container and concatenate the results.	2023-03-08 23:17:17 -06:00
Clayton Coleman	6b9a381185	kubelet: Force deleted pods can fail to move out of terminating If a CRI error occurs during the terminating phase after a pod is force deleted (API or static) then the housekeeping loop will not deliver updates to the pod worker which prevents the pod's state machine from progressing. The pod will remain in the terminating phase but no further attempts to terminate or cleanup will occur until the kubelet is restarted. The pod worker now maintains a store of the pods state that it is attempting to reconcile and uses that to resync unknown pods when SyncKnownPods() is invoked, so that failures in sync methods for unknown pods no longer hang forever. The pod worker's store tracks desired updates and the last update applied on podSyncStatuses. Each goroutine now synchronizes to acquire the next work item, context, and whether the pod can start. This synchronization moves the pending update to the stored last update, which will ensure third parties accessing pod worker state don't see updates before the pod worker begins synchronizing them. As a consequence, the update channel becomes a simple notifier (struct{}) so that SyncKnownPods can coordinate with the pod worker to create a synthetic pending update for unknown pods (i.e. no one besides the pod worker has data about those pods). Otherwise the pending update info would be hidden inside the channel. In order to properly track pending updates, we have to be very careful not to mix RunningPods (which are calculated from the container runtime and are missing all spec info) and config- sourced pods. Update the pod worker to avoid using ToAPIPod() and instead require the pod worker to directly use update.Options.Pod or update.Options.RunningPod for the correct methods. Add a new SyncTerminatingRuntimePod to prevent accidental invocations of runtime only pod data. Finally, fix SyncKnownPods to replay the last valid update for undesired pods which drives the pod state machine towards termination, and alter HandlePodCleanups to: - terminate runtime pods that aren't known to the pod worker - launch admitted pods that aren't known to the pod worker Any started pods receive a replay until they reach the finished state, and then are removed from the pod worker. When a desired pod is detected as not being in the worker, the usual cause is that the pod was deleted and recreated with the same UID (almost always a static pod since API UID reuse is statistically unlikely). This simplifies the previous restartable pod support. We are careful to filter for active pods (those not already terminal or those which have been previously rejected by admission). We also force a refresh of the runtime cache to ensure we don't see an older version of the state. Future changes will allow other components that need to view the pod worker's actual state (not the desired state the podManager represents) to retrieve that info from the pod worker. Several bugs in pod lifecycle have been undetectable at runtime because the kubelet does not clearly describe the number of pods in use. To better report, add the following metrics: kubelet_desired_pods: Pods the pod manager sees kubelet_active_pods: "Admitted" pods that gate new pods kubelet_mirror_pods: Mirror pods the kubelet is tracking kubelet_working_pods: Breakdown of pods from the last sync in each phase, orphaned state, and static or not kubelet_restarted_pods_total: A counter for pods that saw a CREATE before the previous pod with the same UID was finished kubelet_orphaned_runtime_pods_total: A counter for pods detected at runtime that were not known to the kubelet. Will be populated at Kubelet startup and should never be incremented after. Add a metric check to our e2e tests that verifies the values are captured correctly during a serial test, and then verify them in detail in unit tests. Adds 23 series to the kubelet /metrics endpoint.	2023-03-08 22:03:51 -06:00
Kubernetes Prow Robot	8d5c96fed2	Merge pull request #116093 from swatisehgal/topologymanager-ga-graduation node: topologymgr: Graduate Kubelet Topology Manager to GA	2023-03-08 16:56:06 -08:00
Kubernetes Prow Robot	30ee6914c5	Merge pull request #115149 from nilekhc/encrypt-all Allow encryption for all resources	2023-03-08 16:55:59 -08:00
Kubernetes Prow Robot	8fa82976fc	Merge pull request #116356 from pacoxu/cleanup-bump_qps_kubelet sync default qps of kubelet change everywhere	2023-03-08 15:42:41 -08:00
Maksim Nabokikh	c1431af4f8	KEP-3325: Promote SelfSubjectReview to Beta (#116274 ) * Promote SelfSubjectReview to Beta Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com> * Fix whoami API Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com> * Fixes according to code review Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com> --------- Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com>	2023-03-08 15:42:33 -08:00
Kubernetes Prow Robot	0a5310fe9a	Merge pull request #116232 from aojea/e2e_terminating_connectivity test connectivity for terminating pods	2023-03-08 15:42:21 -08:00
Kubernetes Prow Robot	8ee9b82b10	Merge pull request #115984 from tzneal/init-container-tests add more init container testing	2023-03-08 15:42:08 -08:00
Jiahui Feng	0a954cc10d	always get fresh object before updating.	2023-03-08 15:17:58 -08:00
Jiahui Feng	82eb24156a	add test for reset fields.	2023-03-08 15:01:06 -08:00
Peter Schuurman	c57bc292de	Add e2e tests for StatefulSetStartOrdinal feature	2023-03-08 14:55:58 -08:00
Nilekh Chaudhari	9382fab9b6	feat: implements encrypt all Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>	2023-03-08 22:18:49 +00:00
Antoine Pelisse	4f3859ce91	managedfields: Move most of fieldmanager package to managefields	2023-03-08 13:44:00 -08:00
Kubernetes Prow Robot	8319ac5274	Merge pull request #116383 from Huang-Wei/fix/sched-perf-test fix: remove SchedulingMigratedInTreePVs feature gate in sched perf test	2023-03-08 13:12:20 -08:00
Kubernetes Prow Robot	2a22864d9c	Merge pull request #116381 from pohly/cronjob-integration-test-shutdown cronjob: shut down integration test quickly again	2023-03-08 13:12:08 -08:00
Todd Neal	123ab80333	add more init container lifetime testing Add some additional init container tests that work via monitoring container lifetime based on logs written to a common file. This allows more easily writing assertions about the container lifetimes with respect to one another.	2023-03-08 14:39:10 -06:00
Kubernetes Prow Robot	7598ff36cf	Merge pull request #116333 from aojea/multiport_service e2e network test for multiple protocol services on same port	2023-03-08 11:27:12 -08:00
Wei Huang	c9bc2f98d0	fix: remove SchedulingMigratedInTreePVs feature gate in sched perf test	2023-03-08 08:34:44 -08:00
Patrick Ohly	be82872eff	cronjob: shut down integration test quickly again `6f2cd1b5bd` swapped the order of cancel() and closeFn() so that closeFn got called first when the test was done. This caused it to block while waiting for goroutines which themselves were waiting for the context cancellation. The test still shut down, it just took ~86s instead of ~30s. The fix is to register the cancel twice: once as soon as the context is created (to clean up in case of an unexpected panic) and once after closeFn (because then it'll get called first, as before).	2023-03-08 17:26:47 +01:00
Kubernetes Prow Robot	499a03d88b	Merge pull request #115451 from zhucan/nodeexpandvolume-secret-e2e e2e: add e2e test to node expand volume with secret	2023-03-08 07:53:12 -08:00
Kubernetes Prow Robot	03ff890ef4	Merge pull request #116329 from dims/drop-aws-kubelet-credential-provider-and-cleanup-aws-storage-e2e-tests Drop aws kubelet credential provider and cleanup aws storage e2e tests	2023-03-08 06:49:11 -08:00
Andy Goldstein	26e3dab78b	garbagecollector: use contextual logging Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>	2023-03-08 08:37:56 -05:00
ZhangKe10140699	a239b9986b	Migrated the StatefulSet controller (within `kube-controller-manager) to use [contextual logging](https://k8s.io/docs/concepts/cluster-administration/system-logs/#contextual-logging )	2023-03-08 18:57:57 +08:00
Paco Xu	f368413d65	sync default qps of kubelet change	2023-03-08 14:04:51 +08:00
Kubernetes Prow Robot	e390791e5f	Merge pull request #116341 from bobbypage/revert-114640-handle-device-mgr-recovery Revert "node: device-mgr: Handle recovery flow by checking if healthy devices exist"	2023-03-07 19:31:33 -08:00
Kubernetes Prow Robot	83334ccaa1	Merge pull request #116320 from wangchen615/lastminute-scheduler-fix Address last-minute requested changes for inplace update feature testing in scheduler	2023-03-07 19:31:26 -08:00
Kubernetes Prow Robot	548e856b58	Merge pull request #116144 from dashpole/apiserver_tracing_beta_round_2 Graduate API Server tracing to beta	2023-03-07 19:31:12 -08:00
zhucan	7a87b4363b	e2e: add e2e test to node expand volume with secret Signed-off-by: zhucan <zhucan.k8s@gmail.com>	2023-03-08 10:27:50 +08:00
Jiahui Feng	feb18b3f5f	implmementing type checking with multi-type support.	2023-03-07 15:49:19 -08:00
Jiahui Feng	54283a1d38	exempt validatingadmissionpolicies/status because admission control object does not apply to themselves.	2023-03-07 15:48:21 -08:00
Kubernetes Prow Robot	88cce39155	Merge pull request #116200 from Jefftree/e2e-openapi Add OpenAPI V3 E2E Tests	2023-03-07 15:08:55 -08:00
David Ashpole	4014d0fbbf	graduate API Server tracing to beta	2023-03-07 21:39:39 +00:00
Antonio Ojea	7cb135a888	e2e network test for multiple protocol services on same port The test creates a Service exposing two protocols on the same port and a backend that replies on both protocols. 1. Test that Service with works for both protocol 2. Update Service to expose only the TCP port 3. Verify that TCP works and UDP does not work 4. Update Service to expose only the UDP port 5. Verify that TCP does not work and UDP does work Change-Id: Ic4f3a6509e332aa5694d20dfc3b223d7063a7871	2023-03-07 21:30:39 +00:00
Jefftree	658ea4b591	Refactor aggregated apiserver e2e, add openapi e2e	2023-03-07 20:04:00 +00:00
David Porter	9c20cee504	Revert "node: device-mgr: Handle recovery flow by checking if healthy devices exist"	2023-03-07 11:50:52 -08:00
Kubernetes Prow Robot	2bac225e42	Merge pull request #116331 from aojea/revert_aojea_mistake Revert "do not assume backend on e2e service jig"	2023-03-07 10:44:53 -08:00
Kubernetes Prow Robot	7ec3c2727b	Merge pull request #115358 from pohly/logs-performance-benchmarks Logs performance benchmarks	2023-03-07 10:44:46 -08:00
Andrea Tosatto	cae19f9e85	Remove deprecated pod-eviction-timeout flag from controller-manager	2023-03-07 18:14:18 +00:00
kerthcet	e5c812bbe7	Remove CLI flag enable-taint-manager Signed-off-by: kerthcet <kerthcet@gmail.com>	2023-03-07 18:11:49 +00:00
Antonio Ojea	bcc61bbb8b	test connectivity for terminating pods Test 2 scenarios: - pod can connect to a terminating pods - terminating pod can connect to other pods Change-Id: Ia5dc4e7370cc055df452bf7cbaddd9901b4d229d	2023-03-07 17:48:07 +00:00
Kubernetes Prow Robot	0e9ad242bd	Merge pull request #116298 from soltysh/simplify_sset_test Get rid of context.TODO and simplify waitForStatusCurrentReplicas	2023-03-07 09:40:45 -08:00
Kubernetes Prow Robot	37326f7cea	Merge pull request #112670 from yangjunmyfm192085/delklogV0 use contextual logging（nodeipam and nodelifecycle part）	2023-03-07 09:40:33 -08:00
Davanum Srinivas	b50d9d1e28	Cleanup vendor/ Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2023-03-07 10:27:15 -05:00
Antonio Ojea	0826438d72	Revert "do not assume backend on e2e service jig" This reverts commit `909a08d011`. Change-Id: Ie9ba4284ac739b3ea7eb57c44bb4551316121920	2023-03-07 15:12:22 +00:00
Patrick Ohly	5ee679b340	test/integration/logs: use stable struct for unit test v1.Container is still changing a log which caused the test to fail each time a new field was added. To test loading, let's better use something that is unlikely to change. The runtimev1.VersionResponse gets logged by kubelet and seems to be stable.	2023-03-07 16:04:32 +01:00
Patrick Ohly	eaa95b9178	test/integration/logs: benchmark using logsapi The benchmarks and unit tests were written so that they used custom APIs for each log format. This made them less realistic because there were subtle differences between the benchmark and a real Kubernetes component. Now all logging configuration is done with the official k8s.io/component-base/logs/api/v1. To make the different test cases more comparable, "messages/s" is now reported instead of the generic "ns/op".	2023-03-07 16:04:32 +01:00
Patrick Ohly	10c15d7a67	test/integration/logs: replace assert.Contains For long strings the output of assert.Contains is not very readable.	2023-03-07 16:04:32 +01:00
Patrick Ohly	a862a269b0	test/integration/logs: remove useless stats case The same effect can be achieved with `-bench=BenchmarkEncoding/none`.	2023-03-07 16:04:32 +01:00
Patrick Ohly	97a8d72a67	test/integration/logs: update benchmark support When trying again with recent log files from the CI job, it was found that some JSON messages get split across multiple lines, both in container logs and in the systemd journal: 2022-12-21T07:09:47.914739996Z stderr F {"ts":1671606587914.691,"caller":"rest/request.go:1169","msg":"Response ... 2022-12-21T07:09:47.914984628Z stderr F 70 72 6f 78 79 10 01 1a 13 53 ... \".\|\n","v":8} Note the different time stamp on the second line. That first line is long (17384 bytes). This seems to happen because the data must pass through a stream-oriented pipe and thus may get split up by the Linux kernel. The implication is that lines must get merged whenever the JSON decoder encounters an incomplete line. The benchmark loader now supports that. To simplifies this, stripping the non-JSON line prefixes must be done before using a log as test data. The updated README explains how to do that when downloading a CI job result. The amount of manual work gets reduced by committing symlinks under data to the expected location under ci-kubernetes-kind-e2e-json-logging and ignoring them when the data is not there. Support for symlinks gets removed and path/filepath is used instead of path because it has better Windows support.	2023-03-07 16:03:48 +01:00
Davanum Srinivas	90d185b7e1	Drop AWS kubelet credential provider and cleanup AWS storage e2e tests Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2023-03-07 09:00:12 -05:00
Maciej Szulik	27d5cda811	Simplify waitForStatusCurrentReplicas helper	2023-03-07 14:18:31 +01:00
Maciej Szulik	17117dc47d	Replace context.TODO with proper context	2023-03-07 14:18:29 +01:00
Antonio Ojea	4482d1c2f4	e2e support Services with multiple EndpointSlices A Service can use multiple EndpointSlices for its backend, when using custom Endpoint Slices, the data plane should forward traffic to any of the endpoints in the Endpointslices that belong to the Service. Change-Id: I80b42522bf6ab443050697a29b94d8245943526f	2023-03-07 13:00:46 +00:00
Kubernetes Prow Robot	4eb29bcb21	Merge pull request #116040 from pbeschetnov/master [HPA e2e] Reduce possible number of scale steps to minimize stabilization test flakiness	2023-03-07 04:20:16 -08:00
Naman Lakhwani	b6f9a65558	Migrating `pkg/controller/serviceaccount` to contextual logging (#114918 ) * migrating pkg/controller/serviceaccount to contextual logging Signed-off-by: Naman <namanlakhwani@gmail.com> * small nit Signed-off-by: Naman <namanlakhwani@gmail.com> * capitalising first letter of error Signed-off-by: Naman <namanlakhwani@gmail.com> * addressed review comments Signed-off-by: Naman <namanlakhwani@gmail.com> * small nit to add key Signed-off-by: Naman <namanlakhwani@gmail.com> --------- Signed-off-by: Naman <namanlakhwani@gmail.com>	2023-03-07 04:19:59 -08:00
Naman Lakhwani	8f45b64c93	Migrated `pkg/controller/replicaset` to contextual logging (#114871 ) * migrated controller/replicaset to contextual logging Signed-off-by: Naman <namanlakhwani@gmail.com> * small nits Signed-off-by: Naman <namanlakhwani@gmail.com> * addressed changes Signed-off-by: Naman <namanlakhwani@gmail.com> * small nit Signed-off-by: Naman <namanlakhwani@gmail.com> * taking t as input Signed-off-by: Naman <namanlakhwani@gmail.com> --------- Signed-off-by: Naman <namanlakhwani@gmail.com>	2023-03-07 04:19:51 -08:00
Kubernetes Prow Robot	4aaa4df840	Merge pull request #113986 from songxiao-wang87/runwxs-test2 Migrate StorageVersionGC to contextual logging	2023-03-07 04:19:43 -08:00
Kubernetes Prow Robot	471b392f43	Merge pull request #113916 from songxiao-wang87/runwxs-test1 Migrate ttl_controller to contextual logging	2023-03-07 04:18:30 -08:00
Swati Sehgal	e5ad3cbf6a	node: topologymgr: update node e2e test tags Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-07 09:52:07 +00:00
Kubernetes Prow Robot	3489796d5c	Merge pull request #113428 from mengjiao-liu/contextual-logging-controller-cronjob Update `pkg/controller/cronjob/` for contextual logging	2023-03-07 01:28:18 -08:00
JunYang	780ef3afb0	use klog.InfoS instead of klog.V(0),Info	2023-03-07 15:50:01 +08:00
Kubernetes Prow Robot	04675428bb	Merge pull request #115973 from jpbetz/enforcement-actions KEP-3488: Implement Enforcement Actions and Audit Annotations	2023-03-06 21:56:37 -08:00
Joe Betz	c2b3871502	Add integration tests	2023-03-06 21:51:33 -05:00
Chen Wang	fd6105d015	fix last minute scheduler changes for inplace update	2023-03-06 18:47:02 -05:00
David Porter	d3214226de	test: Fix node e2e shutdown test flake Bump the timeout as the previous timeout was sometimes too short, resulting in the pod status update not sent. Also, fixed a typo in previous refactor. Signed-off-by: David Porter <david@porter.me>	2023-03-06 15:38:45 -08:00
Kubernetes Prow Robot	64259b43b8	Merge pull request #116054 from jpbetz/secondary-authz KEP-3488: Implement secondary authz for ValidatingAdmissionPolicy	2023-03-06 11:54:16 -08:00
Joe Betz	4d30c43494	Add integration tests for secondary authz	2023-03-06 12:08:53 -05:00
Kubernetes Prow Robot	d6e9cff212	Merge pull request #115838 from torredil/remove-aws Remove AWS legacy cloud provider + EBS in-tree storage plugin	2023-03-06 08:18:29 -08:00
torredil	6aebda9b1e	Remove AWS legacy cloud provider + EBS in-tree storage plugin Signed-off-by: torredil <torredil@amazon.com>	2023-03-06 14:01:15 +00:00
Swati Sehgal	d536a342b4	node: topologymgr: GA graduation implies Feature Gate is ON by default Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:51:05 +00:00
Swati Sehgal	01a9148887	node: device-mgr: e2e: adapt to sample device plugin refactoring These updates are to adapt to the sample device plugin refactoring done here: `92e00203e0`. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:15:59 +00:00
Swati Sehgal	bae8a164e0	node: device-mgr: e2e: address e2e test review comments Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:15:58 +00:00
Swati Sehgal	674879a959	node: device-mgr: e2e: Update the e2e test to reproduce issue:109595 Breakdown of the steps implemented as part of this e2e test is as follows: 1. Create a file `registration` at path `/var/lib/kubelet/device-plugins/sample/` 2. Create sample device plugin with an environment variable with `REGISTER_CONTROL_FILE=/var/lib/kubelet/device-plugins/sample/registration` that waits for a client to delete the control file. 3. Trigger plugin registeration by deleting the abovementioned directory. 4. Create a test pod requesting devices exposed by the device plugin. 5. Stop kubelet. 6. Remove pods using CRI to ensure new pods are created after kubelet restart. 7. Restart kubelet. 8. Wait for the sample device plugin pod to be running. In this case, the registration is not triggered. 9. Ensure that resource capacity/allocatable exported by the device plugin is zero. 10. The test pod should fail with `UnexpectedAdmissionError` 11. Delete the test pod. 12. Delete the sample device plugin pod. 13. Remove `/var/lib/kubelet/device-plugins/sample/` and its content, the directory created to control registration Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 12:15:58 +00:00
Swati Sehgal	db7afc1cd8	node: device-mgr: e2e: Implement End to end test This commit reuses e2e tests implmented as part of https://github.com/kubernetes/kubernetes/pull/110729. The commit is borrowed from the aforementioned PR as is to preserve authorship. Subsequent commit will update the end to end test to simulate the problem this PR is trying to solve by reproducing the issue: 109595. Co-authored-by: Francesco Romani <fromani@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-06 11:52:23 +00:00
Ed Bartosh	35fd124f4d	DRA: fix CDI spec version The latest CDI release includes spec version check that fails if version is less than 0.3.0: https://github.com/container-orchestrated-devices/container-device-interface/blob/v0.5.4/pkg/cdi/version.go#L42 Updating CDI spec version to 0.3.0 in the test kubelet plugin code should fix e2e test failures on the CRI runtimes that use CDI >= 0.5.4 (Containerd master atm, CRI-O soon).	2023-03-05 16:49:56 +02:00
Kubernetes Prow Robot	20c3a007f5	Merge pull request #115693 from bobbypage/shutdown_test test: e2e node shutdown test logging improvements	2023-03-03 15:20:57 -08:00
Kubernetes Prow Robot	15c5366a1c	Merge pull request #116240 from bobbypage/devicepluginfix test: Fix path to e2e node sample device plugin	2023-03-03 14:15:09 -08:00
Kubernetes Prow Robot	9f0b491953	Merge pull request #113270 from rrangith/fix/create-pvc-for-pending-pod Automatically recreate PVC for pending STS pod	2023-03-03 10:24:58 -08:00
Kubernetes Prow Robot	37d8b5a2b8	Merge pull request #116227 from gnufied/wait-for-pod-startup-before-resize Wait for pod to be running before expanding	2023-03-03 09:18:59 -08:00
David Porter	c5a1f0188b	test: Add node e2e test to verify static pod termination Add node e2e test to verify that static pods can be started after a previous static pod with the same config temporarily failed termination. The scenario is: 1. Static pod is started 2. Static pod is deleted 3. Static pod termination fails (internally `syncTerminatedPod` fails) 4. At later time, pod termination should succeed 5. New static pod with the same config is (re)-added 6. New static pod is expected to start successfully To repro this scenario, setup a pod using a NFS mount. The NFS server is stopped which will result in volumes failing to unmount and `syncTerminatedPod` to fail. The NFS server is later started, allowing the volume to unmount successfully. xref: 1. https://github.com/kubernetes/kubernetes/pull/113145#issuecomment-1289587988 2. https://github.com/kubernetes/kubernetes/pull/113065 3. https://github.com/kubernetes/kubernetes/pull/113093 Signed-off-by: David Porter <david@porter.me>	2023-03-03 10:00:48 -06:00
David Porter	1c75c2cda8	test: Add e2e to verify static pod termination Add a node e2e to verify that if a static pod is terminated while the container runtime or CRI returns an error, the pod is eventually terminated successfully. This test serves as a regression test for k8s.io/issue/113145 which fixes an issue where force deleted pods may not be terminated if the container runtime fails during a `syncTerminatingPod`. To test this behavior, start a static pod, stop the container runtime, and later start the container runtime. The static pod is expected to eventually terminate successfully. To start and stop the container runtime, we need to find the container runtime systemd unit name. Introduce a util function `findContainerRuntimeServiceName` which finds the unit name by getting the pid of the container runtime from the existing `ContainerRuntimeProcessName` flag passed into node e2e and using systemd dbus `GetUnitNameByPID` function to convert the pid of the container runtime to a unit name. Using the unit name, introduce helper functions to start and stop the container runtime. Signed-off-by: David Porter <david@porter.me>	2023-03-03 10:00:48 -06:00
Hemant Kumar	53585ec009	Bump the timeout for volume expansion	2023-03-02 22:36:24 -05:00
David Porter	8647c23c11	test: Fix path to e2e node sample device plugin The existing path is incorrect (missing `sample-device-plugin`) directory and thus causing test failures. The full path should be `test/e2e/testing-manifests/sample-device-plugin/sample-device-plugin.yaml`. Signed-off-by: David Porter <david@porter.me>	2023-03-02 19:22:59 -08:00
Kubernetes Prow Robot	74f0819069	Merge pull request #116152 from torredil/fix-windows-e2e-test Add windows nodeSelector to provisioning functions	2023-03-02 11:36:56 -08:00
Kubernetes Prow Robot	ab002db788	Merge pull request #116223 from logicalhan/metric-docs include beta metrics in documentation and update docs for metrics	2023-03-02 10:31:04 -08:00
Hemant Kumar	99fe00797d	Wait for pod to be running before expanding	2023-03-02 12:57:17 -05:00
Kubernetes Prow Robot	b6d102d634	Merge pull request #116071 from yuanchen8911/symlink Add symlink data verification to statefulset e2e	2023-03-02 05:43:07 -08:00
Kubernetes Prow Robot	78e5db0931	Merge pull request #115107 from swatisehgal/handle-device-mgr-recovery-sample-dp-changes node: device-mgr: sample device plugin: Add support to control registration process	2023-03-02 05:42:55 -08:00
Kubernetes Prow Robot	949bee0118	Merge pull request #116189 from marosset/windows-hyperv-basic-e2e-test Adding e2e test to verify hyperv container is running inside a VM on Windows	2023-03-01 22:27:07 -08:00
Kubernetes Prow Robot	d788d436c9	Merge pull request #115893 from mgoltzsche/go-jose-update-2.6 bump go-jose to v2.6.0	2023-03-01 20:23:06 -08:00
Kubernetes Prow Robot	59a7e34052	Merge pull request #115442 from bobbypage/unknown_pods_test test: Add e2e node test to check for unknown pods	2023-03-01 19:08:55 -08:00
Max Goltzsche	df8fa2eab5	bump go-jose to v2.6.0 Update go-jose from v2.2.2 to v2.6.0. This is to make the kubernetes code compatible with newer go-jose versions that have a small breaking change (`jwt.NewNumericDate()` returns a pointer). Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>	2023-03-02 02:53:17 +01:00
Kubernetes Prow Robot	1646ed8222	Merge pull request #116057 from bobbypage/nodee2elog test: Add log artifact for ginkgo node e2e and tune default ginkgo flags	2023-03-01 16:55:16 -08:00
Kubernetes Prow Robot	dfa03231da	Merge pull request #116110 from knabben/knabben/polling-hpc-stats Poll for stats until Windows kubelet present it in the stats endpoint	2023-03-01 15:11:27 -08:00
Kubernetes Prow Robot	51dedff4f3	Merge pull request #115277 from pohly/klog-update klog update	2023-03-01 15:11:16 -08:00
Mark Rossetti	ab020ee628	Adding e2e test to verify hyperv container is running inside a VM on Windows Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2023-03-01 14:08:46 -08:00
Kubernetes Prow Robot	b0c949d9dd	Merge pull request #116148 from aramase/aramase/f/ci-metrics [KMSv2] update ci script to create cluster and gather metrics	2023-03-01 12:39:30 -08:00
Amim Knabben	3fd3a76eb9	Poll for stats until Windows kubelet present it in the stats endpoint	2023-03-01 17:17:23 -03:00
Han Kang	0199276f85	include beta metrics in documentation and update docs for metrics	2023-03-01 11:32:19 -08:00
Kubernetes Prow Robot	60eefa8066	Merge pull request #115425 from pohly/scheduler-perf-benchstat scheduler perf: benchstat support	2023-03-01 11:19:29 -08:00
Kubernetes Prow Robot	fe671737ec	Merge pull request #116181 from pohly/dra-test-driver-update e2e: dra test driver update	2023-03-01 10:10:39 -08:00
Patrick Ohly	961819a4d0	dependencies: update klog v2.90.1 This improves performance of the text formatting and ktesting. Because ktesting no longer buffers messages by default, one unit test needs to ask for that explicitly.	2023-03-01 19:03:50 +01:00
Anish Ramasekar	c52ac0d59d	[KMSv2] update ci script to create cluster and gather metrics Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>	2023-03-01 18:03:37 +00:00
Patrick Ohly	74785074c6	e2e dra: update logging When running as part of the scheduler_perf benchmark testing, we want to print less information by default, so we should use V to limit verbosity Pretty-printing doesn't belong into "application" code. I am moving that into the ktesting formatting (https://github.com/kubernetes/kubernetes/pull/116180).	2023-03-01 15:02:03 +01:00
Patrick Ohly	106fce6fae	e2e dra: improve goroutine handling There is an API now to wait for informer factory goroutine termination. While at it, an incorrect comment for mutex locking gets removed.	2023-03-01 15:00:30 +01:00
Justin SB	50a025acdb	e2e: Remove dead code in tests We were building a local pod variable that we were no longer using. Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2023-03-01 08:08:33 -05:00
Kubernetes Prow Robot	9ef145d3a7	Merge pull request #116127 from pacoxu/negative-grace-period retry for negative TerminationGracePeriodSeconds update	2023-03-01 04:29:16 -08:00
Swati Sehgal	7ea35d0cd8	node: device-mgr: sample device plugin: manifest to avoid registration Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-01 10:01:34 +00:00
Swati Sehgal	2c8fc26b89	node: device-mgr: sample device plugin: control registration process Update the sample device plugin to enable the e2e node tests (or any other entity with full access to the node filesystem) to control the registration process. We add a new environment variable `REGISTER_CONTROL_FILE`. The value of this variable must be a file which prevents the plugin to register itself while it's present. Once removed, the plugin will go on and complete the registration. The plugin will automatically detect the parent directory on which the file resides and detect deletions, unblocking the registration process. If the file is specified but unaccessible, the plugin will fail. If the file is not specified, the registration process will progress as usual and never pause. The plugin will need read access to the parent directory. This feature is useful because it is not possible to control the order in which the pods are recovered after node reboot/kubelet restart. In this approach, the testing environment will create a directory and then a empty file to pause the registration process of the plugin. Once pointed to that file, the plugin will start and wait for it to be deleted. Only after the directory has been deleted, the plugin would proceed to registration. This feature is used in #114640 where e2e test is implemented to simulate scenarios where application pods requesting devices come up before the device plugin pod on node reboot/ kubelet restart. Co-authored-by: Francesco Romani <fromani@redhat.com> Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2023-03-01 10:00:52 +00:00
Paco Xu	7d8437933e	retry on conflict for negative TerminationGracePeriodSeconds update	2023-03-01 12:55:58 +08:00
Kubernetes Prow Robot	93a5181871	Merge pull request #116022 from nilekhc/reference-implementation-provider [kmsv2] feat: add kms mock plugin for e2e tests	2023-02-28 17:57:17 -08:00
Nilekh Chaudhari	43acba8084	feat: kms base64 plugin for e2e tests Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>	2023-03-01 00:11:17 +00:00
Kubernetes Prow Robot	9b213330f5	Merge pull request #116153 from alexzielenski/podsecurity-featuregate-re-enable skip special features in TestPodSecurityGAOnly	2023-02-28 16:07:23 -08:00
Kubernetes Prow Robot	6e202d6fdb	Merge pull request #116116 from ahg-g/ahg-mutable-job-ga Graduate JobMutableNodeSchedulingDirectives feature to GA	2023-02-28 14:53:52 -08:00
Kubernetes Prow Robot	0469455ff7	Merge pull request #116082 from mimowo/fix-oomkiller-test Fix the flaky OOMKiller test by sleep at start	2023-02-28 14:53:37 -08:00
Patrick Ohly	cc4bcd1d8e	scheduler_perf: report data items as benchmark results This replaces the pretty useless us/op metric (useless because it includes setup and teardown times) with the same values that also get stored in the JSON file. The main advantage is that benchstat can be used to analyze and compare results.	2023-02-28 23:08:23 +01:00
Patrick Ohly	961129c5f1	scheduler_perf: add logging flags This enables testing of different real production configurations (JSON vs. text, different log levels, contextual logging).	2023-02-28 23:08:17 +01:00
Patrick Ohly	00d1459530	test/utils: extend ktesting The upstream ktesting has to be very flexible to accommodate different ways of using it. In Kubernetes, we can be opinionated and make certain choices, like using klog flags, and only those.	2023-02-28 23:06:00 +01:00
Patrick Ohly	c008732948	test/integration: add StartEtcd In contrast to EtcdMain, it can be called by individual tests or benchmarks and each caller will get a fresh etcd instance. However, it uses the same underlying code and the same port for all instances, so tests cannot run in parallel.	2023-02-28 23:05:17 +01:00

1 2 3 4 5 ...

23142 Commits