kubernetes

Author	SHA1	Message	Date
Kubernetes Prow Robot	1814c9c7fb	Merge pull request #105926 from 249043822/br-flakytest1 Fix:Flaky test] [sig-node] Kubelet should correctly account for terminated pods after restart	2021-10-28 10:20:34 -07:00
Kubernetes Prow Robot	e450e3331f	Merge pull request #105482 from endocrimes/dani/kubeletconfig e2e_node: remove unnecessary dynamic config changes	2021-10-28 07:04:27 -07:00
KeZhang	257efda87a	Fix:Flaky test] [sig-node] Kubelet should correctly account for terminated pods after restart	2021-10-28 08:31:14 +08:00
Kubernetes Prow Robot	fa6bb7cad0	Merge pull request #105921 from SergeyKanzhelev/setHostnameAsFQDNIsNodeConformance setHostnameAsFQDN is a GA feature that does not depend on environment	2021-10-26 21:57:26 -07:00
Kubernetes Prow Robot	7c715dbc68	Merge pull request #105637 from Namanl2001/ssh adding `--ssh-key` and `--ssh-user` for kubetest2	2021-10-26 16:33:45 -07:00
Sergey Kanzhelev	cf0a387774	setHostnameAsFQDN is a GA feature that does not depend on environment	2021-10-26 00:24:12 +00:00
Kubernetes Prow Robot	18104ecf1f	Merge pull request #105405 from verb/1.23-ec-beta Promote EphemeralContainers to beta	2021-10-20 09:24:10 -07:00
Martin Schimandl	c9edee165a	Cleanup FeatureGate skippers (#105428 ) * Cleanup FeatureGate skippers * Perform changes requested by review * some more review related changes * Rename skipper functions to make code more readable * add utilfeature back in	2021-10-20 01:47:57 -07:00
Kubernetes Prow Robot	b17bf879a4	Merge pull request #105697 from fromanirh/e2e-kubelet-restart-cleanups node: e2e: clarify findKubeletService	2021-10-19 20:43:57 -07:00
Lee Verberne	40e7689f0e	Move ephemeral container e2e to common	2021-10-19 23:02:09 -04:00
Lee Verberne	ba649b97b7	Add ephemeral container checks to volume e2e tests	2021-10-19 23:02:09 -04:00
Kubernetes Prow Robot	712840904a	Merge pull request #104540 from wzshiming/fix/node-shutdown-e2e Fix nodeShutdownReason for node shutdown e2e	2021-10-19 18:39:57 -07:00
Lee Verberne	6f4b8da9a3	Promote EphemeralContainers feature to beta	2021-10-19 08:47:57 -04:00
Namanl2001	85d16760f0	adding defaultGceKey in remote/ssh.go Signed-off-by: Namanl2001 <namanlakhwani@gmail.com>	2021-10-19 00:07:36 +05:30
Francesco Romani	baa55935f3	node: e2e: clarify findKubeletService Add docstrings to findKubeletService and restartKubelet, fix typos along the way. xref: https://github.com/kubernetes/kubernetes/pull/105516#pullrequestreview-780230582 Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 11:19:03 +02:00
Kubernetes Prow Robot	fe62fcc9b4	Merge pull request #105516 from fromanirh/e2e-kubelet-restart-improvements e2e: node: kubelet restart improvements	2021-10-14 17:58:54 -07:00
Kubernetes Prow Robot	548e37278c	Merge pull request #105313 from giuseppe/fix-flaky-pagefaults-summary-test test, cgroupv2: adjust pagefaults test	2021-10-14 00:13:30 -07:00
Kubernetes Prow Robot	894ceb63d0	Merge pull request #105003 from swatisehgal/getallocatable-to-beta podresource-api: getAllocatableResources to Beta	2021-10-13 17:43:27 -07:00
Kubernetes Prow Robot	63f66e6c99	Merge pull request #105012 from fromanirh/cpumanager-policy-options-beta node: graduate CPUManagerPolicyOptions to beta	2021-10-08 07:32:59 -07:00
Kubernetes Prow Robot	2face135c7	Merge pull request #97415 from AlexeyPerevalov/ExcludeSharedPoolFromPodResources Return only isolated cpus in podresources interface	2021-10-08 05:58:58 -07:00
Kubernetes Prow Robot	dd650bd41f	Merge pull request #105527 from rphillips/fixes/filter_terminated_pods kubelet: set terminated podWorker status for terminated pods	2021-10-07 22:19:51 -07:00
Ryan Phillips	3982fcae64	go fmt	2021-10-07 20:13:43 -05:00
Elana Hashman	c771698de3	Add e2e test to verify kubelet restart behaviour Succeeded pods should not be counted as running on restart.	2021-10-07 18:30:17 -05:00
Francesco Romani	d15bff2839	e2e: node: expose the `running` flag Each e2e test knows it wants to restart a running kubelet or a non-running kubelet. The vast majority of times, we want to restart a running kubelet (e.g. to change config or to check some properties hold across kubelet crashes/restarts), but sometimes we stop the kubelet, do some actions and only then restart. To accomodate both use cases, we just expose the `running` boolean flag to the e2e tests. Having the `restartKubelet` explicitly restarting a running kubelet helps us to trobuleshoot e2e failures on which the kubelet was supposed to be running, while it was not; attempting a restart in such cases only murkied the waters further, making the troubleshooting and the eventual fix harder. In the happy path, no expected change in behaviour. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-07 22:15:28 +02:00
Francesco Romani	e878c20ac7	e2e: node: improve error logging In the `restartKubelet` helper, we use `exec.Command`, whose return value is the output as the command, but as `[]byte`. The way we logged the output of the command was as value, making the output, meant to be human readable, unnecessarily hard to read. We fix this annoying behaviour converting the output to string before to log it out, making pretty obvious to understand the outcome of the command. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-07 22:13:49 +02:00
Swati Sehgal	5043b431b4	excludesharedpool: e2e tests: Test cases for pods with non-integral CPUs Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-10-07 15:39:41 +01:00
Swati Sehgal	42dd01aa3f	excludesharedpool: e2e tests: code refactor to handle non-integral CPUs This patch changes cpuCount to cpuRequest in order to cater to cases where guaranteed pods make non-integral CPU Requests. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-10-07 15:39:40 +01:00
Kubernetes Prow Robot	c4d802b0b5	Merge pull request #103289 from AlexeyPerevalov/DoNotExportEmptyTopology podresources: do not export empty NUMA topology	2021-10-07 07:11:46 -07:00
Swati Sehgal	9337902648	podresource: move the checkForTopology logic inline As per the recommendation here: https://github.com/kubernetes/kubernetes/pull/103289#pullrequestreview-766949859 we move the check inline. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-10-06 11:31:48 +01:00
Giuseppe Scrivano	f23e2a8c7f	test, cgroupv2: adjust pagefaults test on cgroup v2 the reported metric is recursive and it includes all the sub cgroups. Closes: https://github.com/kubernetes/kubernetes/issues/105301 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>	2021-10-05 18:00:57 +02:00
Kubernetes Prow Robot	c5ad58d8a1	Merge pull request #103372 from verb/1.22-e2e-node Create node_e2e test for ephemeral containers	2021-10-05 05:41:09 -07:00
Danielle Lancashire	742d3d36f5	e2e_node: cleanup features in podresources	2021-10-05 14:39:59 +02:00
Danielle Lancashire	f28dd90810	e2e_node: NodeGracefulShutdown is a Beta feature	2021-10-05 14:39:59 +02:00
Danielle Lancashire	71e6d9cbe0	e2e_node: remove no-op config change from critical_pod_test	2021-10-05 10:36:32 +02:00
Danielle Lancashire	8b1b06c507	e2e_node: Remove KubeletPodResources enablement as it is a default gate	2021-10-05 10:26:10 +02:00
Kubernetes Prow Robot	9eaabb6b2e	Merge pull request #104304 from endocrimes/dani/eviction [Failing Test] Fix Kubelet Storage Eviction Tests	2021-10-04 15:16:40 -07:00
Lee Verberne	2a82228e33	Apply suggestions from code review Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com>	2021-10-04 15:07:37 +02:00
Danielle Lancashire	7b91337068	e2e_node: eviction: Include names of pending-eviction pods in error	2021-10-04 13:07:40 +02:00
Danielle Lancashire	b5c2d3b389	e2e_node: eviction: Memory-backed Volumes seperation This commit fixes the LocalStorageCapacityIsolationEviction test by acknowledging that in its default configuration kubelet will no-longer evict memory-backed volume pods as they cannot use more than their assigned limit with SizeMemoryBackedVolumes enabled. To account for the old behaviour, we also add a test that explicitly disables the feature to test the behaviour of memory backed local volumes in those scenarios. That test can be removed when/if the feature gate is removed.	2021-10-04 13:07:40 +02:00
Danielle Lancashire	a8168ed543	e2e_node: Fix LocalStorage and PriorityLocalStorage eviction tests Currently the storage eviction tests fail for a few reasons: - They re-enter storage exhaustion after pulling the images during cleanup (increasing test storage reqs, and adding verification for future diagnosis) - They were timing out, as in practice it seems that eviction takes just over 10 minutes on an n1-standard in many cases. I'm raising these to 15 to provide some padding. This should ideally bring these tests to passing on CI, as they've now passed locally for me several times with the remote GCE env. Follow up work involves diagnosing why these take so long, and restructuring them to be less finicky.	2021-10-04 13:07:40 +02:00
Swati Sehgal	01dacd0463	podresource-api: getAllocatableResources to Beta Signed-off-by: Swati Sehgal <swsehgal@redhat.com>	2021-10-01 16:48:29 +01:00
Patrick Ohly	21d1bcd6b8	initialize logging after flag parsing It wasn't documented that InitLogs already uses the log flush frequency, so some commands have called it before parsing (for example, kubectl in the original code for logs.go). The flag never had an effect in such commands. Fixing this turned into a major refactoring of how commands set up flags and run their Cobra command: - component-base/logs: implicitely registering flags during package init is an anti-pattern that makes it impossible to use the package in commands which want full control over their command line. Logging flags must be added explicitly now, something that the new cli.Run does automatically. - component-base/logs: AddFlags would have crashed in kubectl-convert if it had been called because it relied on the global pflag.CommandLine. This has been fixed and kubectl-convert now has the same --log-flush-frequency flag as other commands. - component-base/logs/testinit: an exception are tests where flag.CommandLine has to be used. This new package can be imported to add flags to that once per test program. - Normalization of the klog command line flags was inconsistent. Some commands unintentionally didn't normalize to the recommended format with hyphens. This gets fixed for sample programs, but not for production programs because it would be a breaking change. This refactoring has the following user-visible effects: - The validation error for `go run ./cmd/kube-apiserver --logging-format=json --add-dir-header` now references `add-dir-header` instead of `add_dir_header`. - `staging/src/k8s.io/cloud-provider/sample` uses flags with hyphen instead of underscore. - `--log-flush-frequency` is not listed anymore in the --logging-format flag's `non-default formats don't honor these flags` usage text because it will also work for non-default formats once it is needed. - `cmd/kubelet`: the description of `--logging-format` uses hyphens instead of underscores for the flags, which now matches what the command is using. - `staging/src/k8s.io/component-base/logs/example/cmd`: added logging flags. - `apiextensions-apiserver` no longer prints a useless stack trace for `main` when command line parsing raises an error.	2021-09-30 13:46:49 +02:00
Francesco Romani	077c0aa1be	node: graduate CPUManagerPolicyOptions to beta We graduate the `CPUManagerPolicyOptions` feature to beta in the 1.23 cycle, and we add new experimental feature gates to guard new options which are planned in the 1.23 and in the following cycles. We introduce additional feature gate called `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions`. The basic idea is to avoid the cumbersome process of adding a feature gate for each option, and to have feature gates which track the maturity level of _groups_ of options. Besides this change, the graduation process, and the process in general, for adding new policy options is still unchanged. The `full-pcpus-only` option added in the 1.22 cycle is intentionally moved into the beta policy options For more details: - KEP: https://github.com/kubernetes/enhancements/pull/2933 - sig-arch discussion: https://groups.google.com/u/1/g/kubernetes-sig-architecture/c/Nxsc7pfe5rw Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-09-29 11:40:03 +02:00
Lee Verberne	da8ddb7485	Create node_e2e test for ephemeral containers	2021-09-27 13:13:33 +02:00
Kubernetes Prow Robot	e5c4defa8e	Merge pull request #103370 from verb/1.22-cleanup-shareprocesses-e2e Remove ShareProcessNamespace tags from e2e_node tests	2021-09-23 10:11:14 -07:00
Elana Hashman	47086a6623	Add test for recreating a static pod	2021-09-15 14:01:48 -04:00
Francesco Romani	54c7d8fbb1	e2e: TM: add option to fail instead of skip The Topology Manager e2e tests wants to run on real multi-NUMA system and want to consume real devices supported by device plugins; SRIOV devices happen to be the most commonly available of such devices. CI machines aren't multi NUMA nor expose SRIOV devices, so the biggest portion of the tests will just skip, and we need to keep it like this until we figure out how to enable these features. However, some organizations can and want to run the testsuite on bare metal; in this case, the current test will skip (not fail) with misconfigured boxes, and this reports a misleading result. It will be much better to fail if the test preconditions aren't met. To satisfy both needs, we add an option, controlled by an environment variable, to fail (not skip) if the machine on which the test run doesn't meet the expectations (multi-NUMA, 4+ cores per NUMA cell, expose SRIOV VFs). We keep the old behaviour as default to keep being CI friendly. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-09-13 13:23:36 +02:00
Kubernetes Prow Robot	5261433627	Merge pull request #104606 from endocrimes/dani/device-driver-deflake [Failing Test] Fix GPU Device Driver test in kubelet-serial	2021-09-10 04:20:00 -07:00
Danielle Lancashire	b970bb5fe0	e2e_node: Update GPU tests to reflect reality In older versions of Kubernetes (at least pre-0.19, it's the earliest this test will run unmodified on), Pods that depended on devices could be restarted after the device plugin had been removed. Currently however, this isn't possible, as during ContainerManager.GetResources(), we attempt to DeviceManager.GetDeviceRunContainerOptions() which fails as there's no cached endpoint information for the plugin type. This commit therefore breaks apart the existing test into two: - One active test that validates that assignments are maintained across restarts - One skipped test that validates the behaviour after GPUs have been removed, in case we decide that this is a bug that should be fixed in the future.	2021-09-06 19:03:15 +02:00
Danielle Lancashire	3884dcb909	e2e_node: run gpu pod long enough to become ready	2021-08-26 14:24:23 +02:00

1 2 3 4 5 ...

2087 Commits