Commit Graph

2109 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
cda360c59f Merge pull request #104613 from ravisantoshgudimetla/reconcile-labels
[kubelet]: Reconcile OS and arch labels periodically
2021-11-08 14:15:19 -08:00
Artyom Lukianov
117141eee3 e2e_node: fix tests after Kubelet dynamic configuration removal
- CPU manager
- Memory Manager
- Topology Manager

Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-08 09:42:24 +02:00
ravisantoshgudimetla
3af5d37be7 [node][e2e test]: Make sure reconcile labels is working fine 2021-11-06 19:21:58 -04:00
Kubernetes Prow Robot
adcd2feb5e Merge pull request #104153 from cynepco3hahue/e2e_node_provide_static_kubelet_config
e2e node: provide static kubelet config
2021-11-04 17:11:53 -07:00
Kubernetes Prow Robot
27d3a9ec57 Merge pull request #104481 from AlexeyPerevalov/E2eIsKubeletConfiguration
e2e_node: Properly check for DynamicKubeletConfig
2021-11-04 16:11:53 -07:00
Artyom Lukianov
50fdcdfc59 e2e_node: refactor code to use a single method to update the kubelet config
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 15:44:35 +02:00
Artyom Lukianov
ca35bdb403 e2e_node: remove DynamicKubeletConfig tests from serial lane
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 15:26:19 +02:00
Artyom Lukianov
b6211657bf e2e_node: drop usage of DynamicKubeletConfig
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 15:26:19 +02:00
Artyom Lukianov
a5ed6c824a e2e_node: provide methods to update kubelet config via file
Signed-off-by: Artyom Lukianov <alukiano@redhat.com>
2021-11-04 15:26:19 +02:00
David Porter
ddd0d8a3da test: fixes for graceful node shutdown test
* Bump the pod status and node status update timeouts to avoid flakes
* Add a small delay after dbus restart to ensure dbus has enough time to
  restart to startup prior to sending shutdown signal
* Change check of pod being terminated by graceful shutdown. Previously,
  the pod phase was checked to see if it was `Failed` and the pod reason
  string matched. This logic needs to change after 1.22 graceful node
  shutdown change introduced in PR #102344 which changed behavior to no
  longer put the pods into a failed phase. Instead, the test now checks
  that containers are not ready, and the pod status message and reason
  are set appropriately.

Signed-off-by: David Porter <david@porter.me>
2021-11-03 18:40:26 -07:00
Kubernetes Prow Robot
b489b03946 Merge pull request #105575 from endocrimes/dani/cleanup-launcher
Allow the e2e_node runner to receive a KubeletConfiguration rather than requiring flags
2021-11-02 18:00:10 -07:00
Kubernetes Prow Robot
359b722c19 Merge pull request #102882 from fromanirh/device-manager-checkpoints
devicemanager: checkpoint: support pre-1.20 data
2021-11-02 16:56:57 -07:00
Danielle Lancashire
4ae64bd799 e2e_node: add a default kubeletconfig fallback 2021-11-02 15:10:29 +01:00
Danielle Lancashire
a4cf3a90a2 e2e_node: support passing kubelet-config-file to local runs 2021-11-02 15:10:29 +01:00
Danielle Lancashire
6e9e436026 e2e_node: kubelet config: move to file where possible 2021-11-02 15:10:28 +01:00
Danielle Lancashire
4097a3d472 e2e_node: allow customizing the base kubeletconfig
This commit forces Kubelet Configuration files to always be generated
and when possible will use the kubeletconfig file that has been provided
by the test orchestrator
2021-11-02 15:09:56 +01:00
Danielle Lancashire
f1deb0ba2e e2e_node: remote: add kubeletconfig to archive
This commit enables the remote runner to provide a KubeletConfiguration
file to the test suite when uploading it to a remote host, thet test
runner will then use this configuration to run the Kubelet with the
provided config.
2021-11-02 15:08:39 +01:00
Danielle Lancashire
26980cf701 e2e_node: cleanup entrypoint 2021-11-02 15:08:39 +01:00
Danielle Lancashire
7dbbfe38e1 e2e_node: remote runner: junitFilePrefix -> junitFileName 2021-11-02 15:08:39 +01:00
David Porter
e1a951afe5 Fix COS GPU driver installation
* Rely on the built in GPU driver installer in COS as recommended in
  public docs - https://cloud.google.com/container-optimized-os/docs/how-to/run-gpus
* Run `nvidia-smi` after installation to verify installation
2021-10-28 17:49:50 -07:00
Kubernetes Prow Robot
1814c9c7fb Merge pull request #105926 from 249043822/br-flakytest1
Fix:Flaky test] [sig-node] Kubelet should correctly account for terminated pods after restart
2021-10-28 10:20:34 -07:00
Kubernetes Prow Robot
e450e3331f Merge pull request #105482 from endocrimes/dani/kubeletconfig
e2e_node: remove unnecessary dynamic config changes
2021-10-28 07:04:27 -07:00
KeZhang
257efda87a Fix:Flaky test] [sig-node] Kubelet should correctly account for terminated pods after restart 2021-10-28 08:31:14 +08:00
Kubernetes Prow Robot
fa6bb7cad0 Merge pull request #105921 from SergeyKanzhelev/setHostnameAsFQDNIsNodeConformance
setHostnameAsFQDN is a GA feature that does not depend on environment
2021-10-26 21:57:26 -07:00
Kubernetes Prow Robot
7c715dbc68 Merge pull request #105637 from Namanl2001/ssh
adding `--ssh-key` and `--ssh-user` for kubetest2
2021-10-26 16:33:45 -07:00
Francesco Romani
b382b6cd0a node: e2e: add test for the checkpoint recovery
Add a e2e test to exercise the checkpoint recovery flow.
This means we need to actually create a old (V1, pre-1.20) checkpoint,
but if we do it only in the e2e test, it's still fine.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-10-26 09:55:11 +02:00
Sergey Kanzhelev
cf0a387774 setHostnameAsFQDN is a GA feature that does not depend on environment 2021-10-26 00:24:12 +00:00
Kubernetes Prow Robot
18104ecf1f Merge pull request #105405 from verb/1.23-ec-beta
Promote EphemeralContainers to beta
2021-10-20 09:24:10 -07:00
Martin Schimandl
c9edee165a Cleanup FeatureGate skippers (#105428)
* Cleanup FeatureGate skippers

* Perform changes requested by review

* some more review related changes

* Rename skipper functions to make code more readable

* add utilfeature back in
2021-10-20 01:47:57 -07:00
Kubernetes Prow Robot
b17bf879a4 Merge pull request #105697 from fromanirh/e2e-kubelet-restart-cleanups
node: e2e: clarify findKubeletService
2021-10-19 20:43:57 -07:00
Lee Verberne
40e7689f0e Move ephemeral container e2e to common 2021-10-19 23:02:09 -04:00
Lee Verberne
ba649b97b7 Add ephemeral container checks to volume e2e tests 2021-10-19 23:02:09 -04:00
Kubernetes Prow Robot
712840904a Merge pull request #104540 from wzshiming/fix/node-shutdown-e2e
Fix nodeShutdownReason for node shutdown e2e
2021-10-19 18:39:57 -07:00
Lee Verberne
6f4b8da9a3 Promote EphemeralContainers feature to beta 2021-10-19 08:47:57 -04:00
Namanl2001
85d16760f0 adding defaultGceKey in remote/ssh.go
Signed-off-by: Namanl2001 <namanlakhwani@gmail.com>
2021-10-19 00:07:36 +05:30
Francesco Romani
baa55935f3 node: e2e: clarify findKubeletService
Add docstrings to findKubeletService and restartKubelet,
fix typos along the way.
xref: https://github.com/kubernetes/kubernetes/pull/105516#pullrequestreview-780230582

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-10-15 11:19:03 +02:00
Kubernetes Prow Robot
fe62fcc9b4 Merge pull request #105516 from fromanirh/e2e-kubelet-restart-improvements
e2e: node: kubelet restart improvements
2021-10-14 17:58:54 -07:00
Kubernetes Prow Robot
548e37278c Merge pull request #105313 from giuseppe/fix-flaky-pagefaults-summary-test
test, cgroupv2: adjust pagefaults test
2021-10-14 00:13:30 -07:00
Kubernetes Prow Robot
894ceb63d0 Merge pull request #105003 from swatisehgal/getallocatable-to-beta
podresource-api: getAllocatableResources to Beta
2021-10-13 17:43:27 -07:00
Kubernetes Prow Robot
63f66e6c99 Merge pull request #105012 from fromanirh/cpumanager-policy-options-beta
node: graduate CPUManagerPolicyOptions to beta
2021-10-08 07:32:59 -07:00
Kubernetes Prow Robot
2face135c7 Merge pull request #97415 from AlexeyPerevalov/ExcludeSharedPoolFromPodResources
Return only isolated cpus in podresources interface
2021-10-08 05:58:58 -07:00
Kubernetes Prow Robot
dd650bd41f Merge pull request #105527 from rphillips/fixes/filter_terminated_pods
kubelet: set terminated podWorker status for terminated pods
2021-10-07 22:19:51 -07:00
Ryan Phillips
3982fcae64 go fmt 2021-10-07 20:13:43 -05:00
Elana Hashman
c771698de3 Add e2e test to verify kubelet restart behaviour
Succeeded pods should not be counted as running on restart.
2021-10-07 18:30:17 -05:00
Francesco Romani
d15bff2839 e2e: node: expose the running flag
Each e2e test knows it wants to restart a running kubelet or a
non-running kubelet. The vast majority of times, we want to
restart a running kubelet (e.g. to change config or to check
some properties hold across kubelet crashes/restarts), but sometimes
we stop the kubelet, do some actions and only then restart.

To accomodate both use cases, we just expose the `running` boolean
flag to the e2e tests.

Having the `restartKubelet` explicitly restarting a running kubelet
helps us to trobuleshoot e2e failures on which the kubelet
was supposed to be running, while it was not; attempting a restart
in such cases only murkied the waters further, making the
troubleshooting and the eventual fix harder.

In the happy path, no expected change in behaviour.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-10-07 22:15:28 +02:00
Francesco Romani
e878c20ac7 e2e: node: improve error logging
In the `restartKubelet` helper, we use `exec.Command`, whose
return value is the output as the command, but as `[]byte`.
The way we logged the output of the command was as value, making
the output, meant to be human readable, unnecessarily hard to read.

We fix this annoying behaviour converting the output to string before
to log it out, making pretty obvious to understand the outcome of
the command.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2021-10-07 22:13:49 +02:00
Swati Sehgal
5043b431b4 excludesharedpool: e2e tests: Test cases for pods with non-integral CPUs
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-07 15:39:41 +01:00
Swati Sehgal
42dd01aa3f excludesharedpool: e2e tests: code refactor to handle non-integral CPUs
This patch changes cpuCount to cpuRequest in order to cater to cases
where guaranteed pods make non-integral CPU Requests.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-07 15:39:40 +01:00
Kubernetes Prow Robot
c4d802b0b5 Merge pull request #103289 from AlexeyPerevalov/DoNotExportEmptyTopology
podresources: do not export empty NUMA topology
2021-10-07 07:11:46 -07:00
Swati Sehgal
9337902648 podresource: move the checkForTopology logic inline
As per the recommendation here: https://github.com/kubernetes/kubernetes/pull/103289#pullrequestreview-766949859
we move the check inline.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2021-10-06 11:31:48 +01:00