Commit Graph

23007 Commits

Author SHA1 Message Date
Amim Knabben
3fd3a76eb9 Poll for stats until Windows kubelet present it in the stats endpoint 2023-03-01 17:17:23 -03:00
Han Kang
0199276f85 include beta metrics in documentation and update docs for metrics 2023-03-01 11:32:19 -08:00
Kubernetes Prow Robot
60eefa8066
Merge pull request #115425 from pohly/scheduler-perf-benchstat
scheduler perf: benchstat support
2023-03-01 11:19:29 -08:00
Kubernetes Prow Robot
fe671737ec
Merge pull request #116181 from pohly/dra-test-driver-update
e2e: dra test driver update
2023-03-01 10:10:39 -08:00
Patrick Ohly
961819a4d0 dependencies: update klog v2.90.1
This improves performance of the text formatting and ktesting.

Because ktesting no longer buffers messages by default, one unit
test needs to ask for that explicitly.
2023-03-01 19:03:50 +01:00
Anish Ramasekar
c52ac0d59d
[KMSv2] update ci script to create cluster and gather metrics
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2023-03-01 18:03:37 +00:00
Patrick Ohly
74785074c6 e2e dra: update logging
When running as part of the scheduler_perf benchmark testing, we want to print
less information by default, so we should use V to limit verbosity

Pretty-printing doesn't belong into "application" code. I am moving that into
the ktesting formatting (https://github.com/kubernetes/kubernetes/pull/116180).
2023-03-01 15:02:03 +01:00
Patrick Ohly
106fce6fae e2e dra: improve goroutine handling
There is an API now to wait for informer factory goroutine termination.
While at it, an incorrect comment for mutex locking gets removed.
2023-03-01 15:00:30 +01:00
Justin SB
50a025acdb e2e: Remove dead code in tests
We were building a local pod variable that we were no longer using.

Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
2023-03-01 08:08:33 -05:00
Kubernetes Prow Robot
9ef145d3a7
Merge pull request #116127 from pacoxu/negative-grace-period
retry for negative TerminationGracePeriodSeconds update
2023-03-01 04:29:16 -08:00
Swati Sehgal
7ea35d0cd8 node: device-mgr: sample device plugin: manifest to avoid registration
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-01 10:01:34 +00:00
Swati Sehgal
2c8fc26b89 node: device-mgr: sample device plugin: control registration process
Update the sample device plugin to enable the e2e node tests (or any
other entity with full access to the node filesystem) to control the
registration process. We add a new environment variable `REGISTER_CONTROL_FILE`.
The value of this variable must be a file which prevents the plugin
to register itself while it's present. Once removed, the plugin will
go on and complete the registration. The plugin will automatically
detect the parent directory on which the file resides and detect
deletions, unblocking the registration process. If the file is specified
but unaccessible, the plugin will fail. If the file is not specified,
the registration process will progress as usual and never pause.
The plugin will need read access to the parent directory.

This feature is useful because it is not possible to control the order
in which the pods are recovered after node reboot/kubelet restart.

In this approach, the testing environment will create a directory and
then a empty file to pause the registration process of the plugin.
Once pointed to that file, the plugin will start and wait for it to
be deleted. Only after the directory has been deleted,
the plugin would proceed to registration.

This feature is used in #114640 where e2e test is implemented to
simulate scenarios where application pods requesting devices come up before
the device plugin pod on node reboot/ kubelet restart.

Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-01 10:00:52 +00:00
Paco Xu
7d8437933e retry on conflict for negative TerminationGracePeriodSeconds update 2023-03-01 12:55:58 +08:00
Kubernetes Prow Robot
93a5181871
Merge pull request #116022 from nilekhc/reference-implementation-provider
[kmsv2] feat: add kms mock plugin for e2e tests
2023-02-28 17:57:17 -08:00
Nilekh Chaudhari
43acba8084
feat: kms base64 plugin for e2e tests
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>
2023-03-01 00:11:17 +00:00
Kubernetes Prow Robot
9b213330f5
Merge pull request #116153 from alexzielenski/podsecurity-featuregate-re-enable
skip special features in TestPodSecurityGAOnly
2023-02-28 16:07:23 -08:00
Kubernetes Prow Robot
6e202d6fdb
Merge pull request #116116 from ahg-g/ahg-mutable-job-ga
Graduate JobMutableNodeSchedulingDirectives feature to GA
2023-02-28 14:53:52 -08:00
Kubernetes Prow Robot
0469455ff7
Merge pull request #116082 from mimowo/fix-oomkiller-test
Fix the flaky OOMKiller test by sleep at start
2023-02-28 14:53:37 -08:00
Patrick Ohly
cc4bcd1d8e scheduler_perf: report data items as benchmark results
This replaces the pretty useless us/op metric (useless because it includes
setup and teardown times) with the same values that also get stored in the JSON
file.

The main advantage is that benchstat can be used to analyze and compare
results.
2023-02-28 23:08:23 +01:00
Patrick Ohly
961129c5f1 scheduler_perf: add logging flags
This enables testing of different real production configurations (JSON
vs. text, different log levels, contextual logging).
2023-02-28 23:08:17 +01:00
Patrick Ohly
00d1459530 test/utils: extend ktesting
The upstream ktesting has to be very flexible to accommodate different ways of
using it. In Kubernetes, we can be opinionated and make certain choices, like
using klog flags, and only those.
2023-02-28 23:06:00 +01:00
Patrick Ohly
c008732948 test/integration: add StartEtcd
In contrast to EtcdMain, it can be called by individual tests or benchmarks and
each caller will get a fresh etcd instance. However, it uses the same
underlying code and the same port for all instances, so tests cannot run in
parallel.
2023-02-28 23:05:17 +01:00
Alexander Zielenski
9ef1fc543f skip special features in TestPodSecurityGAOnly
was causing some alpha/beta features to be disabled after running sometimes
2023-02-28 13:21:35 -08:00
torredil
42909af615
Add windows nodeSelector to provisioning functions
Signed-off-by: torredil <torredil@amazon.com>
2023-02-28 21:21:29 +00:00
ahg-g
2ecd24011a Graduate JobMutableNodeSchedulingDirectives feature to GA 2023-02-28 15:47:13 +00:00
Kubernetes Prow Robot
6f68a13696
Merge pull request #115961 from pohly/e2e-framework-deprecate-gomega-wrappers
e2e framework: deprecate gomega wrappers
2023-02-28 06:27:29 -08:00
Kubernetes Prow Robot
04e7021d06
Merge pull request #114625 from Divya063/feature-private-image-registry
[E2E] Add support for pulling images from private registry
2023-02-28 06:27:17 -08:00
Kubernetes Prow Robot
806b215cce
Merge pull request #115987 from yuanchen8911/cleanup
Replace closures in test packages
2023-02-28 01:47:29 -08:00
Divya Rani
a8b1e57246 add support for pulling images from private registry 2023-02-28 00:40:51 -08:00
David Porter
e001884594 test: Add some default flags ginkgo flags for node e2e
Add the following ginkgo flags for each node e2e similar to the
existing hack/ginkgo-e2e.sh script.

* --no-color, colors aren't rendered properly in prow and make examining
  the log in text editors more difficult, so let's disable them.
  `hack/ginkgo-e2e.sh` (used for kind e2e tests) also disables them
  already.
* -v, enable verbose logs. This is needed so we get more detailed info
  even when the tests pass. This is useful so we can compare successful
  runs to failed runs.

Signed-off-by: David Porter <david@porter.me>
2023-02-28 00:24:40 -08:00
David Porter
e9ecdf3534 test: Emit ginkgo log for each node e2e
When running multiple node e2e with multiple machine images, the tests
are run separately for each node. The final build log has all of the
results for each of the hosts combined together which make debugging the
log difficult. To make it easier, emit a log for each host that was run.
This log will be written to the results directory and uploaded as an
artifact in prow jobs.

Signed-off-by: David Porter <david@porter.me>
2023-02-28 00:21:34 -08:00
David Porter
0980f026c9 test: Remove tests argument from node e2e image config
This was never being used, the only config that used it was deleted in
https://github.com/kubernetes/test-infra/pull/26017 so we don't need
this anymore, so let's delete it.

Signed-off-by: David Porter <david@porter.me>
2023-02-28 00:21:03 -08:00
Kubernetes Prow Robot
b9fd1802ba
Merge pull request #102884 from vinaykul/restart-free-pod-vertical-scaling
In-place Pod Vertical Scaling feature
2023-02-27 22:53:15 -08:00
Kevin Delgado
9828f62237 turn field validation e2e tests into conformance tests 2023-02-27 14:39:21 -08:00
Kubernetes Prow Robot
12ceec47aa
Merge pull request #115977 from elmiko/fix-lb-test
remove aws from e2e loadbalancer udp conntrack tests
2023-02-27 10:26:50 -08:00
Yuan Chen
a24aef6510 Replace a function closure
Replace more closures with pointer conversion

Replace deprecated Int32Ptr to Int32
2023-02-27 09:13:36 -08:00
Kubernetes Prow Robot
015e2fa20c
Merge pull request #115953 from pohly/lint-gomega
test: fixing + linting gomega usage
2023-02-27 00:56:20 -08:00
Michal Wozniak
36eef0600d Fix the flaky OOMKiller test by sleep at start 2023-02-27 08:15:46 +01:00
Yuan Chen
b929908f3b Add symlink check to e2e statefulset app
Fix a bug
2023-02-26 18:37:58 -08:00
Khachatur Ashotyan
1033cf80df e2e: promote gRPC probe tests to conformance 2023-02-25 08:54:51 +04:00
Kubernetes Prow Robot
70fee660fb
Merge pull request #115854 from kerthcet/cleanup/apiserver-cleanup
Cleanup resources when initializing error in integration
2023-02-24 15:58:05 -08:00
Chen Wang
7db339dba2 This commit contains the following:
1. Scheduler bug-fix + scheduler-focussed E2E tests
2. Add cgroup v2 support for in-place pod resize
3. Enable full E2E pod resize test for containerd>=1.6.9 and EventedPLEG related changes.

Co-Authored-By: Vinay Kulkarni <vskibum@gmail.com>
2023-02-24 18:21:21 +00:00
Vinay Kulkarni
f2bd94a0de In-place Pod Vertical Scaling - core implementation
1. Core Kubelet changes to implement In-place Pod Vertical Scaling.
2. E2E tests for In-place Pod Vertical Scaling.
3. Refactor kubelet code and add missing tests (Derek's kubelet review)
4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature.
5. Fix corner-case where resize A->B->A gets ignored
6. Add cgroup v2 support to pod resize E2E test.
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources

Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>
2023-02-24 18:21:21 +00:00
Vinay Kulkarni
76962b0fa7 In-place Pod Vertical Scaling - API changes
1. Define ContainerResizePolicy and add it to Container struct.
 2. Add ResourcesAllocated and Resources fields to ContainerStatus struct.
 3. Define ResourcesResizeStatus and add it to PodStatus struct.
 4. Add InPlacePodVerticalScaling feature gate and drop disabled fields.
 5. ResizePolicy validation & defaulting and Resources mutability for CPU/Memory.
 6. Various fixes from code review feedback (originally committed on Apr 12, 2022)
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources
2023-02-24 17:18:04 +00:00
Pavel Beschetnov
e25badc5de Reduce possible number of scale steps to improve test stability 2023-02-24 13:19:05 +00:00
Kubernetes Prow Robot
35f3fc59c1
Merge pull request #115236 from danielvegamyhre/scalable-indexed-job
Support for elastic Indexed Jobs
2023-02-23 14:57:34 -08:00
Kubernetes Prow Robot
10cdaefc1f
Merge pull request #116005 from gjkim42/fix-createStaticPod
Fix createStaticPod to not use container.RestartPolicy
2023-02-23 12:33:35 -08:00
Kubernetes Prow Robot
3bf57e3ded
Merge pull request #115989 from yuanchen8911/closure
Replace a function argument in statefulset e2e framework
2023-02-23 09:52:37 -08:00
Kubernetes Prow Robot
f3bb101f54
Merge pull request #115964 from atwamahmoud/fix-scale-crd-target-ref
Update ExistsInDiscovery to ignore 404 errors in autoscaling utils framework
2023-02-23 08:47:46 -08:00
Kubernetes Prow Robot
aa98f6f4da
Merge pull request #115606 from wzshiming/fix/termination_grace_period_seconds
`pod.spec.terminationGracePeriodSeconds` is a negative then convert to 1
2023-02-23 07:35:35 -08:00
Mahmoud Atwa
5df798f22c Fix import alias issue for errors package 2023-02-23 12:22:32 +00:00
Gunju Kim
f690a0ce41
Fix createStaticPod to not use container.RestartPolicy 2023-02-23 21:18:24 +09:00
Kubernetes Prow Robot
3702411ef9
Merge pull request #115926 from ffromani/e2e-node-remove-kubevirt-device-plugin
e2e: node remove: kubevirt device plugin
2023-02-23 04:13:35 -08:00
Patrick Ohly
181fc50f8e e2e framework: deprecate gomega wrappers
All wrappers except for ExpectNoError are identical to their gomega
counterparts. The only advantage that they have is that their invocations are
shorter.

That advantage does not outweigh their disadvantages:
- cannot be used in combination with gomega.Eventually/Consistently
- not a full replacement for gomega, so we just end up using both
- don't support passing a stack offset and thus cannot be used in helper
  functions
- ginkgolinter does not work for them, so sub-optimal calls like this one
  are not reported:

     framework.ExpectEqual(len(items), 0)
     ->
     gomega.Expect(items).To(gomega.BeEmpty())
- developers try to make do with what's available in the framework, leading
  to sub-optimal checks like this:

    framework.ExpectEqual(true, strings.Contains(event.Message, expectedEventError), "Event error should indicate non-root policy caused container to not start")
    ->
    gomega.Expect(event.Message).To(gomega.ContainSubstring(expectedEventError), "Event error should indicate non-root policy caused container to not start")

So let's remove these wrappers. As a first step they get marked as deprecated.
This enables stricter
linting (https://github.com/kubernetes/kubernetes/pull/109728), once enabled,
to report new code which uses them.
2023-02-23 09:51:42 +01:00
Yuan Chen
214bb8e573 Remove a closure function in statefulset e2e 2023-02-22 19:30:13 -08:00
Daniel Vega-Myhre
c63f448451 change test names and address other comments 2023-02-23 03:25:17 +00:00
Daniel Vega-Myhre
b0b0959b92 address comments 2023-02-23 03:25:16 +00:00
Daniel Vega-Myhre
d41302312e update validation logic so completions is mutable iff completions is modified in tandem with parallelsim so completions == parallelism 2023-02-23 03:25:16 +00:00
michael mccune
3a76c95f8e remove aws from e2e loadbalancer udp conntrack tests
These tests are not working as expected on AWS due to the requirement
for the user to add an annotation to the UDP Service created.
2023-02-22 17:08:57 -05:00
Patrick Ohly
41f23f52d0 test: fix ginkgolinter issues
All of these issues were reported by https://github.com/nunnatsa/ginkgolinter.
Fixing these issues is useful (several expressions get simpler, using
framework.ExpectNoError is better because it has additional support for
failures) and a necessary step for enabling that linter in our golangci-lint
invocation.
2023-02-22 19:36:05 +01:00
Mahmoud Atwa
13d25acdfa Remove version check & 403 ignore 2023-02-22 16:03:18 +00:00
cc
d49bff855f
refine: the server-side http Request Body is always non-nil (#115908)
* refine: the server-side http Request Body is always non-nil

* revert changes under vendor

* Update staging/src/k8s.io/pod-security-admission/cmd/webhook/server/server.go

Co-authored-by: Jordan Liggitt <jordan@liggitt.net>

* Update main.go

---------

Co-authored-by: Jordan Liggitt <jordan@liggitt.net>
2023-02-22 07:18:09 -08:00
Mahmoud Atwa
e4c25afa59 Remove unneeded edit in error formatting 2023-02-22 13:51:01 +00:00
Mahmoud Atwa
b1980b2f6b Update ExistsInDiscovery to ignore 404 & 403 errors in autoscaling utils framework 2023-02-22 13:41:46 +00:00
Francesco Romani
00b41334bf e2e: node: podresources: fix restart wait
Fix the waiting logic in the e2e test loop to wait
for resources to be reported again instead of making logic on the
timestamp. The idea is that waiting for resource availability
is the canonical way clients should observe the desired state,
and it should also be more robust than comparing timestamps,
especially on CI environments.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 14:04:55 +01:00
Francesco Romani
92e00203e0 e2e: node: unify sample device plugin utilities
Start to consolidate the sample device plugin utility
and constants in a central place, because we need
to use it in different e2e tests.

Having a central dependency is better than a maze of
entangled e2e tests depending on each other helpers.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 14:04:55 +01:00
Francesco Romani
871201ba64 e2e: node: remove kubevirt device plugin
The podresources e2e tests want to exercise the case on which a device
plugin doesn't report topology affinity. The only known device plugin
which had this requirement and didn't depend on specialized hardware
was the kubevirt device plugin, which was however deprecated after
we started using it.

So the e2e tests are now broken, and in any case they can't depend on
unmaintained and liable to be obsolete code.

To unblock the state and preserve some e2e signal, we switch to the
sample device plugin, which is a stub implementation and which is
managed in-tree, so we can maintain it and ensure it fits the e2e test
usecase.

This is however a regression, because e2e tests should try their hardest
to use real devices and avoid any mocking or faking.
The upside is that using a OS-neutral device plugin for the tests enables
us to run on all the supported platform (windows!) so this could allow
us to transition these tests to conformance.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 14:04:22 +01:00
Francesco Romani
aa1a0385e2 e2e: node: podresources: internal cleanup
rename getPodResources for clarity. Allow to return error
(and not use ginkgo expectations), so it can actually be used
as intended inside `Eventually` blocks without blow up at the
first failure.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-02-22 13:49:13 +01:00
Shiming Zhang
b70d1377bb Add test for Pod TerminationGracePeriodSeconds is negative 2023-02-22 13:36:16 +08:00
Anish Ramasekar
c9b8ad6a55
[KMSv2] restructure kms staging dir
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2023-02-21 22:40:25 +00:00
Kubernetes Prow Robot
06b6644fcf
Merge pull request #115815 from Huang-Wei/pod-scheduling-readiness-beta
Graduate PodSchedulingReadiness to beta
2023-02-21 14:24:32 -08:00
Kubernetes Prow Robot
edea44c82e
Merge pull request #113205 from mimowo/oomkiller-e2e-node-test
Add e2e_node test for oom killed container reason
2023-02-21 14:23:55 -08:00
Kubernetes Prow Robot
487c443239
Merge pull request #115710 from pohly/e2e-import-restrictions
e2e framework: revise import restrictions
2023-02-20 17:17:48 -08:00
Kubernetes Prow Robot
b6582ffcd5
Merge pull request #115863 from jsafrane/remove-vsphere-test-global
Remove global vSphere framework variable
2023-02-20 11:09:48 -08:00
cpanato
a2c5863adc
update distroless iptables to v0.2.1
Signed-off-by: cpanato <ctadeu@gmail.com>
2023-02-20 13:44:09 +01:00
Jan Safranek
ba099644b2 Remove global framework variable
`f framework.Framework` does not need to be global, it's used only on a few
places.

This fixes vSphereDriver.PrepareTest() in in_tree.go that schedules
ginkgo.DeferCleanup() that uses the global `f` variable, but its value is not
valid at the time of ginkgo cleanup.
2023-02-20 11:00:12 +01:00
Michal Wozniak
fd28f69ca4 Add e2e_node test for oom killed container reason 2023-02-20 08:15:45 +01:00
Arda Güçlü
6c346e6cc9 Re-enable label selector 2023-02-20 09:10:51 +03:00
Arda Güçlü
6e8a1beda7 Add integration test for diff --prune --selector
This PR adds new integration tests for `kubectl diff --prune -l` to
catch possible regressions in the future.
2023-02-20 09:10:50 +03:00
Kubernetes Prow Robot
d6fe718e19
Merge pull request #115800 from shogohida/switch-image-in-grpc-probe-tests-to-agnhost
Switch image in gRPC probe tests to agnhost
2023-02-18 10:19:36 -08:00
Shogo Hida
7cbf007e47 Fix port number
Signed-off-by: Shogo Hida <shogo.hida@gmail.com>
2023-02-18 20:44:10 +09:00
Kubernetes Prow Robot
70b2e4aa3e
Merge pull request #113312 from jiahuif-forks/feature/cel/builtins
OpenAPI-based CEL type library
2023-02-18 00:31:36 -08:00
Wei Huang
72863f65d6
Graduate PodSchedulingReadiness to beta 2023-02-17 18:45:20 -08:00
Shogo Hida
26f95f475a Fix arguments
Signed-off-by: Shogo Hida <shogo.hida@gmail.com>
2023-02-18 09:51:17 +09:00
Kante Yin
ad55d0cbc9 Use context instead when cleaning up
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-02-17 17:13:35 +08:00
Kante Yin
014be8444a Make sure resoruces will be cleaned up when initializing error
Signed-off-by: Kante Yin <kerthcet@gmail.com>
2023-02-17 17:10:38 +08:00
Davanum Srinivas
4ecb4670cc
Remove unnecessary ETCD_UNSUPPORTED_ARCH for arm64
we should only use this env var for `arm`, since `arm64` is fully
supported by etcd folks, let us drop this!

(ex - https://github.com/etcd-io/etcd/releases/tag/v3.5.6)

ppc64le comment should be dropped as well

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2023-02-16 21:29:13 -05:00
Kubernetes Prow Robot
5c09c9de29
Merge pull request #115828 from cpanato/go1201
[go] Bump images, dependencies and versions to go 1.20.1
2023-02-16 09:55:56 -08:00
Kubernetes Prow Robot
d004f324d3
Merge pull request #114447 from yulng/elseyw
fix:Optimize code for else logic
2023-02-16 08:33:40 -08:00
cpanato
65230338ad
[go] Bump images, dependencies and versions to go 1.20.1
Signed-off-by: cpanato <ctadeu@gmail.com>
2023-02-16 13:38:32 +01:00
Kubernetes Prow Robot
1e84987bac
Merge pull request #115799 from pohly/test-util-data-race
test/utils: avoid data race during parallel create
2023-02-16 01:53:38 -08:00
Patrick Ohly
501a7678b3 test/utils: avoid data race during parallel create
The client-go Create call writes into the object that it gets passed. Each call
therefore needs its own copy when invoked in parallel.

Seen in

   go test -v -timeout=0 -bench=.*/SchedulingBasic/5000Nodes -race ./test/integration/scheduler_perf

WARNING: DATA RACE
Read at 0x00c003fa5b00 by goroutine 45227:
  k8s.io/apimachinery/pkg/apis/meta/v1.(*TypeMeta).GroupVersionKind()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/meta.go:126 +0x84
  k8s.io/apimachinery/pkg/runtime.WithVersionEncoder.Encode()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/helper.go:231 +0x176
  k8s.io/apimachinery/pkg/runtime.(*WithVersionEncoder).Encode()
      <autogenerated>:1 +0xfb
  k8s.io/apimachinery/pkg/runtime.Encode()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/codec.go:50 +0xb3
  k8s.io/client-go/rest.(*Request).Body()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/rest/request.go:469 +0x884
  k8s.io/client-go/kubernetes/typed/core/v1.(*pods).Create()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/kubernetes/typed/core/v1/pod.go:126 +0x264
  k8s.io/kubernetes/test/utils.CreatePodWithRetries.func1()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/create_resources.go:61 +0x111
  k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:222 +0x30
  k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:262 +0x7b
  k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:255 +0x5c
  k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:431 +0x67
  k8s.io/kubernetes/test/utils.RetryWithExponentialBackOff()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/create_resources.go:53 +0x1be
  k8s.io/kubernetes/test/utils.CreatePodWithRetries()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/create_resources.go:70 +0x1bf
  k8s.io/kubernetes/test/utils.makeCreatePod()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/runners.go:1339 +0x68
  k8s.io/kubernetes/test/utils.CreatePod.func1()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/runners.go:1349 +0xab
  k8s.io/client-go/util/workqueue.ParallelizeUntil.func1()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/util/workqueue/parallelizer.go:90 +0x1c1

Previous write at 0x00c003fa5b00 by goroutine 45250:
  k8s.io/apimachinery/pkg/apis/meta/v1.(*TypeMeta).SetGroupVersionKind()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/meta.go:121 +0x1cc
  k8s.io/apimachinery/pkg/runtime.WithVersionEncoder.Encode()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/helper.go:241 +0x408
  k8s.io/apimachinery/pkg/runtime.(*WithVersionEncoder).Encode()
      <autogenerated>:1 +0xfb
  k8s.io/apimachinery/pkg/runtime.Encode()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/runtime/codec.go:50 +0xb3
  k8s.io/client-go/rest.(*Request).Body()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/rest/request.go:469 +0x884
  k8s.io/client-go/kubernetes/typed/core/v1.(*pods).Create()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/kubernetes/typed/core/v1/pod.go:126 +0x264
  k8s.io/kubernetes/test/utils.CreatePodWithRetries.func1()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/create_resources.go:61 +0x111
  k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:222 +0x30
  k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:262 +0x7b
  k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtection()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:255 +0x5c
  k8s.io/apimachinery/pkg/util/wait.ExponentialBackoff()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:431 +0x67
  k8s.io/kubernetes/test/utils.RetryWithExponentialBackOff()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/create_resources.go:53 +0x1be
  k8s.io/kubernetes/test/utils.CreatePodWithRetries()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/create_resources.go:70 +0x1bf
  k8s.io/kubernetes/test/utils.makeCreatePod()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/runners.go:1339 +0x68
  k8s.io/kubernetes/test/utils.CreatePod.func1()
      /nvme/gopath/src/k8s.io/kubernetes/test/utils/runners.go:1349 +0xab
  k8s.io/client-go/util/workqueue.ParallelizeUntil.func1()
      /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/util/workqueue/parallelizer.go:90 +0x1c1
2023-02-16 08:44:42 +01:00
Kubernetes Prow Robot
908803081f
Merge pull request #115811 from danwinship/e2e-userspace-cleanup
Remove checks for userspace proxy mode in e2e tests
2023-02-15 18:03:49 -08:00
Kubernetes Prow Robot
292450717c
Merge pull request #115394 from ritazh/kmsv2-metrics
kmsv2: add metrics
2023-02-15 18:03:37 -08:00
Kubernetes Prow Robot
a25834cb5a
Merge pull request #115802 from logicalhan/webhook-metrics
webhook metrics top out at 2.5s but default timeout is 10s
2023-02-15 15:29:11 -08:00
Rita Zhang
bd0f7f8ee8
kmsv2: add metrics
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
2023-02-15 15:08:24 -08:00
Dan Winship
9283429f22 Remove checks for userspace proxy mode in e2e tests
It's gone
2023-02-15 16:30:58 -05:00
Han Kang
7b823002f3 add 25s bucket 2023-02-15 10:31:12 -08:00
Kubernetes Prow Robot
fa3d5730a4
Merge pull request #115797 from pohly/dra-test-driver-resource-limit-fix
e2e dra: fix resource limits in a mixed cluster
2023-02-15 09:58:33 -08:00
Han Kang
20b5205dad use 10 seconds as the biggest bucket for webhook metrics otherwise charts will top out at 2.5s for webhook latencies 2023-02-15 09:17:41 -08:00
Shogo Hida
58ae449604 Change etcd to agnhost
Signed-off-by: Shogo Hida <shogo.hida@gmail.com>
2023-02-16 00:49:28 +09:00
Kubernetes Prow Robot
e18fa74551
Merge pull request #115590 from swatisehgal/topology-mgr-duration-metrics
node: topology-mgr: Add metric to measure topology manager admission latency
2023-02-15 07:12:25 -08:00
Patrick Ohly
20d7fa2771 e2e dra: fix resource limits in a mixed cluster
The check for "resources available on a node" must treat nodes that are not
listed as "no resources available". The previous logic only worked because all
nodes were listed during E2E testing. The upcoming integration testing is
covering additional scenarios and triggered this broken case.
2023-02-15 15:12:19 +01:00
Swati Sehgal
cf21dcef51 node: topology-mgr: e2e: changes to validate admission latency metrics
The component was previously incorrect. This patch updates to
the correct component name.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-02-15 13:59:56 +00:00
Kubernetes Prow Robot
390ddafe9e
Merge pull request #114494 from chrishenzie/readwriteoncepod-beta
Graduate ReadWriteOncePod to beta, updated e2e test
2023-02-14 16:35:42 -08:00
Kubernetes Prow Robot
731238fb41
Merge pull request #115739 from ii/update-ineligible-yaml-debug-endpoints
Update ineligible endpoints yaml to include debug endpoints
2023-02-14 12:46:02 -08:00
Chris Henzie
f855c90c1e chore: Update hostpath driver to v1.11.0
This version enforces the new SINGLE_NODE_SINGLE_WRITER CSI access mode
in NodePublishVolume.

See for more details:
https://github.com/kubernetes-csi/csi-driver-host-path/pull/381
2023-02-14 10:09:58 -08:00
Chris Henzie
0e47d90dd1 test: e2e test for ReadWriteOncePod preemption 2023-02-14 10:09:57 -08:00
Kubernetes Prow Robot
4cf352c4bb
Merge pull request #115456 from pohly/goroutine-leak-check
test/integration: goroutine leak check
2023-02-14 08:31:31 -08:00
Andy Goldstein
71ec5ed81d
resourcequota: use contexual logging (#113315)
Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
2023-02-14 07:19:31 -08:00
Patrick Ohly
f131cabfa0 test: use go-uber/goleak for strict leak checking
It provides more readable output and has additional APIs for using it inside a
unit test. goleak.IgnoreCurrent is needed to filter out the goroutine that gets
started when importing go.opencensus.io/stats/view.

In order to handle background goroutines that get created on demand and cannot
be stopped (like the one for LogzHealth), a helper function ensures that those
are running before calling goleak.IgnoreCurrent. Keeping those goroutines
running is not a problem and thus not worth the effort of adding new APIs to
stop them.

Other goroutines are genuine leaks for which no fix is available. Those get
suppressed via IgnoreTopFunction, which works as long as that function
is unique enough.

Example output for the leak fixed in https://github.com/kubernetes/kubernetes/pull/115423:

    E0202 09:30:51.641841   74789 etcd.go:205] "EtcdMain goroutine check" err=<
        found unexpected goroutines:
        [Goroutine 4889 in state chan receive, with k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop on top of the stack:
        goroutine 4889 [chan receive]:
        k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop(0xc0076183c0)
        	/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/watch/mux.go:268 +0x65
        created by k8s.io/apimachinery/pkg/watch.NewBroadcaster
        	/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/watch/mux.go:77 +0x116
    >
2023-02-14 12:11:37 +01:00
Kubernetes Prow Robot
8b2545efa3
Merge pull request #115730 from ravisantoshgudimetla/remove-cgo
Remove cgo dependency
2023-02-13 12:49:38 -08:00
Stephen Heywood
4d2611cf58 Update ineligible endpoints yaml
Adding the following endpoints
- connectCoreV1GetNamespacedPodPortforward
- connectCoreV1GetNamespacedPodAttach
- connectCoreV1PostNamespacedPodAttach
2023-02-14 09:00:44 +13:00
Kubernetes Prow Robot
b8b18ecd85
Merge pull request #114051 from chrishenzie/rwop-preemption
[scheduler] Support preemption of pods using ReadWriteOncePod PVCs
2023-02-13 11:45:30 -08:00
ravisantoshgudimetla
d65262d1f9 Remove cgo dependency 2023-02-13 11:16:39 -05:00
Kubernetes Prow Robot
bf79066749
Merge pull request #115714 from aramase/aramase/f/kubernetes#115595
[KMSv2] Add kind cluster and encryption config for e2e
2023-02-13 05:43:42 -08:00
Kubernetes Prow Robot
4933005b38
Merge pull request #115697 from aojea/lbds
don't run loadbalancer tests on large environments
2023-02-13 05:43:30 -08:00
Kubernetes Prow Robot
8ee0d3b6e8
Merge pull request #115584 from pbeschetnov/master
[HPA e2e] Calculate more precise consumed CPU usage for N replicas
2023-02-13 03:27:29 -08:00
Anish Ramasekar
4e6d5dddfb
[KMSv2] Add kind cluster and encryption config for e2e
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2023-02-13 06:42:54 +00:00
Patrick Ohly
3e760310b2 e2e: revise import restrictions
- test/e2e/framework/*.go should have very minimal dependencies.
  We can enforce that via import-boss.

- What each test/e2e/framework/* sub-package uses is less relevant,
  although ideally it also should be as minimal as possible in each case.

Enforcing this via import-boss ensures that new dependencies get flagged as
problem and thus will get additional scrutiny. It might be okay to add them,
but it needs to be considered.
2023-02-12 14:56:45 +01:00
Antonio Ojea
244d7449ce don't run loadbalancer tests on large environments
Change-Id: Id987e9469e563c0837c6437a44a65889cec2e202
2023-02-11 10:28:25 +00:00
David Porter
826472c99d test: e2e node shutdown test logging improvements
Since the pod names are reused across the test, searching the logs is
currently difficult.

Use a uuid for each pod name to make grepping the logs easier. Also,
always include the pod name and pod namespace in any logs or error
messages to make debugging easier.

Signed-off-by: David Porter <david@porter.me>
2023-02-10 16:54:31 -08:00
Kubernetes Prow Robot
0424a530a4
Merge pull request #115678 from pohly/e2e-full-reports
e2e: revise complete report creation
2023-02-10 15:07:29 -08:00
Kubernetes Prow Robot
1749bb2991
Merge pull request #115579 from ardaguclu/fix-wait-sh-timeout
flaky test wait.sh: Add deployment assertion before running wait
2023-02-10 13:59:29 -08:00
Patrick Ohly
3e2b26ce52 e2e: revise complete report creation
The previous approach was based on the observation that some Prow jobs use the
--report-dir parameter instead of the E2E_REPORT_DIR env variable. Parsing the
command line was necessary to use the --json-report and --junit-report
parameters.

But that is complex and can be avoided by triggering the creation of complete
reports in the E2E test suite. The paths are hard-coded and relative to the
report directory to keep the code simple.

There was a report that k8s-triage started processing more data after
6db4b741dd was merged. It's unclear whether
that was because of the new <report-dir>/ginkgo_report.xml file. To avoid
this potential problem, the reports are now in a "ginkgo" sub-directory.

While at it, error checking gets enhanced:
- Create directories at the start of
  the suite and bail out early if that fails.
- *All* e2e suites using the framework do this, not just test/e2e.
- Added missing error checking of truncated JUnit report writing.
2023-02-10 10:20:20 +01:00
Arda Güçlü
e0fedec69d (kubectl debug): Support debugging via files
Currently `kubectl debug` only supports passing names in command line.
However, users might want to pass resources in files by passing `-f` flag like
in all other kubectl commands.

This PR adds this ability.
2023-02-10 10:21:30 +03:00
Kubernetes Prow Robot
b2f8c8f00d
Merge pull request #115635 from bobbypage/npd-time-fix
test: Simplify NPD start timestamp calculation
2023-02-09 18:37:31 -08:00
Kubernetes Prow Robot
0698d9eb82
Merge pull request #115649 from aramase/grpc-metrics
[KMSv2] Add metrics for grpc service
2023-02-09 15:50:45 -08:00
Kubernetes Prow Robot
6e2e61bb3c
Merge pull request #115657 from saschagrunert/inject-base64
Allow SSH e2e node base64 key injection
2023-02-09 14:45:06 -08:00
Kubernetes Prow Robot
95c65ca3a0
Merge pull request #115454 from dgrisonnet/promote-pod-resource-metrics
Promote pod resource metrics to stable
2023-02-09 12:36:16 -08:00
Damien Grisonnet
49da8a1d4a scheduler: promote pod resource metrics to stable
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2023-02-09 20:30:45 +01:00
Anish Ramasekar
de3b2d525b
[KMSv2] Add metrics for grpc service
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2023-02-09 18:51:37 +00:00
Sascha Grunert
85106dc327
Allow SSH e2e node base64 key injection
With the change of the CRI-O jobs to use butane, we now have a
verification for base64 data urls in place. This means that the
following URL is invalid:

```
data:text/plain;base64,GCE_SSH_PUBLIC_KEY_FILE_CONTENT
```

This means we have to pass valid base64 to the URL. To fix that, we now
allow to inject SSH key values with both, the
`GCE_SSH_PUBLIC_KEY_FILE_CONTENT` field and its base64 encoded variant.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-02-09 16:17:11 +01:00
vaibhav2107
6ab8a8fbec Updated the change in registry 2023-02-09 09:37:44 +05:30
David Porter
7fe371a974 test: Simplify NPD start timestamp calculation
The NPD test checks when NPD started to determine if it is needed to
check the kubelet start event. The current logic requires parsing the
journalctl logs which is quite fragile and is broken now because of
systemd changing the expected log format.

Newer versions of systemd do not print "end at" or "logs begin at" and
instead may print "No entries", which will result in the test panicking.

```
$ journalctl -u foo.service
-- No entries --
```

For units started, it will not print "end at" or "logs begin at":

```
root@ubuntu-jammy:~# journalctl -u foo.service
Feb 08 22:02:19 ubuntu-jammy systemd[1]: Started /usr/bin/sleep 1s.
Feb 08 22:02:20 ubuntu-jammy systemd[1]: foo.service: Deactivated successfully.
```

To avoid relying on log parsing which is fragile, let's instead directly
ask systemd when the NPD service started and parse the resulting
timestamp.

Signed-off-by: David Porter <david@porter.me>
2023-02-08 13:58:45 -08:00
Kubernetes Prow Robot
468ce59183
Merge pull request #115557 from MikeSpreitzer/cleanup-path-hack
Simplify construction of /metrics request
2023-02-08 09:28:58 -08:00
Matthew Cary
69808b74ec Remove obsolete GKE local SSD test
Change-Id: I156bd03ac740c2ebe394081d3106851f7182269f
2023-02-07 17:33:32 -08:00
Riaan Kleinhans
999e9f14f7
remove conformance tested endpoints 2023-02-08 11:55:44 +13:00
Kubernetes Prow Robot
090025f5e6
Merge pull request #115548 from pohly/e2e-wait-for-pods-with-gomega
e2e: wait for pods with gomega, II
2023-02-07 07:01:21 -08:00
Kubernetes Prow Robot
22b88dea36
Merge pull request #115315 from enj/enj/i/kas_kubelet_conn_close
kubelet/client: collapse transport wiring onto standard approach
2023-02-07 07:01:14 -08:00
Kubernetes Prow Robot
4f321041bd
Merge pull request #115537 from MadhavJivrajani/bump-tools-deps-go120
*: Bump golangci-lint version and adapt to new linters
2023-02-07 05:53:12 -08:00
Pavel Beschetnov
456de495ef Calculate more precise usage for replicas 2023-02-07 12:41:36 +00:00
Arda Güçlü
60401d35d1 flaky test wait.sh: Add deployment assertion before running wait
There is a test in wait.sh integration suite which is checking the
given timeout value(passed by user) is equal to actual happened timeout value.

However, this test rarely gets `no matching resources found` error and
causes flakyiness. The reason is we are running wait command, immediately
after applying deployment. In reality, timeout test does not care about
deployment, since it is testing the timeout by passing invalid configurations.
But we need this deployment to not get `no matching resources found` error.

That's why, this PR adds deployment assertion before executing wait command.
2023-02-07 14:19:34 +03:00
Madhav Jivrajani
5e1f440d0a *: Fix linter warnings
Adapt to newly improved linters in golangci-lint v1.51.1

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2023-02-07 13:01:41 +05:30
Kubernetes Prow Robot
e944fc28ca
Merge pull request #115443 from torredil/master
Add windows nodeSelector to e2e storage testing pods
2023-02-06 18:27:09 -08:00
Monis Khan
754cb3d601
kubelet/client: collapse transport wiring onto standard approach
Signed-off-by: Monis Khan <mok@microsoft.com>
2023-02-06 20:34:49 -05:00
Mike Spreitzer
e9973979d0 Simplify construction of /metrics request 2023-02-06 16:20:34 -05:00
torredil
25389ee0ee
Add nodeSelector to e2e storage testing pods
Signed-off-by: torredil <torredil@amazon.com>
2023-02-06 16:00:51 +00:00
Patrick Ohly
136f89dfc5 e2e: use error wrapping with %w
The recently introduced failure handling in ExpectNoError depends on error
wrapping: if an error prefix gets added with `fmt.Errorf("foo: %v", err)`, then
ExpectNoError cannot detect that the root cause is an assertion failure and
then will add another useless "unexpected error" prefix and will not dump the
additional failure information (currently the backtrace inside the E2E
framework).

Instead of manually deciding on a case-by-case basis where %w is needed, all
error wrapping was updated automatically with

    sed -i "s/fmt.Errorf\(.*\): '*\(%s\|%v\)'*\",\(.* err)\)/fmt.Errorf\1: %w\",\3/" $(git grep -l 'fmt.Errorf' test/e2e*)

This may be unnecessary in some cases, but it's not wrong.
2023-02-06 15:39:13 +01:00
Patrick Ohly
9878e735dd e2e pod: unit test for pod status + API error
This covers new behavior in gomega.
2023-02-06 15:39:13 +01:00