This commit reuses e2e tests implmented as part of https://github.com/kubernetes/kubernetes/pull/110729.
The commit is borrowed from the aforementioned PR as is to preserve
authorship. Subsequent commit will update the end to end test to
simulate the problem this PR is trying to solve by reproducing
the issue: 109595.
Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
This will change when adding dynamic resource allocation test cases. Instead of
changing mustSetupScheduler and StartScheduler for that, let's return the
informer factory and create informers as needed in the test.
Entire test cases and workloads can have labels attached to them. The union of
these must match the label filter which works as in GitHub. The benchmark by
default runs the tests that are labeled "performance", which is the same as
before.
Capture explicitly a test case pertaining to kubelet restart
but with no pod restart and device plugin re-registration.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Based on whether the test case requires pod restart or not, the sleep
interval needs to be updated and we define constants to represent the two
sleep intervals that can be used in the corresponding test cases.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Explicitly state that the test involves kubelet restart and device plugin
re-registration (no pod restart)
We remove the part of the code where we wait for the pod to restart as this
test case should no longer involve pod restart.
In addition to that, we use `waitForNodeReady` instead of `WaitForAllNodesSchedulable`
for ensuring that the node is ready for pods to be scheduled on it.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Rather than testing out for both pod restart and kubelet restart,
we change the tests to just handle pod restart scenario.
Clarify the test purpose and add extra check to tighten the test.
We would be adding additional tests to cover kubelet restart scenarios
in subsequent commits.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Signed-off-by: Francesco Romani <fromani@redhat.com>
With this change the error message are more helpful and easier
to troubleshoot in case of test failures.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
We rename to make the intent more explicit;
We make it global to be able to reuse the value all across the module
(e.g. to check the node readiness) later on.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
Rather than only returning a string forcing us to log failure with
`framework.Fail`, we return a string and error to handle error cases
more conventionally. This enables us to use the `parseLog` function
inside `Eventually` and `Consistently` blocks, or in general to delegate
the error processing and enable better composability.
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Co-authored-by: Francesco Romani <fromani@redhat.com>
The previous solution had some shortcomings:
- It was based on the assumption that the goroutine gets woken up at regular
intervals. This is not actually guaranteed. Now the code keeps track of the
actual start and end of an interval and verifies that assumption.
- If no pod was scheduled (unlikely, but could happen), then
"0 pods/s" got recorded. In such a case, the metric was always either
zero or >= 1. A better solution is to extend the interval
until some pod gets scheduled. With the larger time interval
it is then possible to also track, for example, 0.5 pods/s.
This allows us to return with a timeout error as soon as the
context is canceled. Previously in cases where the mount will
never succeed pods can get stuck deleting for 2 minutes.
In the Sync*Pod methods that call VolumeManager.WaitFor*, we
must filter out wait.Interrupted errors from being logged as
they are part of control flow, not runtime problems. Any
early interruption should result in exiting the Sync*Pod method
as quickly as possible without logging intermediate errors.
Copied and modified RemoveString function from
k/k/pkg/util/slice/slice.go to e2e/framework/pod/pod_client.go
This is the last dependency from e2e framework to k/k/pkg/util
Copied and modified pod format function from
k/k/pkg/kubelet/util/format/pod.go to e2e/framework/pod/pod_client.go
This is the last dependency from e2e framework to k/k/pkg/kubelet
It's conceptually wrong to have dependencies to k/k/pkg in
the e2e framework code. They should be moved to corresponding
packages, in this particular case to the test/e2e_node.
* Convert file but warn user with impossible conversions
* Only continuing for NotRegisteredErrors. Using iostreams for warning user instead of stdError
* Formatting, correct tests to use valid DNS-1035.
By generating the unique name in advance, the label also can be set to a
matching value directly in the Create request. This makes test startup in
test/integration/scheduler_perf a bit faster because the extra patching can be
avoided.
It also leads to a better label because previously, the unique label value
didn't match the node name. This is required for simulating dynamic resource
allocation, which relies on the label to track where an allocated claim is
available.
Add a regression test for https://issues.k8s.io/116925. The test
exercises the following:
1) Start a restart never pod which will exit with
`v1.PodSucceeded` phase.
2) Start a graceful deletion of the pod (set a deletion timestamp)
3) Restart the kubelet as soon as the kubelet reports the pod is
terminal (but before the pod is deleted).
4) Verify that after kubelet restart, the pod is deleted.
As of v1.27, there is a delay between the pod being marked terminal
phaes, and the status manager deleting the pod. If the kubelet is
restarted in the middle, after starting up again, the kubelet needs to
ensure the pod will be deleted on the API server.
Signed-off-by: David Porter <david@porter.me>
It makes sense to define a test where, depending on the parameters, some
operation creations zero pods, namespaces or nodes. The validation didn't allow
that previously due to the way how it was implemented although the underlying
code works fine with zero as count.
collector.collect got called without ensuring that collector.run had
terminated, so it could have happened that collector.run adds another sample
while collector.collect is reading them.
test-e2e-node for AWS is out-of-tree so that we won't need to vendor
in AWS related packages. For this to work, some of the scripts/golang
code need to know where the k8s tree is git cloned.
So let's add an option to lookup the env var, so that we can then,
change directory to this specified directory to run some make commands
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
grpclog.SetLoggerV is not thread-safe and may only be called before code starts
using GRPC. Calling RunCustomEtcd multiple times, for example in
k8s.io/kubernetes/test/integration/apiserver.TestWatchCacheUpdatedByEtcd,
causes a data race:
WARNING: DATA RACE
Read at 0x00000c8e8d20 by goroutine 135612:
k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.V()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/grpclog.go:41 +0x30
k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.(*componentData).V()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/component.go:103 +0x4e
k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.(*loopyWriter).run.func1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:528 +0xf1
runtime.deferreturn()
/home/prow/go/src/k8s.io/kubernetes/_output/local/.gimme/versions/go1.20.2.linux.amd64/src/runtime/panic.go:476 +0x32
k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport.newHTTP2Client.func6()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/internal/transport/http2_client.go:442 +0x112
Previous write at 0x00000c8e8d20 by goroutine 140228:
k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog.SetLoggerV2()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/google.golang.org/grpc/grpclog/loggerv2.go:76 +0xc6a
k8s.io/kubernetes/test/integration/framework.RunCustomEtcd()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/framework/etcd.go:153 +0xb89
k8s.io/kubernetes/test/integration/apiserver.multiEtcdSetup()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:40 +0xac
k8s.io/kubernetes/test/integration/apiserver.TestWatchCacheUpdatedByEtcd()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/apiserver/watchcache_test.go:88 +0x4a
testing.tRunner()
/home/prow/go/src/k8s.io/kubernetes/_output/local/.gimme/versions/go1.20.2.linux.amd64/src/testing/testing.go:1576 +0x216
testing.(*T).Run.func1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/.gimme/versions/go1.20.2.linux.amd64/src/testing/testing.go:1629 +0x47
Users can pass resources into `kubectl events` command via `--for` flag,
if they have desire to only get events for the resource they specify.
However, current `kubectl events` does not support passing fully qualified
names(e.g. `replicasets.apps`, `cronjobs.v1.batch`, etc.). This PR adds support
for this.
The newly added `MirrorPodWithGracePeriod when create a mirror pod and
the container runtime is temporarily down during pod termination` test
is currently flaking because in some cases when it is run there are
other pods from other tests that are still in progress of being
terminated. This results in the test failing because it asserts metrics
that assume that there is only one pod running on the node.
To fix the flake, prior to starting the test, verify that no pods exist
in the api server other then the newly created mirror pod.
Signed-off-by: David Porter <david@porter.me>
this check needs to go after any mutations. After the mutating admission chain, rest.BeforeUpdate (which is responsible for reverting updates to immutable timestamp fields, among other things.) is called in the store.Update function. Without moving this check, it will be possible for an object to be written to etcd with only a change to its managed fields timestamp.