Contextual logging cannot be enabled manually because there is no feature gate
flag. Enabling the feature unconditionally:
- should be low risk for E2E testing
- if it fails, we want to know
- is useful to get better log output from code which already supports it
We don't want klog to print to anything other than GinkgoWriter, but it still
used os.Stderr in addition to GinkgoWriter when printing log entries with
severity >= error. Changing "stderrthreshold" fixes that.
The unit test for framework output handling didn't test klog behavior. Now it
does:
- os.Stderr is redirected, should be empty
- a new test invokes klog
The WaitFor* refactoring in 07c34eb400 had an oversight what timeout parameter
is used for calling WaitForAllPodsCondition() in WaitForPodsWithLabelRunningReady()
so the calls to WaitForPodsWithLabelRunningReady() ended up ignoring the user
provided timeout. Fix that.
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
Followup on https://github.com/kubernetes/kubernetes/pull/111846. This
particular test was left out from that PR because once it was enabled it
started failing. It was desired to merge
https://github.com/kubernetes/kubernetes/pull/111846 irrespective of
this particular test.
The failure in the test was caused due to the
`createFSGroupRequestPreHook` mock CSI driver hook function assuming
that the request object passed to it is an instance of the respective
struct, but it's actually a pointer instead. This resulted in the hook
function not fulfilling its purpose, and the so the test failed.
Fixes instances of #98213 (to ultimately complete #98213 linting is
required).
This commit fixes a few instances of a common mistake done when writing
parallel subtests or Ginkgo tests (basically any test in which the test
closure is dynamically created in a loop and the loop doesn't wait for
the test closure to complete).
I'm developing a very specific linter that detects this king of mistake
and these are the only violations of it it found in this repo (it's not
airtight so there may be more).
In the case of Ginkgo tests, without this fix, only the last entry in
the loop iteratee is actually tested. In the case of Parallel tests I
think it's the same problem but maybe a bit different, iiuc it depends
on the execution speed.
Waiting for the CI to confirm the tests are still passing, even after
this fix - since it's likely it's the first time those test cases are
executed - they may be buggy or testing code that is buggy.
Another instance of this is in `test/e2e/storage/csi_mock_volume.go` and
is still failing so it has been left out of this commit and will be
addressed in a separate one
The main purpose of this change is to update the e2e Netpol tests to use
the srandard CreateNamespace function from the Framework. Before this
change, a custom Namespace creation function was used, with the
following consequences:
* Pod security admission settings had to be enforced locally (not using
the centralized mechanism)
* the custom function was brittle, not waiting for default Namespace
ServiceAccount creation, causing tests to fail in some infrastructures
* tests were not benefiting from standard framework capabilities:
Namespace name generation, automatic Namespace deletion, etc.
As part of this change, we also do the following:
* clearly decouple responsibilities between the Model, which defines the
K8s objects to be created, and the KubeManager, which has access to
runtime information (actual Namespace names after their creation by
the framework, Service IPs, etc.)
* simplify / clean-up tests and remove as much unneeded logic / funtions
as possible for easier long-term maintenance
* remove the useFixedNamespaces compile-time constant switch, which
aimed at re-using existing K8s resources across test cases. The
reasons: a) it is currently broken as setting it to true causes most
tests to panic on the master branch, b) it is not a good idea to have
some switch like this which changes the behavior of the tests and is
never exercised in CI, c) it cannot possibly work as different test
cases have different Model requirements (e.g., the protocols list can
differ) and hence different K8s resource requirements.
For #108298
Signed-off-by: Antonin Bas <abas@vmware.com>
Introduce networking/v1alpha1 api group.
Add `ClusterCIDR` type to networking/v1alpha1 api group, this type
will enable the NodeIPAM controller to support multiple ClusterCIDRs.
Updates predicate to check for a length >=2 to avoid
the index out of bounds panic.
Signed-off-by: Edwin Xie <exie@vmware.com>
Co-authored-by: Tyler Schultz <tschultz@vmware.com>
- add feature gate
- add encrypted object and run generated_files
- generate protobuf for encrypted object and add unit tests
- move parse endpoint to util and refactor
- refactor interface and remove unused interceptor
- add protobuf generate to update-generated-kms.sh
- add integration tests
- add defaulting for apiVersion in kmsConfiguration
- handle v1/v2 and default in encryption config parsing
- move metrics to own pkg and reuse for v2
- use Marshal and Unmarshal instead of serializer
- add context for all service methods
- check version and keyid for healthz
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
Including the full information for successful tests makes the resulting XML
file too large for the 200GB limit in Spyglass when running large jobs (like
scale testing).
The original solution from https://github.com/kubernetes/kubernetes/pull/111627
broke JUnit reporting in other test suites, in particular
test/e2e_node. Keeping the code inside the framework ensures that all test
suites continue to have the JUnit reporting.
AfterReadingAllFlags is a good place to set this up because all test suites
using the test context are expected to call it before running tests and after
parsing flags.
Removing the ReportEntries added by ginkgo.By from all test reports usually
avoids the `system-err` part in the JUnit file, which in Spyglass avoids
the extra "open stdout" button.
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
Co-authored-by: Dave Chen <dave.chen@arm.com>
It is used to request that a pod runs in a unique user namespace.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Co-authored-by: Rodrigo Campos <rodrigoca@microsoft.com>
Including the full information for successful tests makes the resulting XML
file too large for the 200GB limit in Spyglass when running large jobs (like
scale testing).
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
Co-authored-by: Dave Chen <dave.chen@arm.com>
- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)
We now partly drop the support for seccomp annotations which is planned
for v1.25 as part of the KEP:
https://github.com/kubernetes/enhancements/issues/135
Pod security policies are not touched by this change and therefore we
have to keep the annotation key constants.
This means we only allow the usage of the annotations for backwards
compatibility reasons while the synchronization of the field to
annotation is no longer supported. Using the annotations for static pods
is also not supported any more.
Making the annotations fully non-functional will be deferred to a
future release.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
this is specifically so that we have more structured information when
the metric is parsed and stored as a stable metric. This change does not
change the name of the actual metrics.
Change-Id: I861482401ad9a0ae12306b93abf91d6f76d7a407
In order to promote kubectl alpha events to beta,
it should at least support flags which is already
supported by kubectl get events as well as new flags.
This PR adds;
--output: json|yaml support and does essential refactorings to
integrate other printing options easier in the future.
--no-headers: kubectl get events can hide headers when this flag is set for default printing.
Adds this ability to hide headers also for kubectl alpha events.
This flag has no effect when output is json or yaml for both commands.
--types: This will be used to filter certain events to be printed and
discard others(default behavior is same with --event=Normal,Warning).
Also add entrypoints for verifying and updating a test file for easier
debugging. This is considerably faster than running the stablity checks
against the entire Kubernetes codebase.
Change-Id: I5d5e5b3abf396ebf1317a44130f20771a09afb7f
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The "pause" pods that are being run in the scheduling tests are
sometimes launched in system namespaces. Therefore even if a test
is considered to be running on a "baseline" Pod Security admission
level, its "baseline" pods would fail to run if the global PSa
enforcement policy is set to "restricted" - the system namespaces
have no PSa labels.
The "pause" pods run by this test can actually easily run with
"restricted" security context, and so this patch turns them
into just that.
The test breaks the controllers that depend on api services to be resolvable,
per example, the namespace controller, that is heavily used by the e2e framework to clean the environment
NPD can run in "standalone" or "daemonset" mode in cluster GCE tests.
For standalone mode, NPD runs as a systemd unit, daemonset mode it runs
as pod.
If NPD is running in standalone mode, as soon as we detect that the npd
service does not exist with `systemctl status`, we should skip the rest
of the checks such as looking at the journalctl logs or pid.
Signed-off-by: David Porter <david@porter.me>
Fix rollout history bug where the latest revision was
always shown when requesting a specific revision and
specifying an output.
Add unit and integration tests for rollout history.
As what suggested by Ginkgo migration guide, `Measure` node was
deprecated and replaced with `It` node which creates `gmeasure.Experiment`.
Signed-off-by: Dave Chen <dave.chen@arm.com>
Some operations with Azure Disk volumes can be arbitrarily slow
sometimes. This causes CI flakes that are hard to debug.
Bumping the timeouts has sown to improve or eliminate this issue.
Ginkgo is now writing the JUnit file itself. The -report-dir parameter is used
as fallback for enabling JUnit output in case that users haven't migrated to
the new -junit-report parameter.
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: Dave Chen <dave.chen@arm.com>
Besides, the using of method might lead to a `concurrent map writes`
issue per the discussion here: https://github.com/onsi/ginkgo/issues/970
Signed-off-by: Dave Chen <dave.chen@arm.com>
Default timeout setting has been reduced from `24h` down to `1h` in
Ginkgo V2, but for some long running test this is too short.
How long to abort the test was controlled by the the linux command `timeout`
in V1. e.g. `'timeout -k 30s 150m ...`, and is configured in the file
like `sig-network-misc.yaml`.
Set the timeout manually for Ginkgo V2 to avoid the early aborting.
Signed-off-by: Dave Chen <dave.chen@arm.com>
`FullStackTrace` is not available in v2 if no exception found
with test execution.
The change is needed for conformance test's spec validation.
pls see: https://github.com/onsi/ginkgo/issues/960 for details.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The change is needed to verify the conformance test spec, as this
is verified in `verify-conformance-yaml.sh`.
Signed-off-by: Dave Chen <dave.chen@arm.com>
Full stack traces are on by default. The approach for collecting results is
different. Tests run in their own goroutine, therefore runTests is no longer
part of their callstack. To cover stack traces with more than one entry, a new
test case gets added with a separate helper function.
Gomega object formatting now includes the type.
This removes the last remaining reference to Ginkgo v1.
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: Dave Chen <dave.chen@arm.com>
- update all the import statements
- run hack/pin-dependency.sh to change pinned dependency versions
- run hack/update-vendor.sh to update go.mod files and the vendor directory
- update the method signatures for custom reporters
Signed-off-by: Dave Chen <dave.chen@arm.com>
Seemingly on slow connections if the response to /configz request was
chunked the kubectl proxy was terminated before the response body was
received and read, causing unexpected EOF errors. This patch changes the
configz polling code so that the whole response body is read before
closing the proxy connection.
The test validates the following endpoints
- createAppsV1NamespacedControllerRevision
- deleteAppsV1CollectionNamespacedControllerRevision
- deleteAppsV1NamespacedControllerRevision
- listAppsV1ControllerRevisionForAllNamespaces
- patchAppsV1NamespacedControllerRevision
- readAppsV1NamespacedControllerRevision
- replaceAppsV1NamespacedControllerRevision
As outlined in the KEP, we now graduate the Kubelet feature to beta
which means that it is enabled by default. The corresponding Kubelet
flag still defaults to `false`, but we now have the chance to e2e test
the feature by using a new serial test case.
KEP: https://github.com/kubernetes/enhancements/issues/2413
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
e2e test validates the following 3 extra endpoints
- deleteCoreV1CollectionNamespacedResourceQuota
- listCoreV1ResourceQuotaForAllNamespaces
- patchCoreV1NamespacedResourceQuota
As described in 8c76845b03 ("test/e2e/network: fix a bug in the hostport e2e
test") if we have two pods with the same hostPort, hostIP, but different
protocols, a CNI may be buggy and decide to forward all traffic only to one of
these pods. Add a check that we receiving requests from different pods.
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Making the LoggingConfiguration part of the versioned component-base/config API
had the theoretic advantage that components could have offered different
configuration APIs with experimental features limited to alpha versions (for
example, sanitization offered only in a v1alpha1.KubeletConfiguration). Some
components could have decided to only use stable logging options.
In practice, this wasn't done. Furthermore, we don't want different components
to make different choices regarding which logging features they offer to
users. It should always be the same everywhere, for the sake of consistency.
This can be achieved with a saner Go API by dropping the distinction between
internal and external LoggingConfiguration types. Different stability levels of
indidividual fields have to be covered by documentation (done) and potentially
feature gates (not currently done).
Advantages:
- everything related to logging is under component-base/logs;
previously this was scattered across different packages and
different files under "logs" (why some code was in logs/config.go
vs. logs/options.go vs. logs/logs.go always confused me again
and again when coming back to the code):
- long-term config and command line API are clearly separated
into the "api" package underneath that
- logs/logs.go itself only deals with legacy global flags and
logging configuration
- removal of separate Go APIs like logs.BindLoggingFlags and
logs.Options
- LogRegistry becomes an implementation detail, with less code
and less exported functionality (only registration needs to
be exported, querying is internal)
The hostport e2e test (sonobuoy run --e2e-focus 'validates that there is no
conflict between pods with same hostPort but different hostIP and protocol')
checks, in particular, that two pods with the same hostPort, the same hostIP,
but different L4 protocols can coexist on one node.
In order to do this, the test creates two pods with the same hostIP:hostPort,
one TCP-based, another UDP-based. However, both pods listen on both protocols:
netexec --http-port=8080 --udp-port=8080
This can happen that a CNI which doesn't distinguish between TCP and UDP
hostPorts forwards all traffic, TCP or UDP, to the same pod. As this pod
listens on both protocols it will reply to both requests, and the test
will think that everything works properly while the second pod is indeed
disconnected. Fix this by executing different commands in different pods:
TCP: netexec --http-port=8080 --udp-port=-1
UDP: netexec --http-port=8008 --udp-port=8080
The TCP pod now doesn't listen on UDP, and the UDP pod doesn't listen on TCP on
the target hostPort. The UDP pod still needs to listen on TCP on another port
so that a pod readiness check can be made.
Use a watch to detect invalid pod status updates in graceful node
shutdown node e2e test. By using a watch, all pod updates will be
captured while the previous logic required polling the api-server which
could miss some intermediate updates.
Signed-off-by: David Porter <david@porter.me>
This is useful in environments where the Deployment image is replaced
by another image from an internal registry (via fixture). In that case,
the populator running in populate mode should use the same image as the
populator running in controller mode.