The `make` rules which auto-generate some of our API stuff are
incredibly baroque, and hard to maintain. They were originally added on
the assumption that we would stop checking generated files into git.
Since then we have moved away from that goal, and the worst problems
with generated files have been resolved.
Reasons to kill this:
* It is slow on every build, as opposed to just being slow when running
the generators. It is even slow to calculate that there's nothing to
update.
* Most development work doesn't involve changing APIs.
* It only covers about half (or less) of the generated code, and making
it cover more would be even slower.
* Approximately 1 person knows how this all works.
* We have CI to make sure changes do not get merged without updating
this code.
* We have corner cases where this does the WRONG thing and tracking
those down is ugly and hard in perpetuity.
So this commit puts all the same logic that WAS in the
Makefile.generated_files into update-codegen.sh.
I do not love this script, especially WRT sub-packages, but I am trying
not to boil the ocean. I hope to follow up with some more cleanups over
time.
I have tested this manually and with the scripts and it still seems to
catch errors properly.
This includes a change to kube::util::read-array to make it not unset
variables and not over-write non-array variables.
A staging repo which just got created with only the doc.go file in it won't
have any dependencies yet, which caused the script to fail because the
dependency files didn't get created:
+++ [0926 14:33:22] go.mod: tidying
cat: /tmp/update-vendor.1VTv/group_replace.ZbIT/go.mod.require_direct.tmp: No such file or directory
!!! [0926 14:33:23] Call tree:
!!! [0926 14:33:23] 1: hack/update-vendor.sh:354 group_directives(...)
- Moves kms proto apis to the staging repo
- Updates generate and verify kms proto scripts to check staging repo
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
The new support in ginkgo for progress reports while a test runs dumps
information about where a test is stuck when it runs too long. This can provide
additional insights into what the test is waiting for.
For the Kubernetes jobs using ginkgo-e2e.sh, such dumps are now enabled after
300 seconds and then get repeated every 20 seconds. The initial delay is
intentionally the same as for warning about a slow test. The rationale is that
such test runtimes are unexpected and may need further information to diagnose
why they are slow.
With -ginkgo.source-root, Ginkgo is able to locate the Kubernetes source code
and display small source code snippets for functions that are related to the
test, determined through a heuristic that assumes that all files under the test
suite are for the tests in it.
Set intercept mode to none will help to reveal more information when the
test hangs,
- https://github.com/onsi/ginkgo/issues/970
or circumvent cases where the code grabbing the stdout/stderr pipe
is not under the framework control and may cause hangs,
- https://github.com/onsi/ginkgo/issues/851
The flag `output-interceptor-mode` is set to `none` as we were trying to
figure out of the rootcase of the test flaky, it's only intended for debugging.
- https://github.com/kubernetes/kubernetes/issues/111086
But this set also has some side effect, since it will turn off stdout/stderr
capture completely, any output to stdout/stderr will be lost.
Now that the root cause is not caused by Ginkgo bump nor how the intercept
mode was set, we'd better to follow the default value.
Signed-off-by: Dave Chen <dave.chen@arm.com>
Some scripts and tools still relied on the deprecated flags, the ones
which are about to be removed.
This is intentionally not a complete removal of all those flags in the entire
repo. This would lead to much more code churn also in places where commands
still accept the flags because they use klog directly.
Introduce networking/v1alpha1 api group.
Add `ClusterCIDR` type to networking/v1alpha1 api group, this type
will enable the NodeIPAM controller to support multiple ClusterCIDRs.
- add feature gate
- add encrypted object and run generated_files
- generate protobuf for encrypted object and add unit tests
- move parse endpoint to util and refactor
- refactor interface and remove unused interceptor
- add protobuf generate to update-generated-kms.sh
- add integration tests
- add defaulting for apiVersion in kmsConfiguration
- handle v1/v2 and default in encryption config parsing
- move metrics to own pkg and reuse for v2
- use Marshal and Unmarshal instead of serializer
- add context for all service methods
- check version and keyid for healthz
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
This change is to promote local storage capacity isolation feature to GA
At the same time, to allow rootless system disable this feature due to
unable to get root fs, this change introduced a new kubelet config
"localStorageCapacityIsolation". By default it is set to true. For
rootless systems, they can set this configuration to false to disable
the feature. Once it is set, user cannot set ephemeral-storage
request/limit because capacity and allocatable will not be set.
Change-Id: I48a52e737c6a09e9131454db6ad31247b56c000a
This applies to all jobs using hack/ginkgo-e2e.sh. This is done because
Spyglass does not render the escape sequences, making test output harder to
read.
It is done here because then we don't need to set GINKGO_NO_COLOR in all the
different Prow job configs.
Ginkgo v1 had a much longer default test timeout, in v2 this
switched to being 1 hour. This is not long enough to run many of our
suites.
Here we copy the backwards compatibility that is used by
hack/gingo-e2e.sh to unbreak serial pipelines.
Ginkgo has been migrated to V2, add this to unwanted dependencies
so that it won't be shown up as a dep again in the future.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The alias for vendor/github.com/onsi/ginkgo/ginkgo ensures that code like
30e99cb2a9/experiment/kind-conformance-image-e2e.sh (L110)
continues to work. The one without "vendor/" is there just in case that it
was used because it also worked.
Long term, "ginkgo" is a nicer, version independent alias. It gets used
internally to avoid future churn and gets documented also publicly in the
Makefile help.
The caveat is that there's no guarantee that a future v3 CLI will be compatible
with current invocations. But the most common usage is through
hack/ginkgo-e2e.sh, which can deal with such differences.
Default timeout setting has been reduced from `24h` down to `1h` in
Ginkgo V2, but for some long running test this is too short.
How long to abort the test was controlled by the the linux command `timeout`
in V1. e.g. `'timeout -k 30s 150m ...`, and is configured in the file
like `sig-network-misc.yaml`.
Set the timeout manually for Ginkgo V2 to avoid the early aborting.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The change is needed for `verify-e2e-test-ownership.sh`.
The `jq` is re-defined since the structure of test spec
is different with v1 and the stacktrace related validation
is not available, e.g. `package` and `func`.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The test/e2e directory contains several unit tests that should run as part of
"make test":
./test/e2e/chaosmonkey/chaosmonkey_test.go
./test/e2e/storage/external/external_test.go
./test/e2e/storage/utils/utils_test.go
./test/e2e/framework/log_test.go
./test/e2e/framework/testfiles/testfiles_test.go
./test/e2e/framework/timer/timer_test.go
./test/e2e/framework/node/wait_test.go
./test/e2e/framework/pod/resource_test.go
./test/e2e/framework/config/config_test.go
./test/e2e/framework/ingress/ingress_utils_test.go
./test/e2e/framework/providers/gce/firewall_test.go
Because they were excluded by "./test/e2e/*", some of them became outdated.
./test/e2e/e2e_test.go is the only test that needs to be excluded because it is
the E2E test suite that depends on a functional cluster.
The following investigation occurred during development.
Add TimingHistogram impl that shares lock with WeightedHistogram
Benchmarking and profiling shows that two layers of locking is
noticeably more expensive than one.
After adding this new alternative, I now get the following benchmark
results.
```
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogram$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogram-16 22232037 52.79 ns/op 0 B/op 0 allocs/op
PASS
ok k8s.io/component-base/metrics/prometheusextension 1.404s
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogram$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogram-16 22190997 54.50 ns/op 0 B/op 0 allocs/op
PASS
ok k8s.io/component-base/metrics/prometheusextension 1.435s
```
and
```
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogramDirect$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramDirect-16 28863244 40.99 ns/op 0 B/op 0 allocs/op
PASS
ok k8s.io/component-base/metrics/prometheusextension 1.890s
(base) mspreitz@mjs12 kubernetes %
(base) mspreitz@mjs12 kubernetes %
(base) mspreitz@mjs12 kubernetes % go test -benchmem -run=^$ -bench ^BenchmarkTimingHistogramDirect$ k8s.io/component-base/metrics/prometheusextension
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramDirect-16 27994173 40.37 ns/op 0 B/op 0 allocs/op
PASS
ok k8s.io/component-base/metrics/prometheusextension 1.384s
```
So the new implementation is roughly 20% faster than the original.
Add overlooked exception, rename timingHistogram to timingHistogramLayered
Use the direct (one mutex) style of TimingHistogram impl
This is about a 20% gain in CPU speed on my development machine, in
benchmarks without lock contention. Following are two consecutive
trials.
(base) mspreitz@mjs12 prometheusextension % go test -benchmem -run=^$ -bench Histogram .
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramLayered-16 21650905 51.91 ns/op 0 B/op 0 allocs/op
BenchmarkTimingHistogramDirect-16 29876860 39.33 ns/op 0 B/op 0 allocs/op
BenchmarkWeightedHistogram-16 49227044 24.13 ns/op 0 B/op 0 allocs/op
BenchmarkHistogram-16 41063907 28.82 ns/op 0 B/op 0 allocs/op
PASS
ok k8s.io/component-base/metrics/prometheusextension 5.432s
(base) mspreitz@mjs12 prometheusextension % go test -benchmem -run=^$ -bench Histogram .
goos: darwin
goarch: amd64
pkg: k8s.io/component-base/metrics/prometheusextension
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkTimingHistogramLayered-16 22483816 51.72 ns/op 0 B/op 0 allocs/op
BenchmarkTimingHistogramDirect-16 29697291 39.39 ns/op 0 B/op 0 allocs/op
BenchmarkWeightedHistogram-16 48919845 24.03 ns/op 0 B/op 0 allocs/op
BenchmarkHistogram-16 41153044 29.26 ns/op 0 B/op 0 allocs/op
PASS
ok k8s.io/component-base/metrics/prometheusextension 5.044s
Remove layered implementation of TimingHistogram
This commit cleans up references to the old kubernetes-node-e2e-images
project. In the process it removes the `LIST_IMAGES` mode as listing
large numbers of public cloud projects is not particularly useful, and
has been somewhat broken for a long period of time - as we defaulted
launching a VM to a different project than listing.
This commit undoes the GODEBUG=x509sha1=1 workaround.
The problem should be fixed in Go 1.18.1 now.
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
* Introduce networking/v1alpha1 api, ClusterCIDRConfig type
Introduce networking/v1alpha1 api group.
Add `ClusterCIDRConfig` type to networking/v1alpha1 api group, this type
will enable the NodeIPAM controller to support multiple ClusterCIDRs.
* Change ClusterCIDRConfig.NodeSelector type in api
* Fix review comments for API
* Update ClusterCIDRConfig API Spec
Introduce PerNodeHostBits field, remove PerNodeMaskSize
Over time the size of our junit xml has exploded to the point where
test-grid fails to process them. We still have the original/full
*.stdout files from where the junit xml files are generated from so the
junit xml files need NOT have the fill/exact output for
processing/display. So let us prune the large messages with an
indicator that we have "[... clipped...]" some of the content so folks
can see that they have to consult the full *.stdout files.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Running logcheck as part of golangci-lint has several advantages:
- faster checking because finding files and parsing is shared
with other linters
- gets rid of the complex and buggy
hack/verify-structured-logging.sh (https://github.com/kubernetes/kubernetes/issues/106746)
- support for // nolint:logcheck
- works with Go 1.18
This should fix the following error when running
./hack/update-generated-stable-metrics.sh:
'go get' is no longer supported outside a module.
To build and install a command, use 'go install' with a version,
like 'go install example.com/cmd@latest'
For more information, see https://golang.org/doc/go-get-install-deprecation
or run 'go help get' or 'go help install'.
Using `go get` to download gopkg.in/yaml.v2 package into
KUBE_EXTRA_GOPATH directory no longer works. Interestingly, main repo
already has gopkg.in/yaml.v2@v2.4.0, same version that was installed by
that go get.
I guess that GOPATH with multiple elements no longer works either,
and since this code was the only user of KUBE_EXTRA_GOPATH, let's remove
it as well.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
- Also update test-cmd.sh to pass a signing ca to the kube controller
manager, so CSRs work properly in integration tests.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This has somewhat subtle implications. For a concrete example, this
changes the `-trimpath` behavior from only affecting the named pkg to
affecting all pkgs, which broke ginkgo, which seems to try to strip its
own `pwd` prefix. But since that runs in run-in-gopath, and not in
KUBE_ROOT, it fails to strip anything.
e.g.
before this, strings in the binary would be like
/home/user/kube/_output/local/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/...
Ginkgo would find its own root as
/home/user/kube/_output/local/go/src/k8s.io/kubernetes/
so it would produce
vendor/github.com/onsi/ginkgo/...
in logs.
after this, strings in the binary strip the KUBE_ROOT and be like:
_output/local/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/...
Ginkgo would find its own root as
/home/user/kube/_output/local/go/src/k8s.io/kubernetes/
so it would not strip anything, and produce
_output/local/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/...
in logs.
before:
```
$ make generated_files
+++ [0226 13:42:17] Building go targets for linux/amd64:
hack/make-rules/helpers/go2make
> non-static build: k8s.io/kubernetes/hack/make-rules/helpers/go2make
```
after:
```
$ make generated_files
+++ [0226 14:30:08] Building go targets for linux/amd64
k8s.io/kubernetes/hack/make-rules/helpers/go2make (non-static)
```
When running an integration test that measures performance, like for example
test/integration/scheduler_perf, running etcd with debug level output is
undesirable because it creates additional load on the system and isn't
realistic.
The default is still "debug", but ETCD_LOGLEVEL=warn can be used to override
that.
Since removing dockershim, `make test-e2e-node` will fail by default as
there is no provided container runtime endpoint.
This commit defaults us to using containerd's default socket path as the
local test target, rather than failing hard.
When CRDs are deleted, discovery local cache is not invalidated.
This brings about `resource not found` error when new CRD with same name is created
with different fields(ie. changing scope from cluster-wide to namespaced).
Because this already deleted CRD still stays in serverresources.json and kubectl tries to use it.
This local cached files have 10 minutes TTL. After deletion, if user waits 10 minutes,
files will be expired and deleted and there will be no errors. However, 10 minutes is a long time
and cache needs to be invalidated after deletion occurs.
This PR adds a document into delete command by noting that there might be a need to invalidate discovery
cache when CRD is deleted. In addition to that this PR adds a test to catch this behavior.