It provides more readable output and has additional APIs for using it inside a
unit test. goleak.IgnoreCurrent is needed to filter out the goroutine that gets
started when importing go.opencensus.io/stats/view.
In order to handle background goroutines that get created on demand and cannot
be stopped (like the one for LogzHealth), a helper function ensures that those
are running before calling goleak.IgnoreCurrent. Keeping those goroutines
running is not a problem and thus not worth the effort of adding new APIs to
stop them.
Other goroutines are genuine leaks for which no fix is available. Those get
suppressed via IgnoreTopFunction, which works as long as that function
is unique enough.
Example output for the leak fixed in https://github.com/kubernetes/kubernetes/pull/115423:
E0202 09:30:51.641841 74789 etcd.go:205] "EtcdMain goroutine check" err=<
found unexpected goroutines:
[Goroutine 4889 in state chan receive, with k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop on top of the stack:
goroutine 4889 [chan receive]:
k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop(0xc0076183c0)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/watch/mux.go:268 +0x65
created by k8s.io/apimachinery/pkg/watch.NewBroadcaster
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/watch/mux.go:77 +0x116
>
Requests can accumulate errors with no obvious indication, e.g. if
their primary purpose is to construct a URL: URL() itself doesn't
return an error if r.err is non-nil.
Instead of changing URL() to return an error, which has quite a large
impact, add an Error() function and indicate on URL() that it should
be checked.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
Currently `kubectl debug` only supports passing names in command line.
However, users might want to pass resources in files by passing `-f` flag like
in all other kubectl commands.
This PR adds this ability.
* feat(debug): add more profiles
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
* feat(debug): implment serveral debugging profiles
Including `general`, `baseline` and `restricted`.
I plan to add more profiles afterwards, but I'd like to get early
reviews.
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
* test: add some basic tests
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
* chore: add some helper functions
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
* ensure pod copies always get their probes cleared
not wanting probes to be present is something we want
for all the debug profiles; so an easy place to implement
this is at the time of pod copy generation.
* ensure debug container in pod copy is added before the profile application
The way that the container list modification was defered causes the
debug container to be added after the profile applier runs. We now
make sure to have the container list modification happen before
the profile applier runs.
* make switch over pod copy, ephemeral, or node more clear
* use helper functions
added a helper function to modify a container out of a list that
matches the provided container name.
also added a helper function that adds capabilities to container
security.
* add tests for the debug profiles
* document new debugging profiles in command line help text
* add file header to profiles_test.go
* remove URL to KEP from help text
* move probe removal to the profiles
* remove mustNewProfileApplier in tests
* remove extra whiteline from import block
* remove isPodCopy helper func
* switch baselineProfile to using the modifyEphemeralContainer helper
* rename addCap to addCapability, and don't do deep copy
* fix godoc on modifyEphemeralContainer
* export DebugOptions.Applier for extensibility
* fix unit test
* fix spelling on overriden
* remove debugStyle facilities
* inline setHostNamespace helper func
* remove modifyContainer, modifyEphemeralContainer, and remove probes
their logic have been in-lined at call sites
* remove DebugApplierFunc convenience facility
* fix baseline profile implementation
it shouldn't have SYS_PTRACE base on
https://github.com/kubernetes/enhancements/tree/master/keps/sig-cli/1441-kubectl-debug#profile-baseline
* remove addCapability helper, in-lining at call sites
* address Arda's code review comments
1 use Bool instead of BoolPtr (now deprecated)
2 tweak for loop to continue when container name is not what we expect
3 use our knowledge on how the debug container is generated to simplify
our modification to the security context
4 use our knowledge on how the pod for node debugging is generated to no
longer explicit set pod's HostNework, HostPID and HostIPC fields to
false
* remove tricky defer in generatePodCopyWithDebugContainer
* provide helper functions to make debug profiles more readable
* add note to remind people about updating --profile's help text when adding new profiles
* Implement helper functions with names that improve readability
* add styleUnsupported to replace debugStyle(-1)
* fix godoc on modifyContainer
* drop style prefix from debugStyle values
* put VisitContainers in podutils & use that from debug
* cite source for ContainerType and VisitContainers
* pull in AllContainers ContainerType value
* have VisitContainer take pod spec rather than pod
* in-line modifyContainer
* unexport helper funcs
* put debugStyle at top of file
* merge profile_applier.go into profile.go
* tweak dropCapabilities
* fix allowProcessTracing & add a test for it
* drop mask param from help funcs, since we can already unambiguous identify the container by name
* fix grammar in code comment
---------
Signed-off-by: Jian Zeng <anonymousknight96@gmail.com>
Co-authored-by: Jian Zeng <anonymousknight96@gmail.com>
And also in the terminating-namespace log output. This makes it
easier to track down drain-blocking pods, without having to hunt
around in earlier logs for 'evicting pod ...' messages. Before this
change, caller logs might look like:
evicting pod {namespace}/{name}
...
error when waiting for pod "{name}" terminating: global timeout reached: 20s
With this change, they will look like:
evicting pod {namespace}/{name}
...
error when waiting for pod "{name}" in namespace "{namespace}" to terminate: global timeout reached: 20s