Automatic merge from submit-queue
Fix resync goroutine leak in ListAndWatch
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, read our contributor guidelines https://github.com/kubernetes/kubernetes/blob/master/CONTRIBUTING.md and developer guide https://github.com/kubernetes/kubernetes/blob/master/docs/devel/development.md
2. If you want *faster* PR reviews, read how: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/faster_reviews.md
3. Follow the instructions for writing a release note: https://github.com/kubernetes/kubernetes/blob/master/docs/devel/pull-requests.md#release-notes
-->
**What this PR does / why we need it**:
This PR fixes resync goroutine leak in ListAndWatch function
**Which issue this PR fixes** _(optional, in `fixes #<issue number>(, #<issue_number>, ...)` format, will close that issue when PR gets merged)_: fixes #
fixes#35015
**Special notes for your reviewer**:
**Release note**:
<!-- Steps to write your release note:
1. Use the release-note-* labels to set the release note state (if you have access)
2. Enter your extended release note in the below block; leaving it blank means using the PR title as the release note. If no release note is required, just write `NONE`.
-->
``` release-note
```
Previously, we had no way to stop resync goroutine when ListAndWatch
returned. goroutine leaked every time ListAndWatch returned, for
example, with error. This commit adds another channel to signal that
resync goroutine should exit when ListAndWatch returns.
Previously, we had no way to stop resync goroutine when ListAndWatch
returned. goroutine leaked every time ListAndWatch returned, for
example, with error. This commit adds another channel to signal that
resync goroutine should exit when ListAndWatch returns.
A recent change tries to separate resync and relist. The motivation
was to avoid triggering relist when a resync is required.
However, the change is not effective since it stops the watcher. As hongchao
mentioned in the original comment, today's storage interface will not deliever
any progress notification to the watch chan. So any watcher that does not receive
events for the last few seconds will not be able to catch up from the previous
index after a hard close since the index of the last received event is out of
the cache window inside etcd2.
This pull request tries to fix this issue by not stoping watcher when a resync is
required.
Reflectors started from goroutines are broken because Go doesn't allow
runtime.Callers to see the spawning goroutine. Do a best effort parse of
the call stack for now.
Add logging so that we can easily see which reflectors processes launch,
and measure in logs the frequency of sync intervals.
Support namespacing in cache.Store by framing the interface functions
around interface{} and providing a key function to each Store implementation.
Implementation of a fix for #2294.
Watch depends on long running connections, which intervening proxies
may break without the control of the remote server. Specific errors
handled are io.EOF, io.EOF wrapped by *url.Error, http connection
reset errors (caused by race conditions in golang http code), and
connection reset by peer (simply tolerated).
Watches are often established via long-running HTTP GET requests which
will inevitably time out during the normal course of operations. When
the watches time out, an io.EOF or an io.ErrUnexpectedEOF will bubble
up to client components such as StreamWatcher and Reflector. Treat EOF
as a clean watch termination. Treat ErrUnexpectedEOF as a less-clean
but non-fatal watch termination and log the event at the warning level.
This greatly reduces the amount of log noise generated during what is
ultimately normal operation, and adds the flexibility for the operator
to make a distinction between the EOF conditions if so desired (by
adjusting the logging level).