Automatic merge from submit-queue (batch tested with PRs 38181, 38128, 36711)
etcd2: have prefix always prepended
The prefix issue is discussed in #36290.
This is fixing etcd2 behavior separately.
**release note**:
```
etcd2: have prefix always prepended
```
Automatic merge from submit-queue (batch tested with PRs 37032, 38119, 38186, 38200, 38139)
etcd2: remove unnecessary PrevValue in SetOption
ref: https://github.com/kubernetes/kubernetes/issues/37994
Summary:
- PrevValue is set in HTTP header, and large value (>1MB) could exceed check limit
- We don't need PrevValue indeed since we already use PrevIndex in SetOptions and each PrevIndex corresponds to each PrevValue.
I don't really think we need extra tests for this. There is already test for GuaranteedUpdate covering its use cases.
Automatic merge from submit-queue
Provide flags to use etcd3 backed storage
ref: #24405
What's in this PR?
- Add a new flag "storage-backend" to choose "etcd2" or "etcd3". By default (i.e. empty), it's "etcd2".
- Take out etcd config code into a standalone package and let it create etcd2 or etcd3 storage backend given user input.
Automatic merge from submit-queue
Bump up etcd dependency to fix data race
ref: https://github.com/kubernetes/kubernetes/pull/23694
What this PR does
- Bumping up the godep of etcd to fix data race in etcd watcher. Without this change, watcher PR builds will fail in race detection.
- Small changes to fix builds after upgrade
Automatic merge from submit-queue
Make etcd cache size configurable
Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.
I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.
There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.
This is provided as a fix for #23323