Use checksums instead of fsyncs to manage discovery cache corruption

Part of the API discovery cache uses an HTTP RoundTripper that
transparently caches responses to disk. The upstream implementation of
the disk cache is hard coded to call Sync() on every file it writes.
This has noticably poor performance on modern Macs, which ask their disk
controllers to flush all the way to persistant storage because Go uses
the `F_FULLFSYNC` fnctl. Apple recommends minimizing this behaviour in
order to avoid degrading performance and increasing disk wear.

The content of the discovery cache is not critical; it is indeed just a
cache and can be recreated by hitting the API servers' discovery
endpoints. This commit replaces upstream httpcache's diskcache
implementation with a similar implementation that can use CRC-32
checksums to detect corrupted cache entries at read-time. When such an
entry is detected (e.g. because it was only partially flushed to
permanent storage before the host lost power) the cache will report a
miss. This causes httpcache to fall back to its underlying HTTP
transport (i.e. the real API server) and re-cache the resulting value.

Apart from adding CRC-32 checksums and avoiding calling fsync this
implementation differs from upstream httpcache's diskcache package in
that it uses FNV-32a hashes rather than MD5 hashes of cache keys in
order to generate filenames.

Signed-off-by: Nic Cope <nicc@rk0n.org>
This commit is contained in:
Nic Cope
2022-06-28 19:15:49 -07:00
parent eace469065
commit 7a2c6a432f
4 changed files with 203 additions and 69 deletions

View File

@@ -1,61 +0,0 @@
// Package diskcache provides an implementation of httpcache.Cache that uses the diskv package
// to supplement an in-memory map with persistent storage
//
package diskcache
import (
"bytes"
"crypto/md5"
"encoding/hex"
"github.com/peterbourgon/diskv"
"io"
)
// Cache is an implementation of httpcache.Cache that supplements the in-memory map with persistent storage
type Cache struct {
d *diskv.Diskv
}
// Get returns the response corresponding to key if present
func (c *Cache) Get(key string) (resp []byte, ok bool) {
key = keyToFilename(key)
resp, err := c.d.Read(key)
if err != nil {
return []byte{}, false
}
return resp, true
}
// Set saves a response to the cache as key
func (c *Cache) Set(key string, resp []byte) {
key = keyToFilename(key)
c.d.WriteStream(key, bytes.NewReader(resp), true)
}
// Delete removes the response with key from the cache
func (c *Cache) Delete(key string) {
key = keyToFilename(key)
c.d.Erase(key)
}
func keyToFilename(key string) string {
h := md5.New()
io.WriteString(h, key)
return hex.EncodeToString(h.Sum(nil))
}
// New returns a new Cache that will store files in basePath
func New(basePath string) *Cache {
return &Cache{
d: diskv.New(diskv.Options{
BasePath: basePath,
CacheSizeMax: 100 * 1024 * 1024, // 100MB
}),
}
}
// NewWithDiskv returns a new Cache using the provided Diskv as underlying
// storage.
func NewWithDiskv(d *diskv.Diskv) *Cache {
return &Cache{d}
}