Commit Graph

522 Commits

Author SHA1 Message Date
Adam Rutkowski
81fc7ab5c5 Parallel eviction
Eviction changes allowing to evict (remap) cachelines while
holding hash bucket write lock instead of global metadata
write lock.

As eviction (replacement) is now tightly coupled with request,
each request uses eviction size equal to number of its
unmapped cachelines.

Evicting without global metadata write lock is possible
thanks to the fact that remaping is always performed
while exclusively holding cacheline (read or write) lock.
So for a cacheline on LRU list we acquire cacheline lock,
safely resolve hash and consequently write-lock hash bucket.
Since cacheline lock is acquired under hash bucket (everywhere
except for new eviction implementation), we are certain that
noone acquires cacheline lock behind our back. Concurrent
eviction threads are eliminated by holding eviction list
lock for the duration of critial locking operations.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:47 +01:00
Adam Rutkowski
1411314678 Add getter function for cache->device->concurrency.cache_line
The purpose of this change is to facilitate unit testing.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:47 +01:00
Adam Rutkowski
ce2ff14150 Move request engine callbacks to req structure
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:47 +01:00
Adam Rutkowski
0e699fc982 Refactor ocf_engine_remap
.. so that the main part, responsible strictly for mapping
given LBA to given collision index, is encapsulated in
a function ocf_map_cache_line with external linkage.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
3bd0f6b6c4 Change sequential request detection logic
Changing sequential request detection so that a miss request is
recognized as sequential after needed cachelines are evicted
and mapped to the request in a sequential order.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
056217d103 Rename cleaner attribute cache_line_lock to lock_cacheline
.. to make it clean that true means cleaner must lock
cachelines rather than the lock is already being held.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
c40e36456b Add missing hash bucket lock in cleaner
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
69c0f20b6e Remove global metadata lock from cleaner metadata update step
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
a80eea454f Add function to determine hash collisions
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
07d1079baa Add LOOKUP_REMAPPED status to allow iterative cacheline lock
Allowing request cacheline lock to be called on partially
locked request. This is going to be usefull for upcomming
eviction improvements, where request will first have evicted
(LOOKUP_REMAPPED) cachelines assigned to it in a locked state,
followed by standard request cacheline lock call in order to
lock previously inserted (LOOKUP_HIT) or mapped from freelist
(LOOKUP_INSERTED) cachelines.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
b34f5fd721 Rename LOOKUP_MAPPED to LOOKUP_INSERTED
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Adam Rutkowski
a09587f521 Introduce ocf_cache_line_is_locked_exclusively
Function returns true if cacheline is locked (read
or write) by exactly one entity with no waiters.

This is usefull for eviction. Assuming caller holds
hash bucket write lock, having exlusive cacheline
lock (either read or write) allows holder to remap
cacheline safely. Typically during eviction hash
bucket is unknown until resolved under cacheline lock,
so locking cacheline exclusively (instead of locking
and checking for exclusive lock) is not possible.

More specifically this is the flow for synchronizing
cacheline remap using ocf_cache_line_is_locked_exclusively:
1. acquire a cacheline (read or write) lock
2. resolve hash bucket
3. write-lock hash bucket
4. verify cacheline lock is exclusive

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-03-05 11:20:46 +01:00
Robert Baldyga
ac9bd5b094
Merge pull request #453 from arutk/no_cl_gl_lock
Skip cacheline concurrency global lock in fast path
2021-03-04 12:33:50 +01:00
Michal Mielewczyk
f1012b020b Validate ioclass config
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-03-02 14:46:57 +01:00
Michal Mielewczyk
95d756de91 Remove ioclass min_size from public API
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-03-02 14:46:57 +01:00
Michal Mielewczyk
f61472c3f4 Validate seq cutoff threshold value
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-02-26 08:24:41 -05:00
Adam Rutkowski
c7fc4fff39 Change cacheline concurrency constructor params
Provide number of cachelines as the cacheline concurrency
construtor param instead of reading it from cache.

The purpose of this change is to improve testability.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-24 17:29:45 -06:00
Adam Rutkowski
cf5f82b253 Use cline concurrency ctx instead of cache
Cacheline concurrency functions have their interface changed
so that the cacheline concurrency private context is
explicitly on the parameter list, rather than being taken
from cache->device->concurrency.cache_line.

Cache pointer is no longer provided as a parameter to these
functions. Cacheline concurrency context now has a pointer
to cache structure (for logging purposes only).

The purpose of this change is to facilitate unit testing.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-24 17:29:39 -06:00
Adam Rutkowski
0f34e46375 Fix error handling in cacheline concurrency init
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-24 17:29:36 -06:00
Adam Rutkowski
d8f25d2742 Skip cacheline concurrency global lock in fast path
The main purpose of cacheline concurrency global lock
is to eliminate the possibility of deadlocks when
locking multiple cachelines.

Cacheline lock fast path does not need to acquire
this lock, as it is only opportunistically attempting
to lock all clines without wait. There is no risk
of deadlock, as:
 * concurrent fast path will also only try_lock
   cachelines, releasing all acquired locks if failed
   to immediately acquire lock for any cacheline
 * concurrent slow path is guaranteed to have
   precedence in lock acquisition when conditions
   for deadlock occure (both slowpath and fastpath
   have acquired some locks required by the other
   thread). This is because the fastpath thread will
   back off (release acquired locks) if any one of the
   cacheline locks is not acquired.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-24 17:29:28 -06:00
Adam Rutkowski
c95f6358ab Get rid of status bits lock
All the status bits operations are now protectec by
hash bucket locks

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-18 15:05:53 -06:00
Adam Rutkowski
cd9e42f987 Properly lock hash bucket for status bits operations
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-18 15:02:50 -06:00
Robert Baldyga
75baec5aa5
Merge pull request #456 from arutk/aalru
Relax LRU list ordering to minimize list updates
2021-02-18 13:48:54 +01:00
Michal Mielewczyk
7f3f2ad115 Evict from overflown pinned ioclass
If an ioclass is pinned but it exceeded its occupancy limit, it should be
evicted anyway.

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-02-16 04:06:07 -05:00
Adam Rutkowski
0748f33a9d Align each global metadata lock to 64B
.. in order to move primitives intended to be accessed
concurrently in separate CPU cache line.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-15 11:27:49 -06:00
Adam Rutkowski
05780c98ed Split global metadata lock
Divide single global lock instance into 4 to reduce contention
in multiple read-locks scenario.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-15 11:27:49 -06:00
Adam Rutkowski
10c3c3de36 Renaming hash bucket locking functions
1. new abbreviated previx: ocf_hb (HB stands for hash bucket)
2. clear distinction between functions requiring caller to
   hold metadata shared global lock ("naked") vs the ones
   which acquire global lock on its own ("prot" for protected)
3. clear distinction between hash bucket locking functions
   accepting hash bucket id ("id"), core line and lba ("cline")
   and entire request ("req").

Resulting naming scheme:
ocf_hb_(id/cline/req)_(prot/naked)_(lock/unlock/trylock)_(rd/wr)

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-12 18:08:15 -06:00
Adam Rutkowski
c822c953ed Fix return status from hash bucket trylock wr
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-11 15:02:06 -06:00
Robert Baldyga
af8177d2ba
Merge pull request #458 from mmichal10/fix-cleaning
Fix updating hot cachelines cleaning list
2021-02-11 11:30:07 +01:00
Robert Baldyga
d03ea719cd
Merge pull request #451 from arutk/exact_evict_count
only request evict size equal to request unmapped count
2021-02-11 10:47:12 +01:00
Michal Mielewczyk
fa41d4fc88 Fix updating hot cachelines cleaning list
Update cacheline's timestamp each time it's being written.

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-02-10 10:02:57 -05:00
Adam Rutkowski
9e98eec361 Only acquire read lock to verify lru elem hotness
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-09 16:23:01 -06:00
Adam Rutkowski
c04bfa3962 Add macros to read lock eviction list
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-09 16:23:01 -06:00
Adam Rutkowski
9690f13bef Change eviction spin lock to RW lock
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-09 16:23:01 -06:00
Adam Rutkowski
b4daac11c2 Track hot items on LRU list
Top 50% least recently used cachelines are not promoted
to list head upon access. Only after cacheline drops to
bottom 50% it is considered as a candidate to promote
to list head.

The purpose of this change is to reduce overhead of
LRU list maintanance for hot cachelines.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-09 16:22:55 -06:00
Adam Rutkowski
746b32c47d Evict from overflown partitions first
Overflown partitions now have precedence over others during
eviction, regardless of IO class priorities.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-09 12:51:39 -06:00
Adam Rutkowski
5538a5a95d Only request evict size equal to request unmapped count
Removing the logic for oportunistic partition overflow
reduction by evicting more cachelines than actually
required by the request being serviced.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-02-09 11:11:15 -06:00
Michal Mielewczyk
93eccc862a Reset per-partition counters when adding core
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-02-03 06:18:44 -05:00
Robert Baldyga
3543f5c5cc
Merge pull request #443 from rafalste/update_copyright
Update copyright statements (2021)
2021-02-03 11:59:39 +01:00
Michal Mielewczyk
3a7b55c4c2 Don't evict on hit
If request is hit, simply try to acquire cachelines instead of verifying
whether target partition's size is not exceeded.

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-01-29 17:15:32 -05:00
Rafal Stefanowski
6ed4cf8a24 Update copyright statements (2021)
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
2021-01-21 13:17:34 +01:00
Adam Rutkowski
012438c279 Add missing collision page lock in cleaner
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-01-20 19:28:41 -06:00
Robert Baldyga
5a88ab2d61 Flush metadata collision segment on core remove
If there is any dirty data on the cache associated with removed core,
we must flush collision metadata after removing core to make metadata
persistent in case of dirty shutdown.

This fixes the problem when recovery procedure erroneously interprets
cache lines that belonged to removed core as valid ones.

This also fixes the problem, when after removing core containing dirty
data another core is added, and then recovery procedure following dirty
shutdown assigns cache lines from removed core to the new one, effectively
leading to data corruption.

Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
2021-01-19 13:34:28 +01:00
Adam Rutkowski
f206c64ff6 Fine granularity lock in cache_mngt_core_deinit_attached_meta
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-01-15 03:11:46 -05:00
Michal Mielewczyk
6d962b38e9 API for cacheline write trylock
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-01-15 02:10:42 -05:00
Adam Rutkowski
bd20d6119b External linkage for function to sparse single cline
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-01-15 02:10:42 -05:00
Adam Rutkowski
93bda499c7 Add functions to lock specific hash bucket
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2021-01-12 15:42:21 -05:00
Robert Baldyga
fd88c2c3a4
Merge pull request #436 from mmichal10/metadata-assert
Metadata assert
2021-01-08 10:15:08 +01:00
Michal Mielewczyk
fcef130919 Bug on metadata access error
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-01-07 18:10:44 -05:00
Michal Mielewczyk
d0225ef1cb Prevent uint32_t overflow
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
2021-01-07 02:45:05 -05:00