New structure ocf_part is added to contain all the data common for both
user partitions and freelist partition: part_runtime and part_id.
ocf_user_part now contains ocf_part structure as well as pointer to
cleaning partition runtime metadata (moved out from part_runtime) and
user partition config (no change here).
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Eviction changes allowing to evict (remap) cachelines while
holding hash bucket write lock instead of global metadata
write lock.
As eviction (replacement) is now tightly coupled with request,
each request uses eviction size equal to number of its
unmapped cachelines.
Evicting without global metadata write lock is possible
thanks to the fact that remaping is always performed
while exclusively holding cacheline (read or write) lock.
So for a cacheline on LRU list we acquire cacheline lock,
safely resolve hash and consequently write-lock hash bucket.
Since cacheline lock is acquired under hash bucket (everywhere
except for new eviction implementation), we are certain that
noone acquires cacheline lock behind our back. Concurrent
eviction threads are eliminated by holding eviction list
lock for the duration of critial locking operations.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Reformat function that calculates how long cache/core is dirty
Update `dirty_for` types in functional tests
Values stored in info structs fields (both in cache and core structs)
are unsigned 64-bits ints but `dirty_for`s were unsigned 32-bits ints.
Use existing function to transform returned value to seconds.
Replace seconds stored in metadata with seconds.
Replacement was done if old value of replaced field was equal to zero.
Acquiring monotonic high precission timestamp is potentially
slow and it makes sense to compare the field's value
to zero before calling atomic function.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
Cacheline concurrency functions have their interface changed
so that the cacheline concurrency private context is
explicitly on the parameter list, rather than being taken
from cache->device->concurrency.cache_line.
Cache pointer is no longer provided as a parameter to these
functions. Cacheline concurrency context now has a pointer
to cache structure (for logging purposes only).
The purpose of this change is to facilitate unit testing.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
.. in order to move primitives intended to be accessed
concurrently in separate CPU cache line.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Divide single global lock instance into 4 to reduce contention
in multiple read-locks scenario.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
1. new abbreviated previx: ocf_hb (HB stands for hash bucket)
2. clear distinction between functions requiring caller to
hold metadata shared global lock ("naked") vs the ones
which acquire global lock on its own ("prot" for protected)
3. clear distinction between hash bucket locking functions
accepting hash bucket id ("id"), core line and lba ("cline")
and entire request ("req").
Resulting naming scheme:
ocf_hb_(id/cline/req)_(prot/naked)_(lock/unlock/trylock)_(rd/wr)
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
If there is any dirty data on the cache associated with removed core,
we must flush collision metadata after removing core to make metadata
persistent in case of dirty shutdown.
This fixes the problem when recovery procedure erroneously interprets
cache lines that belonged to removed core as valid ones.
This also fixes the problem, when after removing core containing dirty
data another core is added, and then recovery procedure following dirty
shutdown assigns cache lines from removed core to the new one, effectively
leading to data corruption.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Min and max values, keept as an explicit number of cachelines, are tightly
coupled with particular cache. This might lead to errors and mismatches after
reattaching cache of different size.
To prevent those errors, min and max should be calculated dynamically.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Moving metadata implementation out of obsolete metadata_hash.c
to .c files corresponding to function declaration header files.
This requires adding shared header for metadata implementation
metadata_internal.h. Some metadata header files did not have
a corresponding .c file - in this case it is added in this
commit.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Metadata wrapper functions (calling iface->func) in header
files are changed to be declarations only. Hash interface
implementation functions in metadata_hash.c are given an
external linkage and are renamed to drop "hash" prefix.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Locks acquired in ocf_metadata_flush(/load)_all are
acquired only for the duration of queueing asynch
service for flush/load, no actual metadata accesses
are performed there.
Also flush/load all are always performed with metadata
marked as deinitialized (metadata reference counter freezed),
so no I/O is reading nor writing the metadata. The only source
of potential concurrent metadata accesses are other management
operations, which should be synchronized using management lock.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
After writing metadata configuration to disk we must
send a flush request to make sure configuration sections
are commited to non-volatile storage.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
During recovery procedure there is no guarantee that checksums
of runtime sections were flushed correctly before dirty shutdown.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
a_req is allocated using env_vmalloc() so we need to free it
using env_vfree(), not env_free().
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
In case of cleaning metadata used to be flushed only when status of whole cache
line changed to clean.
This patch ensures that metadata flush is triggered after changin status of each
single sector is cache line.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
After second dirty write to cache line which was already dirty, metadata flush
was not triggered. In case of dirty shutdown, this led to data corruption.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
It allows to modify and retrieve particular PP params event if it isn't active
and store values between cache stop and load.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adding locks to assure partition list consistency. Partition
lists is typically modified under cacheline or hash bucket lock.
This is not sufficient synchronization as adding/removing cacheline
from partition list affects neighbouring cachelines state as well.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Since ocf requires loading cache with the same name as it was stopped, it should
also allow to read name from metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adding synchronization around metadata collision segment pages.
This part of metadata is modified when cacheline is mapped/unmapped
and when dirty status changes.
Synchronization on page level is required on top of cacheline
and hash bucket locks to assure metadata flush always reads
consistent state when copying entire collision table memory
page.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
There is one RW lock per hash bucket. Write lock is required
to map cacheline, read lock is sufficient for traversing.
Hash bucket locks are always acquired under global metadata
read lock. This assures mutual exclusion with eviction and
management paths, where global metadata write lock is held.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Global free cacheline list is divided into a set of freelists, one
per execution context. When attempting to map addres to cache, first
the freelist for current execution context is considered (fast path).
If current execution context freelist is empty (fast path failure),
mapping function attempts to get freelist from other execution context
list (slow path).
The purpose of this change is improve concurrency in freelist access.
It is part of fine granularity metadata lock implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Fix problem introduced by increasing partition name size to 1024 bytes,
which effectively made superblock bigger than one page. Due to this
flushing superblock required more than one io, which in case of dirty
shutdown between these ios resulted in CRC missmatch and made cache
recovery impossible.
Moving parts metadata to separate sections makes superblock fitting
in one page, effectively solving described problem.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Metadata layout (seq / striping) is already encapsulated
in ocf_metadata_layout_iface, no need to double this logic
in freelist initialization.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Environment should provide calls for destroying primitives (i.e. env_mutex_destroy()) and OCF should call these functions in its cleanup paths.
Signed-off-by: Firas Medini <mdnfiras@yahoo.com>