Eviction changes allowing to evict (remap) cachelines while
holding hash bucket write lock instead of global metadata
write lock.
As eviction (replacement) is now tightly coupled with request,
each request uses eviction size equal to number of its
unmapped cachelines.
Evicting without global metadata write lock is possible
thanks to the fact that remaping is always performed
while exclusively holding cacheline (read or write) lock.
So for a cacheline on LRU list we acquire cacheline lock,
safely resolve hash and consequently write-lock hash bucket.
Since cacheline lock is acquired under hash bucket (everywhere
except for new eviction implementation), we are certain that
noone acquires cacheline lock behind our back. Concurrent
eviction threads are eliminated by holding eviction list
lock for the duration of critial locking operations.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Reformat function that calculates how long cache/core is dirty
Update `dirty_for` types in functional tests
Values stored in info structs fields (both in cache and core structs)
are unsigned 64-bits ints but `dirty_for`s were unsigned 32-bits ints.
Use existing function to transform returned value to seconds.
Replace seconds stored in metadata with seconds.
Replacement was done if old value of replaced field was equal to zero.
Acquiring monotonic high precission timestamp is potentially
slow and it makes sense to compare the field's value
to zero before calling atomic function.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
Cacheline concurrency functions have their interface changed
so that the cacheline concurrency private context is
explicitly on the parameter list, rather than being taken
from cache->device->concurrency.cache_line.
Cache pointer is no longer provided as a parameter to these
functions. Cacheline concurrency context now has a pointer
to cache structure (for logging purposes only).
The purpose of this change is to facilitate unit testing.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
.. in order to move primitives intended to be accessed
concurrently in separate CPU cache line.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Divide single global lock instance into 4 to reduce contention
in multiple read-locks scenario.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
1. new abbreviated previx: ocf_hb (HB stands for hash bucket)
2. clear distinction between functions requiring caller to
hold metadata shared global lock ("naked") vs the ones
which acquire global lock on its own ("prot" for protected)
3. clear distinction between hash bucket locking functions
accepting hash bucket id ("id"), core line and lba ("cline")
and entire request ("req").
Resulting naming scheme:
ocf_hb_(id/cline/req)_(prot/naked)_(lock/unlock/trylock)_(rd/wr)
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
If there is any dirty data on the cache associated with removed core,
we must flush collision metadata after removing core to make metadata
persistent in case of dirty shutdown.
This fixes the problem when recovery procedure erroneously interprets
cache lines that belonged to removed core as valid ones.
This also fixes the problem, when after removing core containing dirty
data another core is added, and then recovery procedure following dirty
shutdown assigns cache lines from removed core to the new one, effectively
leading to data corruption.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Min and max values, keept as an explicit number of cachelines, are tightly
coupled with particular cache. This might lead to errors and mismatches after
reattaching cache of different size.
To prevent those errors, min and max should be calculated dynamically.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Moving metadata implementation out of obsolete metadata_hash.c
to .c files corresponding to function declaration header files.
This requires adding shared header for metadata implementation
metadata_internal.h. Some metadata header files did not have
a corresponding .c file - in this case it is added in this
commit.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Metadata wrapper functions (calling iface->func) in header
files are changed to be declarations only. Hash interface
implementation functions in metadata_hash.c are given an
external linkage and are renamed to drop "hash" prefix.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Locks acquired in ocf_metadata_flush(/load)_all are
acquired only for the duration of queueing asynch
service for flush/load, no actual metadata accesses
are performed there.
Also flush/load all are always performed with metadata
marked as deinitialized (metadata reference counter freezed),
so no I/O is reading nor writing the metadata. The only source
of potential concurrent metadata accesses are other management
operations, which should be synchronized using management lock.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
After writing metadata configuration to disk we must
send a flush request to make sure configuration sections
are commited to non-volatile storage.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
During recovery procedure there is no guarantee that checksums
of runtime sections were flushed correctly before dirty shutdown.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
a_req is allocated using env_vmalloc() so we need to free it
using env_vfree(), not env_free().
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
In case of cleaning metadata used to be flushed only when status of whole cache
line changed to clean.
This patch ensures that metadata flush is triggered after changin status of each
single sector is cache line.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
After second dirty write to cache line which was already dirty, metadata flush
was not triggered. In case of dirty shutdown, this led to data corruption.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
It allows to modify and retrieve particular PP params event if it isn't active
and store values between cache stop and load.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adding locks to assure partition list consistency. Partition
lists is typically modified under cacheline or hash bucket lock.
This is not sufficient synchronization as adding/removing cacheline
from partition list affects neighbouring cachelines state as well.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Since ocf requires loading cache with the same name as it was stopped, it should
also allow to read name from metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adding synchronization around metadata collision segment pages.
This part of metadata is modified when cacheline is mapped/unmapped
and when dirty status changes.
Synchronization on page level is required on top of cacheline
and hash bucket locks to assure metadata flush always reads
consistent state when copying entire collision table memory
page.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
There is one RW lock per hash bucket. Write lock is required
to map cacheline, read lock is sufficient for traversing.
Hash bucket locks are always acquired under global metadata
read lock. This assures mutual exclusion with eviction and
management paths, where global metadata write lock is held.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Global free cacheline list is divided into a set of freelists, one
per execution context. When attempting to map addres to cache, first
the freelist for current execution context is considered (fast path).
If current execution context freelist is empty (fast path failure),
mapping function attempts to get freelist from other execution context
list (slow path).
The purpose of this change is improve concurrency in freelist access.
It is part of fine granularity metadata lock implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Fix problem introduced by increasing partition name size to 1024 bytes,
which effectively made superblock bigger than one page. Due to this
flushing superblock required more than one io, which in case of dirty
shutdown between these ios resulted in CRC missmatch and made cache
recovery impossible.
Moving parts metadata to separate sections makes superblock fitting
in one page, effectively solving described problem.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Metadata layout (seq / striping) is already encapsulated
in ocf_metadata_layout_iface, no need to double this logic
in freelist initialization.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Environment should provide calls for destroying primitives (i.e. env_mutex_destroy()) and OCF should call these functions in its cleanup paths.
Signed-off-by: Firas Medini <mdnfiras@yahoo.com>
Don't try to remove invalid cores
If valid cache metadata was read, but environment has changed (i.e. number of cache lines has changed) ocf (in error handling path) was trying to close cores which were not opened. It happened due to cores were marked in cache metadata as added, but any cache inserting operation didn't take place.
In this patch 'added' flag in cache metadata was replaced with more meaningful 'valid' - it is set if given core is stored in cache metadata. Moreover, new 'added' flag was added to core run-time metadata and it is set if given core is added to cache.
Modify ocf_metadata_hash_func to return consecutive (modulo @hash_table_entries)
values for consecutive @core_line_num. This way it is trivial to sort all
core lines within a single request according to their hash value. This kind
of sorting will be required to assure that future hash bucket metadata locks
are always acquired in fixed order, eliminating the risk of dead locks.
This change is part of fine granularity metadata lock implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Promotion policy is supposed to perform ALRU noise filtering by
eliminating one-hit wonders being added to cache and polluting it.
Signed-off-by: Jan Musial <jan.musial@intel.com>
ocf_request has always been first class citizen in OCF,
so lets place it along with another essential objects.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This simplifies code by allowing to express programmer intent
explicitly and helps to avoid missing return statements (this patch
fixes at least one bug related to this).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This simplifies cases when we want to call completion callback
and immediately return from void-returning function, by allowing
to explicitly express programmers intent. That way we can avoid
cases when return statement is missing by mistake (this patch
fixes at least one bug related to this).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Don't reassign value of cache without any previus use.
It produced warnings when analyzing with scanbuild.
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Adapter can opt to take additional steps to securely allocate
memory used by OCF to store cache metadata. Typically this would
involve mlocking pages and zeroing memory before deallocation.
Memory allocated using secure_alloc is not expected to be zeroed
or physically continous.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Before async metadata we used to return 0 on flushing volatile
containers, this way we didn't have to special-case them.
This change brings back this behavior.
Signed-off-by: Jan Musial <jan.musial@intel.com>
NOTE: This is still not the real asynchronism. Metadata interfaces
are still not fully asynchronous.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
- Add cache trylock and read trylock functions.
- Introduce new error code -OCF_ERR_NO_LOCK.
- Change trylock functions in env to return this code in case of
lock contention.
[ENV CHANGES REQUIRED]
Following functions should return 0 on success or -OCF_ERR_NO_LOCK
in case of lock contention:
- env_mutex_trylock()
- env_rwsem_up_read_trylock()
- env_rwsem_up_write_trylock()
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
- Queue allocation is now separated from starting cache.
- Queue can be created and destroyed in runtime.
- All queue ops accept queue handle instead of queue id.
- Cache stores queues as list instead of array.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This looks like during change from DIV_ROUND_UP to OCF_DIV_ROUND_UP
this one occurrence has been missed.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>