Divide single global lock instance into 4 to reduce contention
in multiple read-locks scenario.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
1. new abbreviated previx: ocf_hb (HB stands for hash bucket)
2. clear distinction between functions requiring caller to
hold metadata shared global lock ("naked") vs the ones
which acquire global lock on its own ("prot" for protected)
3. clear distinction between hash bucket locking functions
accepting hash bucket id ("id"), core line and lba ("cline")
and entire request ("req").
Resulting naming scheme:
ocf_hb_(id/cline/req)_(prot/naked)_(lock/unlock/trylock)_(rd/wr)
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
If there is any dirty data on the cache associated with removed core,
we must flush collision metadata after removing core to make metadata
persistent in case of dirty shutdown.
This fixes the problem when recovery procedure erroneously interprets
cache lines that belonged to removed core as valid ones.
This also fixes the problem, when after removing core containing dirty
data another core is added, and then recovery procedure following dirty
shutdown assigns cache lines from removed core to the new one, effectively
leading to data corruption.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Min and max values, keept as an explicit number of cachelines, are tightly
coupled with particular cache. This might lead to errors and mismatches after
reattaching cache of different size.
To prevent those errors, min and max should be calculated dynamically.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Load properties before checking memory needs and obtain cache line size
from context rather than from cache state.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Rather then passing whole structs, supply
_ocf_mngt_calculate_ram_needed() with just the values it actually uses.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Fail `ocf_mngt_cache_load` function with `OCF_ERR_INVAL`
error code when force flag is in use.
Log error message.
Closes#361
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
There is no need to constantly hold metadata global lock
during collecting cachelines to clean. Since we are past
freezing dirty request counter, we know for sure that the
number of dirty cache lines will not increase. So the worst
case is when the loop realxes and releases the lock,
a concurent IO to CAS is serviced in WT mode, possibly
inserting and/or evicting cachelines. This does not interfere
with scanning for dirty cachelines. And the lower layer will
handle synchronization with concurrent I/O by acquiring
asynchronous read lock on each cleaned cacheline.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
In current implementation in case of fast media flushning
container may starve all concurrent containers flushing
due to continous rescheduling of offender requests to the
front of I/O queue. Pushing request to the back of IO
queue ensures FIFO handling and removes possibility of
starvation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Lower layer is prepared to handle used cachelines by
acquiring asynchronus read lock. It is very likely that
by the time the cacheline is actually cleaned its lock
state changes. So checking the lock at the moment of
constructing dirty cachelines list makes little sense.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Moving _ocf_mngt_flush_containers outside global metadata
critical section. All this function does is sort core lines
and add queue request.
This fixes stalls reported by Linux scheduler due to
IO threads waiting on global metadata RW semaprhore for
several minutes.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This way debug prints during metadata init phase won't cause crash
(because of the fact that temporary cache object does not have proper
ctx set hence does not have logger obj).
Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
Change 'OCF_ERR_START_CACHE_FAIL' to 'OCF_ERR_NO_MEM' while CAS fails in case of memory lack on device.
Add new error code for case, when device doesn't satisfy CAS requirements - 'OCF_ERR_INVAL_CACHE_DEV'.
Use 'OCF_ERR_INVAL_CACHE_DEV' in code.
Update error code match in test.
closes#317 issue
Signed-off-by: Ostrokrzew <slawomir.jankowski@intel.com>
To eliminate possibility of allocation error in cache stop, pipeline is
allocated on attach.
Due this change, the only possible non-zero status of ocf_mngt_cache_stop() is
just a warning and cache is always stopped after executing it.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Since we do not expect that incrementing cache's reference counter
during cache init will fail at any condition it is can be changed
to an assert instead of error handling.
Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
pointer type to array which is 32 characters long;
**core.py**: Add missing import and modify class' field type
to keep consistency;
**ocf_mngt_core**: Remove local variable 'name';
remove env_vmalloc for 'name' - isn't no longer needed;
remove initialization 'name' - as above;
remove env_vfree for context->cfg.name - variable isn't no allocated
in memory;
check if cfg->name exists;
change label in goto from deleted err_name to the closest err_pipeline.
Signed-off-by: Slawomir_Jankowski <slawomir.jankowski@intel.com>
It allows to modify and retrieve particular PP params event if it isn't active
and store values between cache stop and load.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Cache lock waiters hold cache refcount. Because of that,
if there were some waiters, deinitialization of cache
lock on the last put did never happen and putting the
cache was effectively impossible.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This change allows to check if specified cache name name is unique. To prevent
adding cache instance with the same name, context lock is acquired until name
isn't set.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adding synchronization around metadata collision segment pages.
This part of metadata is modified when cacheline is mapped/unmapped
and when dirty status changes.
Synchronization on page level is required on top of cacheline
and hash bucket locks to assure metadata flush always reads
consistent state when copying entire collision table memory
page.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Modifying _ocf_mngt_io_class_configure and _ocf_mngt_io_class_remove
to never return -OCF_ERR_IO_CLASS_NOT_EXIST error code. This
return code was ignored by the caller anyway. In _ocf_mngt_io_class_remove
-OCF_ERR_IO_CLASS_NOT_EXIST indicated the IO class is already
removed, which is not an error. In _ocf_mngt_io_class_configure
-OCF_ERR_IO_CLASS_NOT_EXIST indicated empty IO class name, which
is actualy invalid input. This change made it possible to remove
erroneous error handling for -OCF_ERR_IO_CLASS_NOT_EXIST case in
ocf_mngt_cache_io_classes_configure.
This change fixes IO class configuration.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
There is one RW lock per hash bucket. Write lock is required
to map cacheline, read lock is sufficient for traversing.
Hash bucket locks are always acquired under global metadata
read lock. This assures mutual exclusion with eviction and
management paths, where global metadata write lock is held.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Global free cacheline list is divided into a set of freelists, one
per execution context. When attempting to map addres to cache, first
the freelist for current execution context is considered (fast path).
If current execution context freelist is empty (fast path failure),
mapping function attempts to get freelist from other execution context
list (slow path).
The purpose of this change is improve concurrency in freelist access.
It is part of fine granularity metadata lock implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Fix problem introduced by increasing partition name size to 1024 bytes,
which effectively made superblock bigger than one page. Due to this
flushing superblock required more than one io, which in case of dirty
shutdown between these ios resulted in CRC missmatch and made cache
recovery impossible.
Moving parts metadata to separate sections makes superblock fitting
in one page, effectively solving described problem.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Environment should provide calls for destroying primitives (i.e. env_mutex_destroy()) and OCF should call these functions in its cleanup paths.
Signed-off-by: Firas Medini <mdnfiras@yahoo.com>
Don't try to remove invalid cores
If valid cache metadata was read, but environment has changed (i.e. number of cache lines has changed) ocf (in error handling path) was trying to close cores which were not opened. It happened due to cores were marked in cache metadata as added, but any cache inserting operation didn't take place.
In this patch 'added' flag in cache metadata was replaced with more meaningful 'valid' - it is set if given core is stored in cache metadata. Moreover, new 'added' flag was added to core run-time metadata and it is set if given core is added to cache.
If cleaning policy didn't have init() function,
'_ocf_mngt_init_instance_load_complete' returned early and promotion policy
wasn't initialized at all.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Promotion policy is supposed to perform ALRU noise filtering by
eliminating one-hit wonders being added to cache and polluting it.
Signed-off-by: Jan Musial <jan.musial@intel.com>
When cache is detached we cannot assume there is a management
queue created. This change introduces simplified cache stop
path, performing all the necessary deinit without using
IO queues.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This reverts commit 5ad5c521df.
This change broke setting IO-classes with allocation. We use max as a
special value to indicate that the partition should use cache global
caching mode.
Write-only cache mode is similar to writeback, however read
operations do not promote data to cache. Reads are mostly serviced
by the core device, only dirty sectors are fetched from the cache.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
ocf_request has always been first class citizen in OCF,
so lets place it along with another essential objects.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
ocf_kick_cleaner() allows to perfom cleaning immediately.
Nop cleaning policy now returns new 'OCF_CLEANER_DISABLE' macro which indicates
that cleaing shouldn't be performed. To enable it back, ocf_kick_cleaner()
should be called.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
If cfg->core_id is OCF_CORE_MAX and that core matches the UUID of
existing not opened one, then set cfg->core_id to id of that core.
This is useful when loading cache from metadata: if user does not store
the ids of cores but relies on OCF to assign them, there is no need to
not reassign them on load.
Previus behaviur when cfg->core_id != id of core with matching UUID is
maintained.
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Passing int constant directly to OCF_PL_NEXT_ON_SUCCESS_RET() macro caused
following compilation error (on GCC 7.4.0):
src/ocf/mngt/ocf_mngt_core.c:599:33: error: ?:
using integer constants in boolean context [-Werror=int-in-bool-context]
error ? -OCF_ERR_WRITE_CACHE : 0);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
src/ocf/mngt/../utils/utils_pipeline.h:145:6: note:
in definition of macro ‘OCF_PL_NEXT_ON_SUCCESS_RET’
if (error) \
^~~~~
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
As non-interruptible flushes are no longer triggered from OCF
internals, we can get rid of "interruption" argument and let
adapters handle it themselves.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This simplifies code by allowing to express programmer intent
explicitly and helps to avoid missing return statements (this patch
fixes at least one bug related to this).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This simplifies cases when we want to call completion callback
and immediately return from void-returning function, by allowing
to explicitly express programmers intent. That way we can avoid
cases when return statement is missing by mistake (this patch
fixes at least one bug related to this).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
_ocf_mngt_cache_unplug context is now provided by the caller.
This way _ocf_mngt_cache_unplug returns only non-critical (cache write)
errors, allowing stop/detach operation to always proceed and optionally
finish with error. This eliminates the need for rolling back previous
stop/detach operations, which might turn out to be impossible e.g.
under memory pressure.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Errors not related to cache disk I/O failure should force
cache stop to return with error without deinitializing cache
instance.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This pointer is used to provide optional volume specific data
from the user down to bottom volume open callback. volume_params
is provided to OCF in ocf_mngt_cache_device_config.volume_params.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Cache is used during pipeline_destroy which means that
doing put_cache before destroying pipeline may result in
accessing memory that was freed.
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
For flush/purge entry points to be fully asynchronous we still
need to rework flush mutex and waiting for outstanding dirty
requests.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This allows to reuse same step functions giving them different parameters
on each step.
Additionally move pipeline to utils, to make it accessible to other
subsystems of OCF (e.g. metadata).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
NOTE: This is still not the real asynchronism. Metadata interfaces
are still not fully asynchronous.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
NOTE: This patch only changes API that pretends to be asynchronous.
Most of management operations are still performed synchronously.
The real asynchronism will be introduced in the next patches.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
PyOCF is a tool written with testing OCF functionality in mind.
It is a Python3 (3.6 version required) package which wraps OCF
by providing Python objects in place of OCF objects (volumes, queues,
etc). Thin layer of translation between OCF objects and PyOCF objects
enables using customized behaviors for OCF primitives by subclassing
PyOCF classes.
This initial version implements only WT and WI modes and single,
synchronously operating Queue.
TO DO:
- Queues/Cleaner/MetadataUpdater implemented as Python threads
- Loading of caches from PyOCF Volumes (fix bugs in OCF)
- Make sure it works multi-threaded for more sophisticated tests
Co-authored-by: Jan Musial <jan.musial@intel.com>
Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
Signed-off-by: Jan Musial <jan.musial@intel.com>
- Add cache trylock and read trylock functions.
- Introduce new error code -OCF_ERR_NO_LOCK.
- Change trylock functions in env to return this code in case of
lock contention.
[ENV CHANGES REQUIRED]
Following functions should return 0 on success or -OCF_ERR_NO_LOCK
in case of lock contention:
- env_mutex_trylock()
- env_rwsem_up_read_trylock()
- env_rwsem_up_write_trylock()
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
- Queue allocation is now separated from starting cache.
- Queue can be created and destroyed in runtime.
- All queue ops accept queue handle instead of queue id.
- Cache stores queues as list instead of array.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Instead of switching write policy to pass-through, barrier is rised
by incrementing counter in ocf_cache_t structure.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Replace assembler code with C equivalent and slightly simplify
logic searching for first free core.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>