Cleaning policy callbacks are typically called with hash buckets or
cache lines locked. However cleaning policies maintain structures
which are shared across multiple cache lines (e.g. ALRU list).
Additional synchronization is required for theese structures to
maintain integrity.
ACP already implements hash bucket locks. This patch is adding
ALRU list mutex.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Adding locks to assure partition list consistency. Partition
lists is typically modified under cacheline or hash bucket lock.
This is not sufficient synchronization as adding/removing cacheline
from partition list affects neighbouring cachelines state as well.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Since ocf requires loading cache with the same name as it was stopped, it should
also allow to read name from metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
This change allows to check if specified cache name name is unique. To prevent
adding cache instance with the same name, context lock is acquired until name
isn't set.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adding synchronization around metadata collision segment pages.
This part of metadata is modified when cacheline is mapped/unmapped
and when dirty status changes.
Synchronization on page level is required on top of cacheline
and hash bucket locks to assure metadata flush always reads
consistent state when copying entire collision table memory
page.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Modifying _ocf_mngt_io_class_configure and _ocf_mngt_io_class_remove
to never return -OCF_ERR_IO_CLASS_NOT_EXIST error code. This
return code was ignored by the caller anyway. In _ocf_mngt_io_class_remove
-OCF_ERR_IO_CLASS_NOT_EXIST indicated the IO class is already
removed, which is not an error. In _ocf_mngt_io_class_configure
-OCF_ERR_IO_CLASS_NOT_EXIST indicated empty IO class name, which
is actualy invalid input. This change made it possible to remove
erroneous error handling for -OCF_ERR_IO_CLASS_NOT_EXIST case in
ocf_mngt_cache_io_classes_configure.
This change fixes IO class configuration.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
The ocf_async_lock may be used in atomic context, thus we need
to replace synchronization primitives to non-sleeping variants.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Hash bucket read/write lock is sufficient to safely attempt
cacheline trylock/lock. This change removes cacheline lock
global RW semaprhore and moves cacheline trylock/lock under
hash bucket read/write lock respectively.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This temporarily increases amount of boiler-plate code, but
this is going to be mitigated in the following commits.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
There is one RW lock per hash bucket. Write lock is required
to map cacheline, read lock is sufficient for traversing.
Hash bucket locks are always acquired under global metadata
read lock. This assures mutual exclusion with eviction and
management paths, where global metadata write lock is held.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Core is not assigned to request in cleaner, so to increase it's stats it has to
be retrieved from mapping.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Inactive core stats should be caluculated and returned to adapter in unified
from, just like all stats are.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Since stats builder is implemented for retrieving cache, core and ioclass stats,
adapters should use it instead.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Global free cacheline list is divided into a set of freelists, one
per execution context. When attempting to map addres to cache, first
the freelist for current execution context is considered (fast path).
If current execution context freelist is empty (fast path failure),
mapping function attempts to get freelist from other execution context
list (slow path).
The purpose of this change is improve concurrency in freelist access.
It is part of fine granularity metadata lock implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Fix problem introduced by increasing partition name size to 1024 bytes,
which effectively made superblock bigger than one page. Due to this
flushing superblock required more than one io, which in case of dirty
shutdown between these ios resulted in CRC missmatch and made cache
recovery impossible.
Moving parts metadata to separate sections makes superblock fitting
in one page, effectively solving described problem.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This change refactors the code in order to prepare for removing
global concurrency lock, which won't be needed after per-bucket
metadata locking is in place.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Metadata layout (seq / striping) is already encapsulated
in ocf_metadata_layout_iface, no need to double this logic
in freelist initialization.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Environment should provide calls for destroying primitives (i.e. env_mutex_destroy()) and OCF should call these functions in its cleanup paths.
Signed-off-by: Firas Medini <mdnfiras@yahoo.com>
Don't try to remove invalid cores
If valid cache metadata was read, but environment has changed (i.e. number of cache lines has changed) ocf (in error handling path) was trying to close cores which were not opened. It happened due to cores were marked in cache metadata as added, but any cache inserting operation didn't take place.
In this patch 'added' flag in cache metadata was replaced with more meaningful 'valid' - it is set if given core is stored in cache metadata. Moreover, new 'added' flag was added to core run-time metadata and it is set if given core is added to cache.
If cleaning policy didn't have init() function,
'_ocf_mngt_init_instance_load_complete' returned early and promotion policy
wasn't initialized at all.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Adapter can provide ops for data type deinitialization, if any additional
operations are required on deinitialization procedure.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Modify ocf_metadata_hash_func to return consecutive (modulo @hash_table_entries)
values for consecutive @core_line_num. This way it is trivial to sort all
core lines within a single request according to their hash value. This kind
of sorting will be required to assure that future hash bucket metadata locks
are always acquired in fixed order, eliminating the risk of dead locks.
This change is part of fine granularity metadata lock implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Promotion policy is supposed to perform ALRU noise filtering by
eliminating one-hit wonders being added to cache and polluting it.
Signed-off-by: Jan Musial <jan.musial@intel.com>
When cache is detached we cannot assume there is a management
queue created. This change introduces simplified cache stop
path, performing all the necessary deinit without using
IO queues.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This reverts commit 5ad5c521df.
This change broke setting IO-classes with allocation. We use max as a
special value to indicate that the partition should use cache global
caching mode.
WO cache mode should not repartition cachelines nor affect cacheline
status in any way when servicing read. Reading data from the cache
is just an internal optimization. Also WO cache mode is designed to
be used with partitioning based on write life-time hints and read
requests do not carry write lifetime hint by definition.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Refactoring ocf_submit_cache_reqs to make it clear that
req->map is accessed at index derived from offset argument,
not necesarily starting at 0.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Static code analyzers fail to understand that this variable
is always assigned to before usage.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Write-only cache mode is similar to writeback, however read
operations do not promote data to cache. Reads are mostly serviced
by the core device, only dirty sectors are fetched from the cache.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
ocf_request has always been first class citizen in OCF,
so lets place it along with another essential objects.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
ocf_kick_cleaner() allows to perfom cleaning immediately.
Nop cleaning policy now returns new 'OCF_CLEANER_DISABLE' macro which indicates
that cleaing shouldn't be performed. To enable it back, ocf_kick_cleaner()
should be called.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
If cfg->core_id is OCF_CORE_MAX and that core matches the UUID of
existing not opened one, then set cfg->core_id to id of that core.
This is useful when loading cache from metadata: if user does not store
the ids of cores but relies on OCF to assign them, there is no need to
not reassign them on load.
Previus behaviur when cfg->core_id != id of core with matching UUID is
maintained.
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Passing int constant directly to OCF_PL_NEXT_ON_SUCCESS_RET() macro caused
following compilation error (on GCC 7.4.0):
src/ocf/mngt/ocf_mngt_core.c:599:33: error: ?:
using integer constants in boolean context [-Werror=int-in-bool-context]
error ? -OCF_ERR_WRITE_CACHE : 0);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
src/ocf/mngt/../utils/utils_pipeline.h:145:6: note:
in definition of macro ‘OCF_PL_NEXT_ON_SUCCESS_RET’
if (error) \
^~~~~
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
As non-interruptible flushes are no longer triggered from OCF
internals, we can get rid of "interruption" argument and let
adapters handle it themselves.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This simplifies code by allowing to express programmer intent
explicitly and helps to avoid missing return statements (this patch
fixes at least one bug related to this).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This simplifies cases when we want to call completion callback
and immediately return from void-returning function, by allowing
to explicitly express programmers intent. That way we can avoid
cases when return statement is missing by mistake (this patch
fixes at least one bug related to this).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
_ocf_mngt_cache_unplug context is now provided by the caller.
This way _ocf_mngt_cache_unplug returns only non-critical (cache write)
errors, allowing stop/detach operation to always proceed and optionally
finish with error. This eliminates the need for rolling back previous
stop/detach operations, which might turn out to be impossible e.g.
under memory pressure.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Don't reassign value of cache without any previus use.
It produced warnings when analyzing with scanbuild.
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Adapter can opt to take additional steps to securely allocate
memory used by OCF to store cache metadata. Typically this would
involve mlocking pages and zeroing memory before deallocation.
Memory allocated using secure_alloc is not expected to be zeroed
or physically continous.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Volume close should not close underlying device until all
I/O targeting this volume are deallocated. To achieve this
a reference counter is added to volume. Counter value
matches number of I/O objects associated with volume. Counter
is freezed when volume is closed, blocking allocation of new
I/O objects.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This is useful when reference counter is initialized in non-zeroed
memory (or assuming atomic variable is not properly initialized by
memseting to zero).
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Due the aggresive security checks in compiler 'printf' might be substituded with
'__printf_chk'. However it does not differentiate whether substituted string is
library function call whether field in structure.
By renaming field we prevent it to be unintentionally subustituted by the
preprocessor.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Errors not related to cache disk I/O failure should force
cache stop to return with error without deinitializing cache
instance.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Before async metadata we used to return 0 on flushing volatile
containers, this way we didn't have to special-case them.
This change brings back this behavior.
Signed-off-by: Jan Musial <jan.musial@intel.com>
Core volume I/O must not be queued on management queue - this would
break I/O accounting code, resulting in use-after-free type of errors
after cache detach, core remove etc.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
There is a risk that without uuid.data pointer denitialization it will
be freed during original volume deinit, zeroing uuid.data pointer
prevents that.
Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
This pointer is used to provide optional volume specific data
from the user down to bottom volume open callback. volume_params
is provided to OCF in ocf_mngt_cache_device_config.volume_params.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
In order to synchronize management operations with I/O OCF
maintains in-flight request counters. For example such ref
counters are used during ocf_mngt_detach to drain requests
accessing cache metadata (cache requests counter) and in
ocf_mngt_flush where we wait for outstanding requests sent
in write back mode (dirty requests counter).
Typically I/O threads increment cache/dirty counter when
creating request and decrement counter on request completion.
Management thread sets atomic variable to signal the start of
management operation. I/O threads react to this by changing
I/O requests mode so that the cache/dirty reference counter
is not incremented. As a result reference counter keeps getting
decremented. Management thread waits for the counter to drop to 0
and proceeds with management operation with assumption that no
cache/dirty requests are in progress.
This patch introduces a handy utility for requests reference
counting logic. ocf_refcnt_inc / dec are used to increment/
decrement counter. ocf_refcnt_freeze() makes subsequent
ocf_refcnt_inc() calls to return false, indicating that counter
cannot be incremented at this moment. ocf_refcnt_register_zero_cb
can be used to asynchronously wait for counter to drop to 0.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Cache is used during pipeline_destroy which means that
doing put_cache before destroying pipeline may result in
accessing memory that was freed.
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
For flush/purge entry points to be fully asynchronous we still
need to rework flush mutex and waiting for outstanding dirty
requests.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Send cleaner IOs with appropriate queue set
This solves the issue of bottom adapter getting NULL in io->io_queue
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
This allows to reuse same step functions giving them different parameters
on each step.
Additionally move pipeline to utils, to make it accessible to other
subsystems of OCF (e.g. metadata).
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
NOTE: This is still not the real asynchronism. Metadata interfaces
are still not fully asynchronous.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
NOTE: This patch only changes API that pretends to be asynchronous.
Most of management operations are still performed synchronously.
The real asynchronism will be introduced in the next patches.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Instead of calling flush separatly for each IO class, it is called after
collecting number of dirty cache lines defined by user or after iterating
through all IO classes.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Unlocking cache and putting queue are perormed in cleaning completion, so all
cleaning policies has to call completion.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
PyOCF is a tool written with testing OCF functionality in mind.
It is a Python3 (3.6 version required) package which wraps OCF
by providing Python objects in place of OCF objects (volumes, queues,
etc). Thin layer of translation between OCF objects and PyOCF objects
enables using customized behaviors for OCF primitives by subclassing
PyOCF classes.
This initial version implements only WT and WI modes and single,
synchronously operating Queue.
TO DO:
- Queues/Cleaner/MetadataUpdater implemented as Python threads
- Loading of caches from PyOCF Volumes (fix bugs in OCF)
- Make sure it works multi-threaded for more sophisticated tests
Co-authored-by: Jan Musial <jan.musial@intel.com>
Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
Signed-off-by: Jan Musial <jan.musial@intel.com>
- Add cache trylock and read trylock functions.
- Introduce new error code -OCF_ERR_NO_LOCK.
- Change trylock functions in env to return this code in case of
lock contention.
[ENV CHANGES REQUIRED]
Following functions should return 0 on success or -OCF_ERR_NO_LOCK
in case of lock contention:
- env_mutex_trylock()
- env_rwsem_up_read_trylock()
- env_rwsem_up_write_trylock()
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>