Because of metadata flapping it is much more complicated to capture those
sections in flight in standby mode, so we read them directly from the cache
volume during the activate.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Move error print to where it belongs, preventing this message to
pop up when same error code is reported elsewhere for other reason.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
To allow the fastest switching from the passive-standby to active mode, the
runtime metadata must be kept 100% synced with the metadata on the drive and in
the RAM thus recovery is required after each collision section update.
To avoid long-lasting recovering of all the cachelines each time the collision
section is being updated, the passive update procedure recovers only those
which have its MD entries on the updated pages.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Starting cache in a standby mode requires access to a valid cleaning policy
type. If the policy is stored only in the superblock, it may be overridden by
one of the metadata passive updates.
To prevent losing the information it should be stored in cache's runtime
metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Initializing cleaning policy is very time consuming. To reduce the time required
for activating cache instance the initialization sholud be done during passitve
start
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Since part of the recovery is done during `standby init`, the correct shutdown
status has to be set
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
The unsafe mode is useful if the metadata of added cores is incomplete.
Such scenario is possible when starting cache to standby mode from partially
vaild metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Make sure all the invalid cachelines have reset status bits. This allows to
recognize invalid cachelines easily during populate.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Recovery during passive start is based on the assuption that metadata collision
section stored on disk might be partially valid. Reseting this data would make
rebuilding metadata impossible.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Decide whether to promote sequential cutoff stream
to global structures when threshold is reached
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Error for an invalid cache operation while in passive mode added
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Error name correction
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
API changes for passive cache mode
Moved the passive cache error return source to the api for flush and
set_param
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Further API changes for passive cache mode
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Passive api - review changes
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Cache name is needed for logging in passive mode, when config metadata
is still not accessible.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
To prevent deinitializing cleaner context (i.e. during switching policy) during
processing requests, access to cleaner should be protected with reference
counter
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Making the operation asynchronous will allow to use refcnt utility as an
synchronization mechanism between processing cachelines and deinitializing
cleaning policy.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
The change should unify access to cleaning policy resources and facilitate
synchronization when switching cleaning policies
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
src/eviction/lru.c -> src/ocf_lru.c
src/eviction/lru.h -> src/ocf_lru.h
src/eviction/lru_structs.h -> src/ocf_lru_structs.h
src/eviction/eviction.c -> src/ocf_space.c
src/eviction/eviction.h -> src/ocf_space.h
.. as well as corresponding UT files.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
... in UT as well
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
New structure ocf_part is added to contain all the data common for both
user partitions and freelist partition: part_runtime and part_id.
ocf_user_part now contains ocf_part structure as well as pointer to
cleaning partition runtime metadata (moved out from part_runtime) and
user partition config (no change here).
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This allows access to it in ctx_metadata_updater_init, which is
done in the same call stack during initalization.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
After detaching a core if user wanted to remove inactive cores the
cleaning policy data would not be initialized and would bug-out on next
core add.
This check was incorrect, as cleaning policy core metadata lifetime is
not bound to core volume being open or not.
Signed-off-by: Jan Musial <jan.musial@intel.com>
Eviction changes allowing to evict (remap) cachelines while
holding hash bucket write lock instead of global metadata
write lock.
As eviction (replacement) is now tightly coupled with request,
each request uses eviction size equal to number of its
unmapped cachelines.
Evicting without global metadata write lock is possible
thanks to the fact that remaping is always performed
while exclusively holding cacheline (read or write) lock.
So for a cacheline on LRU list we acquire cacheline lock,
safely resolve hash and consequently write-lock hash bucket.
Since cacheline lock is acquired under hash bucket (everywhere
except for new eviction implementation), we are certain that
noone acquires cacheline lock behind our back. Concurrent
eviction threads are eliminated by holding eviction list
lock for the duration of critial locking operations.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
.. to make it clean that true means cleaner must lock
cachelines rather than the lock is already being held.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Cacheline concurrency functions have their interface changed
so that the cacheline concurrency private context is
explicitly on the parameter list, rather than being taken
from cache->device->concurrency.cache_line.
Cache pointer is no longer provided as a parameter to these
functions. Cacheline concurrency context now has a pointer
to cache structure (for logging purposes only).
The purpose of this change is to facilitate unit testing.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Divide single global lock instance into 4 to reduce contention
in multiple read-locks scenario.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
1. new abbreviated previx: ocf_hb (HB stands for hash bucket)
2. clear distinction between functions requiring caller to
hold metadata shared global lock ("naked") vs the ones
which acquire global lock on its own ("prot" for protected)
3. clear distinction between hash bucket locking functions
accepting hash bucket id ("id"), core line and lba ("cline")
and entire request ("req").
Resulting naming scheme:
ocf_hb_(id/cline/req)_(prot/naked)_(lock/unlock/trylock)_(rd/wr)
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
If there is any dirty data on the cache associated with removed core,
we must flush collision metadata after removing core to make metadata
persistent in case of dirty shutdown.
This fixes the problem when recovery procedure erroneously interprets
cache lines that belonged to removed core as valid ones.
This also fixes the problem, when after removing core containing dirty
data another core is added, and then recovery procedure following dirty
shutdown assigns cache lines from removed core to the new one, effectively
leading to data corruption.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Min and max values, keept as an explicit number of cachelines, are tightly
coupled with particular cache. This might lead to errors and mismatches after
reattaching cache of different size.
To prevent those errors, min and max should be calculated dynamically.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Load properties before checking memory needs and obtain cache line size
from context rather than from cache state.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Rather then passing whole structs, supply
_ocf_mngt_calculate_ram_needed() with just the values it actually uses.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Fail `ocf_mngt_cache_load` function with `OCF_ERR_INVAL`
error code when force flag is in use.
Log error message.
Closes#361
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>