Request submitted in fast path may be freed before the sequential cutoff stats
are updated. Increment request reference counter to prevent it.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Move error print to where it belongs, preventing this message to
pop up when same error code is reported elsewhere for other reason.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
This patch fixes the issue 988 (and 997) causing a kernel stack
overflow.
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
To allow the fastest switching from the passive-standby to active mode, the
runtime metadata must be kept 100% synced with the metadata on the drive and in
the RAM thus recovery is required after each collision section update.
To avoid long-lasting recovering of all the cachelines each time the collision
section is being updated, the passive update procedure recovers only those
which have its MD entries on the updated pages.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Starting cache in a standby mode requires access to a valid cleaning policy
type. If the policy is stored only in the superblock, it may be overridden by
one of the metadata passive updates.
To prevent losing the information it should be stored in cache's runtime
metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Initializing cleaning policy is very time consuming. To reduce the time required
for activating cache instance the initialization sholud be done during passitve
start
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Since part of the recovery is done during `standby init`, the correct shutdown
status has to be set
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
The unsafe mode is useful if the metadata of added cores is incomplete.
Such scenario is possible when starting cache to standby mode from partially
vaild metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Make sure all the invalid cachelines have reset status bits. This allows to
recognize invalid cachelines easily during populate.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Recovery during passive start is based on the assuption that metadata collision
section stored on disk might be partially valid. Reseting this data would make
rebuilding metadata impossible.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Decide whether to promote sequential cutoff stream
to global structures when threshold is reached
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
GCC 11 static check finds an array size mismatch and prevents OCF from
compiling correctly. This fixes OpenCAS Linux issue #968
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Error for an invalid cache operation while in passive mode added
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Error name correction
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
API changes for passive cache mode
Moved the passive cache error return source to the api for flush and
set_param
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Further API changes for passive cache mode
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Passive api - review changes
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Management pipelines tend to consist of multiple asynchronous steps.
Allowing synchronous queue kick results in massive call stacks (e.g.
almost 500 functions deep in case of cache stop). Since async kick
is required anyway, it seems reasonable to switch to async kick
in pipeline implementation.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Cache name is needed for logging in passive mode, when config metadata
is still not accessible.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
To prevent deinitializing cleaner context (i.e. during switching policy) during
processing requests, access to cleaner should be protected with reference
counter
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Making the operation asynchronous will allow to use refcnt utility as an
synchronization mechanism between processing cachelines and deinitializing
cleaning policy.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
The change should unify access to cleaning policy resources and facilitate
synchronization when switching cleaning policies
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
All remapped cachelines are write locked. If the operation fails cachelines has
to be unlocked during rollback
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
ocf_lru_get_list() now returs clean list for freelist partition to
provide common interface regardless of partition type.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Flushing metadata in WT is required only if at least of the request's cacheline
changed its state to clean.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
There's a possibility that WT write is performed to dirty cache line (i.e. after
switching WB->WT without flush) and status bits change from dirty to clean. If
power failure occurs it might happen that recovery would ignore recent data from
cache and would assume that data is clean while backend storage data is out of
date.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
In WB mode metadata should be updated only if the actuall data had been saved
on disk. Otherwise metadata might be flushed too early and consequently data
corruption might occur.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
src/eviction/lru.c -> src/ocf_lru.c
src/eviction/lru.h -> src/ocf_lru.h
src/eviction/lru_structs.h -> src/ocf_lru_structs.h
src/eviction/eviction.c -> src/ocf_space.c
src/eviction/eviction.h -> src/ocf_space.h
.. as well as corresponding UT files.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
... in UT as well
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
New structure ocf_part is added to contain all the data common for both
user partitions and freelist partition: part_runtime and part_id.
ocf_user_part now contains ocf_part structure as well as pointer to
cleaning partition runtime metadata (moved out from part_runtime) and
user partition config (no change here).
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Lookup is repeated after request is identified as miss and hash bucket
lock is upgraded (in order to map missing cachelines). At this point
cachelines status might change and the request might turn out to be
a hit after all. Adding check for this condition removes unnecessary
calls to remap logic.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This allows access to it in ctx_metadata_updater_init, which is
done in the same call stack during initalization.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
After detaching a core if user wanted to remove inactive cores the
cleaning policy data would not be initialized and would bug-out on next
core add.
This check was incorrect, as cleaning policy core metadata lifetime is
not bound to core volume being open or not.
Signed-off-by: Jan Musial <jan.musial@intel.com>
Eviction should decrement occupancy statistics for the
core from which a cacheline is being evicted rather than
from the I/O target core.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
update_req_info() should include REMAPPED cachelines
in repart stats (number of cachelines within request
belonging to other than the target partition).
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This check is incorrect as cacheline status may change
from dirty to clean at any point during cleaning, except for
when the hash bucket is locked.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This isn't strictly required in current implementation as
nodes are always re-initialized before inserting to LRU list.
However it seems to make sense to zero the flag anyway to
make the code easier to reason about.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Removing conditional early return from engine_map() function
in case of insufficient free cachelines. The reasons are:
1. current implementation does not treat unssufficient free
cachelines condition as an error,
2. the check is based on stale request info, so it is inaccurate,
3. it is easier to hit more paths with functional tests,
4. partially mapping request from the freelist becomes more common
rather than being a corner case dependent on racy timings between
threads
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Move ref counts to their own cacheline - otherwise they pollute and cause
false sharing to fields nearby and cause a lot of cache bouncing between
physical CPUs.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Grouping static fields together, while often changing ones get their own
cacheline, or some not used often/important fields.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
set_hot() depends on cacheline metadata status to determine
on which list the element is located (dirty cs clean list).
Thus at least hash bucket lock is required when calling
set_hot().
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
When issuing discard request over 512KiB OCF would trim this request and
overwrite req->core_line_count which would then cause this request to be
freed from wrong mpool.
This is fixed now by saving core_line_count that was set when allocating
this request that is never overwritten. This alloc_core_line_count is
then used to free the request from correct mpool.
Signed-off-by: Jan Musial <jan.musial@intel.com>
If user thread is preempted during tree/list update and another IO
is issued on the same CPU, the structure will be in undefined state.
This may result in hung tasks, if the tree stops being a tree and a loop exists -
tree search functions won't be able to end properly; or panics if a NULL value appears
suddenly in the preempted thread, after a null-check was already done.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
When allocation of request with map fails, we fallback to allocating
request with no map, and then allocate map separately. During request
put we need to distinguish between those two cases in order to deallocate
request properly.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Early return from engine_map() in case of insufficient free
cachelines on the freelist is opportunistic, as both request
map info and freelist count are not accurate. Map info is stale
as it is to be refreshed in engine_map() after hash bucket
lock had been upgraded. Freelist count on other hand is subject
to change asynchronously.
The implementation assumption however is that after engine_map()
request is fully traversed (engine_map() is equivalent to
engine_lookup() followed by an attempt to map missing cachelines).
So in case of early return we must take care of repeating the
lookup.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
At this point cacheline status in request map is stale,
as lookup was performed before upgrading hash bucket lock.
If indeed all cachelines are mapped, this will be determined
in the main loop of engine_map().
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
This assures that cacheline with LOOKUP_INSERTED status
is always present on the LRU list.
This fixes an ENV_BUG() caused by an attempt to remove
a cacheline from LRU list which was not there. This
happened when cacheline was mapped from freelist
(LOOKUP_INSERTED) but the entire request mapping failed
and generic cleanup routines attempted to invalidate cacheline,
including removing it from the LRU list. As engine_set_hot()
is called after successfull mapping, the inserted cacheline was
not yet present on the LRU list.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Number of cachelines to evcit can't be greater than the number of unmapped
entries in request.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>