Increasing the occupancy before the backfill request has completed leads
to incorrect statistics
test_lazy_engine_errors() detected the bug in the following scenario:
1. Rio submitted a read request
2. The read turned out to be a full miss. After a successful read from
the core device, engine_read set the metadata bits to valid and
submitted a backfill request
3. In meantime, Rio submitted a write request to the same cache lines
4. Since the test uses an error device as cache (all sectors are
erroneous) the backfill request failed and was redirected to
engine_invalidate
5. engine_invalidate reset the metadata bits to invalid, but since cache
lines had waiters registered, the occupancy wasn't decremented as
there was an assumption that the new owner (in this case, the write
request) would do the bookkeeping
6. Upon receiving cache line locks, the write request detected that
the mapping changed so it was completed with an error without
decrementing the occupancy
7. The incorrect value of cached cache lines was persisted for the whole
lifetime of the cache
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
The next commit will move occupancy accounting to backfill which makes
testing statistics value even more time dependent. Settling cache before
cache.get_stats() prevents this error-inducing race conditions
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
The cleaning metadata has been deinitialized in the previous pipeline step
together with other services
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
Flushing metadata has nothing to do with dinitializing services so it
should be a separate step in the stop pipeline
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
Move flushing metadata outside cache_detinit_services(), so the function
can be shared between stop() and detach() without redundant ifs.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
The completion callback is called only in the cache stop scenario, after
flushing the metadata
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
GCC/Clang sanitizer can be used together with PyOCF to catch some errors during
testing.
CC was purposely removed from the Makefile. It always points to GCC on Linux
by default. This allows to change the compiler and its options during the run
of the script
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
This resets count_pages_variable on cache-detach, so during the following
cache-attach metadata size is calculated properly.
Signed-off-by: Daniel Madej <daniel.madej@huawei.com>
During core remove/detach ocf_cleaner_refcnt_freeze was called only
when cache was attached, but ocf_cleaner_refcnt_unfreeze was called
regardless of cache state.
Signed-off-by: Daniel Madej <daniel.madej@huawei.com>
Cache attach operation is not supposed to complete unless all the d2c
requests are completed, thus need to handle it asynchronously.
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
After attaching new cache device handle all the IOs in Pass-Through mode
until all the d2c requests are completed.
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
The flag isn't reset before retraversation so it might be true even if cache
line was reparted in the meantime
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>