Commit Graph

187 Commits

Author SHA1 Message Date
Michal Mielewczyk
f1e25c923b D2C: Prevent use after free
Request could be completed and freed before the statistics were updated

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2025-03-26 15:53:51 +01:00
Michal Mielewczyk
a38341389a Microoptimization for resolving cache mode
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2025-03-25 09:32:39 +01:00
Michal Mielewczyk
295b3949bc backfill: Update occupancy only if BF succeeded
Increasing the occupancy before the backfill request has completed leads
to incorrect statistics

test_lazy_engine_errors() detected the bug in the following scenario:
1. Rio submitted a read request
2. The read turned out to be a full miss. After a successful read from
   the core device, engine_read set the metadata bits to valid and
   submitted a backfill request
3. In meantime, Rio submitted a write request to the same cache lines
4. Since the test uses an error device as cache (all sectors are
   erroneous) the backfill request failed and was redirected to
   engine_invalidate
5. engine_invalidate reset the metadata bits to invalid, but since cache
   lines had waiters registered, the occupancy wasn't decremented as
   there was an assumption that the new owner (in this case, the write
   request) would do the bookkeeping
6. Upon receiving cache line locks, the write request detected that
   the mapping changed so it was completed with an error without
   decrementing the occupancy
7. The incorrect value of cached cache lines was persisted for the whole
   lifetime of the cache

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2025-03-24 12:22:25 +01:00
Adam Rutkowski
53ee7c1d3a Per-cpu refcounters
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Signed-off-by: Jan Musial <jan.musial@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@huawei.com>
2025-02-06 12:04:34 +01:00
Robert Baldyga
be068df400
Merge pull request #853 from mmichal10/repart
Repart
2025-02-04 16:39:49 +01:00
Robert Baldyga
0d06b3a597 Fix race condition during cache attach
After attaching new cache device handle all the IOs in Pass-Through mode
until all the d2c requests are completed.

Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-11-21 21:26:00 +01:00
Michal Mielewczyk
b3fa1fc96a Reset repart flag during refreshing request status
The flag isn't reset before retraversation so it might be true even if cache
line was reparted in the meantime

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-10-16 18:37:42 +02:00
Robert Baldyga
1ab882aa61
Merge pull request #852 from robertbaldyga/forward_io-fix-error-accounting
Fix error accounting in forward_io
2024-10-15 08:44:47 +02:00
Robert Baldyga
a1af1809d8 Fix error accounting in forward_io
Resetting cache_error/core_error in ocf_req_forward_* functions may lead
to overwriting already reported error if the forward is being done in the
loop.

To avoid this potential problem, introduce set of forward init functions
intended to be called before the entire forward operation, which resets
the error code and sets a forward callback.

Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-10-14 16:54:51 +02:00
Roel Apfelbaum
1d1561649c Remove redundant fallback-PT counter accesses
The fewer (atomic variable accesses on IO path) the better fare

Signed-off-by: Roel Apfelbaum <roel.apfelbaum@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-10-08 11:14:20 +02:00
Robert Baldyga
3e21b11703 Remove unnecessary references to req->req_remaining
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-09-24 20:06:24 +02:00
Robert Baldyga
fd508435d6 Fix double completion in engine_wi
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-09-24 20:04:09 +02:00
Robert Baldyga
3fbb75756e Consolidate ocf_request_io and ocf_request - io properties
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-20 13:59:46 +02:00
Robert Baldyga
322ae2687d Replace submit with forward in cleaner
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-20 13:59:46 +02:00
Robert Baldyga
7d53dd1e41 Handle D2C early and fast
Avoid unnecessary code execution in D2C mode.
Avoid multiple req->d2c check in normal I/O path.

Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-19 23:58:26 +02:00
Robert Baldyga
10098ccedd Remove unused functions
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-19 15:55:19 +02:00
Robert Baldyga
1ed707361f Modify engines to use forward API
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-19 15:55:19 +02:00
Robert Baldyga
7e73de0d51 volume: Introduce general IO forward mechanism
Allow the core volume IOs to be forwarded directly to backend volumes to
avoid unnecessary allocations.

Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-19 15:55:19 +02:00
Jan Musial
5fadec7e32 Clean dirty requests in WI
Signed-off-by: Jan Musial <jan.musial@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-19 14:58:56 +02:00
Michal Mielewczyk
9c65ec955f engine_rd: Ignore backfill buffer allocation error
It's OK to proceed with a read even if failed to allocate a buffer for backfill

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-18 19:43:56 +02:00
Michal Mielewczyk
a3bccbba6c engine_rd: Refactor
Code beautification only, no functional changes.

Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-18 19:38:51 +02:00
Rafal Stefanowski
194e5a9172 Use cache_error and core_error flags only in WT
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-09-18 14:04:08 +02:00
Roel Apfelbaum
73387c8f26 Support set_data() with offset > 0 for core
Signed-off-by: Roel Apfelbaum <roel.apfelbaum@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-17 16:26:27 +02:00
Michal Mielewczyk
ca7f3651e9 discard engine: lookup without updating hotness
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-10 15:20:51 +02:00
Michal Mielewczyk
0df0eec7f0 Uncouple lookup() and set_hot()
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-10 15:20:51 +02:00
Rafal Stefanowski
7dfe70f69b Fix discard step callback refcount
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-10 15:20:51 +02:00
Robert Baldyga
1bcd949a89 Rename engine_ops to engine_flush
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-10 15:16:33 +02:00
Sara Merzel
835eb708b5 Introduce pass-through block stats
Signed-off-by: Sara Merzel <sara.merzel@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-06 14:47:02 +02:00
Michael Lyulko
470204ac70 Count deferred requests as full miss
Otherwise, it may increase the number of hits, while the overall performance
has not been improved. This way, the hit rate is more correlated with
the performance changes.

Signed-off-by: Michael Lyulko <michael.lyulko@huawei.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@huawei.com>
2024-09-02 09:21:28 +02:00
Ian Levine
ac1b6b774a Added a priority queue for the request instead of push front
Now the request can be pushed to a high priority queue (instead of ocf_queue_push_req_front)
and to a low priority queue (instead of ocf_queue_push_req_back).
Both functions were merged into one function (ocf_queue_push_req) and instead of the
allow_sync parameter there is now a flags parameter that can be an OR combination of
OCF_QUEUE_ALLOW_SYNC and OCF_QUEUE_PRIO_HIGH

Signed-off-by: Ian Levine <ian.levine@huawei.com>
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-08-02 12:53:16 +02:00
Ian Levine
4f2d5c22d6 Move and rename ocf_engine_pop_req from cache_engine to ocf_queue_pop_req in ocf_queue
Signed-off-by: Ian Levine <ian.levine@huawei.com>
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-08-02 12:53:16 +02:00
Ian Levine
038126e9ab Move and rename ocf_engine_push_req_* from engine_common to ocf_queue_push_req_* in ocf_queue
Signed-off-by: Ian Levine <ian.levine@huawei.com>
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-08-02 12:53:16 +02:00
Ian Levine
de32a9649a Rename ocf_engine_cb to ocf_req_cb and move it from engine_common.h to ocf_request.h
Signed-off-by: Ian Levine <ian.levine@huawei.com>
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-08-02 12:53:10 +02:00
Robert Baldyga
dfb2e1a8d5 cleaner: Check mapping after taking cache line lock
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-07-12 17:38:13 +02:00
Robert Baldyga
d7fe7c05f1 Add missing ocf_cache_mode_t to ocf_req_cache_mode_t conversions
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-07-05 16:59:05 +02:00
Robert Baldyga
168ecd0075 Add missing "static" to the local function
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-05-11 00:59:39 +02:00
Robert Baldyga
578f4b6591 Add missing headers
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-05-11 00:51:29 +02:00
Robert Baldyga
527e3deb74 Remove accidentally added .swp file
Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-05-11 00:35:59 +02:00
Robert Baldyga
5710ca8b4a Fix compilation
Signed-off-by: Robert Baldyga <robert.baldyga@open-cas.com>
2024-04-01 18:27:25 +00:00
Robert Baldyga
fd489e3a30 Fix potential deadlock in discard
HB lock takes inclusive metadata lock, which is taken also by metadata
flush, thus trying to call metadata flush under HB lock attempts to take
this lock recursively. In that case, if in the meantime some other thread
would try to take exclusive metadata lock, the inner inclusive lock would
block (because the lock keeps the order), with outer inclusive lock still
held, leading to a deadlock.

Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2024-03-20 23:35:46 +01:00
Robert Baldyga
d57c9bb51d Unlock request in PT using ocf_req_unlock()
There are situations when we can end up in engine_pt with cache lines
locked for write. One example is engine_rd falling back to engine_pt after
failure during cache line preparation, where write lock has been already
taken. To handle this situation properly, unlock request using more general
unlock function.

Signed-off-by: Robert Baldyga <robert.baldyga@huawei.com>
2023-09-13 17:04:06 +02:00
Damian Raczkowski
d2ea41cdbc remove ocf_io_start function
Signed-off-by: Damian Raczkowski <damian.raczkowski@intel.com>
2022-10-28 15:03:36 +02:00
Robert Baldyga
1c701e4101
Merge pull request #750 from robertbaldyga/remove-req-io-if
Get rid of req->io_if
2022-09-08 22:59:57 +02:00
Robert Baldyga
228c5fc891 Get rid of req->io_if
Remove one callback indirection level. I/O never changes it's direction
so there is no point in storing both read and write callbacks for each
request.

Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
2022-09-07 23:07:04 +02:00
Adam Rutkowski
83b4455a0e unify cache write error stats accounting
In most (6/9) instances across engines ocf_core_stats_cache_error_update
is called upon each cache volume I/O error, possibly multiple times
per a user request in case of multi-cacheline requests. Backfill,
fast and read engine are exceptions, incrementing error stats only
once per user request.

This commit unifies ocf_core_stats_cache_error_update usage so that
in all the engines error statistic is incremented for once for every
error.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2022-09-05 21:13:06 +02:00
Adam Rutkowski
df7ed6920c Fix ops(flush) engine
Flush I/O should be forwarded to core and cache device. In case of core
this is simple - just mirror the I/O from the top volume. Since
cache data is owned by OCF it makes sense to send a simple flush I/O
with 0 address and size.

Current implementation attempts to use cache data I/O interface
(ocf_submit_cache_reqs function) instead of submitting empty flush to
the underlying cache device. This function is designed to read/write
from mapped cachelines while there is no traversation/mapping
performed on flush I/O.

If request map allocation succeeds, this results in sending I/O to
addres 0 with size and flags inherited from the top adapter I/O.
This doesn't make any sense, and can even result in invalid I/O if the
size is greater than cache device size.

Even worse, if flush request map allocation fails (which happens
always in case of large flush requests) then the erroneous call to
ocf_submit_cache_reqs results in NULL pointer dereference.

Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
2022-06-01 22:33:35 +02:00
Robert Baldyga
d5b2c65a39 Remove "metadata_layout" parameter of the cache
This feature is replaced with LRU list shuffling.

Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
2022-03-07 17:48:25 +01:00
Robert Baldyga
9a956f59cd
Merge pull request #654 from Open-CAS/fix-flapping-merge
Porting fix-flapping patches from v21.6.4 by arutk
2022-03-05 01:31:23 +01:00
Adam Rutkowski
866bba72bf Explicitly validate superblock after load
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>

Additional changes - load sb recovery CRC check

Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
2022-03-04 19:12:51 +01:00
Robert Baldyga
45cc56f40d Extend BF queue protection to cache device queue
So far the only resource protected by backfill queue blocking was internal
OCF request queue. Move unblock to backfill io completion to protect also
queue of underlying cache device.

Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
2022-03-02 20:59:51 +01:00