The purpose of this change is not to write superblock to the cache
drive untill all other sections are initilized on disk in attach()
path. Combined with superblock clearing at the erarlier stage of
attach(), this assures there are no residual mappings in the collision
section in case of power failure during attach with pre-existing
metadata.
This is implemented by removing ocf_metadata_flush_all_set_status() step
at the beginning of ocf_metadata_flush_all().
ocf_metadata_flush_all() is called, except for the attach() case described
above, in two cases:
1. at the end of cache load - potentially after cache recovery
2. during detaching cache drive in cache stop.
To make sure there are no regressions in the first case, an explicit
_ocf_mngt_attach_shutdown_status() is added to load pipeline before
ocf_metadata_flush_all(). The second case is always ran after cache
drive is attached, so dirty status bit must have already be written to
the disk.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Because of metadata flapping it is much more complicated to capture those
sections in flight in standby mode, so we read them directly from the cache
volume during the activate.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This feature provides double buffering of config sections to prevent
situation when power failure during metadata flush leads to partially
updated metadata. Flapping mechanism makes it always possible to perform
graceful rollback to previous config metadata content in such situation.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Request submitted in fast path may be freed before the sequential cutoff stats
are updated. Increment request reference counter to prevent it.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Move error print to where it belongs, preventing this message to
pop up when same error code is reported elsewhere for other reason.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
This patch fixes the issue 988 (and 997) causing a kernel stack
overflow.
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
To allow the fastest switching from the passive-standby to active mode, the
runtime metadata must be kept 100% synced with the metadata on the drive and in
the RAM thus recovery is required after each collision section update.
To avoid long-lasting recovering of all the cachelines each time the collision
section is being updated, the passive update procedure recovers only those
which have its MD entries on the updated pages.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Starting cache in a standby mode requires access to a valid cleaning policy
type. If the policy is stored only in the superblock, it may be overridden by
one of the metadata passive updates.
To prevent losing the information it should be stored in cache's runtime
metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Initializing cleaning policy is very time consuming. To reduce the time required
for activating cache instance the initialization sholud be done during passitve
start
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Since part of the recovery is done during `standby init`, the correct shutdown
status has to be set
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
The unsafe mode is useful if the metadata of added cores is incomplete.
Such scenario is possible when starting cache to standby mode from partially
vaild metadata.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Make sure all the invalid cachelines have reset status bits. This allows to
recognize invalid cachelines easily during populate.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Recovery during passive start is based on the assuption that metadata collision
section stored on disk might be partially valid. Reseting this data would make
rebuilding metadata impossible.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>