_cas_stop_thread() function synchronizes with cleaner thread, so after
that we can be sure that there are no more ongoing cleaning requests.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Changing cache mode to the same mode is a special case that in OCL is
handled on the kernel level, without calling an OCF API. In result it
seemed to succeed even in standby mode, which should return an error.
Explicitly check for standby to return an appropriate error code.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Instead of stopping passive instance in case of every possible error, allow it
to remain in standby mode if the error was handleable
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Cache priv is being allocated on starting cache instance and is freed only when
stopping cache. This cachnge allows to properly handle rollback if activate has
failed. Without setting this flag managment queue is not being stopped despite
its cache doesn't exist.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
When dir is ignored, and 0 is passed instead, each flush request will
appear as READ request, which is not supported by some block device
drivers.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
One of the steps of discarding data in cache is invalidating OCF metadata.
If a cache line which is supposed to be discarded is dirty, invalidating
it will require flushing metadata. Unfortunately, OCF allocates flushing
requests with the exactly the same flags as the original IO (in this case
discard flag is set) so the page on the disk is discarded instead of being
flushed. In case of power failure occurring before the metadata is flushed
to the disk, the data may be corrupted even if recovery will succeed.
Disabling propagation of original I/O flags for discard requests solves
this problem.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
The bio_alloc_bioset() function now BUG() if trying to allocate a bio
with more than BIO_MAX_VECS vectors.
A no-limit value (-1) is defined in order not to change old kernels'
behaviour.
Signed-off-by: Gal Hammer <gal.hammer@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Moved cas_blk_get_part_count function to configure section after the
the disk's partitions table was changed to xarray.
Signed-off-by: Gal Hammer <gal.hammer@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
The module_mutex is internal to the module loader since kernel
commit 922f2a7c.
Signed-off-by: Gal Hammer <gal.hammer@huawei.com>
Signed-off-by: Shai Fultheim <shai.fultheim@huawei.com>
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
It is legal to call KCAS_IOCTL_INSERT_CORE against non-existing cache
(in try_add mode), however in that case core_id has to be provded.
Return error code in case when given cache id does not exist and core_id
is set to OCF_CORE_MAX.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Due to linux thread scheduling nature, we prefer to promote streams
as early as we reasonably can. One way to achieve that is to set
promotion count really low, which unfortunately significantly increases
number of accesses to shared structures. The other way is to promote
streams which reach cutoff threshold, as we can reasonably assume that
they are likely be continued after thread is rescheduled to another CPU.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Don't print statistics for a cache in passive state
Passive cache - casadm set/get cache param disabled in passive state
Obsolete "cache_get_param" function removed
Error in layer_cache_management.c fixed
Flushing cache/core disabled with error for passive mode
Core addition disabled in passive mode
IO class setting disabled for passive mode
Counters reset disabled for passive mode
Ioctl handling changes to reflect OCF API changes
Signed-off-by: Krzysztof Majzerowicz-Jaszcz <krzysztof.majzerowicz-jaszcz@intel.com>
This is a suboptimal solution to CAS on top of MD RAID1 device. If using
only submit_bio API RAID1 would process all IOs in single thread.
Plugging bypasses this thread and processess IOs in blk_finish_plug
caller context improving performance drastically.
Testing showed no negative impact to other usecases and it's a thing
that Linux does in AIO, so it's vetted and proven to work.
Signed-off-by: Jan Musial <jan.musial@intel.com>
All the operations on `count` are performed under the lock thus it doesn't need
to be atomic.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Some helper threads are created at the very beginning of cache start/stop
operations, but they are used only after OCF start/stop finishes, which
may take significant amount of time. Kernel by default creates threads
that wait for the first wake up in uninterruptible state, which may trigger
hung task warning if the first wake up is called more than 120 seconds
after thread creation. To mitigate this problem we create lazy thread
abstraction that waits for a wake up in interruptible state.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
OCF cannot allocate request map bigger than 4MiB (due to kmalloc
limitations), thus we need to split bigger IOs into series smaller
ones to reduce request map size.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
It's defined on every single supported kernel, so there is actually no need
for this define at all.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This patch fixes adding core after core addition failure.
The queue wasn't cleaned before and following core addition cannot
re-initialize queue properly.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
In case of initial flush error stop is aborted. In case
of failure during the second flush, appropriate error
message is presetned to the user.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Don't remove inactive core if it has dirt cache lines assigned unless `force`
flag is specified.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
The data->size field can be initialized to a lower value than bio->bi_vcnt,
if the bio is split. The bio_for_each_segment then iterates based on the original
indexes and the mismatch eventually causes a BUG_ON.
Fixes#714
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
If time is counted in jiffies, machine reboot brokes `dirty for`
statistic for caches loaded at boot. The counter overflows and
`dirty for` shows some huge values.
Cast ticks to unsigned long.
Add necessary header.
Move `env_msleep` to `TIME` subgroup of header.
Move `env_time_after` below time converting functions.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
Create module-side handling of inactive core removal.
Extract core functionality of core removal that applies to inactive core
and copy it to `cache_mngt_remove_inactive_core` function.
Return proper error if core is active.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
New error code will allow to properly handle issues caused by wrong
usage of `remove inactive core` command.
It will also allow to print meaningful error messages.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
Change extended error message for `KCAS_ERR_REMOVED_DIRTY`.
Print informative messages when `remove core` command fails.
Make separate error messages for detaching.
Update help printouts.
Update documentation comments.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
FLush only active core during core removal.
During core removal with `casadm -R` there's a flush triggered.
This flush shall be skipped for inactive cores.
Change return code when `casadm -R` is called with `force` flag.
There was no info about dirty data when core was removed without flush.
Do not destroy exported object while core is inactive.
Perform detach only on active cores.
Skip removing inactive core with command for active cores.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
Kernel adapter now returns is_cache_device=1 and newly added
metadata_compatible=0 in case of metadata detected with
differing version (instead of is_cache_device = 0).
This allows zero-superblock command to recognize old
cache instance and clear it.
casadm --script --check-cache-device still returns 'Is cache'='no'
in this case, as this layer only cares about metadata in current
version to be able to detect dirty datas tatus.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Ignore the interruption of the stop operation - will finish asynchronously.
Remove redundant `ocf_queue_put`.
Move creating the `finish_thread` during the cache stop
from the `_cache_mngt_cache_stop_sync` to the `cache_mngt_exit_instance`
and give it a proper handling.
Signed-off-by: Slawomir Jankowski <slawomir.jankowski@intel.com>
This method produced too optimistic free memory value, which in result
led to oom killer activation. This patch restores more conservative
free memory calculation method.
This reverts commit 1e9b7a4262.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
Since kernel 5.7 kallsyms_on_each_symbol() is not available.
NOTE: This affects ability to perform upgrade in flight on kernels 5.7+.
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
This counter is not accurate (missing required memory barrier
to avoid unwanted behavior due to processor optimizations)
and performance gain is not clear - generally global
atomic variables are something we would like to avoid
going forward.
Signed-off-by: Adam Rutkowski <adam.j.rutkowski@intel.com>
Since commit 8b3238cabd50e27 in linux kernel removed blk_bidi_rq() marco, it
has to be wrapped in CAS `configure` script
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
Since there is no kernel-kernel api available to communicate
with nvme driver it is more convenient to use some nvme-dedicated
software (e.g. nvme-cli) to manage nvme devices.
It is even not possible to format nvme device with CAS using current
implementation on newest kernels.
Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
To avoid logging the same message each time _cache_mngt_create_exported_object()
is called, print error message within it.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
To avoid logging the same message each time block_dev_activate_exported_object()
is called, print error message within it.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>
In case of error `blk_mq_init_queue()` does not return NULL, but
`ERR_PTR(error_code)` instead.
`IS_ERR_OR_NULL()` should be used to check if `blk_mq_init_queue()` actually
failed.
Signed-off-by: Michal Mielewczyk <michal.mielewczyk@intel.com>