mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-16 05:26:07 +00:00
4e54ea72ed
Suggested-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20240619154623.450048-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
96 lines
3.8 KiB
ReStructuredText
96 lines
3.8 KiB
ReStructuredText
==========================================
|
|
Explicit volatile write back cache control
|
|
==========================================
|
|
|
|
Introduction
|
|
------------
|
|
|
|
Many storage devices, especially in the consumer market, come with volatile
|
|
write back caches. That means the devices signal I/O completion to the
|
|
operating system before data actually has hit the non-volatile storage. This
|
|
behavior obviously speeds up various workloads, but it means the operating
|
|
system needs to force data out to the non-volatile storage when it performs
|
|
a data integrity operation like fsync, sync or an unmount.
|
|
|
|
The Linux block layer provides two simple mechanisms that let filesystems
|
|
control the caching behavior of the storage device. These mechanisms are
|
|
a forced cache flush, and the Force Unit Access (FUA) flag for requests.
|
|
|
|
|
|
Explicit cache flushes
|
|
----------------------
|
|
|
|
The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
|
|
the filesystem and will make sure the volatile cache of the storage device
|
|
has been flushed before the actual I/O operation is started. This explicitly
|
|
guarantees that previously completed write requests are on non-volatile
|
|
storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
|
|
set on an otherwise empty bio structure, which causes only an explicit cache
|
|
flush without any dependent I/O. It is recommend to use
|
|
the blkdev_issue_flush() helper for a pure cache flush.
|
|
|
|
|
|
Forced Unit Access
|
|
------------------
|
|
|
|
The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
|
|
filesystem and will make sure that I/O completion for this request is only
|
|
signaled after the data has been committed to non-volatile storage.
|
|
|
|
|
|
Implementation details for filesystems
|
|
--------------------------------------
|
|
|
|
Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
|
|
worry if the underlying devices need any explicit cache flushing and how
|
|
the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags
|
|
may both be set on a single bio.
|
|
|
|
Feature settings for block drivers
|
|
----------------------------------
|
|
|
|
For devices that do not support volatile write caches there is no driver
|
|
support required, the block layer completes empty REQ_PREFLUSH requests before
|
|
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
|
|
requests that have a payload.
|
|
|
|
For devices with volatile write caches the driver needs to tell the block layer
|
|
that it supports flushing caches by setting the
|
|
|
|
BLK_FEAT_WRITE_CACHE
|
|
|
|
flag in the queue_limits feature field. For devices that also support the FUA
|
|
bit the block layer needs to be told to pass on the REQ_FUA bit by also setting
|
|
the
|
|
|
|
BLK_FEAT_FUA
|
|
|
|
flag in the features field of the queue_limits structure.
|
|
|
|
Implementation details for bio based block drivers
|
|
--------------------------------------------------
|
|
|
|
For bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simply passed on to
|
|
the driver if the driver sets the BLK_FEAT_WRITE_CACHE flag and the driver
|
|
needs to handle them.
|
|
|
|
*NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is
|
|
_not_ set. Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to
|
|
handle REQ_FUA.
|
|
|
|
For remapping drivers the REQ_FUA bits need to be propagated to underlying
|
|
devices, and a global flush needs to be implemented for bios with the
|
|
REQ_PREFLUSH bit set.
|
|
|
|
Implementation details for blk-mq drivers
|
|
-----------------------------------------
|
|
|
|
When the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests
|
|
with a payload are automatically turned into a sequence of a REQ_OP_FLUSH
|
|
request followed by the actual write by the block layer.
|
|
|
|
When the BLK_FEAT_FUA flags is set, the REQ_FUA bit is simply passed on for the
|
|
REQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer
|
|
after the completion of the write request for bio submissions with the REQ_FUA
|
|
bit set.
|