block: Hold invalidate_lock in BLKDISCARD ioctl

When BLKDISCARD ioctl and data read race, the data read leaves stale
page cache. To avoid the stale page cache, hold invalidate_lock of the
block device file mapping. The stale page cache is observed when
blktests test case block/009 is repeated hundreds of times.

This patch can be applied back to the stable kernel version v5.15.y
with slight patch edit. Rework is required for older stable kernels.

Fixes: 351499a172 ("block: Invalidate cache on discard v2")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.15
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20211109104723.835533-2-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
Shin'ichiro Kawasaki 2021-11-09 19:47:22 +09:00 committed by Jens Axboe
parent cb690f5238
commit 7607c44c15

View File

@ -113,6 +113,7 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
uint64_t range[2]; uint64_t range[2];
uint64_t start, len; uint64_t start, len;
struct request_queue *q = bdev_get_queue(bdev); struct request_queue *q = bdev_get_queue(bdev);
struct inode *inode = bdev->bd_inode;
int err; int err;
if (!(mode & FMODE_WRITE)) if (!(mode & FMODE_WRITE))
@ -135,12 +136,17 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
if (start + len > bdev_nr_bytes(bdev)) if (start + len > bdev_nr_bytes(bdev))
return -EINVAL; return -EINVAL;
filemap_invalidate_lock(inode->i_mapping);
err = truncate_bdev_range(bdev, mode, start, start + len - 1); err = truncate_bdev_range(bdev, mode, start, start + len - 1);
if (err) if (err)
return err; goto fail;
return blkdev_issue_discard(bdev, start >> 9, len >> 9, err = blkdev_issue_discard(bdev, start >> 9, len >> 9,
GFP_KERNEL, flags); GFP_KERNEL, flags);
fail:
filemap_invalidate_unlock(inode->i_mapping);
return err;
} }
static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode, static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode,