linux-stable/block
Muchun Song 3e62d49f59 block: fix ordering between checking BLK_MQ_S_STOPPED request adding
commit 96a9fe64bf upstream.

Supposing first scenario with a virtio_blk driver.

CPU0                        CPU1

blk_mq_try_issue_directly()
  __blk_mq_issue_directly()
    q->mq_ops->queue_rq()
      virtio_queue_rq()
        blk_mq_stop_hw_queue()
                            virtblk_done()
  blk_mq_request_bypass_insert()  1) store
                              blk_mq_start_stopped_hw_queue()
                                clear_bit(BLK_MQ_S_STOPPED)       3) store
                                blk_mq_run_hw_queue()
                                  if (!blk_mq_hctx_has_pending()) 4) load
                                    return
                                  blk_mq_sched_dispatch_requests()
  blk_mq_run_hw_queue()
    if (!blk_mq_hctx_has_pending())
      return
    blk_mq_sched_dispatch_requests()
      if (blk_mq_hctx_stopped())  2) load
        return
      __blk_mq_sched_dispatch_requests()

Supposing another scenario.

CPU0                        CPU1

blk_mq_requeue_work()
  blk_mq_insert_request() 1) store
                            virtblk_done()
                              blk_mq_start_stopped_hw_queue()
  blk_mq_run_hw_queues()        clear_bit(BLK_MQ_S_STOPPED)       3) store
                                blk_mq_run_hw_queue()
                                  if (!blk_mq_hctx_has_pending()) 4) load
                                    return
                                  blk_mq_sched_dispatch_requests()
    if (blk_mq_hctx_stopped())  2) load
      continue
    blk_mq_run_hw_queue()

Both scenarios are similar, the full memory barrier should be inserted
between 1) and 2), as well as between 3) and 4) to make sure that either
CPU0 sees BLK_MQ_S_STOPPED is cleared or CPU1 sees dispatch list.
Otherwise, either CPU will not rerun the hardware queue causing
starvation of the request.

The easy way to fix it is to add the essential full memory barrier into
helper of blk_mq_hctx_stopped(). In order to not affect the fast path
(hardware queue is not stopped most of the time), we only insert the
barrier into the slow path. Actually, only slow path needs to care about
missing of dispatching the request to the low-level device driver.

Fixes: 320ae51fee ("blk-mq: new multi-queue block IO queueing mechanism")
Cc: stable@vger.kernel.org
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20241014092934.53630-4-songmuchun@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-05 10:59:40 +01:00
..
partitions block: fix signed int overflow in Amiga partition support 2023-08-30 16:31:46 +02:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-11-03 11:29:50 -07:00
bfq-cgroup.c block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group() 2020-03-25 08:06:08 +01:00
bfq-iosched.c block, bfq: don't break merge chain in bfq_split_bfqq() 2024-11-08 16:19:04 +01:00
bfq-iosched.h block, bfq: fix use after free in bfq_bfqq_expire 2021-12-29 12:20:43 +01:00
bfq-wf2q.c block, bfq: fix use after free in bfq_bfqq_expire 2021-12-29 12:20:43 +01:00
bio-integrity.c block: initialize integrity buffer to zero before writing it to media 2024-09-12 11:02:50 +02:00
bio.c block: Remove special-casing of compound pages 2024-02-23 08:12:40 +01:00
blk-cgroup.c bdi: use bdi_dev_name() to get device name 2021-08-08 08:54:29 +02:00
blk-core.c block: fix and cleanup bio_check_ro 2023-02-06 07:49:41 +01:00
blk-exec.c blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request 2018-01-17 09:49:21 -07:00
blk-flush.c block: Fix fsync always failed if once failed 2022-03-08 19:04:08 +01:00
blk-integrity.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
blk-ioc.c block: Fix use-after-free issue accessing struct io_cq 2020-04-17 10:48:41 +02:00
blk-iolatency.c blk-iolatency: Fix inflight count imbalances and IO hangs on offline 2022-06-14 16:59:30 +02:00
blk-lib.c block: fix 32 bit overflow in __blkdev_issue_discard() 2020-02-01 09:37:12 +00:00
blk-map.c block: fix memleak when __blk_rq_map_user_iov() is failed 2020-01-12 12:17:22 +01:00
blk-merge.c block: don't merge across cgroup boundaries if blkcg is enabled 2022-04-15 14:14:41 +02:00
blk-mq-cpumap.c blk-mq: don't keep offline CPUs mapped to hctx 0 2018-04-10 08:38:46 -06:00
blk-mq-debugfs-zoned.c block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n 2018-07-09 09:07:52 -06:00
blk-mq-debugfs.c block, scsi: Change the preempt-only flag into a counter 2019-08-04 09:30:57 +02:00
blk-mq-debugfs.h block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n 2018-07-09 09:07:52 -06:00
blk-mq-pci.c blk-mq: code clean-up by adding an API to clear set->mq_map 2018-07-09 09:07:53 -06:00
blk-mq-rdma.c block: Add rdma affinity based queue mapping helper 2017-08-08 14:58:03 -04:00
blk-mq-sched.c blk-mq: remove stale comment for blk_mq_sched_mark_restart_hctx 2023-03-11 16:31:33 +01:00
blk-mq-sched.h block: mq-deadline: Fix write completion handling 2019-01-13 09:51:07 +01:00
blk-mq-sysfs.c blk-mq: fix possible memleak when register 'hctx' failed 2023-01-18 11:30:37 +01:00
blk-mq-tag.c blk-mq: Allow blocking queue tag iter callbacks 2018-09-25 20:17:59 -06:00
blk-mq-tag.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c block: fix ordering between checking BLK_MQ_S_STOPPED request adding 2024-12-05 10:59:40 +01:00
blk-mq.h block: fix ordering between checking BLK_MQ_S_STOPPED request adding 2024-12-05 10:59:40 +01:00
blk-rq-qos.c blk-wbt: fix performance regression in wbt scale_up/scale_down 2019-10-17 13:45:16 -07:00
blk-rq-qos.h blk-rq-qos: fix first node deletion of rq_qos_del() 2019-10-29 09:20:09 +01:00
blk-settings.c blk-settings: align max_sectors on "logical_block_size" boundary 2021-03-04 09:39:51 +01:00
blk-softirq.c block: fix timeout changes for legacy request drivers 2018-06-19 11:27:18 -06:00
blk-stat.c block: prevent division by zero in blk_rq_stat_sum() 2024-04-13 12:50:16 +02:00
blk-stat.h block: deactivate blk_stat timer in wbt_disable_default() 2019-01-13 09:51:06 +01:00
blk-sysfs.c block: don't delete queue kobject before its children 2022-04-15 14:14:43 +02:00
blk-tag.c for-linus-20180616 2018-06-17 05:37:55 +09:00
blk-throttle.c blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!" 2023-12-20 15:38:02 +01:00
blk-timeout.c blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE 2018-06-23 10:25:45 -06:00
blk-wbt.c blk-wbt: make sure throttle is enabled properly 2021-07-20 16:15:48 +02:00
blk-wbt.h blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled() 2021-07-20 16:15:48 +02:00
blk-zoned.c blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN 2021-09-22 11:47:57 +02:00
blk.h block: split .sysfs_lock into two locks 2021-03-04 09:39:30 +01:00
bounce.c block: copy ioprio in __bio_clone_fast() and bounce 2018-12-01 09:37:32 +01:00
bsg-lib.c block/bsg-lib: use PTR_ERR_OR_ZERO to simplify the flow path 2018-08-01 09:13:03 -06:00
bsg.c block: bsg: move atomic_t ref_count variable to refcount API 2018-08-27 19:17:02 -06:00
cfq-iosched.c cfq: Suppress compiler warnings about comparisons 2018-08-07 17:57:13 -06:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat_ioctl.c block/compat_ioctl: fix range check in BLKGETSIZE 2022-04-27 13:39:45 +02:00
deadline-iosched.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
elevator.c block/wbt: fix negative inflight counter when remove scsi device 2022-02-23 11:58:40 +01:00
genhd.c md: switch to ->check_events for media change notifications 2024-03-26 18:22:34 -04:00
ioctl.c block: add a new set_read_only method 2024-03-26 18:22:34 -04:00
ioprio.c block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2) 2021-12-14 10:18:07 +01:00
Kconfig block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
Kconfig.iosched License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kyber-iosched.c block: kyber: make kyber more friendly with merging 2018-05-30 10:47:40 -06:00
Makefile block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
mq-deadline.c block: mq-deadline: Fix queue restart handling 2019-10-07 18:57:19 +02:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block: sed-opal: handle empty atoms when parsing response 2024-03-26 18:22:33 -04:00
partition-generic.c block: unhash blkdev part inode when the part is deleted 2023-01-18 11:29:59 +01:00
scsi_ioctl.c block: consistently use GFP_NOIO instead of __GFP_NORECLAIM 2018-05-14 08:55:18 -06:00
sed-opal.c block: sed-opal: handle empty atoms when parsing response 2024-03-26 18:22:33 -04:00
t10-pi.c block: move dif_prepare/dif_complete functions to block layer 2018-07-30 08:27:02 -06:00