linux-next/block
Paolo Valente 9778369a2d block, bfq: split sync bfq_queues on a per-actuator basis
Single-LUN multi-actuator SCSI drives, as well as all multi-actuator
SATA drives appear as a single device to the I/O subsystem [1].  Yet
they address commands to different actuators internally, as a function
of Logical Block Addressing (LBAs). A given sector is reachable by
only one of the actuators. For example, Seagate’s Serial Advanced
Technology Attachment (SATA) version contains two actuators and maps
the lower half of the SATA LBA space to the lower actuator and the
upper half to the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. The block layer
must somehow control the load of each actuator individually. This
commit lays the ground for allowing BFQ to provide such a per-actuator
control.

BFQ associates an I/O-request sync bfq_queue with each process doing
synchronous I/O, or with a group of processes, in case of queue
merging. Then BFQ serves one bfq_queue at a time. While in service, a
bfq_queue is emptied in request-position order. Yet the same process,
or group of processes, may generate I/O for different actuators. In
this case, different streams of I/O (each for a different actuator)
get all inserted into the same sync bfq_queue. So there is basically
no individual control on when each stream is served, i.e., on when the
I/O requests of the stream are picked from the bfq_queue and
dispatched to the drive.

This commit enables BFQ to control the service of each actuator
individually for synchronous I/O, by simply splitting each sync
bfq_queue into N queues, one for each actuator. In other words, a sync
bfq_queue is now associated to a pair (process, actuator). As a
consequence of this split, the per-queue proportional-share policy
implemented by BFQ will guarantee that the sync I/O generated for each
actuator, by each process, receives its fair share of service.

This is just a preparatory patch. If the I/O of the same process
happens to be sent to different queues, then each of these queues may
undergo queue merging. To handle this event, the bfq_io_cq data
structure must be properly extended. In addition, stable merging must
be disabled to avoid loss of control on individual actuators. Finally,
also async queues must be split. These issues are described in detail
and addressed in next commits. As for this commit, although multiple
per-process bfq_queues are provided, the I/O of each process or group
of processes is still sent to only one queue, regardless of the
actuator the I/O is for. The forwarding to distinct bfq_queues will be
enabled after addressing the above issues.

[1] https://www.linaro.org/blog/budget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives/

Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Gabriele Felici <felicigb@gmail.com>
Signed-off-by: Carmine Zaccagnino <carmine@carminezacc.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20230103145503.71712-2-paolo.valente@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29 15:18:32 -07:00
..
partitions block: don't add partitions if GD_SUPPRESS_PART_SCAN is set 2022-09-03 11:29:03 -06:00
badblocks.c block/badblocks: Remove redundant assignments 2022-04-23 07:15:26 -06:00
bdev.c block: bdev & blktrace: use consistent function doc. notation 2022-12-01 09:16:46 -07:00
bfq-cgroup.c block, bfq: split sync bfq_queues on a per-actuator basis 2023-01-29 15:18:32 -07:00
bfq-iosched.c block, bfq: split sync bfq_queues on a per-actuator basis 2023-01-29 15:18:32 -07:00
bfq-iosched.h block, bfq: split sync bfq_queues on a per-actuator basis 2023-01-29 15:18:32 -07:00
bfq-wf2q.c block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED 2022-12-15 05:11:59 -07:00
bio-integrity.c block: pass struct queue_limits to the bio splitting helpers 2022-08-02 21:08:53 -06:00
bio.c Revert "block: bio_copy_data_iter" 2023-01-04 14:43:27 -07:00
blk-cgroup-fc-appid.c cgroup: Homogenize cgroup_get_from_id() return value 2022-08-26 10:57:41 -10:00
blk-cgroup-rwstat.c blk-cgroup: Fix the recursive blkg rwstat 2021-03-05 11:32:15 -07:00
blk-cgroup-rwstat.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
blk-cgroup.c blk-cgroup: fix missing pd_online_fn() while activating policy 2023-01-16 19:04:07 -07:00
blk-cgroup.h blk-cgroup: Optimize blkcg_rstat_flush() 2022-11-16 16:58:44 -07:00
blk-core.c block: Drop spurious might_sleep() from blk_put_queue() 2023-01-08 20:29:28 -07:00
blk-crypto-fallback.c treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
blk-crypto-internal.h blk-crypto: pass a gendisk to blk_crypto_sysfs_{,un}register 2022-11-30 11:09:00 -07:00
blk-crypto-profile.c blk-crypto: Add a missing include directive 2022-11-23 10:38:54 -07:00
blk-crypto-sysfs.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
blk-crypto.c for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
blk-flush.c block: change request end_io handler to pass back a return value 2022-09-30 07:49:09 -06:00
blk-ia-ranges.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
blk-integrity.c blk-crypto: remove blk_crypto_unregister() 2021-11-29 06:38:51 -07:00
blk-ioc.c block: fix default IO priority handling again 2022-06-27 06:29:12 -06:00
blk-iocost.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
blk-iolatency.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
blk-ioprio.c blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-ioprio.h blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-lib.c blk-lib: fix blkdev_issue_secure_erase 2022-09-15 00:25:17 -06:00
blk-map.c block: set FOLL_PCI_P2PDMA in bio_map_user_iov() 2022-11-09 11:29:21 -07:00
blk-merge.c block: don't allow splitting of a REQ_NOWAIT bio 2023-01-04 13:24:44 -07:00
blk-mq-cpumap.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-debugfs-zoned.c block: move zone related fields to struct gendisk 2022-07-06 06:46:26 -06:00
blk-mq-debugfs.c for-6.1/block-2022-10-03 2022-10-07 09:19:14 -07:00
blk-mq-debugfs.h block: remove per-disk debugfs files in blk_unregister_queue 2022-06-17 07:31:05 -06:00
blk-mq-pci.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-rdma.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-sched.c block: split elevator_switch 2022-11-01 09:12:24 -06:00
blk-mq-sched.h block: move blk_mq_sched_assign_ioc to blk-ioc.c 2021-11-29 06:41:29 -07:00
blk-mq-sysfs.c blk-mq: fix possible memleak when register 'hctx' failed 2022-11-25 06:34:03 -07:00
blk-mq-tag.c sbitmap: fix batched wait_cnt accounting 2022-09-12 00:10:34 -06:00
blk-mq-tag.h blk-mq: blk_mq_tag_busy is no need to return a value 2022-06-27 06:29:12 -06:00
blk-mq-virtio.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq.c block: fix hctx checks for batch allocation 2023-01-17 09:56:52 -07:00
blk-mq.h blk-mq: move the srcu_struct used for quiescing to the tagset 2022-11-02 08:35:34 -06:00
blk-pm.c scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() 2021-12-22 23:38:29 -05:00
blk-pm.h block: Remove unused blk_pm_*() function definitions 2021-02-22 06:33:48 -07:00
blk-rq-qos.c block/rq_qos: Use atomic_try_cmpxchg in atomic_inc_below 2022-07-12 14:38:52 -06:00
blk-rq-qos.h block/blk-rq-qos: delete useless enmu RQ_QOS_IOPRIO 2022-09-21 19:50:53 -06:00
blk-settings.c for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
blk-stat.c block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-stat.h block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-sysfs.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
blk-throttle.c blk-throttle: Use more suitable time_after check for update of slice_start 2022-12-05 13:45:31 -07:00
blk-throttle.h blk-throttle: pass a gendisk to blk_throtl_cancel_bios 2022-09-26 19:17:28 -06:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c blk-wbt: don't enable throttling if default elevator is bfq 2022-10-23 18:59:17 -06:00
blk-wbt.h blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled 2022-10-23 18:59:17 -06:00
blk-zoned.c block: adapt blk_mq_plug() to not plug for writes that require a zone lock 2022-09-29 07:45:47 -06:00
blk.h for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
bounce.c block: change the blk_queue_bounce calling convention 2022-08-02 17:22:54 -06:00
bsg-lib.c blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue 2022-10-25 08:25:10 -06:00
bsg.c Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
disk-events.c block: remove genhd.h 2022-02-02 07:49:59 -07:00
elevator.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
elevator.h block: add proper helpers for elevator_type module refcount management 2022-10-23 18:59:17 -06:00
fops.c block: remove blkdev_writepages 2022-11-16 11:32:53 -07:00
genhd.c block-2023-01-06 2023-01-06 13:12:42 -08:00
holder.c block: don't allow a disk link holder to itself 2022-11-16 15:19:56 -07:00
ioctl.c block: Do not reread partition table on exclusively open device 2022-12-01 07:44:03 -07:00
ioprio.c block: Fix handling of tasks without ioprio in ioprio_get(2) 2022-06-27 06:29:12 -06:00
Kconfig block: Remove "select SRCU" 2023-01-05 08:50:10 -07:00
Kconfig.iosched block: only build the icq tracking code when needed 2021-12-16 10:59:02 -07:00
kyber-iosched.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
Makefile blk-cgroup: move blkcg_{get,set}_fc_appid out of line 2022-05-02 14:06:20 -06:00
mq-deadline.c block: mq-deadline: Rename deadline_is_seq_writes() 2022-11-28 19:27:45 -07:00
opal_proto.h block: sed-opal: Add ioctl to return device status 2022-08-22 07:52:51 -06:00
sed-opal.c for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
t10-pi.c block: add pi for extended integrity 2022-03-07 12:48:35 -07:00