linux-stable/block
Yu Kuai 9a97008dbf block: support to account io_ticks precisely
[ Upstream commit 99dc422335 ]

Currently, io_ticks is accounted based on sampling, specifically
update_io_ticks() will always account io_ticks by 1 jiffies from
bdev_start_io_acct()/blk_account_io_start(), and the result can be
inaccurate, for example(HZ is 250):

Test script:
fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms

Test result: util is about 90%, while the disk is really idle.

This behaviour is introduced by commit 5b18b5a737 ("block: delete
part_round_stats and switch to less precise counting"), however, there
was a key point that is missed that this patch also improve performance
a lot:

Before the commit:
part_round_stats:
  if (part->stamp != now)
   stats |= 1;

  part_in_flight()
  -> there can be lots of task here in 1 jiffies.
  part_round_stats_single()
   __part_stat_add()
  part->stamp = now;

After the commit:
update_io_ticks:
  stamp = part->bd_stamp;
  if (time_after(now, stamp))
   if (try_cmpxchg())
    __part_stat_add()
    -> only one task can reach here in 1 jiffies.

Hence in order to account io_ticks precisely, we only need to know if
there are IO inflight at most once in one jiffies. Noted that for
rq-based device, iterating tags should not be used here because
'tags->lock' is grabbed in blk_mq_find_and_get_req(), hence
part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight.
The additional overhead is quite little:

 - per cpu add/dec for each IO for rq-based device;
 - per cpu sum for each jiffies;

And it's verified by null-blk that there are no performance degration
under heavy IO pressure.

Fixes: 5b18b5a737 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20240509123717.3223892-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12 11:03:07 +02:00
..
partitions block: Move checking GENHD_FL_NO_PART to bdev_add_partition() 2024-01-31 16:17:11 -08:00
badblocks.c block/badblocks: Remove redundant assignments 2022-04-23 07:15:26 -06:00
bdev.c block: update the stable_writes flag in bdev_add 2024-01-10 17:10:32 +01:00
bfq-cgroup.c block, bfq: fix uaf for bfqq in bic_set_bfqq() 2023-02-09 11:28:06 +01:00
bfq-iosched.c block, bfq: Fix division by zero error on zero wsum 2023-05-24 17:32:38 +01:00
bfq-iosched.h block, bfq: remove unused variable for bfq_queue 2022-10-20 05:46:49 -07:00
bfq-wf2q.c block, bfq: remove useless parameter for bfq_add/del_bfqq_busy() 2022-08-22 10:07:56 -06:00
bio-integrity.c block: factor out a bvec_set_page helper 2023-09-23 11:11:08 +02:00
bio.c block: Fix page refcounts for unaligned buffers in __bio_release_pages() 2024-04-03 15:19:46 +02:00
blk-cgroup-fc-appid.c cgroup: Homogenize cgroup_get_from_id() return value 2022-08-26 10:57:41 -10:00
blk-cgroup-rwstat.c blk-cgroup: Fix the recursive blkg rwstat 2021-03-05 11:32:15 -07:00
blk-cgroup-rwstat.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
blk-cgroup.c blk-cgroup: bypass blkcg_deactivate_policy after destroying 2023-12-20 17:00:21 +01:00
blk-cgroup.h blk-cgroup: pass a gendisk to blkcg_init_queue and blkcg_exit_queue 2022-09-26 19:09:31 -06:00
blk-core.c block: support to account io_ticks precisely 2024-06-12 11:03:07 +02:00
blk-crypto-fallback.c blk-crypto: dynamically allocate fallback profile 2023-08-23 17:52:39 +02:00
blk-crypto-internal.h blk-mq: release crypto keyslot before reporting I/O complete 2023-05-11 23:03:00 +09:00
blk-crypto-profile.c blk-crypto: use dynamic lock class for blk_crypto_profile::lock 2023-07-23 13:49:21 +02:00
blk-crypto-sysfs.c blk-crypto: show crypto capabilities in sysfs 2022-02-28 06:40:23 -07:00
blk-crypto.c blk-crypto: make blk_crypto_evict_key() more robust 2023-05-11 23:03:01 +09:00
blk-flush.c block: change request end_io handler to pass back a return value 2022-09-30 07:49:09 -06:00
blk-ia-ranges.c block: simplify disk_set_independent_access_ranges 2022-06-29 08:36:46 -06:00
blk-integrity.c blk-crypto: remove blk_crypto_unregister() 2021-11-29 06:38:51 -07:00
blk-ioc.c block: fix default IO priority handling again 2022-06-27 06:29:12 -06:00
blk-iocost.c blk-iocost: avoid out of bounds shift 2024-05-17 11:56:07 +02:00
blk-iolatency.c blk-cgroup: pass a gendisk to blkcg_schedule_throttle 2022-09-26 19:17:28 -06:00
blk-ioprio.c blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-ioprio.h blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-lib.c blk-lib: fix blkdev_issue_secure_erase 2022-09-15 00:25:17 -06:00
blk-map.c block: Fix WARNING in _copy_from_iter 2024-03-01 13:26:25 +01:00
blk-merge.c block: support to account io_ticks precisely 2024-06-12 11:03:07 +02:00
blk-mq-cpumap.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-debugfs-zoned.c block: move zone related fields to struct gendisk 2022-07-06 06:46:26 -06:00
blk-mq-debugfs.c blk-mq: fix potential io hang by wrong 'wake_batch' 2023-07-19 16:20:55 +02:00
blk-mq-debugfs.h block: remove per-disk debugfs files in blk_unregister_queue 2022-06-17 07:31:05 -06:00
blk-mq-pci.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-rdma.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-sched.c blk-mq: correct stale comment of .get_budget 2023-03-10 09:32:44 +01:00
blk-mq-sched.h block: move blk_mq_sched_assign_ioc to blk-ioc.c 2021-11-29 06:41:29 -07:00
blk-mq-sysfs.c blk-mq: fix possible memleak when register 'hctx' failed 2022-12-31 13:33:03 +01:00
blk-mq-tag.c blk-mq: fix potential io hang by wrong 'wake_batch' 2023-07-19 16:20:55 +02:00
blk-mq-tag.h blk-mq: blk_mq_tag_busy is no need to return a value 2022-06-27 06:29:12 -06:00
blk-mq-virtio.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq.c block: support to account io_ticks precisely 2024-06-12 11:03:07 +02:00
blk-mq.h blk-mq: fix potential io hang by wrong 'wake_batch' 2023-07-19 16:20:55 +02:00
blk-pm.c scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() 2021-12-22 23:38:29 -05:00
blk-pm.h block: Remove unused blk_pm_*() function definitions 2021-02-22 06:33:48 -07:00
blk-rq-qos.c block/rq_qos: Use atomic_try_cmpxchg in atomic_inc_below 2022-07-12 14:38:52 -06:00
blk-rq-qos.h block/blk-rq-qos: delete useless enmu RQ_QOS_IOPRIO 2022-09-21 19:50:53 -06:00
blk-settings.c block: Clear zone limits for a non-zoned stacked queue 2024-04-03 15:19:27 +02:00
blk-stat.c block: prevent division by zero in blk_rq_stat_sum() 2024-04-13 13:05:12 +02:00
blk-stat.h block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-sysfs.c block: fix use-after-free of q->q_usage_counter 2023-10-10 22:00:37 +02:00
blk-throttle.c blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!" 2023-12-20 17:00:21 +01:00
blk-throttle.h blk-throttle: pass a gendisk to blk_throtl_cancel_bios 2022-09-26 19:17:28 -06:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init() 2022-10-09 07:48:16 -06:00
blk-wbt.h blk-wbt: remove wbt_track stub 2022-03-31 12:58:38 -06:00
blk-zoned.c block: adapt blk_mq_plug() to not plug for writes that require a zone lock 2022-09-29 07:45:47 -06:00
blk.h block: support to account io_ticks precisely 2024-06-12 11:03:07 +02:00
bounce.c block: change the blk_queue_bounce calling convention 2022-08-02 17:22:54 -06:00
bsg-lib.c blk-mq: Drop blk_mq_ops.timeout 'reserved' arg 2022-07-06 06:33:53 -06:00
bsg.c scsi: core: bsg: Remove usage of the deprecated ida_simple_xxx() API 2022-06-21 21:22:51 -04:00
disk-events.c block: increment diskseq on all media change events 2023-07-19 16:21:47 +02:00
elevator.c blk-mq: use quiesced elevator switch when reinitializing queues 2022-09-27 09:58:56 -06:00
elevator.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
fops.c block: Don't invalidate pagecache for invalid falloc modes 2024-01-10 17:10:20 +01:00
genhd.c block: support to account io_ticks precisely 2024-06-12 11:03:07 +02:00
holder.c block: remove WARN_ON() from bd_link_disk_holder 2022-06-23 07:48:05 -06:00
ioctl.c block: fix overflow in blk_ioctl_discard() 2024-05-17 11:56:05 +02:00
ioprio.c block: Fix handling of tasks without ioprio in ioprio_get(2) 2022-06-27 06:29:12 -06:00
Kconfig block: remove "select BLK_RQ_IO_DATA_LEN" from BLK_CGROUP_IOCOST dependency 2022-06-29 08:35:57 -06:00
Kconfig.iosched block: only build the icq tracking code when needed 2021-12-16 10:59:02 -07:00
kyber-iosched.c block/kyber: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
Makefile blk-cgroup: move blkcg_{get,set}_fc_appid out of line 2022-05-02 14:06:20 -06:00
mq-deadline.c Revert "block/mq-deadline: use correct way to throttling write requests" 2024-04-03 15:19:37 +02:00
opal_proto.h block: sed-opal: handle empty atoms when parsing response 2024-03-26 18:20:26 -04:00
sed-opal.c block: sed-opal: handle empty atoms when parsing response 2024-03-26 18:20:26 -04:00
t10-pi.c block: add pi for extended integrity 2022-03-07 12:48:35 -07:00