linux-next/block
Waiman Long 3b8cc62987 blk-cgroup: Optimize blkcg_rstat_flush()
For a system with many CPUs and block devices, the time to do
blkcg_rstat_flush() from cgroup_rstat_flush() can be rather long. It
can be especially problematic as interrupt is disabled during the flush.
It was reported that it might take seconds to complete in some extreme
cases leading to hard lockup messages.

As it is likely that not all the percpu blkg_iostat_set's has been
updated since the last flush, those stale blkg_iostat_set's don't need
to be flushed in this case. This patch optimizes blkcg_rstat_flush()
by keeping a lockless list of recently updated blkg_iostat_set's in a
newly added percpu blkcg->lhead pointer.

The blkg_iostat_set is added to a lockless list on the update side
in blk_cgroup_bio_start(). It is removed from the lockless list when
flushed in blkcg_rstat_flush(). Due to racing, it is possible that
blk_iostat_set's in the lockless list may have no new IO stats to be
flushed, but that is OK.

To protect against destruction of blkg, a percpu reference is gotten
when putting into the lockless list and put back when removed.

When booting up an instrumented test kernel with this patch on a
2-socket 96-thread system with cgroup v2, out of the 2051 calls to
cgroup_rstat_flush() after bootup, 1788 of the calls were exited
immediately because of empty lockless list. After an all-cpu kernel
build, the ratio became 6295424/6340513. That was more than 99%.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20221105005902.407297-3-longman@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-16 16:58:44 -07:00
..
partitions block: don't add partitions if GD_SUPPRESS_PART_SCAN is set 2022-09-03 11:29:03 -06:00
badblocks.c block/badblocks: Remove redundant assignments 2022-04-23 07:15:26 -06:00
bdev.c vfs: support STATX_DIOALIGN on block devices 2022-09-11 19:47:12 -05:00
bfq-cgroup.c block, bfq: don't declare 'bfqd' as type 'void *' in bfq_group 2022-11-01 20:10:55 -06:00
bfq-iosched.c bfq: ignore oom_bfqq in bfq_check_waker 2022-11-09 12:42:26 -07:00
bfq-iosched.h block, bfq: don't declare 'bfqd' as type 'void *' in bfq_group 2022-11-01 20:10:55 -06:00
bfq-wf2q.c block, bfq: don't declare 'bfqd' as type 'void *' in bfq_group 2022-11-01 20:10:55 -06:00
bio-integrity.c block: pass struct queue_limits to the bio splitting helpers 2022-08-02 21:08:53 -06:00
bio.c bio: shrink max number of pcpu cached bios 2022-11-16 09:44:26 -07:00
blk-cgroup-fc-appid.c cgroup: Homogenize cgroup_get_from_id() return value 2022-08-26 10:57:41 -10:00
blk-cgroup-rwstat.c blk-cgroup: Fix the recursive blkg rwstat 2021-03-05 11:32:15 -07:00
blk-cgroup-rwstat.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
blk-cgroup.c blk-cgroup: Optimize blkcg_rstat_flush() 2022-11-16 16:58:44 -07:00
blk-cgroup.h blk-cgroup: Optimize blkcg_rstat_flush() 2022-11-16 16:58:44 -07:00
blk-core.c blk-mq: move the srcu_struct used for quiescing to the tagset 2022-11-02 08:35:34 -06:00
blk-crypto-fallback.c treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
blk-crypto-internal.h blk-crypto: show crypto capabilities in sysfs 2022-02-28 06:40:23 -07:00
blk-crypto-profile.c blk-crypto: remove blk_crypto_unregister() 2021-11-29 06:38:51 -07:00
blk-crypto-sysfs.c blk-crypto: show crypto capabilities in sysfs 2022-02-28 06:40:23 -07:00
blk-crypto.c blk-crypto: show crypto capabilities in sysfs 2022-02-28 06:40:23 -07:00
blk-flush.c block: change request end_io handler to pass back a return value 2022-09-30 07:49:09 -06:00
blk-ia-ranges.c block: simplify disk_set_independent_access_ranges 2022-06-29 08:36:46 -06:00
blk-integrity.c blk-crypto: remove blk_crypto_unregister() 2021-11-29 06:38:51 -07:00
blk-ioc.c block: fix default IO priority handling again 2022-06-27 06:29:12 -06:00
blk-iocost.c blk-iocost: read 'ioc->params' inside 'ioc->lock' in ioc_timer_fn() 2022-10-23 18:59:17 -06:00
blk-iolatency.c block: Replace struct rq_depth with unsigned int in struct iolatency_grp 2022-11-01 09:12:24 -06:00
blk-ioprio.c blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-ioprio.h blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-lib.c blk-lib: fix blkdev_issue_secure_erase 2022-09-15 00:25:17 -06:00
blk-map.c block: set FOLL_PCI_P2PDMA in bio_map_user_iov() 2022-11-09 11:29:21 -07:00
blk-merge.c block: Micro-optimize get_max_segment_size() 2022-10-25 13:41:20 -06:00
blk-mq-cpumap.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-debugfs-zoned.c block: move zone related fields to struct gendisk 2022-07-06 06:46:26 -06:00
blk-mq-debugfs.c for-6.1/block-2022-10-03 2022-10-07 09:19:14 -07:00
blk-mq-debugfs.h block: remove per-disk debugfs files in blk_unregister_queue 2022-06-17 07:31:05 -06:00
blk-mq-pci.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-rdma.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-sched.c block: split elevator_switch 2022-11-01 09:12:24 -06:00
blk-mq-sched.h block: move blk_mq_sched_assign_ioc to blk-ioc.c 2021-11-29 06:41:29 -07:00
blk-mq-sysfs.c blk-mq: cleanup disk sysfs registration 2022-06-28 11:32:42 -06:00
blk-mq-tag.c sbitmap: fix batched wait_cnt accounting 2022-09-12 00:10:34 -06:00
blk-mq-tag.h blk-mq: blk_mq_tag_busy is no need to return a value 2022-06-27 06:29:12 -06:00
blk-mq-virtio.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq.c blk-mq: simplify blk_mq_realloc_tag_set_tags 2022-11-10 11:18:09 -07:00
blk-mq.h blk-mq: move the srcu_struct used for quiescing to the tagset 2022-11-02 08:35:34 -06:00
blk-pm.c scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() 2021-12-22 23:38:29 -05:00
blk-pm.h block: Remove unused blk_pm_*() function definitions 2021-02-22 06:33:48 -07:00
blk-rq-qos.c block/rq_qos: Use atomic_try_cmpxchg in atomic_inc_below 2022-07-12 14:38:52 -06:00
blk-rq-qos.h block/blk-rq-qos: delete useless enmu RQ_QOS_IOPRIO 2022-09-21 19:50:53 -06:00
blk-settings.c block: Constify most queue limits pointers 2022-10-25 13:41:17 -06:00
blk-stat.c block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-stat.h block: make queue stat accounting a reference 2021-12-14 17:23:05 -07:00
blk-sysfs.c blk-mq: move the srcu_struct used for quiescing to the tagset 2022-11-02 08:35:34 -06:00
blk-throttle.c blk-throttle: pass a gendisk to blk_throtl_cancel_bios 2022-09-26 19:17:28 -06:00
blk-throttle.h blk-throttle: pass a gendisk to blk_throtl_cancel_bios 2022-09-26 19:17:28 -06:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c blk-wbt: don't enable throttling if default elevator is bfq 2022-10-23 18:59:17 -06:00
blk-wbt.h blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled 2022-10-23 18:59:17 -06:00
blk-zoned.c block: adapt blk_mq_plug() to not plug for writes that require a zone lock 2022-09-29 07:45:47 -06:00
blk.h blk-mq: move the srcu_struct used for quiescing to the tagset 2022-11-02 08:35:34 -06:00
bounce.c block: change the blk_queue_bounce calling convention 2022-08-02 17:22:54 -06:00
bsg-lib.c blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue 2022-10-25 08:25:10 -06:00
bsg.c scsi: core: bsg: Remove usage of the deprecated ida_simple_xxx() API 2022-06-21 21:22:51 -04:00
disk-events.c block: remove genhd.h 2022-02-02 07:49:59 -07:00
elevator.c block: Fix some kernel-doc comments 2022-11-07 07:21:45 -07:00
elevator.h block: add proper helpers for elevator_type module refcount management 2022-10-23 18:59:17 -06:00
fops.c block: remove blkdev_writepages 2022-11-16 11:32:53 -07:00
genhd.c block: remove delayed holder registration 2022-11-16 15:19:56 -07:00
holder.c block: don't allow a disk link holder to itself 2022-11-16 15:19:56 -07:00
ioctl.c block: replace blkdev_nr_zones with bdev_nr_zones 2022-07-06 06:46:26 -06:00
ioprio.c block: Fix handling of tasks without ioprio in ioprio_get(2) 2022-06-27 06:29:12 -06:00
Kconfig block: remove "select BLK_RQ_IO_DATA_LEN" from BLK_CGROUP_IOCOST dependency 2022-06-29 08:35:57 -06:00
Kconfig.iosched block: only build the icq tracking code when needed 2021-12-16 10:59:02 -07:00
kyber-iosched.c block/kyber: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
Makefile blk-cgroup: move blkcg_{get,set}_fc_appid out of line 2022-05-02 14:06:20 -06:00
mq-deadline.c block/mq-deadline: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
opal_proto.h block: sed-opal: Add ioctl to return device status 2022-08-22 07:52:51 -06:00
sed-opal.c block: sed-opal: Add ioctl to return device status 2022-08-22 07:52:51 -06:00
t10-pi.c block: add pi for extended integrity 2022-03-07 12:48:35 -07:00