linux-stable/drivers
Jens Axboe a9ce385344 dm: don't attempt to queue IO under RCU protection
dm looks up the table for IO based on the request type, with an
assumption that if the request is marked REQ_NOWAIT, it's fine to
attempt to submit that IO while under RCU read lock protection. This
is not OK, as REQ_NOWAIT just means that we should not be sleeping
waiting on other IO, it does not mean that we can't potentially
schedule.

A simple test case demonstrates this quite nicely:

int main(int argc, char *argv[])
{
        struct iovec iov;
        int fd;

        fd = open("/dev/dm-0", O_RDONLY | O_DIRECT);
        posix_memalign(&iov.iov_base, 4096, 4096);
        iov.iov_len = 4096;
        preadv2(fd, &iov, 1, 0, RWF_NOWAIT);
        return 0;
}

which will instantly spew:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5580, name: dm-nowait
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
INFO: lockdep is turned off.
CPU: 7 PID: 5580 Comm: dm-nowait Not tainted 6.6.0-rc1-g39956d2dcd81 #132
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x11d/0x1b0
 __might_resched+0x3c3/0x5e0
 ? preempt_count_sub+0x150/0x150
 mempool_alloc+0x1e2/0x390
 ? mempool_resize+0x7d0/0x7d0
 ? lock_sync+0x190/0x190
 ? lock_release+0x4b7/0x670
 ? internal_get_user_pages_fast+0x868/0x2d40
 bio_alloc_bioset+0x417/0x8c0
 ? bvec_alloc+0x200/0x200
 ? internal_get_user_pages_fast+0xb8c/0x2d40
 bio_alloc_clone+0x53/0x100
 dm_submit_bio+0x27f/0x1a20
 ? lock_release+0x4b7/0x670
 ? blk_try_enter_queue+0x1a0/0x4d0
 ? dm_dax_direct_access+0x260/0x260
 ? rcu_is_watching+0x12/0xb0
 ? blk_try_enter_queue+0x1cc/0x4d0
 __submit_bio+0x239/0x310
 ? __bio_queue_enter+0x700/0x700
 ? kvm_clock_get_cycles+0x40/0x60
 ? ktime_get+0x285/0x470
 submit_bio_noacct_nocheck+0x4d9/0xb80
 ? should_fail_request+0x80/0x80
 ? preempt_count_sub+0x150/0x150
 ? lock_release+0x4b7/0x670
 ? __bio_add_page+0x143/0x2d0
 ? iov_iter_revert+0x27/0x360
 submit_bio_noacct+0x53e/0x1b30
 submit_bio_wait+0x10a/0x230
 ? submit_bio_wait_endio+0x40/0x40
 __blkdev_direct_IO_simple+0x4f8/0x780
 ? blkdev_bio_end_io+0x4c0/0x4c0
 ? stack_trace_save+0x90/0xc0
 ? __bio_clone+0x3c0/0x3c0
 ? lock_release+0x4b7/0x670
 ? lock_sync+0x190/0x190
 ? atime_needs_update+0x3bf/0x7e0
 ? timestamp_truncate+0x21b/0x2d0
 ? inode_owner_or_capable+0x240/0x240
 blkdev_direct_IO.part.0+0x84a/0x1810
 ? rcu_is_watching+0x12/0xb0
 ? lock_release+0x4b7/0x670
 ? blkdev_read_iter+0x40d/0x530
 ? reacquire_held_locks+0x4e0/0x4e0
 ? __blkdev_direct_IO_simple+0x780/0x780
 ? rcu_is_watching+0x12/0xb0
 ? __mark_inode_dirty+0x297/0xd50
 ? preempt_count_add+0x72/0x140
 blkdev_read_iter+0x2a4/0x530
 do_iter_readv_writev+0x2f2/0x3c0
 ? generic_copy_file_range+0x1d0/0x1d0
 ? fsnotify_perm.part.0+0x25d/0x630
 ? security_file_permission+0xd8/0x100
 do_iter_read+0x31b/0x880
 ? import_iovec+0x10b/0x140
 vfs_readv+0x12d/0x1a0
 ? vfs_iter_read+0xb0/0xb0
 ? rcu_is_watching+0x12/0xb0
 ? rcu_is_watching+0x12/0xb0
 ? lock_release+0x4b7/0x670
 do_preadv+0x1b3/0x260
 ? do_readv+0x370/0x370
 __x64_sys_preadv2+0xef/0x150
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f5af41ad806
Code: 41 54 41 89 fc 55 44 89 c5 53 48 89 cb 48 83 ec 18 80 3d e4 dd 0d 00 00 74 7a 45 89 c1 49 89 ca 45 31 c0 b8 47 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 be 00 00 00 48 85 c0 79 4a 48 8b 0d da 55
RSP: 002b:00007ffd3145c7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000147
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5af41ad806
RDX: 0000000000000001 RSI: 00007ffd3145c850 RDI: 0000000000000003
RBP: 0000000000000008 R08: 0000000000000000 R09: 0000000000000008
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007ffd3145c850 R14: 000055f5f0431dd8 R15: 0000000000000001
 </TASK>

where in fact it is dm itself that attempts to allocate a bio clone with
GFP_NOIO under the rcu read lock, regardless of the request type.

Fix this by getting rid of the special casing for REQ_NOWAIT, and just
use the normal SRCU protected table lookup. Get rid of the bio based
table locking helpers at the same time, as they are now unused.

Cc: stable@vger.kernel.org
Fixes: 563a225c9f ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-09-15 15:39:59 -04:00
..
accel Short summary of fixes pull: 2023-09-08 06:36:36 +10:00
accessibility
acpi More thermal control updates for 6.6-rc1 2023-09-04 15:17:28 -07:00
amba amba: bus: fix refcount leak 2023-08-22 15:50:57 +02:00
android Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
ata SCSI misc on 20230909 2023-09-09 12:01:33 -07:00
atm
auxdisplay drm for 6.6-rc1 2023-08-30 13:34:34 -07:00
base Driver core changes for 6.6-rc1 2023-09-01 09:43:18 -07:00
bcma
block block-6.6-2023-09-08 2023-09-08 21:39:54 -07:00
bluetooth TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
bus Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
cache cache: Add L2 cache management for Andes AX45MP RISC-V core 2023-09-01 09:08:59 -07:00
cdrom
cdx
char tpm: Enable hwrng only for Pluton on AMD CPUs 2023-09-04 21:57:59 +03:00
clk This pull request is full of clk driver changes. In fact, there aren't any 2023-08-30 19:53:39 -07:00
clocksource Updates for clocksource/clockevent drivers: 2023-09-04 13:15:57 -07:00
comedi
connector
counter - New Drivers 2023-09-04 13:47:59 -07:00
cpufreq cpufreq: Support per-policy performance boost 2023-08-29 20:51:40 +02:00
cpuidle powerpc updates for 6.6 2023-08-31 12:43:10 -07:00
crypto This update includes the following changes: 2023-08-29 11:23:29 -07:00
cxl
dax mm: remove enum page_entry_size 2023-08-24 16:20:30 -07:00
dca
devfreq
dio
dma dmaengine updates for v6.6 2023-09-03 10:49:42 -07:00
dma-buf drm for 6.6-rc1 2023-08-30 13:34:34 -07:00
edac Intel EDAC fixes: 2023-08-30 19:23:00 -07:00
eisa
extcon
firewire
firmware RISC-V Patches for the 6.6 Merge Window, Part 2 (try 2) 2023-09-09 14:25:11 -07:00
fpga
fsi fsi: i2cr: Switch to use struct i2c_driver's .probe() 2023-08-22 15:51:33 +02:00
genpd ARM: SoC drivers for 6.6 2023-08-30 16:42:21 -07:00
gnss
gpio gpio: zynq: restore zynq_gpio_irq_reqres/zynq_gpio_irq_relres callbacks 2023-09-06 17:08:51 +02:00
gpu drm ci for 6.6-rc1 2023-09-10 11:55:26 -07:00
greybus
hid for-linus-2023083101 2023-09-01 12:31:44 -07:00
hsi
hte hte: Explicitly include correct DT includes 2023-08-28 13:31:06 -05:00
hv hyperv-next for v6.6 2023-09-04 11:26:29 -07:00
hwmon Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
hwspinlock
hwtracing coresight: trbe: Fix TRBE potential sleep in atomic context 2023-08-18 16:42:26 +01:00
i2c I2C has mainly cleanups this time and a few driver improvements. Because 2023-09-04 13:44:11 -07:00
i3c i3c: master: svc: fix probe failure when no i3c device exist 2023-09-06 01:21:47 +02:00
idle Perf events changes for v6.6: 2023-08-28 16:35:01 -07:00
iio Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
infiniband SCSI misc on 20230902 2023-09-02 12:02:41 -07:00
input Input updates for 6.6 merge window: 2023-09-06 09:24:25 -07:00
interconnect This pull request is full of clk driver changes. In fact, there aren't any 2023-08-30 19:53:39 -07:00
iommu IOMMU Updates for Linux v6.6 2023-09-01 16:54:25 -07:00
ipack
irqchip Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
isdn Merge commit b320441c04 ("Merge tag 'tty-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty") into tty-next 2023-08-20 14:29:37 +02:00
leds - Core Frameworks 2023-09-04 13:52:58 -07:00
macintosh powerpc updates for 6.6 2023-08-31 12:43:10 -07:00
mailbox mailbox: qcom-ipcc: fix incorrect num_chans counting 2023-09-05 10:11:01 -05:00
mcb
md dm: don't attempt to queue IO under RCU protection 2023-09-15 15:39:59 -04:00
media media: dvb: symbol fixup for dvb_attach() 2023-09-09 08:15:11 +01:00
memory
memstick
message
mfd spi: Updates for v6.6 2023-08-29 09:47:33 -07:00
misc Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
mmc TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
most
mtd - New Drivers 2023-09-04 13:47:59 -07:00
mux mux: Explicitly include correct DT includes 2023-08-28 13:36:24 -05:00
net Including fixes from netfilter and bpf. 2023-09-07 18:33:07 -07:00
nfc NFC: nxp: add NXP1002 2023-08-30 18:32:24 -07:00
ntb ntb: Check tx descriptors outstanding instead of head/tail for tx queue 2023-08-22 12:38:19 -04:00
nubus
nvdimm nvdimm changes for v6.6 merge window 2023-08-30 20:52:08 -07:00
nvme for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
nvmem nvmem: core: Notify when a new layout is registered 2023-08-23 16:34:02 +02:00
of Devicetree updates for v6.6: 2023-08-30 16:59:03 -07:00
opp OPP: Fix argument name in doc comment 2023-08-18 10:55:49 +05:30
parisc parisc: ccio-dma: Create private runway procfs root entry 2023-08-28 18:00:27 +02:00
parport TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
pci pci-v6.6-fixes-1 2023-09-09 11:35:28 -07:00
pcmcia
peci
perf arm64 fixes for -rc1 2023-09-08 12:48:37 -07:00
phy phy-for-6.6 2023-09-03 10:38:02 -07:00
pinctrl Pin control bulk changes for the v6.6 kernel cycle: 2023-08-30 19:36:19 -07:00
platform USB / Thunderbolt / PHY driver update for 6.6-rc1 2023-09-01 09:23:34 -07:00
pnp PNP: ACPI: Fix string truncation warning 2023-08-17 19:38:35 +02:00
power thermal: Use thermal_tripless_zone_device_register() 2023-09-05 21:42:18 +02:00
powercap powercap: intel_rapl: Fix invalid setting of Power Limit 4 2023-09-06 22:21:22 +02:00
pps
ps3
ptp
pwm pwm: Changes for v6.6-rc1 2023-09-07 18:05:58 -07:00
rapidio
ras
regulator regulator: Fixes for v6.6 2023-09-07 15:51:07 -07:00
remoteproc remoteproc updates for v6.6 2023-09-04 15:12:26 -07:00
reset This pull request is full of clk driver changes. In fact, there aren't any 2023-08-30 19:53:39 -07:00
rpmsg rpmsg updates for v6.6 2023-09-04 15:08:52 -07:00
rtc RTC for 6.6 2023-09-07 16:07:35 -07:00
s390 block-6.6-2023-09-08 2023-09-08 21:39:54 -07:00
sbus sbus: Explicitly include correct DT includes 2023-08-28 13:36:24 -05:00
scsi SCSI misc on 20230909 2023-09-09 12:01:33 -07:00
sh
siox
slimbus
soc soc: renesas: Kconfig: For ARCH_R9A07G043 select the required configs if dependencies are met 2023-09-08 11:25:29 -07:00
soundwire soundwire updates for 6.6 2023-09-03 10:20:57 -07:00
spi spi: Fixes for v6.6 2023-09-07 15:49:20 -07:00
spmi
ssb
staging media: dvb: symbol fixup for dvb_attach() 2023-09-09 08:15:11 +01:00
target SCSI misc on 20230902 2023-09-02 12:02:41 -07:00
tc
tee
thermal thermal: core: Drop thermal_zone_device_register() 2023-09-05 21:42:18 +02:00
thunderbolt thunderbolt: Changes for v6.6 merge window 2023-08-22 14:22:35 +02:00
tty TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
ufs Merge branch 'fixes' into misc 2023-09-02 08:25:19 +01:00
uio uio: pruss: fix missing iounmap() in pruss_probe() 2023-08-22 13:41:55 +02:00
usb just cleanups and fixes 2023-09-07 10:35:14 -07:00
vdpa virtio: features 2023-09-04 10:43:44 -07:00
vfio iommufd for 6.6 2023-08-30 20:41:37 -07:00
vhost vdpa: add get_backend_features vdpa operation 2023-09-03 18:10:22 -04:00
video - New Functionality 2023-09-06 09:00:37 -07:00
virt minmax: add in_range() macro 2023-08-24 16:20:18 -07:00
virtio virtio_ring: fix avail_wrap_counter in virtqueue_add_packed 2023-09-03 18:10:24 -04:00
vlynq
w1
watchdog linux-watchdog 6.6-rc1 tag 2023-09-06 09:19:12 -07:00
xen dma-maping updates for Linux 6.6 2023-08-29 20:32:10 -07:00
zorro zorro: Include zorro.h in names.c 2023-08-21 13:27:44 +02:00
Kconfig Merge patch series "Add non-coherent DMA support for AX45MP" 2023-09-08 11:24:34 -07:00
Makefile Merge patch series "Add non-coherent DMA support for AX45MP" 2023-09-08 11:24:34 -07:00