linux/fs
Johannes Thumshirn 05b36b04d7 btrfs: fix use-after-free in btrfs_encoded_read_endio()
Shinichiro reported the following use-after free that sometimes is
happening in our CI system when running fstests' btrfs/284 on a TCMU
runner device:

  BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780
  Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219

  CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15
  Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   <TASK>
   dump_stack_lvl+0x6e/0xa0
   ? lock_release+0x708/0x780
   print_report+0x174/0x505
   ? lock_release+0x708/0x780
   ? __virt_addr_valid+0x224/0x410
   ? lock_release+0x708/0x780
   kasan_report+0xda/0x1b0
   ? lock_release+0x708/0x780
   ? __wake_up+0x44/0x60
   lock_release+0x708/0x780
   ? __pfx_lock_release+0x10/0x10
   ? __pfx_do_raw_spin_lock+0x10/0x10
   ? lock_is_held_type+0x9a/0x110
   _raw_spin_unlock_irqrestore+0x1f/0x60
   __wake_up+0x44/0x60
   btrfs_encoded_read_endio+0x14b/0x190 [btrfs]
   btrfs_check_read_bio+0x8d9/0x1360 [btrfs]
   ? lock_release+0x1b0/0x780
   ? trace_lock_acquire+0x12f/0x1a0
   ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs]
   ? process_one_work+0x7e3/0x1460
   ? lock_acquire+0x31/0xc0
   ? process_one_work+0x7e3/0x1460
   process_one_work+0x85c/0x1460
   ? __pfx_process_one_work+0x10/0x10
   ? assign_work+0x16c/0x240
   worker_thread+0x5e6/0xfc0
   ? __pfx_worker_thread+0x10/0x10
   kthread+0x2c3/0x3a0
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x31/0x70
   ? __pfx_kthread+0x10/0x10
   ret_from_fork_asm+0x1a/0x30
   </TASK>

  Allocated by task 3661:
   kasan_save_stack+0x30/0x50
   kasan_save_track+0x14/0x30
   __kasan_kmalloc+0xaa/0xb0
   btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs]
   send_extent_data+0xf0f/0x24a0 [btrfs]
   process_extent+0x48a/0x1830 [btrfs]
   changed_cb+0x178b/0x2ea0 [btrfs]
   btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs]
   _btrfs_ioctl_send+0x117/0x330 [btrfs]
   btrfs_ioctl+0x184a/0x60a0 [btrfs]
   __x64_sys_ioctl+0x12e/0x1a0
   do_syscall_64+0x95/0x180
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

  Freed by task 3661:
   kasan_save_stack+0x30/0x50
   kasan_save_track+0x14/0x30
   kasan_save_free_info+0x3b/0x70
   __kasan_slab_free+0x4f/0x70
   kfree+0x143/0x490
   btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs]
   send_extent_data+0xf0f/0x24a0 [btrfs]
   process_extent+0x48a/0x1830 [btrfs]
   changed_cb+0x178b/0x2ea0 [btrfs]
   btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs]
   _btrfs_ioctl_send+0x117/0x330 [btrfs]
   btrfs_ioctl+0x184a/0x60a0 [btrfs]
   __x64_sys_ioctl+0x12e/0x1a0
   do_syscall_64+0x95/0x180
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

  The buggy address belongs to the object at ffff888106a83f00
   which belongs to the cache kmalloc-rnd-07-96 of size 96
  The buggy address is located 24 bytes inside of
   freed 96-byte region [ffff888106a83f00, ffff888106a83f60)

  The buggy address belongs to the physical page:
  page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83
  flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
  page_type: f5(slab)
  raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004
  raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
   ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
  >ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                              ^
   ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
   ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ==================================================================

Further analyzing the trace and the crash dump's vmcore file shows that
the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on
the wait_queue that is in the private data passed to the end_io handler.

Commit 4ff47df40447 ("btrfs: move priv off stack in
btrfs_encoded_read_regular_fill_pages()") moved 'struct
btrfs_encoded_read_private' off the stack.

Before that commit one can see a corruption of the private data when
analyzing the vmcore after a crash:

*(struct btrfs_encoded_read_private *)0xffff88815626eec8 = {
	.wait = (wait_queue_head_t){
		.lock = (spinlock_t){
			.rlock = (struct raw_spinlock){
				.raw_lock = (arch_spinlock_t){
					.val = (atomic_t){
						.counter = (int)-2005885696,
					},
					.locked = (u8)0,
					.pending = (u8)157,
					.locked_pending = (u16)40192,
					.tail = (u16)34928,
				},
				.magic = (unsigned int)536325682,
				.owner_cpu = (unsigned int)29,
				.owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0,
				.dep_map = (struct lockdep_map){
					.key = (struct lock_class_key *)0xffff8881575a3b6c,
					.class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 },
					.name = (const char *)0xffff88815626f100 = "",
					.wait_type_outer = (u8)37,
					.wait_type_inner = (u8)178,
					.lock_type = (u8)154,
				},
			},
			.__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 },
			.dep_map = (struct lockdep_map){
				.key = (struct lock_class_key *)0xffff8881575a3b6c,
				.class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 },
				.name = (const char *)0xffff88815626f100 = "",
				.wait_type_outer = (u8)37,
				.wait_type_inner = (u8)178,
				.lock_type = (u8)154,
			},
		},
		.head = (struct list_head){
			.next = (struct list_head *)0x112cca,
			.prev = (struct list_head *)0x47,
		},
	},
	.pending = (atomic_t){
		.counter = (int)-1491499288,
	},
	.status = (blk_status_t)130,
}

Here we can see several indicators of in-memory data corruption, e.g. the
large negative atomic values of ->pending or
->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic
0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus
pointer values for ->wait->head.

To fix this, change atomic_dec_return() to atomic_dec_and_test() to fix the
corruption, as atomic_dec_return() is defined as two instructions on
x86_64, whereas atomic_dec_and_test() is defined as a single atomic
operation. This can lead to a situation where counter value is already
decremented but the if statement in btrfs_encoded_read_endio() is not
completely processed, i.e. the 0 test has not completed. If another thread
continues executing btrfs_encoded_read_regular_fill_pages() the
atomic_dec_return() there can see an already updated ->pending counter and
continues by freeing the private data. Continuing in the endio handler the
test for 0 succeeds and the wait_queue is woken up, resulting in a
use-after-free.

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com>
Fixes: 1881fba89b ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-28 20:45:43 +01:00
..
9p Revert patches causing inode collision problems 2024-10-25 15:25:02 -07:00
adfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
affs affs-for-6.12-tag 2024-09-16 13:07:59 +02:00
afs vfs-6.12-rc6.fixes 2024-11-01 07:37:10 -10:00
autofs autofs: fix thinko in validate_dev_ioctl() 2024-10-28 13:16:56 +01:00
bcachefs bcachefs: Fix UAF in __promote_alloc() error path 2024-11-07 16:48:21 -05:00
befs befs: Convert befs_symlink_read_folio() to use folio_end_read() 2024-05-31 12:31:39 +02:00
bfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
btrfs btrfs: fix use-after-free in btrfs_encoded_read_endio() 2024-11-28 20:45:43 +01:00
cachefiles cachefiles: fix dentry leak in cachefiles_open_file() 2024-09-27 18:29:19 +02:00
ceph A fix from Patrick for a variety of CephFS lockup scenarios caused by 2024-10-04 10:10:23 -07:00
coda coda: use param->file for FSCONFIG_SET_FD 2024-08-19 13:45:03 +02:00
configfs fs/configfs: Add a callback to determine attribute visibility 2024-06-17 20:42:57 +02:00
cramfs vfs-6.11.module.description 2024-07-15 11:14:59 -07:00
crypto move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
debugfs [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
devpts
dlm [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
ecryptfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
efivarfs [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
efs vfs-6.11.module.description 2024-07-15 11:14:59 -07:00
erofs erofs: use get_tree_bdev_flags() to avoid misleading messages 2024-10-21 14:30:27 +02:00
exfat move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
exportfs fhandle: relax open_by_handle_at() permission checks 2024-05-28 15:57:23 +02:00
ext2 vfs-6.12.file 2024-09-16 09:14:02 +02:00
ext4 ext4: fix off by one issue in alloc_flex_gd() 2024-10-04 17:36:28 -04:00
f2fs f2fs: allow parallel DIO reads 2024-10-11 15:12:07 +00:00
fat fat: fix uninitialized variable 2024-10-17 00:28:06 -07:00
freevxfs freevxfs: Convert freevxfs to the new mount API. 2024-03-26 09:04:53 +01:00
fuse fuse: remove stray debug line 2024-10-25 17:05:49 +02:00
gfs2 gfs2 changes 2024-09-23 11:55:17 -07:00
hfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
hfsplus move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
hostfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
hpfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
hugetlbfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
iomap vfs-6.12-rc6.iomap 2024-11-01 07:45:00 -10:00
isofs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
jbd2 jbd2: remove unneeded check of ret in jbd2_fc_get_buf 2024-08-26 23:49:15 -04:00
jffs2 jffs2: Use a folio in jffs2_garbage_collect_dnode() 2024-08-19 13:40:00 +02:00
jfs jfs: Fix sanity check in dbMount 2024-10-22 09:40:37 -05:00
kernfs kernfs: mount: Remove unnecessary ‘NULL’ values from knparent 2024-05-04 19:02:39 +02:00
lockd move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
minix buffer: Convert __block_write_begin() to take a folio 2024-08-07 11:33:36 +02:00
netfs netfs: Downgrade i_rwsem for a buffered write 2024-10-17 15:33:42 +02:00
nfs nfs: avoid i_lock contention in nfs_clear_invalid_mapping 2024-11-04 10:24:19 -05:00
nfs_common nfs_common: fix localio to cope with racing nfs_local_probe() 2024-11-04 10:24:19 -05:00
nfsd nfsd-6.12 fixes: 2024-11-09 13:18:07 -08:00
nilfs2 nilfs2: fix potential deadlock with newly created symlinks 2024-10-30 20:14:12 -07:00
nls move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
notify inotify: Fix possible deadlock in fsnotify_destroy_mark 2024-10-02 15:14:29 +02:00
ntfs3 Changes for 6.12-rc3 2024-10-08 10:53:06 -07:00
ocfs2 ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove() 2024-11-07 14:14:59 -08:00
omfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
openpromfs openpromfs: add missing MODULE_DESCRIPTION() macro 2024-06-20 09:46:01 +02:00
orangefs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
overlayfs fs: pass offset and result to backing_file end_write() callback 2024-10-16 13:17:45 +02:00
proc 20 hotfixes, 14 of which are cc:stable. 2024-11-10 09:04:27 -08:00
pstore drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
qnx4 qnx4: add MODULE_DESCRIPTION() 2024-05-28 11:52:53 +02:00
qnx6 qnx6: Convert directory handling to use kmap_local 2024-08-07 11:31:56 +02:00
quota \n 2024-09-23 10:49:28 -07:00
ramfs mm: switch mm->get_unmapped_area() to a flag 2024-04-25 20:56:25 -07:00
reiserfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
romfs romfs: fix romfs_read_folio() 2024-08-21 22:32:58 +02:00
smb fix net namespace refcount issue 2024-11-09 12:58:23 -08:00
squashfs Squashfs: fix variable overflow in squashfs_readpage_block 2024-10-30 20:14:12 -07:00
sysfs Merge 6.9-rc5 into driver-core-next 2024-04-23 13:27:43 +02:00
sysv buffer: Convert __block_write_begin() to take a folio 2024-08-07 11:33:36 +02:00
tests execve: Move KUnit tests to tests/ subdirectory 2024-07-22 18:25:47 -07:00
tracefs tracing: Fix tracefs mount options 2024-11-01 08:38:14 -04:00
ubifs [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
udf udf: fix uninit-value use in udf_get_fileshortad 2024-10-02 14:32:37 +02:00
ufs ufs_rename(): fix bogus argument of folio_release_kmap() 2024-10-02 00:05:09 -04:00
unicode unicode: Don't special case ignorable code points 2024-10-09 13:34:01 -04:00
vboxsf fs: Convert aops->write_end to take a folio 2024-08-07 11:32:02 +02:00
verity fsverity: expose verified fsverity built-in signatures to LSMs 2024-08-20 14:03:18 -04:00
xfs XFS bug fies for 6.12-rc6 2024-11-02 09:22:16 -10:00
zonefs zonefs fixes for 6.12-rc2 2024-10-02 12:02:15 -07:00
aio.c fs/aio: Fix __percpu annotation of *cpu pointer in struct kioctx 2024-08-19 13:45:03 +02:00
anon_inodes.c fs: Create anon_inode_getfile_fmode() 2024-04-26 10:33:05 +02:00
attr.c nfsd-6.11 fixes: 2024-08-29 06:20:44 +12:00
backing-file.c fs: pass offset and result to backing_file end_write() callback 2024-10-16 13:17:45 +02:00
bad_inode.c
binfmt_elf_fdpic.c binfmt_elf_fdpic: fix AUXV size calculation when ELF_HWCAP2 is defined 2024-08-26 13:00:38 -07:00
binfmt_elf.c Revert "binfmt_elf, coredump: Log the reason of the failed core dumps" 2024-09-26 11:39:02 -07:00
binfmt_flat.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
binfmt_misc.c vfs-6.11.module.description 2024-07-15 11:14:59 -07:00
binfmt_script.c fs: binfmt: add missing MODULE_DESCRIPTION() macros 2024-05-28 12:06:51 +02:00
bpf_fs_kfuncs.c bpf: Add kfunc bpf_get_dentry_xattr() to read xattr from dentry 2024-08-07 11:26:54 -07:00
buffer.c vfs-6.12.folio 2024-09-16 08:54:30 +02:00
char_dev.c
compat_binfmt_elf.c
coredump.c Revert "binfmt_elf, coredump: Log the reason of the failed core dumps" 2024-09-26 11:39:02 -07:00
d_path.c
dax.c fsdax: dax_unshare_iter needs to copy entire blocks 2024-10-07 13:51:47 +02:00
dcache.c vfs-6.12.misc 2024-09-16 08:35:09 +02:00
direct-io.c fs/direct-io: Remove linux/prefetch.h include 2024-08-19 13:45:02 +02:00
drop_caches.c sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
eventfd.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
eventpoll.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
exec.c ALong with the usual shower of singleton patches, notable patch series in 2024-09-21 07:29:05 -07:00
fcntl.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
fhandle.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
file_table.c slab updates for 6.12 2024-09-18 08:53:53 +02:00
file.c close_range(): fix the logics in descriptor table trimming 2024-09-29 21:52:29 -04:00
filesystems.c
fs_context.c
fs_parser.c fs_parse: add uid & gid option option parsing helpers 2024-07-02 06:20:49 +02:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c inode: port __I_SYNC to var event 2024-08-30 08:22:39 +02:00
fsopen.c [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
init.c
inode.c bcachefs: do not use PF_MEMALLOC_NORECLAIM 2024-10-09 12:47:18 -07:00
internal.h file: reclaim 24 bytes from f_owner 2024-08-28 13:05:39 +02:00
ioctl.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
Kconfig nfs_common: fix Kconfig for NFS_COMMON_LOCALIO_SUPPORT 2024-10-03 16:19:51 -04:00
Kconfig.binfmt exec: Add KUnit test for bprm_stack_limits() 2024-06-19 13:13:55 -07:00
kernel_read_file.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
libfs.c vfs-6.12.folio 2024-09-16 08:54:30 +02:00
locks.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
Makefile bpf: introduce new VFS based BPF kfuncs 2024-08-06 09:01:41 -07:00
mbcache.c
mnt_idmapping.c fuse update for 6.12 2024-09-24 15:29:42 -07:00
mount.h vfs-6.12.mount 2024-09-16 11:15:26 +02:00
mpage.c buffer: Remove calls to set and clear the folio error flag 2024-05-31 12:31:43 +02:00
namei.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
namespace.c fs: don't try and remove empty rbtree node 2024-10-17 15:33:43 +02:00
nsfs.c [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
open.c openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) 2024-10-10 12:09:03 +02:00
pidfs.c pidfs: check for valid pid namespace 2024-09-27 18:29:19 +02:00
pipe.c [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
pnode.c
pnode.h
posix_acl.c fs: Use in_group_or_capable() helper to simplify the code 2024-08-30 08:22:37 +02:00
proc_namespace.c fs: rename show_mnt_opts -> show_vfsmnt_opts 2024-06-28 14:36:43 +02:00
read_write.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
readdir.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
remap_range.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
select.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
seq_file.c seq_file: Simplify __seq_puts() 2024-05-02 16:28:20 +02:00
signalfd.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
splice.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
stack.c
stat.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
statfs.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
super.c fs/super.c: introduce get_tree_bdev_flags() 2024-10-21 14:30:26 +02:00
sync.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
sysctls.c
timerfd.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
userfaultfd.c fork: do not invoke uffd on fork if error occurs 2024-10-28 21:40:38 -07:00
utimes.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
xattr.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00