[ Upstream commit 7e1b150fece033703a824df1bbc03df091ea53cc ]
With below script, redundant compress extension will be parsed and added
by parse_options(), because parse_options() doesn't check whether the
extension is existed or not, fix it.
1. mount -t f2fs -o compress_extension=so /dev/vdb /mnt/f2fs
2. mount -t f2fs -o remount,compress_extension=so /mnt/f2fs
3. mount|grep f2fs
/dev/vdb on /mnt/f2fs type f2fs (...,compress_extension=so,compress_extension=so,...)
Fixes: 4c8ff7095bef ("f2fs: support data compression")
Fixes: 151b1982be5d ("f2fs: compress: add nocompress extensions support")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b0327c84e91a0f4f0abced8cb83ec86a7083f086 ]
Call trace:
__memcpy+0x128/0x250
f2fs_read_multi_pages+0x940/0xf7c
f2fs_mpage_readpages+0x5a8/0x624
f2fs_readahead+0x5c/0x110
page_cache_ra_unbounded+0x1b8/0x590
do_sync_mmap_readahead+0x1dc/0x2e4
filemap_fault+0x254/0xa8c
f2fs_filemap_fault+0x2c/0x104
__do_fault+0x7c/0x238
do_handle_mm_fault+0x11bc/0x2d14
do_mem_abort+0x3a8/0x1004
el0_da+0x3c/0xa0
el0t_64_sync_handler+0xc4/0xec
el0t_64_sync+0x1b4/0x1b8
In f2fs_read_multi_pages(), once f2fs_decompress_cluster() was called if
we hit cached page in compress_inode's cache, dic may be released, it needs
break the loop rather than continuing it, in order to avoid accessing
invalid dic pointer.
Fixes: 6ce19aff0b8c ("f2fs: compress: add compress_inode to cache compressed blocks")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c5d3f9b7649abb20aa5ab3ebff9421a171eaeb22 ]
With below mount option and testcase, it hangs kernel.
1. mount -t f2fs -o compress_log_size=5 /dev/vdb /mnt/f2fs
2. touch /mnt/f2fs/file
3. chattr +c /mnt/f2fs/file
4. dd if=/dev/zero of=/mnt/f2fs/file bs=1MB count=1
5. sync
6. dd if=/dev/zero of=/mnt/f2fs/file bs=111 count=11 conv=notrunc
7. sync
INFO: task sync:4788 blocked for more than 120 seconds.
Not tainted 6.5.0-rc1+ #322
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:sync state:D stack:0 pid:4788 ppid:509 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x335/0xf80
schedule+0x6f/0xf0
wb_wait_for_completion+0x5e/0x90
sync_inodes_sb+0xd8/0x2a0
sync_inodes_one_sb+0x1d/0x30
iterate_supers+0x99/0xf0
ksys_sync+0x46/0xb0
__do_sys_sync+0x12/0x20
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
The reason is f2fs_all_cluster_page_ready() assumes that pages array should
cover at least one cluster, otherwise, it will always return false, result
in deadloop.
By default, pages array size is 16, and it can cover the case cluster_size
is equal or less than 16, for the case cluster_size is larger than 16, let's
allocate memory of pages array dynamically.
Fixes: 4c8ff7095bef ("f2fs: support data compression")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1cd98ee747cff120ee9b93988ddb7315d8d8f8e7 ]
Convert the function to use a folio_batch instead of pagevec. This is in
preparation for the removal of find_get_pages_range_tag().
Also modified f2fs_all_cluster_page_ready to take in a folio_batch instead
of pagevec. This does NOT support large folios. The function currently
only utilizes folios of size 1 so this shouldn't cause any issues right
now.
This version of the patch limits the number of pages fetched to
F2FS_ONSTACK_PAGES. If that ever happens, update the start index here
since filemap_get_folios_tag() updates the index to be after the last
found folio, not necessarily the last used page.
Link: https://lkml.kernel.org/r/20230104211448.4804-15-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: c5d3f9b7649a ("f2fs: compress: fix deadloop in f2fs_write_cache_pages()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d2d9bb3b6d2fbccb5b33d3a85a2830971625a4ea ]
https://bugzilla.kernel.org/show_bug.cgi?id=216050
Somehow we're getting a page which has a different mapping.
Let's avoid the infinite loop.
Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 59237a21776f70ffb0420611c23e7158e1317037 ]
Wei Chen reports a kernel bug as blew:
INFO: task syz-executor.0:29056 blocked for more than 143 seconds.
Not tainted 5.15.0-rc5 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0 state:D stack:14632 pid:29056 ppid: 6574 flags:0x00000004
Call Trace:
__schedule+0x4a1/0x1720
schedule+0x36/0xe0
rwsem_down_write_slowpath+0x322/0x7a0
fscrypt_ioctl_set_policy+0x11f/0x2a0
__f2fs_ioctl+0x1a9f/0x5780
f2fs_ioctl+0x89/0x3a0
__x64_sys_ioctl+0xe8/0x140
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Eric did some investigation on this issue, quoted from reply of Eric:
"Well, the quality of this bug report has a lot to be desired (not on
upstream kernel, reproducer is full of totally irrelevant stuff, not
sent to the mailing list of the filesystem whose disk image is being
fuzzed, etc.). But what is going on is that f2fs_empty_dir() doesn't
consider the case of a directory with an extremely large i_size on a
malicious disk image.
Specifically, the reproducer mounts an f2fs image with a directory
that has an i_size of 14814520042850357248, then calls
FS_IOC_SET_ENCRYPTION_POLICY on it.
That results in a call to f2fs_empty_dir() to check whether the
directory is empty. f2fs_empty_dir() then iterates through all
3616826182336513 blocks the directory allegedly contains to check
whether any contain anything. i_rwsem is held during this, so
anything else that tries to take it will hang."
In order to solve this issue, let's use f2fs_get_next_page_offset()
to speed up iteration by skipping holes for all below functions:
- f2fs_empty_dir
- f2fs_readdir
- find_in_level
The way why we can speed up iteration was described in
'commit 3cf4574705b4 ("f2fs: introduce get_next_page_offset to speed
up SEEK_DATA")'.
Meanwhile, in f2fs_empty_dir(), let's use f2fs_find_data_page()
instead f2fs_get_lock_data_page(), due to i_rwsem was held in
caller of f2fs_empty_dir(), there shouldn't be any races, so it's
fine to not lock dentry page during lookuping dirents in the page.
Link: https://lore.kernel.org/lkml/536944df-a0ae-1dd8-148f-510b476e1347@kernel.org/T/
Reported-by: Wei Chen <harperchen1110@gmail.com>
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: d2d9bb3b6d2f ("f2fs: get out of a repeat loop when getting a locked data page")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 958ccbbf1ce716d77c7cfa79ace50a421c1eed73 ]
syzbot reports a f2fs bug as below:
UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3275:19
index 1409 is out of range for type '__le32[923]' (aka 'unsigned int[923]')
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:217 [inline]
__ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348
inline_data_addr fs/f2fs/f2fs.h:3275 [inline]
__recover_inline_status fs/f2fs/inode.c:113 [inline]
do_read_inode fs/f2fs/inode.c:480 [inline]
f2fs_iget+0x4730/0x48b0 fs/f2fs/inode.c:604
f2fs_fill_super+0x640e/0x80c0 fs/f2fs/super.c:4601
mount_bdev+0x276/0x3b0 fs/super.c:1391
legacy_get_tree+0xef/0x190 fs/fs_context.c:611
vfs_get_tree+0x8c/0x270 fs/super.c:1519
do_new_mount+0x28f/0xae0 fs/namespace.c:3335
do_mount fs/namespace.c:3675 [inline]
__do_sys_mount fs/namespace.c:3884 [inline]
__se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The issue was bisected to:
commit d48a7b3a72f121655d95b5157c32c7d555e44c05
Author: Chao Yu <chao@kernel.org>
Date: Mon Jan 9 03:49:20 2023 +0000
f2fs: fix to do sanity check on extent cache correctly
The root cause is we applied both v1 and v2 of the patch, v2 is the right
fix, so it needs to revert v1 in order to fix reported issue.
v1:
commit d48a7b3a72f1 ("f2fs: fix to do sanity check on extent cache correctly")
https://lore.kernel.org/lkml/20230109034920.492914-1-chao@kernel.org/
v2:
commit 269d11948100 ("f2fs: fix to do sanity check on extent cache correctly")
https://lore.kernel.org/lkml/20230207134808.1827869-1-chao@kernel.org/
Reported-by: syzbot+601018296973a481f302@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000fcf0690600e4d04d@google.com/
Fixes: d48a7b3a72f1 ("f2fs: fix to do sanity check on extent cache correctly")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5a711acab305e04278c136c841ba37c589c16a1 ]
The current logic, regardless of whether CONFIG_BLK_DEV_ZONED
is enabled or not, will judge whether discard_unit is SECTION,
when f2fs_sb_has_blkzoned.
In fact, when CONFIG_BLK_DEV_ZONED is not enabled, this judgment
is a path that will never be accessed. At this time, -EINVAL will
be returned in the parse_options function, accompanied by the
message "Zoned block device support is not enabled".
Let's wrap this discard_unit judgment with CONFIG_BLK_DEV_ZONED.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 2bd4df8fcbc7 ("f2fs: Only lfs mode is allowed with zoned block device feature")
Signed-off-by: Sasha Levin <sashal@kernel.org>
This reverts commit a78a8bcdc26de5ef3a0ee27c9c6c512e54a6051c which is
commit a6ec83786ab9f13f25fb18166dee908845713a95 upstream.
Something is currently broken in the f2fs code, Guenter has reported
boot problems with it for a few releases now, so revert the most recent
f2fs changes in the hope to get this back to a working filesystem.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/b392e1a8-b987-4993-bd45-035db9415a6e@roeck-us.net
Cc: Chao Yu <chao@kernel.org>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 6ba0594a81f91d6fd8ca9bd4ad23aa1618635a0f which is
commit 967eaad1fed5f6335ea97a47d45214744dc57925 upstream.
Something is currently broken in the f2fs code, Guenter has reported
boot problems with it for a few releases now, so revert the most recent
f2fs changes in the hope to get this back to a working filesystem.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/b392e1a8-b987-4993-bd45-035db9415a6e@roeck-us.net
Cc: Chao Yu <chao@kernel.org>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Yangtao Li <frank.li@vivo.com>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit e2fb24ce37caeaecff08af4e9967c8462624312b which is
commit 458c15dfbce62c35fefd9ca637b20a051309c9f1 upstream.
Something is currently broken in the f2fs code, Guenter has reported
boot problems with it for a few releases now, so revert the most recent
f2fs changes in the hope to get this back to a working filesystem.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/b392e1a8-b987-4993-bd45-035db9415a6e@roeck-us.net
Cc: Chao Yu <chao@kernel.org>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 458c15dfbce62c35fefd9ca637b20a051309c9f1 ]
syzbot reports a bug as below:
general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] PREEMPT SMP KASAN
RIP: 0010:__lock_acquire+0x69/0x2000 kernel/locking/lockdep.c:4942
Call Trace:
lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5691
__raw_write_lock include/linux/rwlock_api_smp.h:209 [inline]
_raw_write_lock+0x2e/0x40 kernel/locking/spinlock.c:300
__drop_extent_tree+0x3ac/0x660 fs/f2fs/extent_cache.c:1100
f2fs_drop_extent_tree+0x17/0x30 fs/f2fs/extent_cache.c:1116
f2fs_insert_range+0x2d5/0x3c0 fs/f2fs/file.c:1664
f2fs_fallocate+0x4e4/0x6d0 fs/f2fs/file.c:1838
vfs_fallocate+0x54b/0x6b0 fs/open.c:324
ksys_fallocate fs/open.c:347 [inline]
__do_sys_fallocate fs/open.c:355 [inline]
__se_sys_fallocate fs/open.c:353 [inline]
__x64_sys_fallocate+0xbd/0x100 fs/open.c:353
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The root cause is race condition as below:
- since it tries to remount rw filesystem, so that do_remount won't
call sb_prepare_remount_readonly to block fallocate, there may be race
condition in between remount and fallocate.
- in f2fs_remount(), default_options() will reset mount option to default
one, and then update it based on result of parse_options(), so there is
a hole which race condition can happen.
Thread A Thread B
- f2fs_fill_super
- parse_options
- clear_opt(READ_EXTENT_CACHE)
- f2fs_remount
- default_options
- set_opt(READ_EXTENT_CACHE)
- f2fs_fallocate
- f2fs_insert_range
- f2fs_drop_extent_tree
- __drop_extent_tree
- __may_extent_tree
- test_opt(READ_EXTENT_CACHE) return true
- write_lock(&et->lock) access NULL pointer
- parse_options
- clear_opt(READ_EXTENT_CACHE)
Cc: <stable@vger.kernel.org>
Reported-by: syzbot+d015b6c2fbb5c383bf08@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/20230522124203.3838360-1-chao@kernel.org
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 967eaad1fed5f6335ea97a47d45214744dc57925 ]
Some minor modifications to flush_merge and related parameters:
1.The FLUSH_MERGE opt is set by default only in non-ro mode.
2.When ro and merge are set at the same time, an error is reported.
3.Display noflush_merge mount opt.
Suggested-by: Chao Yu <chao@kernel.org>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 458c15dfbce6 ("f2fs: don't reset unchangable mount option in f2fs_remount()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a6ec83786ab9f13f25fb18166dee908845713a95 upstream.
syzbot reports below bug:
BUG: KASAN: slab-use-after-free in f2fs_truncate_data_blocks_range+0x122a/0x14c0 fs/f2fs/file.c:574
Read of size 4 at addr ffff88802a25c000 by task syz-executor148/5000
CPU: 1 PID: 5000 Comm: syz-executor148 Not tainted 6.4.0-rc7-syzkaller-00041-ge660abd551f1 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351
print_report mm/kasan/report.c:462 [inline]
kasan_report+0x11c/0x130 mm/kasan/report.c:572
f2fs_truncate_data_blocks_range+0x122a/0x14c0 fs/f2fs/file.c:574
truncate_dnode+0x229/0x2e0 fs/f2fs/node.c:944
f2fs_truncate_inode_blocks+0x64b/0xde0 fs/f2fs/node.c:1154
f2fs_do_truncate_blocks+0x4ac/0xf30 fs/f2fs/file.c:721
f2fs_truncate_blocks+0x7b/0x300 fs/f2fs/file.c:749
f2fs_truncate.part.0+0x4a5/0x630 fs/f2fs/file.c:799
f2fs_truncate include/linux/fs.h:825 [inline]
f2fs_setattr+0x1738/0x2090 fs/f2fs/file.c:1006
notify_change+0xb2c/0x1180 fs/attr.c:483
do_truncate+0x143/0x200 fs/open.c:66
handle_truncate fs/namei.c:3295 [inline]
do_open fs/namei.c:3640 [inline]
path_openat+0x2083/0x2750 fs/namei.c:3791
do_filp_open+0x1ba/0x410 fs/namei.c:3818
do_sys_openat2+0x16d/0x4c0 fs/open.c:1356
do_sys_open fs/open.c:1372 [inline]
__do_sys_creat fs/open.c:1448 [inline]
__se_sys_creat fs/open.c:1442 [inline]
__x64_sys_creat+0xcd/0x120 fs/open.c:1442
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The root cause is, inodeA references inodeB via inodeB's ino, once inodeA
is truncated, it calls truncate_dnode() to truncate data blocks in inodeB's
node page, it traverse mapping data from node->i.i_addr[0] to
node->i.i_addr[ADDRS_PER_BLOCK() - 1], result in out-of-boundary access.
This patch fixes to add sanity check on dnode page in truncate_dnode(),
so that, it can help to avoid triggering such issue, and once it encounters
such issue, it will record newly introduced ERROR_INVALID_NODE_REFERENCE
error into superblock, later fsck can detect such issue and try repairing.
Also, it removes f2fs_truncate_data_blocks() for cleanup due to the
function has only one caller, and uses f2fs_truncate_data_blocks_range()
instead.
Reported-and-tested-by: syzbot+12cb4425b22169b52036@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000f3038a05fef867f8@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cde3c9d7e2a359e337216855dcb333a19daaa436 upstream.
This reverts commit d94772154e524b329a168678836745d2773a6e02. The
locking is going to be provided by VFS.
CC: Jaegeuk Kim <jaegeuk@kernel.org>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230601105830.13168-3-jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0135c482fa97e2fd8245cb462784112a00ed1211 ]
If truncate_node() fails in truncate_dnode(), it missed to call
f2fs_put_page(), fix it.
Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8bec7dd1b3f7d7769d433d67bde404de948a2d95 ]
freeze_super() can fail, it needs to check its return value and do
error handling in f2fs_resize_fs().
Fixes: 04f0b2eaa3b3 ("f2fs: ioctl for removing a range from F2FS")
Fixes: b4b10061ef98 ("f2fs: refactor resize_fs to avoid meta updates in progress")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8189834d4348ae608083e1f1f53792cfcc2a9bc ]
butt3rflyh4ck reports a bug as below:
When a thread always calls F2FS_IOC_RESIZE_FS to resize fs, if resize fs is
failed, f2fs kernel thread would invoke callback function to update f2fs io
info, it would call f2fs_write_end_io and may trigger null-ptr-deref in
NODE_MAPPING.
general protection fault, probably for non-canonical address
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
RIP: 0010:NODE_MAPPING fs/f2fs/f2fs.h:1972 [inline]
RIP: 0010:f2fs_write_end_io+0x727/0x1050 fs/f2fs/data.c:370
<TASK>
bio_endio+0x5af/0x6c0 block/bio.c:1608
req_bio_endio block/blk-mq.c:761 [inline]
blk_update_request+0x5cc/0x1690 block/blk-mq.c:906
blk_mq_end_request+0x59/0x4c0 block/blk-mq.c:1023
lo_complete_rq+0x1c6/0x280 drivers/block/loop.c:370
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1101
__do_softirq+0x1d4/0x8ef kernel/softirq.c:571
run_ksoftirqd kernel/softirq.c:939 [inline]
run_ksoftirqd+0x31/0x60 kernel/softirq.c:931
smpboot_thread_fn+0x659/0x9e0 kernel/smpboot.c:164
kthread+0x33e/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
The root cause is below race case can cause leaving dirty metadata
in f2fs after filesystem is remount as ro:
Thread A Thread B
- f2fs_ioc_resize_fs
- f2fs_readonly --- return false
- f2fs_resize_fs
- f2fs_remount
- write_checkpoint
- set f2fs as ro
- free_segment_range
- update meta_inode's data
Then, if f2fs_put_super() fails to write_checkpoint due to readonly
status, and meta_inode's dirty data will be writebacked after node_inode
is put, finally, f2fs_write_end_io will access NULL pointer on
sbi->node_inode.
Thread A IRQ context
- f2fs_put_super
- write_checkpoint fails
- iput(node_inode)
- node_inode = NULL
- iput(meta_inode)
- write_inode_now
- f2fs_write_meta_page
- f2fs_write_end_io
- NODE_MAPPING(sbi)
: access NULL pointer on node_inode
Fixes: b4b10061ef98 ("f2fs: refactor resize_fs to avoid meta updates in progress")
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Closes: https://lore.kernel.org/r/1684480657-2375-1-git-send-email-yangtiezhu@loongson.cn
Tested-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f082c6b205a06953f26c40bdc7621cc5a58ceb7c ]
If S_NOQUOTA is cleared from inode during data page writeback of quota
file, it may miss to unlock node_write lock, result in potential
deadlock, fix to use the lock in paired.
Kworker Thread
- writepage
if (IS_NOQUOTA())
f2fs_down_read(&sbi->node_write);
- vfs_cleanup_quota_inode
- inode->i_flags &= ~S_NOQUOTA;
if (IS_NOQUOTA())
f2fs_up_read(&sbi->node_write);
Fixes: 79963d967b49 ("f2fs: shrink node_write lock coverage")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7cd2e5f75b86a1befa99834f3ed1d735eeff69e6 ]
If a file has FI_COMPRESS_RELEASED, all writes for it should not be
allowed.
Fixes: 5fdb322ff2c2 ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE")
Signed-off-by: Qi Han <hanqi@vivo.com>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d78dfefcde9d311284434560d69c0478c55a657e ]
With below case, it can mount multi-device image w/ rw option, however
one of secondary device is set as ro, later update will cause panic, so
let's introduce f2fs_dev_is_readonly(), and check multi-devices rw status
in f2fs_remount() w/ it in order to avoid such inconsistent mount status.
mkfs.f2fs -c /dev/zram1 /dev/zram0 -f
blockdev --setro /dev/zram1
mount -t f2fs dev/zram0 /mnt/f2fs
mount: /mnt/f2fs: WARNING: source write-protected, mounted read-only.
mount -t f2fs -o remount,rw mnt/f2fs
dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=8192
kernel BUG at fs/f2fs/inline.c:258!
RIP: 0010:f2fs_write_inline_data+0x23e/0x2d0 [f2fs]
Call Trace:
f2fs_write_single_data_page+0x26b/0x9f0 [f2fs]
f2fs_write_cache_pages+0x389/0xa60 [f2fs]
__f2fs_write_data_pages+0x26b/0x2d0 [f2fs]
f2fs_write_data_pages+0x2e/0x40 [f2fs]
do_writepages+0xd3/0x1b0
__writeback_single_inode+0x5b/0x420
writeback_sb_inodes+0x236/0x5a0
__writeback_inodes_wb+0x56/0xf0
wb_writeback+0x2a3/0x490
wb_do_writeback+0x2b2/0x330
wb_workfn+0x6a/0x260
process_one_work+0x270/0x5e0
worker_thread+0x52/0x3e0
kthread+0xf4/0x120
ret_from_fork+0x29/0x50
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c9b3649a934d131151111354bcbb638076f03a30 ]
xfstest generic/361 reports a bug as below:
f2fs_bug_on(sbi, sbi->fsync_node_num);
kernel BUG at fs/f2fs/super.c:1627!
RIP: 0010:f2fs_put_super+0x3a8/0x3b0
Call Trace:
generic_shutdown_super+0x8c/0x1b0
kill_block_super+0x2b/0x60
kill_f2fs_super+0x87/0x110
deactivate_locked_super+0x39/0x80
deactivate_super+0x46/0x50
cleanup_mnt+0x109/0x170
__cleanup_mnt+0x16/0x20
task_work_run+0x65/0xa0
exit_to_user_mode_prepare+0x175/0x190
syscall_exit_to_user_mode+0x25/0x50
do_syscall_64+0x4c/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
During umount(), if cp_error is set, f2fs_wait_on_all_pages() should
not stop waiting all F2FS_WB_CP_DATA pages to be writebacked, otherwise,
fsync_node_num can be non-zero after f2fs_wait_on_all_pages() causing
this bug.
In this case, to avoid deadloop in f2fs_wait_on_all_pages(), it needs
to drop all dirty pages rather than redirtying them.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 269d119481008cd725ce32553332593c0ecfc91c upstream.
In do_read_inode(), sanity check for extent cache should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d48a7b3a72f121655d95b5157c32c7d555e44c05 upstream.
In do_read_inode(), sanity_check_inode() should be called after
f2fs_init_read_extent_tree(), fix it.
Fixes: 72840cccc0a1 ("f2fs: allocate the extent_cache by default")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 043d2d00b44310f84c0593c63e51fae88c829cdd ]
Let's reduce the complexity of mixed use of rb_tree in victim_entry from
extent_cache and discard_cmd.
This should fix arm32 memory alignment issue caused by shared rb_entry.
[struct victim_entry] [struct rb_entry]
[0] struct rb_node rb_node; [0] struct rb_node rb_node;
union {
struct {
unsigned int ofs;
unsigned int len;
};
[16] unsigned long long mtime; [12] unsigned long long key;
} __packed;
Cc: <stable@vger.kernel.org>
Fixes: 093749e296e2 ("f2fs: support age threshold based garbage collection")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 72840cccc0a1a0a0dc1bb27b669a9111be6d0f6a ]
Let's allocate it to remove the runtime complexity.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 043d2d00b443 ("f2fs: factor out victim_entry usage from general rb_tree use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7547daccd6a37522f0af74ec4b5a3036f3dd328 ]
This patch prepares extent_cache to be ready for addition.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Stable-dep-of: 043d2d00b443 ("f2fs: factor out victim_entry usage from general rb_tree use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit d94772154e524b329a168678836745d2773a6e02 upstream.
F2FS has the same issue in ext4_rename causing crash revealed by
xfstests/generic/707.
See also commit 0813299c586b ("ext4: Fix possible corruption when moving a directory")
CC: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 935fc6fa6466cf18dd72dd1518ebc7bfc4cd58a4 ]
In __replace_atomic_write_block(), we missed to check return value
of inc_valid_block_count(), for extreme testcase that f2fs image is
run out of space, it may cause inconsistent status in between SIT
table and total valid block count.
Cc: Daeho Jeong <daehojeong@google.com>
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5cdb422c839134273866208dad5360835ddb9794 ]
xfstest generic/019 reports a bug:
kernel BUG at mm/filemap.c:1619!
RIP: 0010:folio_end_writeback+0x8a/0x90
Call Trace:
end_page_writeback+0x1c/0x60
f2fs_write_end_io+0x199/0x420
bio_endio+0x104/0x180
submit_bio_noacct+0xa5/0x510
submit_bio+0x48/0x80
f2fs_submit_write_bio+0x35/0x300
f2fs_submit_merged_ipu_write+0x2a0/0x2b0
f2fs_write_single_data_page+0x838/0x8b0
f2fs_write_cache_pages+0x379/0xa30
f2fs_write_data_pages+0x30c/0x340
do_writepages+0xd8/0x1b0
__writeback_single_inode+0x44/0x370
writeback_sb_inodes+0x233/0x4d0
__writeback_inodes_wb+0x56/0xf0
wb_writeback+0x1dd/0x2d0
wb_workfn+0x367/0x4a0
process_one_work+0x21d/0x430
worker_thread+0x4e/0x3c0
kthread+0x103/0x130
ret_from_fork+0x2c/0x50
The root cause is: after cp_error is set, f2fs_submit_merged_ipu_write()
in f2fs_write_single_data_page() tries to flush IPU bio in cache, however
f2fs_submit_merged_ipu_write() missed to check validity of @bio parameter,
result in submitting random cached bio which belong to other IO context,
then it will cause use-after-free issue, fix it by adding additional
validity check.
Fixes: 0b20fcec8651 ("f2fs: cache global IPU bio")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit babedcbac164cec970872b8097401ca913a80e61 ]
BUG_ON() will be triggered when writing files concurrently,
because the same page is writtenback multiple times.
1597 void folio_end_writeback(struct folio *folio)
1598 {
......
1618 if (!__folio_end_writeback(folio))
1619 BUG();
......
1625 }
kernel BUG at mm/filemap.c:1619!
Call Trace:
<TASK>
f2fs_write_end_io+0x1a0/0x370
blk_update_request+0x6c/0x410
blk_mq_end_request+0x15/0x130
blk_complete_reqs+0x3c/0x50
__do_softirq+0xb8/0x29b
? sort_range+0x20/0x20
run_ksoftirqd+0x19/0x20
smpboot_thread_fn+0x10b/0x1d0
kthread+0xde/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Below is the concurrency scenario:
[Process A] [Process B] [Process C]
f2fs_write_raw_pages()
- redirty_page_for_writepage()
- unlock page()
f2fs_do_write_data_page()
- lock_page()
- clear_page_dirty_for_io()
- set_page_writeback() [1st writeback]
.....
- unlock page()
generic_perform_write()
- f2fs_write_begin()
- wait_for_stable_page()
- f2fs_write_end()
- set_page_dirty()
- lock_page()
- f2fs_do_write_data_page()
- set_page_writeback() [2st writeback]
This problem was introduced by the previous commit 7377e853967b ("f2fs:
compress: fix potential deadlock of compress file"). All pagelocks were
released in f2fs_write_raw_pages(), but whether the page was
in the writeback state was ignored in the subsequent writing process.
Let's fix it by waiting for the page to writeback before writing.
Cc: Christoph Hellwig <hch@lst.de>
Fixes: 4c8ff7095bef ("f2fs: support data compression")
Fixes: 7377e853967b ("f2fs: compress: fix potential deadlock of compress file")
Signed-off-by: Qi Han <hanqi@vivo.com>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b37ed21e3367539b79284e0b0af2246ffcf0dca ]
If we manage the zone capacity per zone type, it'll break the GC assumption.
And, the current logic complains valid block count mismatch.
Let's apply zone capacity to all zone type, if specified.
Fixes: de881df97768 ("f2fs: support zone capacity less than zone size")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c17caf0ba3aa3411b96c71b4ce24be1040b8f3e8 ]
When f2fs skipped a gc round during victim migration, there was a bug which
would skip all upcoming gc rounds unconditionally because skipped_gc_rwsem
was not initialized. It fixes the bug by correctly initializing the
skipped_gc_rwsem inside the gc loop.
Fixes: 6f8d4455060d ("f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc")
Signed-off-by: Yonggil Song <yonggil.song@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8051692f5f23260215bfe9a72e712d93606acc5f ]
We should set the error code when dqget() failed.
Fixes: 2c1d03056991 ("f2fs: support F2FS_IOC_FS{GET,SET}XATTR")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a46bebd502fe1a3bd1d22f64cedd93e7e7702693 ]
To fix a race condition between atomic write aborts, I use the inode
lock and make COW inode to be re-usable thoroughout the whole
atomic file inode lifetime.
Reported-by: syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>