820 Commits

Author SHA1 Message Date
Linus Torvalds
79952bdcbc f2fs-6.12-rc1
In this series, the main changes include 1) converting major IO paths to use
 folio, and 2) adding various knobs to control GC more flexibly for Zoned
 devices. In addition, there are several patches to address corner cases of
 atomic file operations and better support for file pinning on zoned device.
 
 Enhancement:
  - add knobs to tune foreground/background GCs for Zoned devices
  - convert IO paths to use folio
  - reduce expensive checkpoint trigger frequency
  - allow F2FS_IPU_NOCACHE for pinned file
  - forcibly migrate to secure space for zoned device file pinning
  - get rid of buffer_head use
  - add write priority option based on zone UFS
  - get rid of online repair on corrupted directory
 
 Bug fix:
  - fix to don't panic system for no free segment fault injection
  - fix to don't set SB_RDONLY in f2fs_handle_critical_error()
  - avoid unused block when dio write in LFS mode
  - compress: don't redirty sparse cluster during {,de}compress
  - check discard support for conventional zones
  - atomic: prevent atomic file from being dirtied before commit
  - atomic: fix to check atomic_file in f2fs ioctl interfaces
  - atomic: fix to forbid dio in atomic_file
  - atomic: fix to truncate pagecache before on-disk metadata truncation
  - atomic: create COW inode from parent dentry
  - atomic: fix to avoid racing w/ GC
  - atomic: require FMODE_WRITE for atomic write ioctls
  - fix to wait page writeback before setting gcing flag
  - fix to avoid racing in between read and OPU dio write, dio completion
  - fix several potential integer overflows in file offsets and dir_block_index
  - fix to avoid use-after-free in f2fs_stop_gc_thread()
 
 As usual, there are several code clean-ups and refactorings.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmbyJn8ACgkQQBSofoJI
 UNJz9Q/+LDDJjD6xh0Fs6H2NeltFNbuNmS79kN5oG0xfjIAiKXE1lsw2n2gwrDKv
 EHKUPa2D4Rztckp8EFF6/st2SXVXH5U7YY2z5jkIUFccbeod+CrK9AGHjJe54iXL
 D0ulbgE2jR8uuwAkNEooNJK1a5ZhZLVy+fXknNIgKoqx31YYE+mKOJaaJFbCxvNT
 grZdH9ApweJB8L4A4ebwIWyBy8Bh4lhr2d6ngsx6HA5TFA2Ay0V9kaoZrLPZvJhv
 3qJ+xu3oeGJbP4e5h5g9omafBskI1pfEE6/sY94o1Zy5Ahx3iCR6U/qehtyyU3TF
 5QLoMXTvIz0MkRuBaW1XxVDpFevVzUfYmbLycuxjArBtjHnvsdh12DKT1Pk5BDZ4
 GgkUyt4pK4PYyEZFtayCleLZljSRzKzi+Y9XEs82z01s41mvx71kz44bR8SPcb1Q
 D4VOJld4O4qMmNrZhhwW8sj4UiDVgliURwmpiZwz9zT9fXU/ZPD1gThcfSWJZ/53
 rrx87e1Bnyk/cMuN/gxEdVV20nggxng4hl2oDcUzBBV1G1R9I3RZJWQt/YFXpB0O
 Whv5pJkV8BZXFWoRmm9cpWe0MslRRhsKBPzcKmlowy/lYdgjpQTmh7TSJ1Teh+2Y
 r77XI31Y/ACaKDJsRmUVbtqdM3N/88N97Fa52wOByK0PjMbgM0E=
 =EKzY
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "The main changes include converting major IO paths to use folio, and
  adding various knobs to control GC more flexibly for Zoned devices.

  In addition, there are several patches to address corner cases of
  atomic file operations and better support for file pinning on zoned
  device.

  Enhancement:
   - add knobs to tune foreground/background GCs for Zoned devices
   - convert IO paths to use folio
   - reduce expensive checkpoint trigger frequency
   - allow F2FS_IPU_NOCACHE for pinned file
   - forcibly migrate to secure space for zoned device file pinning
   - get rid of buffer_head use
   - add write priority option based on zone UFS
   - get rid of online repair on corrupted directory

  Bug fixes:
   - fix to don't panic system for no free segment fault injection
   - fix to don't set SB_RDONLY in f2fs_handle_critical_error()
   - avoid unused block when dio write in LFS mode
   - compress: don't redirty sparse cluster during {,de}compress
   - check discard support for conventional zones
   - atomic: prevent atomic file from being dirtied before commit
   - atomic: fix to check atomic_file in f2fs ioctl interfaces
   - atomic: fix to forbid dio in atomic_file
   - atomic: fix to truncate pagecache before on-disk metadata truncation
   - atomic: create COW inode from parent dentry
   - atomic: fix to avoid racing w/ GC
   - atomic: require FMODE_WRITE for atomic write ioctls
   - fix to wait page writeback before setting gcing flag
   - fix to avoid racing in between read and OPU dio write, dio completion
   - fix several potential integer overflows in file offsets and dir_block_index
   - fix to avoid use-after-free in f2fs_stop_gc_thread()

  As usual, there are several code clean-ups and refactorings"

* tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
  f2fs: allow F2FS_IPU_NOCACHE for pinned file
  f2fs: forcibly migrate to secure space for zoned device file pinning
  f2fs: remove unused parameters
  f2fs: fix to don't panic system for no free segment fault injection
  f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error()
  f2fs: add valid block ratio not to do excessive GC for one time GC
  f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent
  f2fs: do FG_GC when GC boosting is required for zoned devices
  f2fs: increase BG GC migration window granularity when boosted for zoned devices
  f2fs: add reserved_segments sysfs node
  f2fs: introduce migration_window_granularity
  f2fs: make BG GC more aggressive for zoned devices
  f2fs: avoid unused block when dio write in LFS mode
  f2fs: fix to check atomic_file in f2fs ioctl interfaces
  f2fs: get rid of online repaire on corrupted directory
  f2fs: prevent atomic file from being dirtied before commit
  f2fs: get rid of page->index
  f2fs: convert read_node_page() to use folio
  f2fs: convert __write_node_page() to use folio
  f2fs: convert f2fs_write_data_page() to use folio
  ...
2024-09-24 15:12:38 -07:00
Chao Yu
930c6ab934 f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error()
syzbot reports a f2fs bug as below:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 58 at kernel/rcu/sync.c:177 rcu_sync_dtor+0xcd/0x180 kernel/rcu/sync.c:177
CPU: 1 UID: 0 PID: 58 Comm: kworker/1:2 Not tainted 6.10.0-syzkaller-12562-g1722389b0d86 #0
Workqueue: events destroy_super_work
RIP: 0010:rcu_sync_dtor+0xcd/0x180 kernel/rcu/sync.c:177
Call Trace:
 percpu_free_rwsem+0x41/0x80 kernel/locking/percpu-rwsem.c:42
 destroy_super_work+0xec/0x130 fs/super.c:282
 process_one_work kernel/workqueue.c:3231 [inline]
 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
 worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
 kthread+0x2f0/0x390 kernel/kthread.c:389
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

As Christian Brauner pointed out [1]: the root cause is f2fs sets
SB_RDONLY flag in internal function, rather than setting the flag
covered w/ sb->s_umount semaphore via remount procedure, then below
race condition causes this bug:

- freeze_super()
 - sb_wait_write(sb, SB_FREEZE_WRITE)
 - sb_wait_write(sb, SB_FREEZE_PAGEFAULT)
 - sb_wait_write(sb, SB_FREEZE_FS)
					- f2fs_handle_critical_error
					 - sb->s_flags |= SB_RDONLY
- thaw_super
 - thaw_super_locked
  - sb_rdonly() is true, so it skips
    sb_freeze_unlock(sb, SB_FREEZE_FS)
  - deactivate_locked_super

Since f2fs has almost the same logic as ext4 [2] when handling critical
error in filesystem if it mounts w/ errors=remount-ro option:
- set CP_ERROR_FLAG flag which indicates filesystem is stopped
- record errors to superblock
- set SB_RDONLY falg
Once we set CP_ERROR_FLAG flag, all writable interfaces can detect the
flag and stop any further updates on filesystem. So, it is safe to not
set SB_RDONLY flag, let's remove the logic and keep in line w/ ext4 [3].

[1] https://lore.kernel.org/all/20240729-himbeeren-funknetz-96e62f9c7aee@brauner
[2] https://lore.kernel.org/all/20240729132721.hxih6ehigadqf7wx@quack3
[3] https://lore.kernel.org/linux-ext4/20240805201241.27286-1-jack@suse.cz

Fixes: b62e71be2110 ("f2fs: support errors=remount-ro|continue|panic mountoption")
Reported-by: syzbot+20d7e439f76bbbd863a7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000b90a8e061e21d12f@google.com/
Cc: Jan Kara <jack@suse.cz>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:36:43 +00:00
Daeho Jeong
8c890c4c60 f2fs: introduce migration_window_granularity
We can control the scanning window granularity for GC migration. For
more frequent scanning and GC on zoned devices, we need a fine grained
control knob for it.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:32:54 +00:00
Daeho Jeong
5062b5bed4 f2fs: make BG GC more aggressive for zoned devices
Since we don't have any GC on device side for zoned devices, need more
aggressive BG GC. So, tune the parameters for that.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:32:44 +00:00
Chao Yu
c7f114d864 f2fs: fix to avoid use-after-free in f2fs_stop_gc_thread()
syzbot reports a f2fs bug as below:

 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
 print_report+0xe8/0x550 mm/kasan/report.c:491
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
 instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
 atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:252 [inline]
 __refcount_add include/linux/refcount.h:184 [inline]
 __refcount_inc include/linux/refcount.h:241 [inline]
 refcount_inc include/linux/refcount.h:258 [inline]
 get_task_struct include/linux/sched/task.h:118 [inline]
 kthread_stop+0xca/0x630 kernel/kthread.c:704
 f2fs_stop_gc_thread+0x65/0xb0 fs/f2fs/gc.c:210
 f2fs_do_shutdown+0x192/0x540 fs/f2fs/file.c:2283
 f2fs_ioc_shutdown fs/f2fs/file.c:2325 [inline]
 __f2fs_ioctl+0x443a/0xbe60 fs/f2fs/file.c:4325
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:907 [inline]
 __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The root cause is below race condition, it may cause use-after-free
issue in sbi->gc_th pointer.

- remount
 - f2fs_remount
  - f2fs_stop_gc_thread
   - kfree(gc_th)
				- f2fs_ioc_shutdown
				 - f2fs_do_shutdown
				  - f2fs_stop_gc_thread
				   - kthread_stop(gc_th->f2fs_gc_task)
   : sbi->gc_thread = NULL;

We will call f2fs_do_shutdown() in two paths:
- for f2fs_ioc_shutdown() path, we should grab sb->s_umount semaphore
for fixing.
- for f2fs_shutdown() path, it's safe since caller has already grabbed
sb->s_umount semaphore.

Reported-by: syzbot+1a8e2b31f2ac9bd3d148@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/0000000000005c7ccb061e032b9b@google.com
Fixes: 7950e9ac638e ("f2fs: stop gc/discard thread after fs shutdown")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21 00:56:28 +00:00
Zhiguo Niu
8fb9f31984 f2fs: clean up val{>>,<<}F2FS_BLKSIZE_BITS
Use F2FS_BYTES_TO_BLK(bytes) and F2FS_BLK_TO_BYTES(blk) for cleanup

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21 00:56:27 +00:00
Chao Yu
5bcde45578 f2fs: get rid of buffer_head use
Convert to use folio and related functionality.

Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15 15:26:40 +00:00
Matthew Wilcox (Oracle)
1da86618bd
fs: Convert aops->write_begin to take a folio
Convert all callers from working on a page to working on one page
of a folio (support for working on an entire folio can come later).
Removes a lot of folio->page->folio conversions.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07 11:33:21 +02:00
Matthew Wilcox (Oracle)
a225800f32
fs: Convert aops->write_end to take a folio
Most callers have a folio, and most implementations operate on a folio,
so remove the conversion from folio->page->folio to fit through this
interface.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07 11:32:02 +02:00
Liao Yuanhong
8444ce5249 f2fs: add write priority option based on zone UFS
Currently, we are using a mix of traditional UFS and zone UFS to support
some functionalities that cannot be achieved on zone UFS alone. However,
there are some issues with this approach. There exists a significant
performance difference between traditional UFS and zone UFS. Under normal
usage, we prioritize writes to zone UFS. However, in critical conditions
(such as when the entire UFS is almost full), we cannot determine whether
data will be written to traditional UFS or zone UFS. This can lead to
significant performance fluctuations, which is not conducive to
development and testing. To address this, we have added an option
zlu_io_enable under sys with the following three modes:
1) zlu_io_enable == 0:Normal mode, prioritize writing to zone UFS;
2) zlu_io_enable == 1:Zone UFS only mode, only allow writing to zone UFS;
3) zlu_io_enable == 2:Traditional UFS priority mode, prioritize writing to
traditional UFS.

Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Wu Bo <bo.wu@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-05 20:18:35 +00:00
Nikita Zhandarovich
50438dbc48 f2fs: avoid potential int overflow in sanity_check_area_boundary()
While calculating the end addresses of main area and segment 0, u32
may be not enough to hold the result without the danger of int
overflow.

Just in case, play it safe and cast one of the operands to a
wider type (u64).

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: fd694733d523 ("f2fs: cover large section in sanity check of super")
Cc: stable@vger.kernel.org
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-05 20:18:35 +00:00
Linus Torvalds
5ad7ff8738 f2fs update for 6.11-rc1
It's a pretty small update including mostly minor bug fixes in zoned storage
 along with the large section support.
 
 Enhancement:
  - add support for FS_IOC_GETFSSYSFSPATH
  - enable atgc dynamically if conditions are met
  - use new ioprio Macro to get ckpt thread ioprio level
  - remove unreachable lazytime mount option parsing
 
 Bug fix:
  - fix null reference error when checking end of zone
  - fix start segno of large section
  - fix to cover read extent cache access with lock
  - don't dirty inode for readonly filesystem
  - allocate a new section if curseg is not the first seg in its zone
  - only fragment segment in the same section
  - truncate preallocated blocks in f2fs_file_open()
  - fix to avoid use SSR allocate when do defragment
  - fix to force buffered IO on inline_data inode
 
 And, it includes some minor code clean-ups, and sanity checks.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmagGLkACgkQQBSofoJI
 UNLsvA//U1u2hr+VEmSIxZ+CcM8vBM7wmbuggdUikEW0uj07YpvovLikifV7p6kK
 00p/GsqIqNRsVcTxRI9wBTPiltJRei/w6K3EXnSGKgTPtq1QMSv/GKiBUUaYsRu0
 F6W5AqouTquDZz61/ULhMc7WvWqUIZ1m4QX/DMEUGPSnQ2+yIsnz/PT4ZXaKBH7K
 lIh4WiFAyKO6/UWftcGmnvPiqj4YvqFOhLLV/fgF/VY8IVcENrDH+8+SJM2NtT0F
 6gT0bN2Jscc8o43ejo6dlwc7+0qhmH7H2IOCC1XSYGCsveUYgqgKgpBP4ryKjZvt
 LrbYKaL+auGuJMcLYCG/6IDPl5xkJo3SuRE7YnJdeTNc3InC6BUr17pkmU8n5ib4
 xKSeH2XQXk/nu3l9srtKb87Zdwjr90GgvjEZwsCTe+6ihjJ7SGWfpvVLhm3pHale
 SHPSLaVGqTlqdrNLtfhtNEg6xcvUVxTPbqzoCAmS6onEZfv8BldtQDSea0Tuw7UG
 Ic4AbfJ/gVCKyCDw/QiV0B1n8GHsVIhlBXss2/xEuO2/2Pso8YFIAXCyH0kBXIN2
 0/VesfguJLBIGyyFZ2M5AGZehr5s1n2IThe+qGjeoHfNQz7Br+xBTc25VpowUenC
 nET3UoAmUkLFrItDMMqJbJ8DwW/Idei+YH/xnDZSKkz5rgHclsg=
 =4m67
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "A pretty small update including mostly minor bug fixes in zoned
  storage along with the large section support.

  Enhancements:
   - add support for FS_IOC_GETFSSYSFSPATH
   - enable atgc dynamically if conditions are met
   - use new ioprio Macro to get ckpt thread ioprio level
   - remove unreachable lazytime mount option parsing

  Bug fixes:
   - fix null reference error when checking end of zone
   - fix start segno of large section
   - fix to cover read extent cache access with lock
   - don't dirty inode for readonly filesystem
   - allocate a new section if curseg is not the first seg in its zone
   - only fragment segment in the same section
   - truncate preallocated blocks in f2fs_file_open()
   - fix to avoid use SSR allocate when do defragment
   - fix to force buffered IO on inline_data inode

  And some minor code clean-ups and sanity checks"

* tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (26 commits)
  f2fs: clean up addrs_per_{inode,block}()
  f2fs: clean up F2FS_I()
  f2fs: use meta inode for GC of COW file
  f2fs: use meta inode for GC of atomic file
  f2fs: only fragment segment in the same section
  f2fs: fix to update user block counts in block_operations()
  f2fs: remove unreachable lazytime mount option parsing
  f2fs: fix null reference error when checking end of zone
  f2fs: fix start segno of large section
  f2fs: remove redundant sanity check in sanity_check_inode()
  f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid
  f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtie
  f2fs: clean up set REQ_RAHEAD given rac
  f2fs: enable atgc dynamically if conditions are met
  f2fs: fix to truncate preallocated blocks in f2fs_file_open()
  f2fs: fix to cover read extent cache access with lock
  f2fs: fix return value of f2fs_convert_inline_inode()
  f2fs: use new ioprio Macro to get ckpt thread ioprio level
  f2fs: fix to don't dirty inode for readonly filesystem
  f2fs: fix to avoid use SSR allocate when do defragment
  ...
2024-07-23 15:21:19 -07:00
Eric Sandeen
54f43a10fa f2fs: remove unreachable lazytime mount option parsing
The lazytime/nolazytime options are now handled in the VFS, and are
never seen in filesystem parsers, so remove handling of these
options from f2fs.

Note: when lazytime support was added in 6d94c74ab85f it made
lazytime the default in default_options() - as a result, lazytime
cannot be disabled (because Opt_nolazytime is never seen in f2fs
parsing).

If lazytime is desired to be configurable, and default off is OK,
default_options() could be updated to stop setting it by default
and allow mount option control.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-07-10 22:47:57 +00:00
Chao Yu
cc260b66c4 f2fs: add support for FS_IOC_GETFSSYSFSPATH
FS_IOC_GETFSSYSFSPATH ioctl expects sysfs sub-path of a filesystem, the
format can be "$FSTYP/$SYSFS_IDENTIFIER" under /sys/fs, it can helps to
standardizes exporting sysfs datas across filesystems.

This patch wires up FS_IOC_GETFSSYSFSPATH for f2fs, it will output
"f2fs/<dev>".

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12 15:46:02 +00:00
Gabriel Krisman Bertazi
28add38d54
f2fs: Move CONFIG_UNICODE defguards into the code flow
Instead of a bunch of ifdefs, make the unicode built checks part of the
code flow where possible, as requested by Torvalds.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
[eugen.hristev@collabora.com: port to 6.10-rc1]
Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com>
Link: https://lore.kernel.org/r/20240606073353.47130-8-eugen.hristev@collabora.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-07 17:00:45 +02:00
Linus Torvalds
72ece20127 f2fs update for 6.10-rc1
In this round, we've tried to address some performance issues on zoned storage
 such as direct IO and write_hints. In addition, we've migrated some IO paths
 using folio. Meanwhile, there are multiple bug fixes in the compression paths,
 sanity check conditions, and error handlers.
 
 Enhancement:
  - allow direct io of pinned files for zoned storage
  - assign the write hint per stream by default
  - convert read paths and test_writeback to folio
  - avoid allocating WARM_DATA segment for direct IO
 
 Bug fix:
  - fix false alarm on invalid block address
  - fix to add missing iput() in gc_data_segment()
  - fix to release node block count in error path of f2fs_new_node_page()
  - compress: don't allow unaligned truncation on released compress inode
  - compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
  - compress: fix error path of inc_valid_block_count()
  - compress: fix to update i_compr_blocks correctly
  - fix block migration when section is not aligned to pow2
  - don't trigger OPU on pinfile for direct IO
  - fix to do sanity check on i_xattr_nid in sanity_check_inode()
  - write missing last sum blk of file pinning section
  - clear writeback when compression failed
  - fix to adjust appropirate defragment pg_end
 
 As usual, there are several minor code clean-ups, and fixes to manage missing
 corner cases in the error paths.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmZLpYcACgkQQBSofoJI
 UNJQTw/+NaY7a1EgkMUpBAzxrJMKHcuBtyG42QKqgk6new0XejQGjPHojL2nPrw/
 t5G9TsbZbkHNMuhAkkTZMH+DFg92QYhByJlq79fxzya0XyGH4OaY1i4u67FLu0Qz
 PS/UKRkEI2B9lH+bGwa//XNMDSnzcao46bNi1SFbCNPGzU1cS35uOy/YgAdFlqTM
 WKJmM/AcNir4xtL30tBCVU//0OTtzT8+5YFVyPTeFR4WACsF6eTJAre9938xw1Ef
 p6ed6Wl2GYehqgFrAdAF07veZ1hVDSRAAB/1Mu1WKnNp57VBRjJW3DFDyApf+fIe
 2KJIDJd9/ece3dycuiZP/LXPV0sODqOI1/5s9RbFVq/QAhTSME5xq8hNXTejdl28
 PV6M2tKcTKMRpykppQg/K/N9PaO5Q6oFz0xlrOsrGoAhT1YnZfJi/DmzCZCCwYxW
 jyZor/r+849yDDdjhB94ZaByvj5S3OVqgsaunnbMBcGy+DDe0rUMXvRzVK4gTcCF
 lSTSp895BggWXLyPuXVNTjC4GIbzVbEDaHILPicfbqi0h5OCXG8YybKHiRs+ss6z
 ZrKJQxSVVvhjyHTVcBhb/Nc1s7Fm7DkX+KjV9GV3gwzB+AlVIgPlwyMTc2fZp3ST
 dUbmBR5+g4UUz2v4v4ZStAGy9eUFktO89u/roet8/74ppklj73E=
 =3mwj
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've tried to address some performance issues on zoned
  storage such as direct IO and write_hints. In addition, we've migrated
  some IO paths using folio. Meanwhile, there are multiple bug fixes in
  the compression paths, sanity check conditions, and error handlers.

  Enhancements:
   - allow direct io of pinned files for zoned storage
   - assign the write hint per stream by default
   - convert read paths and test_writeback to folio
   - avoid allocating WARM_DATA segment for direct IO

  Bug fixes:
   - fix false alarm on invalid block address
   - fix to add missing iput() in gc_data_segment()
   - fix to release node block count in error path of
     f2fs_new_node_page()
   - compress:
       - don't allow unaligned truncation on released compress inode
       - cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
       - fix error path of inc_valid_block_count()
       - fix to update i_compr_blocks correctly
   - fix block migration when section is not aligned to pow2
   - don't trigger OPU on pinfile for direct IO
   - fix to do sanity check on i_xattr_nid in sanity_check_inode()
   - write missing last sum blk of file pinning section
   - clear writeback when compression failed
   - fix to adjust appropirate defragment pg_end

  As usual, there are several minor code clean-ups, and fixes to manage
  missing corner cases in the error paths"

* tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (50 commits)
  f2fs: initialize last_block_in_bio variable
  f2fs: Add inline to f2fs_build_fault_attr() stub
  f2fs: fix some ambiguous comments
  f2fs: fix to add missing iput() in gc_data_segment()
  f2fs: allow dirty sections with zero valid block for checkpoint disabled
  f2fs: compress: don't allow unaligned truncation on released compress inode
  f2fs: fix to release node block count in error path of f2fs_new_node_page()
  f2fs: compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
  f2fs: compress: fix error path of inc_valid_block_count()
  f2fs: compress: fix typo in f2fs_reserve_compress_blocks()
  f2fs: compress: fix to update i_compr_blocks correctly
  f2fs: check validation of fault attrs in f2fs_build_fault_attr()
  f2fs: fix to limit gc_pin_file_threshold
  f2fs: remove unused GC_FAILURE_PIN
  f2fs: use f2fs_{err,info}_ratelimited() for cleanup
  f2fs: fix block migration when section is not aligned to pow2
  f2fs: zone: fix to don't trigger OPU on pinfile for direct IO
  f2fs: fix to do sanity check on i_xattr_nid in sanity_check_inode()
  f2fs: fix to avoid allocating WARM_DATA segment for direct IO
  f2fs: remove redundant parameter in is_next_segment_free()
  ...
2024-05-20 13:23:43 -07:00
Chao Yu
4ed886b187 f2fs: check validation of fault attrs in f2fs_build_fault_attr()
- It missed to check validation of fault attrs in parse_options(),
let's fix to add check condition in f2fs_build_fault_attr().
- Use f2fs_build_fault_attr() in __sbi_store() to clean up code.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-09 01:04:46 +00:00
Chao Yu
06b206d9e2 f2fs: remove unnecessary block size check in init_f2fs_fs()
After commit d7e9a9037de2 ("f2fs: Support Block Size == Page Size"),
F2FS_BLKSIZE equals to PAGE_SIZE, remove unnecessary check condition.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-19 17:57:35 +00:00
Chao Yu
5bf624c012 f2fs: fix comment in sanity_check_raw_super()
Commit d7e9a9037de2 ("f2fs: Support Block Size == Page Size") missed to
adjust comment in sanity_check_raw_super(), fix it.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-19 17:57:22 +00:00
Jaegeuk Kim
3bdb7f1616 f2fs: don't set RO when shutting down f2fs
Shutdown does not check the error of thaw_super due to readonly, which
causes a deadlock like below.

f2fs_ioc_shutdown(F2FS_GOING_DOWN_FULLSYNC)        issue_discard_thread
 - bdev_freeze
  - freeze_super
 - f2fs_stop_checkpoint()
  - f2fs_handle_critical_error                     - sb_start_write
    - set RO                                         - waiting
 - bdev_thaw
  - thaw_super_locked
    - return -EINVAL, if sb_rdonly()
 - f2fs_stop_discard_thread
  -> wait for kthread_stop(discard_thread);

Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12 20:58:35 +00:00
Wenjie Qi
0f9b12142b f2fs: fix zoned block device information initialization
If the max open zones of zoned devices are less than
the active logs of F2FS, the device may error due to
insufficient zone resources when multiple active logs
are being written at the same time.

Signed-off-by: Wenjie Qi <qwjhust@gmail.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-09 16:17:48 +00:00
Yunlei He
ac5eecf481 f2fs: remove clear SB_INLINECRYPT flag in default_options
In f2fs_remount, SB_INLINECRYPT flag will be clear and re-set.
If create new file or open file during this gap, these files
will not use inlinecrypt. Worse case, it may lead to data
corruption if wrappedkey_v0 is enable.

Thread A:                               Thread B:

-f2fs_remount				-f2fs_file_open or f2fs_new_inode
  -default_options
	<- clear SB_INLINECRYPT flag

                                          -fscrypt_select_encryption_impl

  -parse_options
	<- set SB_INLINECRYPT again

Signed-off-by: Yunlei He <heyunlei@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-29 23:54:42 +00:00
Christian Brauner
22650a9982
fs,block: yield devices early
Currently a device is only really released once the umount returns to
userspace due to how file closing works. That ultimately could cause
an old umount assumption to be violated that concurrent umount and mount
don't fail. So an exclusively held device with a temporary holder should
be yielded before the filesystem is gone. Add a helper that allows
callers to do that. This also allows us to remove the two holder ops
that Linus wasn't excited about.

Link: https://lore.kernel.org/r/20240326-vfs-bdev-end_holder-v1-1-20af85202918@kernel.org
Fixes: f3a608827d1f ("bdev: open block device as files") # mainline only
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-27 13:17:15 +01:00
Chao Yu
ee745e4736 f2fs: support .shutdown in f2fs_sops
Support .shutdown callback in f2fs_sops, then, it can be called to
shut down the file system when underlying block device is marked dead.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-26 02:13:08 +00:00
Linus Torvalds
c5d9ab85eb f2fs update for 6.9-rc1
In this round, there are a number of updates on mainly two areas: Zoned block
 device support and Per-file compression. For example, we've found several issues
 to support Zoned block device especially having large sections regarding to GC
 and file pinning used for Android devices. In compression side, we've fixed many
 corner race conditions that had broken the design assumption.
 
 Enhancement:
  - Support file pinning for Zoned block device having large section
  - Enhance the data recovery after sudden power cut on Zoned block device
  - Add more error injection cases to easily detect the kernel panics
  - add a proc entry show the entire disk layout
  - Improve various error paths paniced by BUG_ON in block allocation and GC
  - support SEEK_DATA and SEEK_HOLE for compression files
 
 Bug fix:
  - fix to avoid use-after-free issue in f2fs_filemap_fault
  - fix some race conditions to break the atomic write design assumption
  - fix to truncate meta inode pages forcely
  - resolve various per-file compression issues wrt the space management and
    compression policies
  - fix some swap-related bugs
 
 In addition, we removed deprecated codes such as io_bits and heap_allocation,
 and also fixed minor error handling routines with neat debugging messages.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmX4gS0ACgkQQBSofoJI
 UNLmgBAAg4mvbWjmJ5VbXs4zGLOgLRJYcY1sZRO5Ufg4LhWzoGRxL1Dru+TELw0t
 1Ck2EQvP91XZ5weA5AZOfWbxcijy4+8L3P8L7ohOShudfACci0wQsx6IaUUWWylC
 ILA4+DkovpZrlu6th12Gj9QAM6TN9gdy3V1VLT5O/KmE1x6Pekwp2hQoIvVJRH5L
 I3KxOf5fTe3oWLvEN6m7yCz/8qGqz8+w0ae90UG0fqi0wVEuZJ99zsVPnuhu6uBo
 riFm2A6ra0I/JqoPyqn2QM6ApItM867ULo9EoyQVgq56Q1w31ENOJXsU9N7N4Wxt
 olgujH1SijkWk9ni57iKtMhR68e3Rs+pVsuNFmJuOPq0HASoggB66QRrVvCgM9JG
 z3D//CB2ONtX2XiKJMiTcX9VqIqrMw6L1eVxEZu0P96C3CS70MoBU69mdSR9Og2S
 5nQXja3yzFhdk3thp6+wAJ3I04ZQkf3qoHZB+0chU2Xl1pV+5NIkBgBsSw8g/TY3
 EIHMfK+TX0SBSNCvkUDEJ+Z8ZRID6tcbAquTSsBr6wxB+F9mq7onEvI8O7xwyH9W
 DU8xhymOE2QUoluNtyW7ww6HK913ripXIenI9LaYJnuj0XeDAcMIoPsgR7AGU5UG
 hshvirFdUdWRMTfXxNNUrvhOWI0qurQSVx+VV6Qb62DGqR5ofOw=
 =Qpvy
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs update from Jaegeuk Kim:
 "In this round, there are a number of updates on mainly two areas:
  Zoned block device support and Per-file compression. For example,
  we've found several issues to support Zoned block device especially
  having large sections regarding to GC and file pinning used for
  Android devices. In compression side, we've fixed many corner race
  conditions that had broken the design assumption.

  Enhancements:
   - Support file pinning for Zoned block device having large section
   - Enhance the data recovery after sudden power cut on Zoned block
     device
   - Add more error injection cases to easily detect the kernel panics
   - add a proc entry show the entire disk layout
   - Improve various error paths paniced by BUG_ON in block allocation
     and GC
   - support SEEK_DATA and SEEK_HOLE for compression files

  Bug fixes:
   - avoid use-after-free issue in f2fs_filemap_fault
   - fix some race conditions to break the atomic write design
     assumption
   - fix to truncate meta inode pages forcely
   - resolve various per-file compression issues wrt the space
     management and compression policies
   - fix some swap-related bugs

  In addition, we removed deprecated codes such as io_bits and
  heap_allocation, and also fixed minor error handling routines with
  neat debugging messages"

* tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
  f2fs: fix to avoid use-after-free issue in f2fs_filemap_fault
  f2fs: truncate page cache before clearing flags when aborting atomic write
  f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag
  f2fs: prevent atomic write on pinned file
  f2fs: fix to handle error paths of {new,change}_curseg()
  f2fs: unify the error handling of f2fs_is_valid_blkaddr
  f2fs: zone: fix to remove pow2 check condition for zoned block device
  f2fs: fix to truncate meta inode pages forcely
  f2fs: compress: fix reserve_cblocks counting error when out of space
  f2fs: compress: relocate some judgments in f2fs_reserve_compress_blocks
  f2fs: add a proc entry show disk layout
  f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup
  f2fs: fix to check return value of f2fs_gc_range
  f2fs: fix to check return value __allocate_new_segment
  f2fs: fix to do sanity check in update_sit_entry
  f2fs: fix to reset fields for unloaded curseg
  f2fs: clean up new_curseg()
  f2fs: relocate f2fs_precache_extents() in f2fs_swap_activate()
  f2fs: fix blkofs_end correctly in f2fs_migrate_blocks()
  f2fs: ro: don't start discard thread for readonly image
  ...
2024-03-18 11:26:00 -07:00
Linus Torvalds
e5e038b7ae \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmXx5kwACgkQnJ2qBz9k
 QNmZowf/UlGJ1rmQFFhoodn3SyK48tQjOZ23Ygx6v9FZiLMuQ3b1k0kWKmwM4lZb
 mtRriCm+lPO9Yp/Sflz+jn8S51b/2bcTXiPV4w2Y4ZIun41wwggV7rWPnTCHhu94
 rGEPu/SNSBdpxWGv43BKHSDl4XolsGbyusQKBbKZtftnrpIf0y2OnyEXSV91Vnlh
 KM/XxzacBD4/3r4KCljyEkORWlIIn2+gdZf58sKtxLKvnfCIxjB+BF1e0gOWgmNQ
 e/pVnzbAHO3wuavRlwnrtA+ekBYQiJq7T61yyYI8zpeSoLHmwvPoKSsZP+q4BTvV
 yrcVCbGp3uZlXHD93U3BOfdqS0xBmg==
 =84Q4
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull ext2, isofs, udf, and quota updates from Jan Kara:
 "A lot of material this time:

   - removal of a lot of GFP_NOFS usage from ext2, udf, quota (either it
     was legacy or replaced with scoped memalloc_nofs_*() API)

   - removal of BUG_ONs in quota code

   - conversion of UDF to the new mount API

   - tightening quota on disk format verification

   - fix some potentially unsafe use of RCU pointers in quota code and
     annotate everything properly to make sparse happy

   - a few other small quota, ext2, udf, and isofs fixes"

* tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (26 commits)
  udf: remove SLAB_MEM_SPREAD flag usage
  quota: remove SLAB_MEM_SPREAD flag usage
  isofs: remove SLAB_MEM_SPREAD flag usage
  ext2: remove SLAB_MEM_SPREAD flag usage
  ext2: mark as deprecated
  udf: convert to new mount API
  udf: convert novrs to an option flag
  MAINTAINERS: add missing git address for ext2 entry
  quota: Detect loops in quota tree
  quota: Properly annotate i_dquot arrays with __rcu
  quota: Fix rcu annotations of inode dquot pointers
  isofs: handle CDs with bad root inode but good Joliet root directory
  udf: Avoid invalid LVID used on mount
  quota: Fix potential NULL pointer dereference
  quota: Drop GFP_NOFS instances under dquot->dq_lock and dqio_sem
  quota: Set nofs allocation context when acquiring dqio_sem
  ext2: Remove GFP_NOFS use in ext2_xattr_cache_insert()
  ext2: Drop GFP_NOFS use in ext2_get_blocks()
  ext2: Drop GFP_NOFS allocation from ext2_init_block_alloc_info()
  udf: Remove GFP_NOFS allocation in udf_expand_file_adinicb()
  ...
2024-03-13 14:30:58 -07:00
Zhiguo Niu
245930617c f2fs: fix to handle error paths of {new,change}_curseg()
{new,change}_curseg() may return error in some special cases,
error handling should be did in their callers, and this will also
facilitate subsequent error path expansion in {new,change}_curseg().

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12 18:25:17 -07:00
Chao Yu
11bec96afb f2fs: zone: fix to remove pow2 check condition for zoned block device
Commit 2e2c6e9b72ce ("f2fs: remove power-of-two limitation of zoned
device") missed to remove pow2 check condition in init_blkz_info(),
fix it.

Fixes: 2e2c6e9b72ce ("f2fs: remove power-of-two limitation of zoned device")
Signed-off-by: Feng Song <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12 18:25:17 -07:00
Linus Torvalds
0f1a876682 vfs-6.9.uuid
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem5LwAKCRCRxhvAZXjc
 onZsAQCjMNabNWAty2VBAQrNIpGkZ+AMA2DxEajPldaPiJH5zQEA9ea7feB3T47i
 NUrXXfMQ5DSop+k5Y65pPkEpbX4rhQo=
 =NZgd
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs uuid updates from Christian Brauner:
 "This adds two new ioctl()s for getting the filesystem uuid and
  retrieving the sysfs path based on the path of a mounted filesystem.
  Getting the filesystem uuid has been implemented in filesystem
  specific code for a while it's now lifted as a generic ioctl"

* tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: add support for FS_IOC_GETFSSYSFSPATH
  fs: add FS_IOC_GETFSSYSFSPATH
  fat: Hook up sb->s_uuid
  fs: FS_IOC_GETUUID
  ovl: convert to super_set_uuid()
  fs: super_set_uuid()
2024-03-11 11:02:06 -07:00
Linus Torvalds
910202f00a vfs-6.9.super
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem4DwAKCRCRxhvAZXjc
 ooTRAQDRI6Qz6wJym5Yblta8BScMGbt/SgrdgkoCvT6y83MtqwD+Nv/AZQzi3A3l
 9NdULtniW1reuCYkc8R7dYM8S+yAwAc=
 =Y1qX
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.9.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull block handle updates from Christian Brauner:
 "Last cycle we changed opening of block devices, and opening a block
  device would return a bdev_handle. This allowed us to implement
  support for restricting and forbidding writes to mounted block
  devices. It was accompanied by converting and adding helpers to
  operate on bdev_handles instead of plain block devices.

  That was already a good step forward but ultimately it isn't necessary
  to have special purpose helpers for opening block devices internally
  that return a bdev_handle.

  Fundamentally, opening a block device internally should just be
  equivalent to opening files. So now all internal opens of block
  devices return files just as a userspace open would. Instead of
  introducing a separate indirection into bdev_open_by_*() via struct
  bdev_handle bdev_file_open_by_*() is made to just return a struct
  file. Opening and closing a block device just becomes equivalent to
  opening and closing a file.

  This all works well because internally we already have a pseudo fs for
  block devices and so opening block devices is simple. There's a few
  places where we needed to be careful such as during boot when the
  kernel is supposed to mount the rootfs directly without init doing it.
  Here we need to take care to ensure that we flush out any asynchronous
  file close. That's what we already do for opening, unpacking, and
  closing the initramfs. So nothing new here.

  The equivalence of opening and closing block devices to regular files
  is a win in and of itself. But it also has various other advantages.
  We can remove struct bdev_handle completely. Various low-level helpers
  are now private to the block layer. Other helpers were simply
  removable completely.

  A follow-up series that is already reviewed build on this and makes it
  possible to remove bdev->bd_inode and allows various clean ups of the
  buffer head code as well. All places where we stashed a bdev_handle
  now just stash a file and use simple accessors to get to the actual
  block device which was already the case for bdev_handle"

* tag 'vfs-6.9.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits)
  block: remove bdev_handle completely
  block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access
  bdev: remove bdev pointer from struct bdev_handle
  bdev: make struct bdev_handle private to the block layer
  bdev: make bdev_{release, open_by_dev}() private to block layer
  bdev: remove bdev_open_by_path()
  reiserfs: port block device access to file
  ocfs2: port block device access to file
  nfs: port block device access to files
  jfs: port block device access to file
  f2fs: port block device access to files
  ext4: port block device access to file
  erofs: port device access to file
  btrfs: port device access to file
  bcachefs: port block device access to file
  target: port block device access to file
  s390: port block device access to file
  nvme: port block device access to file
  block2mtd: port device access to files
  bcache: port block device access to files
  ...
2024-03-11 10:52:34 -07:00
Christian Brauner
09406ad8e5 case-insensitive updates for 6.9
- Patch case-insensitive lookup by trying the case-exact comparison
 first, before falling back to costly utf8 casefolded comparison.
 
 - Fix to forbid using a case-insensitive directory as part of an
 overlayfs mount.
 
 - Patchset to ensure d_op are set at d_alloc time for fscrypt and
 casefold volumes, ensuring filesystem dentries will all have the correct
 ops, whether they come from a lookup or not.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRIEdicmeMNCZKVCdo6u2Upsdk6RAUCZedXSQAKCRA6u2Upsdk6
 RILBAQDXBwZjsdW4DM9CW1HYBKl7gx0rYOBI7HhlMd63ndHxvwD+N9kMWHCS+ERh
 QdYPEK5q44NYKTLeRE9lILjLsUCM9Q0=
 =dovM
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZemdjgAKCRCRxhvAZXjc
 opLhAP9/oVGFQViYR7rAr8v/uh9yQYbRJwq5O1HRCBlwSR5/qgD/e8QVP+MYfgSb
 /tKX+8n5rRnQlrieEsWFKfDtk6FvAQo=
 =Nbke
 -----END PGP SIGNATURE-----

Merge tag 'for-next-6.9' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krisman/unicode into vfs.misc

Merge case-insensitive updates from Gabriel Krisman Bertazi:

- Patch case-insensitive lookup by trying the case-exact comparison
  first, before falling back to costly utf8 casefolded comparison.

- Fix to forbid using a case-insensitive directory as part of an
  overlayfs mount.

- Patchset to ensure d_op are set at d_alloc time for fscrypt and
  casefold volumes, ensuring filesystem dentries will all have the
  correct ops, whether they come from a lookup or not.

* tag 'for-next-6.9' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krisman/unicode:
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup

Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-07 11:55:41 +01:00
Chao Yu
45809cd3bd f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup
Just cleanup, no functional change.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04 10:18:26 -08:00
Jaegeuk Kim
afbb8ff62b f2fs: print zone status in string and some log
No functional change, but add some more logs.

Note, it includes the spelling mistakes pointed by Colin Ian King.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04 09:22:30 -08:00
Jaegeuk Kim
4d4c593893 f2fs: fix write pointers all the time
Even if the roll forward recovery stopped due to any error, we have to fix
the write pointers in order to mount the disk from the previous checkpoint.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-29 08:34:35 -08:00
Jaegeuk Kim
de25240756 f2fs: prevent an f2fs_gc loop during disable_checkpoint
Don't get stuck in the f2fs_gc loop while disabling checkpoint. Instead, we have
a time-based management.

Reviewed-by: Chao Yu <chao@kernel.org>
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-29 08:34:34 -08:00
Chao Yu
8b10d36537 f2fs: introduce FAULT_NO_SEGMENT
Use it to simulate no free segment case during block allocation.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-29 08:34:34 -08:00
Gabriel Krisman Bertazi
be2760a703 f2fs: Configure dentry operations at dentry-creation time
This was already the case for case-insensitive before commit
bb9cd9106b22 ("fscrypt: Have filesystems handle their d_ops"), but it
was changed to set at lookup-time to facilitate the integration with
fscrypt.  But it's a problem because dentries that don't get created
through ->lookup() won't have any visibility of the operations.

Since fscrypt now also supports configuring dentry operations at
creation-time, do it for any encrypted and/or casefold volume,
simplifying the implementation across these features.

Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240221171412.10710-9-krisman@suse.de
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
2024-02-27 16:55:35 -05:00
Jaegeuk Kim
4e0197f993 f2fs: kill heap-based allocation
No one uses this feature. Let's kill it.

Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-27 09:41:15 -08:00
Chao Yu
e39602da75 f2fs: compress: fix to check zstd compress level correctly in mount option
f2fs only support to config zstd compress level w/ a positive number due
to layout design, but since commit e0c1b49f5b67 ("lib: zstd: Upgrade to
latest upstream zstd version 1.4.10"), zstd supports negative compress
level, so that zstd_min_clevel() may return a negative number, then w/
below mount option, .compress_level can be configed w/ a negative number,
which is not allowed to f2fs, let's add check condition to avoid it.

mount -o compress_algorithm=zstd:4294967295 /dev/sdx /mnt/f2fs

Fixes: 00e120b5e4b5 ("f2fs: assign default compression level")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-27 09:41:15 -08:00
Jaegeuk Kim
a60108f7df f2fs: use BLKS_PER_SEG, BLKS_PER_SEC, and SEGS_PER_SEC
No functional change.

Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-27 09:41:12 -08:00
Christian Brauner
512383ae49
f2fs: port block device access to files
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-22-adbd023e19cc@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-25 12:05:26 +01:00
Christian Brauner
f3a608827d
bdev: open block device as files
Add two new helpers to allow opening block devices as files.
This is not the final infrastructure. This still opens the block device
before opening a struct a file. Until we have removed all references to
struct bdev_handle we can't switch the order:

* Introduce blk_to_file_flags() to translate from block specific to
  flags usable to pen a new file.
* Introduce bdev_file_open_by_{dev,path}().
* Introduce temporary sb_bdev_handle() helper to retrieve a struct
  bdev_handle from a block device file and update places that directly
  reference struct bdev_handle to rely on it.
* Don't count block device openes against the number of open files. A
  bdev_file_open_by_{dev,path}() file is never installed into any
  file descriptor table.

One idea that came to mind was to use kernel_tmpfile_open() which
would require us to pass a path and it would then call do_dentry_open()
going through the regular fops->open::blkdev_open() path. But then we're
back to the problem of routing block specific flags such as
BLK_OPEN_RESTRICT_WRITES through the open path and would have to waste
FMODE_* flags every time we add a new one. With this we can avoid using
a flag bit and we have more leeway in how we open block devices from
bdev_open_by_{dev,path}().

Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-1-adbd023e19cc@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-25 12:05:21 +01:00
Jaegeuk Kim
87161a2b0a f2fs: deprecate io_bits
Let's deprecate an unused io_bits feature to save CPU cycles and memory.

Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-20 11:08:57 -08:00
Kent Overstreet
a4af51ce22
fs: super_set_uuid()
Some weird old filesytems have UUID-like things that we wish to expose
as UUIDs, but are smaller; add a length field so that the new
FS_IOC_(GET|SET)UUID ioctls can handle them in generic code.

And add a helper super_set_uuid(), for setting nonstandard length uuids.

Helper is now required for the new FS_IOC_GETUUID ioctl; if
super_set_uuid() hasn't been called, the ioctl won't be supported.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/r/20240207025624.1019754-2-kent.overstreet@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-08 21:19:59 +01:00
Jan Kara
ccb49011bb quota: Properly annotate i_dquot arrays with __rcu
Dquots pointed to from i_dquot arrays in inodes are protected by
dquot_srcu. Annotate them as such and change .get_dquots callback to
return properly annotated pointer to make sparse happy.

Fixes: b9ba6f94b238 ("quota: remove dqptr_sem")
Signed-off-by: Jan Kara <jack@suse.cz>
2024-02-08 12:04:59 +01:00
Chao Yu
0b8eb814e0 f2fs: use f2fs_err_ratelimited() to avoid redundant logs
Use f2fs_err_ratelimited() to instead f2fs_err() in
f2fs_record_stop_reason() and f2fs_record_errors() to
avoid redundant logs.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-05 18:58:40 -08:00
Chao Yu
b1c9d3f833 f2fs: support printk_ratelimited() in f2fs_printk()
This patch supports using printk_ratelimited() in f2fs_printk(), and
wrap ratelimited f2fs_printk() into f2fs_{err,warn,info}_ratelimited(),
then, use these new helps to clean up codes.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-05 18:58:40 -08:00
Chao Yu
c7115e094c f2fs: introduce FAULT_BLKADDR_CONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type
fault injection is on.

Info: checkpoint state = d6 :  nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off
[ASSERT] (fsck_chk_inode_blk:1254)  --> ino: 0x1c100 has i_blocks: 000000c0, but has 191 blocks
[FIX] (fsck_chk_inode_blk:1260)  --> [0x1c100] i_blocks=0x000000c0 -> 0xbf
[FIX] (fsck_chk_inode_blk:1269)  --> [0x1c100] i_compr_blocks=0x00000026 -> 0x27
[ASSERT] (fsck_chk_inode_blk:1254)  --> ino: 0x1cadb has i_blocks: 0000002f, but has 46 blocks
[FIX] (fsck_chk_inode_blk:1260)  --> [0x1cadb] i_blocks=0x0000002f -> 0x2e
[FIX] (fsck_chk_inode_blk:1269)  --> [0x1cadb] i_compr_blocks=0x00000011 -> 0x12
[ASSERT] (fsck_chk_inode_blk:1254)  --> ino: 0x1c62c has i_blocks: 00000002, but has 1 blocks
[FIX] (fsck_chk_inode_blk:1260)  --> [0x1c62c] i_blocks=0x00000002 -> 0x1

After we inject fault into f2fs_is_valid_blkaddr() during truncation,
a) it missed to increase @nr_free or @valid_blocks
b) it can cause in blkaddr leak in truncated dnode
Which may cause inconsistent status.

This patch separates FAULT_BLKADDR_CONSISTENCE from FAULT_BLKADDR,
and rename FAULT_BLKADDR to FAULT_BLKADDR_VALIDITY
so that we can:
a) use FAULT_BLKADDR_CONSISTENCE in f2fs_truncate_data_blocks_range()
to simulate inconsistent issue independently, then it can verify fsck
repair flow.
b) FAULT_BLKADDR_VALIDITY fault will not cause any inconsistent status,
we can just use it to check error path handling in kernel side.

Reviewed-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-05 18:58:39 -08:00
Eric Biggers
c919330dd5 f2fs: fix double free of f2fs_sb_info
kill_f2fs_super() is called even if f2fs_fill_super() fails.
f2fs_fill_super() frees the struct f2fs_sb_info, so it must set
sb->s_fs_info to NULL to prevent it from being freed again.

Fixes: 275dca4630c1 ("f2fs: move release of block devices to after kill_block_super()")
Reported-by:  <syzbot+8f477ac014ff5b32d81f@syzkaller.appspotmail.com>
Closes: https://lore.kernel.org/lkml/0000000000006cb174060ec34502@google.com
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/linux-f2fs-devel/20240113005747.38887-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-01-12 18:55:09 -08:00
Linus Torvalds
70d201a408 f2fs update for 6.8-rc1
In this series, we've some progress to support Zoned block device regarding to
 the power-cut recovery flow and enabling checkpoint=disable feature which is
 essential for Android OTA. Other than that, some patches touched sysfs entries
 and tracepoints which are minor, while several bug fixes on error handlers and
 compression flows are good to improve the overall stability.
 
 Enhancement:
  - enable checkpoint=disable for zoned block device
  - sysfs entries such as discard status, discard_io_aware, dir_level
  - tracepoints such as f2fs_vm_page_mkwrite(), f2fs_rename(), f2fs_new_inode()
  - use shared inode lock during f2fs_fiemap() and f2fs_seek_block()
 
 Bug fix:
  - address some power-cut recovery issues on zoned block device
  - handle errors and logics on do_garbage_collect(), f2fs_reserve_new_block(),
    f2fs_move_file_range(), f2fs_recover_xattr_data()
  - don't set FI_PREALLOCATED_ALL for partial write
  - fix to update iostat correctly in f2fs_filemap_fault()
  - fix to wait on block writeback for post_read case
  - fix to tag gcing flag on page during block migration
  - restrict max filesize for 16K f2fs
  - fix to avoid dirent corruption
  - explicitly null-terminate the xattr list
 
 There are also several clean-up patches to remove dead codes and better
 readability.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmWgMYcACgkQQBSofoJI
 UNJShxAAiYOXP7LPOAbPS1251BBgl8AIfs6u96hGTZkxOYsLHrBBbPbkWf3+nVbC
 JsBsVOe9K50rssK9kPg6XHPbmFGC8ERlyYcZTpONLfjtHOaQicbRnc//2qOvnCx8
 JOKcMVkZyLU/HbOCoUW6mzNCQlOl0aAV8tRcb7jwAxT0HgpjHTHxej/62gRcPKzC
 1E5w4iNTY//R97YGB36jPeGlKhbBZ7Ox1NM6AWadgE7B0j9rcYiBnPQllyeyaVVo
 XMCWRdl42tNMks2zgvU+vC41OrZ55bwLTQmVj3P1wnyKXig5/ZLQsrEcIGE+b2tP
 Mx+imCIRNYZqLwv5KYl6FU+KuLQGuZT1AjpP70Cb95WLyiYvVE6+xeiZg0fVTCEF
 3Hg7lEqMtAEAh1NEmJyYmbiAm9KQ3vHyse9ix++tfm+Xvgqj8b2flmzAtIFKpCBV
 J+yFI+A55IYuYZt7gzPoZLkQL0tULPf80TKQrzwlnHNtZ6T6FK2Nunu+Urwf1/Th
 s5IulqHJZxHU/Bgd6yQZUVfDILcXTkqNCpO3+qLZMPZizlH1hXiJFTeVzS6mnGvZ
 sK2LL4rEJ8EhDHU1F0SJzCWJcuR8cQ/t2zKYUygo9LvHbtEM1bZwC1Bqfolt7NrU
 +pgiM2wnE9yjkPdfZN1JgYZDq0/lGvxPQ5NAc/5ERX71QonRyn8=
 =MQl3
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs update from Jaegeuk Kim:
 "In this series, we've some progress to support Zoned block device
  regarding to the power-cut recovery flow and enabling
  checkpoint=disable feature which is essential for Android OTA.

  Other than that, some patches touched sysfs entries and tracepoints
  which are minor, while several bug fixes on error handlers and
  compression flows are good to improve the overall stability.

  Enhancements:
   - enable checkpoint=disable for zoned block device
   - sysfs entries such as discard status, discard_io_aware, dir_level
   - tracepoints such as f2fs_vm_page_mkwrite(), f2fs_rename(),
     f2fs_new_inode()
   - use shared inode lock during f2fs_fiemap() and f2fs_seek_block()

  Bug fixes:
   - address some power-cut recovery issues on zoned block device
   - handle errors and logics on do_garbage_collect(),
     f2fs_reserve_new_block(), f2fs_move_file_range(),
     f2fs_recover_xattr_data()
   - don't set FI_PREALLOCATED_ALL for partial write
   - fix to update iostat correctly in f2fs_filemap_fault()
   - fix to wait on block writeback for post_read case
   - fix to tag gcing flag on page during block migration
   - restrict max filesize for 16K f2fs
   - fix to avoid dirent corruption
   - explicitly null-terminate the xattr list

  There are also several clean-up patches to remove dead codes and
  better readability"

* tag 'f2fs-for-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (33 commits)
  f2fs: show more discard status by sysfs
  f2fs: Add error handling for negative returns from do_garbage_collect
  f2fs: Constrain the modification range of dir_level in the sysfs
  f2fs: Use wait_event_freezable_timeout() for freezable kthread
  f2fs: fix to check return value of f2fs_recover_xattr_data
  f2fs: don't set FI_PREALLOCATED_ALL for partial write
  f2fs: fix to update iostat correctly in f2fs_filemap_fault()
  f2fs: fix to check compress file in f2fs_move_file_range()
  f2fs: fix to wait on block writeback for post_read case
  f2fs: fix to tag gcing flag on page during block migration
  f2fs: add tracepoint for f2fs_vm_page_mkwrite()
  f2fs: introduce f2fs_invalidate_internal_cache() for cleanup
  f2fs: update blkaddr in __set_data_blkaddr() for cleanup
  f2fs: introduce get_dnode_addr() to clean up codes
  f2fs: delete obsolete FI_DROP_CACHE
  f2fs: delete obsolete FI_FIRST_BLOCK_WRITTEN
  f2fs: Restrict max filesize for 16K f2fs
  f2fs: let's finish or reset zones all the time
  f2fs: check write pointers when checkpoint=disable
  f2fs: fix write pointers on zoned device after roll forward
  ...
2024-01-11 20:39:15 -08:00