9248 Commits

Author SHA1 Message Date
Stephen Rothwell
854cbc3cb1 Merge branch 'fs-next' of linux-next 2025-01-15 11:11:23 +11:00
Stephen Rothwell
fdc335b8f6 Merge branch 'mm-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm 2025-01-15 10:07:52 +11:00
Stephen Rothwell
bcf9ccab61 Merge branch 'vfs.all' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
# Conflicts:
#	fs/nfsd/filecache.c
2025-01-15 09:23:09 +11:00
Stephen Rothwell
56feeeb631 Merge branch 'for-next' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git 2025-01-15 09:23:06 +11:00
Stephen Rothwell
6651719273 Merge branch 'for_next' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git 2025-01-15 09:22:59 +11:00
Luiz Capitulino
8c3cbdcf4d mm: alloc_pages_bulk: rename API
The previous commit removed the page_list argument from
alloc_pages_bulk_noprof() along with the alloc_pages_bulk_list() function.

Now that only the *_array() flavour of the API remains, we can do the
following renaming (along with the _noprof() ones):

  alloc_pages_bulk_array -> alloc_pages_bulk
  alloc_pages_bulk_array_mempolicy -> alloc_pages_bulk_mempolicy
  alloc_pages_bulk_array_node -> alloc_pages_bulk_node

Link: https://lkml.kernel.org/r/275a3bbc0be20fbe9002297d60045e67ab3d4ada.1734991165.git.luizcap@redhat.com
Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13 22:41:24 -08:00
Mirsad Todorovac
9d9b724726 xfs/libxfs: replace kmalloc() and memcpy() with kmemdup()
The source static analysis tool gave the following advice:

./fs/xfs/libxfs/xfs_dir2.c:382:15-22: WARNING opportunity for kmemdup

 → 382         args->value = kmalloc(len,
   383                          GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_RETRY_MAYFAIL);
   384         if (!args->value)
   385                 return -ENOMEM;
   386
 → 387         memcpy(args->value, name, len);
   388         args->valuelen = len;
   389         return -EEXIST;

Replacing kmalloc() + memcpy() with kmemdump() doesn't change semantics.
Original code works without fault, so this is not a bug fix but proposed improvement.

Link: https://lwn.net/Articles/198928/
Fixes: 94a69db2367ef ("xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS")
Fixes: 384f3ced07efd ("[XFS] Return case-insensitive match for dentry cache")
Fixes: 2451337dd0439 ("xfs: global error sign conversion")
Cc: Carlos Maiolino <cem@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Chandan Babu R <chandanbabu@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: linux-xfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Mirsad Todorovac <mtodorovac69@gmail.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:58:04 +01:00
Christoph Hellwig
183d988ae9 xfs: constify feature checks
They will eventually be needed to be const for zoned growfs, but even
now having such simpler helpers as const as possible is a good thing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:57:08 +01:00
Christoph Hellwig
dd324cb79e xfs: refactor xfs_fs_statfs
Split out helpers for data, rt data and inode related information, and
assigning f_bavail once instead of in three places.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:56:30 +01:00
Christoph Hellwig
72843ca624 xfs: don't take m_sb_lock in xfs_fs_statfs
The only non-constant value read under m_sb_lock in xfs_fs_statfs is
sb_dblocks, and it could become stale right after dropping the lock
anyway.  Remove the thus pointless lock section.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:56:24 +01:00
Christoph Hellwig
f4752daf47 xfs: fix the comment above xfs_discard_endio
pagb_lock has been replaced with eb_lock.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:56:20 +01:00
Long Li
09f7680dea xfs: remove bp->b_error check in xfs_attr3_root_inactive
The b_error check right after xfs_trans_get_buf() is redundant:

1) If the buffer is found in transaction via xfs_trans_buf_item_match(),
   any corrupted metadata error would have already been exposed during
   previous reads like xfs_da3_node_read().

2) If the buffer is obtained via xfs_buf_get_map():
   - It's called without XBF_READ flag, so won't return buffer with
     b_error set, since xfs_buf_get_map() will clear it anyway.
   - Buffer found in cache normally won't have error since previous reads
     had checked it, unless someone corrupts the buffer and the AIL
     pushes it out to disk while the buffer's unlocked. But in this case,
     AIL will shut down the log.

Remove this redundant check to simplify the code, make the code consistent
with most other xfs_trans_get_buf() callers in XFS.

Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:56:15 +01:00
Long Li
adcaff355b xfs: remove redundant update for ticket->t_curr_res in xfs_log_ticket_regrant
The current reservation of the log ticket has already been updated a few
lines above in xfs_log_ticket_regrant(), so there is no need to update it
again. This is just a code cleanup with no functional changes.

Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:56:09 +01:00
Long Li
99fc33d16b xfs: clean up xfs_end_ioend() to reuse local variables
Use already initialized local variables 'offset' and 'size' instead
of accessing ioend members directly in xfs_setfilesize() call.

This is just a code cleanup with no functional changes.

Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:56:02 +01:00
Long Li
efebe42d95 xfs: fix mount hang during primary superblock recovery failure
When mounting an image containing a log with sb modifications that require
log replay, the mount process hang all the time and stack as follows:

  [root@localhost ~]# cat /proc/557/stack
  [<0>] xfs_buftarg_wait+0x31/0x70
  [<0>] xfs_buftarg_drain+0x54/0x350
  [<0>] xfs_mountfs+0x66e/0xe80
  [<0>] xfs_fs_fill_super+0x7f1/0xec0
  [<0>] get_tree_bdev_flags+0x186/0x280
  [<0>] get_tree_bdev+0x18/0x30
  [<0>] xfs_fs_get_tree+0x1d/0x30
  [<0>] vfs_get_tree+0x2d/0x110
  [<0>] path_mount+0xb59/0xfc0
  [<0>] do_mount+0x92/0xc0
  [<0>] __x64_sys_mount+0xc2/0x160
  [<0>] x64_sys_call+0x2de4/0x45c0
  [<0>] do_syscall_64+0xa7/0x240
  [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

During log recovery, while updating the in-memory superblock from the
primary SB buffer, if an error is encountered, such as superblock
corruption occurs or some other reasons, we will proceed to out_release
and release the xfs_buf. However, this is insufficient because the
xfs_buf's log item has already been initialized and the xfs_buf is held
by the buffer log item as follows, the xfs_buf will not be released,
causing the mount thread to hang.

  xlog_recover_do_primary_sb_buffer
    xlog_recover_do_reg_buffer
      xlog_recover_validate_buf_type
        xfs_buf_item_init(bp, mp)

The solution is straightforward, we simply need to allow it to be
handled by the normal buffer write process. The filesystem will be
shutdown before the submission of buffer_list in xlog_do_recovery_pass(),
ensuring the correct release of the xfs_buf as follows:

  xlog_do_recovery_pass
    error = xlog_recover_process
      xlog_recover_process_data
        xlog_recover_process_ophdr
          xlog_recovery_process_trans
            ...
              xlog_recover_buf_commit_pass2
                error = xlog_recover_do_primary_sb_buffer
                  //Encounter error and return
                if (error)
                  goto out_writebuf
                ...
              out_writebuf:
                xfs_buf_delwri_queue(bp, buffer_list) //add bp to list
                return  error
            ...
    if (!list_empty(&buffer_list))
      if (error)
        xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR); //shutdown first
      xfs_buf_delwri_submit(&buffer_list); //submit buffers in list
        __xfs_buf_submit
          if (bp->b_mount->m_log && xlog_is_shutdown(bp->b_mount->m_log))
            xfs_buf_ioend_fail(bp)  //release bp correctly

Fixes: 6a18765b54e2 ("xfs: update the file system geometry after recoverying superblock buffers")
Cc: stable@vger.kernel.org # v6.12
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:55:37 +01:00
Christoph Hellwig
471511d6ef xfs: remove the t_magic field in struct xfs_trans
The t_magic field is only ever assigned to, but never read.  Remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:55:19 +01:00
Christoph Hellwig
415dee1e06 xfs: remove XFS_ILOG_NONCORE
XFS_ILOG_NONCORE is not used in the kernel code or xfsprogs, remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:55:14 +01:00
Christoph Hellwig
23ebf63925 xfs: mark xfs_dir_isempty static
And return bool instead of a boolean condition as int.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-13 14:55:06 +01:00
Carlos Maiolino
156d1c389c xfs: reflink on the realtime device [v6.2 05/14]
This patchset enables use of the file data block sharing feature (i.e.
 reflink) on the realtime device.  It follows the same basic sequence as
 the realtime rmap series -- first a few cleanups; then  introduction of
 the new btree format and inode fork format.  Next comes enabling CoW and
 remapping for the rt device; new scrub, repair, and health reporting
 code; and at the end we implement some code to lengthen write requests
 so that rt extents are always CoWed fully.
 
 This has been running on the djcloud for months with no problems.  Enjoy!
 
 Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZ2nQ6gAKCRBKO3ySh0YR
 phxFAQDsng0DbICm7bQPkjOxCG8SGMv0DRIABYe7kmk4NJuZkQEA/mHiDSA6CLg+
 bmqS9dfMRMSG/hl+9+Vp3e3CsZgMxQ4=
 =dWPC
 -----END PGP SIGNATURE-----

Merge tag 'realtime-reflink_2024-12-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into for-next

xfs: reflink on the realtime device [v6.2 05/14]

This patchset enables use of the file data block sharing feature (i.e.
reflink) on the realtime device.  It follows the same basic sequence as
the realtime rmap series -- first a few cleanups; then  introduction of
the new btree format and inode fork format.  Next comes enabling CoW and
remapping for the rt device; new scrub, repair, and health reporting
code; and at the end we implement some code to lengthen write requests
so that rt extents are always CoWed fully.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
2025-01-13 14:54:52 +01:00
Carlos Maiolino
a938bbe473 xfs: realtime reverse-mapping support [v6.2 04/14]
This is the latest revision of a patchset that adds to XFS kernel
 support for reverse mapping for the realtime device.  This time around
 I've fixed some of the bitrot that I've noticed over the past few
 months, and most notably have converted rtrmapbt to use the metadata
 inode directory feature instead of burning more space in the superblock.
 
 At the beginning of the set are patches to implement storing B+tree
 leaves in an inode root, since the realtime rmapbt is rooted in an
 inode, unlike the regular rmapbt which is rooted in an AG block.
 Prior to this, the only btree that could be rooted in the inode fork
 was the block mapping btree; if all the extent records fit in the
 inode, format would be switched from 'btree' to 'extents'.
 
 The next few patches enhance the reverse mapping routines to handle
 the parts that are specific to rtgroups -- adding the new btree type,
 adding a new log intent item type, and wiring up the metadata directory
 tree entries.
 
 Finally, implement GETFSMAP with the rtrmapbt and scrub functionality
 for the rtrmapbt and rtbitmap and online fsck functionality.
 
 This has been running on the djcloud for months with no problems.  Enjoy!
 
 Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZ2nQ6gAKCRBKO3ySh0YR
 puNFAP4j44Z3j+WEMSONKXLgl3Goo7Xahm/teGTzSN42l7lUtQEA5zhKcvDgiNIe
 KXxtJcRmHCqq8qaxqcnM6vlcHok92wM=
 =XjiL
 -----END PGP SIGNATURE-----

Merge tag 'realtime-rmap_2024-12-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into for-next

xfs: realtime reverse-mapping support [v6.2 04/14]

This is the latest revision of a patchset that adds to XFS kernel
support for reverse mapping for the realtime device.  This time around
I've fixed some of the bitrot that I've noticed over the past few
months, and most notably have converted rtrmapbt to use the metadata
inode directory feature instead of burning more space in the superblock.

At the beginning of the set are patches to implement storing B+tree
leaves in an inode root, since the realtime rmapbt is rooted in an
inode, unlike the regular rmapbt which is rooted in an AG block.
Prior to this, the only btree that could be rooted in the inode fork
was the block mapping btree; if all the extent records fit in the
inode, format would be switched from 'btree' to 'extents'.

The next few patches enhance the reverse mapping routines to handle
the parts that are specific to rtgroups -- adding the new btree type,
adding a new log intent item type, and wiring up the metadata directory
tree entries.

Finally, implement GETFSMAP with the rtrmapbt and scrub functionality
for the rtrmapbt and rtbitmap and online fsck functionality.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
2025-01-13 14:54:41 +01:00
Carlos Maiolino
8a092f440e xfs: enable in-core block reservation for rt metadata [v6.2 03/14]
In preparation for adding reverse mapping and refcounting to the
 realtime device, enhance the metadir code to reserve free space for
 btree shape changes as delayed allocation blocks.
 
 This enables us to pre-allocate space for the rmap and refcount btrees
 in the same manner as we do for the data device counterparts, which is
 how we avoid ENOSPC failures when space is low but we've already
 committed to a COW operation.
 
 This has been running on the djcloud for months with no problems.  Enjoy!
 
 Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZ2nQ6gAKCRBKO3ySh0YR
 pluAAP9u4A74HNTi7Df4C70diZIpXynSnyzAxVirQzda/WlUzAD+OjZEo7VkLgG4
 AJIqspcX632tFBFhBAhf3XzqWlM+zws=
 =hOw3
 -----END PGP SIGNATURE-----

Merge tag 'reserve-rt-metadata-space_2024-12-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into for-next

xfs: enable in-core block reservation for rt metadata [v6.2 03/14]

In preparation for adding reverse mapping and refcounting to the
realtime device, enhance the metadir code to reserve free space for
btree shape changes as delayed allocation blocks.

This enables us to pre-allocate space for the rmap and refcount btrees
in the same manner as we do for the data device counterparts, which is
how we avoid ENOSPC failures when space is low but we've already
committed to a COW operation.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
2025-01-13 14:54:33 +01:00
Carlos Maiolino
9a2ce7254c xfs: refactor btrees to support records in inode root [v6.2 02/14]
Amend the btree code to support storing btree rcords in the inode root,
 because the current bmbt code does not support this.
 
 This has been running on the djcloud for months with no problems.  Enjoy!
 
 Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZ2nQ6QAKCRBKO3ySh0YR
 piSBAP0dmPfWwKtdIx1AQUmfQ4FeNQbSk5PTKLuXIjyfUug9QwEAu7L616xtrgt6
 6/9CBlIUNp3gJriZzLbCjS72F7hssgQ=
 =Jk+2
 -----END PGP SIGNATURE-----

Merge tag 'btree-ifork-records_2024-12-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into for-next

xfs: refactor btrees to support records in inode root [v6.2 02/14]

Amend the btree code to support storing btree rcords in the inode root,
because the current bmbt code does not support this.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
2025-01-13 14:54:23 +01:00
Carlos Maiolino
69bf6cd7f3 xfs: bug fixes for 6.13 [01/14]
Bug fixes for 6.13.
 
 This has been running on the djcloud for months with no problems.  Enjoy!
 
 Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZ2nQ6QAKCRBKO3ySh0YR
 pjVrAQD9ty5VSpobCR0HRKImR1zCvjreaTzDty+XXnCrYVka1AEA1NbC53RxAPoy
 bW5GJ0ZEvQhamkx9YuBfiJXoqYw1jgI=
 =ZmFb
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.13-fixes_2024-12-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into for-next

xfs: bug fixes for 6.13 [01/14]

Bug fixes for 6.13.

This has been running on the djcloud for months with no problems.  Enjoy!

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
2025-01-13 14:54:14 +01:00
Christian Brauner
7ce932d426
Merge branch 'vfs-6.14.statx.dio' into vfs.all 2025-01-10 16:18:24 +01:00
Darrick J. Wong
111d36d627 xfs: lock dquot buffer before detaching dquot from b_li_list
We have to lock the buffer before we can delete the dquot log item from
the buffer's log item list.

Cc: stable@vger.kernel.org # v6.13-rc3
Fixes: acc8f8628c3737 ("xfs: attach dquot buffer to dquot log item buffer")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-10 10:12:48 +01:00
Christoph Hellwig
468210ec76
xfs: report larger dio alignment for COW inodes
For I/O to reflinked blocks we always need to write an entire new file
system block, and the code enforces the file system block alignment for
the entire file if it has any reflinked blocks.  Mirror the larger
value reported in the statx in the dio_offset_align in the xfs-specific
XFS_IOC_DIOINFO ioctl for the same reason.

Don't bother adding a new field for the read alignment to this legacy
ioctl as all new users should use statx instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250109083109.1441561-6-hch@lst.de
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 16:23:18 +01:00
Christoph Hellwig
7422bbd030
xfs: report the correct read/write dio alignment for reflinked inodes
For I/O to reflinked blocks we always need to write an entire new
file system block, and the code enforces the file system block alignment
for the entire file if it has any reflinked blocks.

Use the new STATX_DIO_READ_ALIGN flag to report the asymmetric read
vs write alignments for reflinked files.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250109083109.1441561-5-hch@lst.de
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 16:23:18 +01:00
Christoph Hellwig
7e17483c7b
xfs: cleanup xfs_vn_getattr
Split the two bits of optional statx reporting into their own helpers
so that they are self-contained instead of deeply indented in the main
getattr handler.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250109083109.1441561-4-hch@lst.de
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 16:23:17 +01:00
Christoph Hellwig
7ee7c9b39e xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RT
Non-rtg file systems have a fake RT group even if they do not have a RT
device, and thus an rgcount of 1.  Ensure xfs_update_last_rtgroup_size
doesn't fail when called for !XFS_RT to handle this case.

Fixes: 87fe4c34a383 ("xfs: create incore realtime group structures")
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-08 15:04:40 +01:00
Darrick J. Wong
155debbe7e xfs: enable realtime reflink
Enable reflink for realtime devices, as long as the realtime allocation
unit is a single fsblock.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:17 -08:00
Darrick J. Wong
fd97fe1112 xfs: fix CoW forks for realtime files
Port the copy on write fork repair to realtime files.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:17 -08:00
Darrick J. Wong
12f4d20328 xfs: check for shared rt extents when rebuilding rt file's data fork
When we're rebuilding the data fork of a realtime file, we need to
cross-reference each mapping with the rt refcount btree to ensure that
the reflink flag is set if there are any shared extents found.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:16 -08:00
Darrick J. Wong
92b2019493 xfs: repair inodes that have a refcount btree in the data fork
Plumb knowledge of refcount btrees into the inode core repair code.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:16 -08:00
Darrick J. Wong
83ccffc489 xfs: online repair of the realtime refcount btree
Port the data device's refcount btree repair code to the realtime
refcount btree.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:16 -08:00
Darrick J. Wong
fe2efe9508 xfs: capture realtime CoW staging extents when rebuilding rt rmapbt
Walk the realtime refcount btree to find the CoW staging extents when
we're rebuilding the realtime rmap btree.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:16 -08:00
Darrick J. Wong
477493082f xfs: walk the rt reference count tree when rebuilding rmap
When we're rebuilding the data device rmap, if we encounter a "refcount"
format fork, we have to walk the (realtime) refcount btree inode to
build the appropriate mappings.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:16 -08:00
Darrick J. Wong
6470ceef32 xfs: check new rtbitmap records against rt refcount btree
When we're rebuilding the realtime bitmap, check the proposed free
extents against the rt refcount btree to make sure we don't commit any
grievous errors.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:16 -08:00
Darrick J. Wong
cca34a3054 xfs: don't flag quota rt block usage on rtreflink filesystems
Quota space usage is allowed to exceed the size of the physical storage
when reflink is enabled.  Now that we have reflink for the realtime
volume, apply this same logic to the rtb repair logic.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:15 -08:00
Darrick J. Wong
ca757af07f xfs: scrub the metadir path of rt refcount btree files
Add a new XFS_SCRUB_METAPATH subtype so that we can scrub the metadata
directory tree path to the refcount btree file for each rt group.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:15 -08:00
Darrick J. Wong
a9600db96f xfs: detect and repair misaligned rtinherit directory cowextsize hints
If we encounter a directory that has been configured to pass on a CoW
extent size hint to a new realtime file and the hint isn't an integer
multiple of the rt extent size, we should flag the hint for
administrative review and/or turn it off because that is a
misconfiguration.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:15 -08:00
Darrick J. Wong
48bc170f2c xfs: allow dquot rt block count to exceed rt blocks on reflink fs
Update the quota scrubber to allow dquots where the realtime block count
exceeds the block count of the rt volume if reflink is enabled.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:15 -08:00
Darrick J. Wong
30f47950dc xfs: check reference counts of gaps between rt refcount records
If there's a gap between records in the rt refcount btree, we ought to
cross-reference the gap with the rtrmap records to make sure that there
aren't any overlapping records for a region that doesn't have any shared
ownership.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:15 -08:00
Darrick J. Wong
2d9a3e9805 xfs: allow overlapping rtrmapbt records for shared data extents
Allow overlapping realtime reverse mapping records if they both describe
shared data extents and the fs supports reflink on the realtime volume.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:15 -08:00
Darrick J. Wong
91683bb3f2 xfs: cross-reference checks with the rt refcount btree
Use the realtime refcount btree to implement cross-reference checks in
other data structures.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:14 -08:00
Darrick J. Wong
c27929670d xfs: scrub the realtime refcount btree
Add code to scrub realtime refcount btrees.  Similar to the refcount
btree checking code for the data device, we walk the rmap btree for each
refcount record to confirm that the reference counts are correct.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:14 -08:00
Darrick J. Wong
026c8ed8d4 xfs: report realtime refcount btree corruption errors to the health system
Whenever we encounter corrupt realtime refcount btree blocks, we should
report that to the health monitoring system for later reporting.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:14 -08:00
Darrick J. Wong
88a70768df xfs: check that the rtrefcount maxlevels doesn't increase when growing fs
The size of filesystem transaction reservations depends on the maximum
height (maxlevels) of the realtime btrees.  Since we don't want a grow
operation to increase the reservation size enough that we'll fail the
minimum log size checks on the next mount, constrain growfs operations
if they would cause an increase in the rt refcount btree maxlevels.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:14 -08:00
Darrick J. Wong
8e84e8052b xfs: enable extent size hints for CoW operations
Wire up the copy-on-write extent size hint for realtime files, and
connect it to the rt allocator so that we avoid fragmentation on rt
filesystems.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:14 -08:00
Darrick J. Wong
4de1a7ba41 xfs: apply rt extent alignment constraints to CoW extsize hint
The copy-on-write extent size hint is subject to the same alignment
constraints as the regular extent size hint.  Since we're in the process
of adding reflink (and therefore CoW) to the realtime device, we must
apply the same scattered rextsize alignment validation strategies to
both hints to deal with the possibility of rextsize changing.

Therefore, fix the inode validator to perform rextsize alignment checks
on regular realtime files, and to remove misaligned directory hints.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:14 -08:00
Darrick J. Wong
6853d23bad xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files
Currently, we (ab)use xfs_get_extsz_hint so that it always returns a
nonzero value for realtime files.  This apparently was done to disable
delayed allocation for realtime files.

However, once we enable realtime reflink, we can also turn on the
alwayscow flag to force CoW writes to realtime files.  In this case, the
logic will incorrectly send the write through the delalloc write path.

Fix this by adjusting the logic slightly.

Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-12-23 13:06:13 -08:00