4144 Commits

Author SHA1 Message Date
Linus Torvalds
8f602276d3 bcachefs fixes for 6.12-rc2
A lot of little fixes, bigger ones include:
 
 - bcachefs's __wait_on_freeing_inode() was broken in rc1 due to vfs
   changes, now fixed along with another lost wakeup
 - fragmentation LRU fixes; fsck now repairs successfully (this is the
   data structure copygc uses); along with some nice simplification.
 - Rework logged op error handling, so that if logged op replay errors
   (due to another filesystem error) we delete the logged op instead of
   going into an infinite loop)
 - Various small filesystem connectivitity repair fixes
 
 The final part of this patch series, fixing snapshots + unlinked file
 handling, is now out on the list - I'm giving that part of the series
 more time for user testing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmcBhkIACgkQE6szbY3K
 bnYt8RAAqZo6RcN91sgz6xGsJkUvE6DS4Rtj1J4vlVAmuiIa5NUhRhqFnS6j8V9A
 AWZw63JwTizrglbLk4Z4knfiViT4GOeiKX4sttaJk7cLW7bxwCUddlho1G5Q7q0I
 PFurYevqG1ltcl5oZpD6LhZiqEhndQI3XnkpEvKsmoXy9TSB4KEqaU8Y+cewjq4q
 KCFuxTBhbmatxP9eTGuDhd6uWw5h0EVDGQyMitEcSutIaernGlSsBQ8gZ5n9dWSd
 lP91qFT5iypmCMo9Arf8Fq1YBvOpV6P91eq8YPa4A3sKDfzHn3CCzsSyjUiGK0RM
 Wcl+kNwqYJa7Fwtb7aGgTVhaMkqLzPTI+XYye3FXrXjJ6B0JKpl2QvvDoFhDxop9
 ZPb57QyRgRBtOvofvFz8fWQOr67n+HNvaMbeG1iwGvqm6/MrgdSLsN6OaRh80uAE
 5P0qX7rwTTOfJj5T6dKLxr3KuXKXNrM5AAIG0MjOMsha232+XUAZvofYNmqx7BMi
 juJvqZc9/GXrcXqdPTYDyBs4UXDkwHsKdr744ooZ64VNiIYFs6eTvXp7V0XuajYH
 ExLrEEjhO2UGPM5N9R9jw9AMsEhJstexgylHQsiiADtdi+jY4LKa/NZAJSJQQC+C
 QQyE3Q7ZCpzRPiGPkkpIY/D7IRoIHL2H+LhbXV/K3oMGdbA7hS4=
 =XnG4
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-10-05' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "A lot of little fixes, bigger ones include:

   - bcachefs's __wait_on_freeing_inode() was broken in rc1 due to vfs
     changes, now fixed along with another lost wakeup

   - fragmentation LRU fixes; fsck now repairs successfully (this is the
     data structure copygc uses); along with some nice simplification.

   - Rework logged op error handling, so that if logged op replay errors
     (due to another filesystem error) we delete the logged op instead
     of going into an infinite loop)

   - Various small filesystem connectivitity repair fixes"

* tag 'bcachefs-2024-10-05' of git://evilpiepirate.org/bcachefs:
  bcachefs: Rework logged op error handling
  bcachefs: Add warn param to subvol_get_snapshot, peek_inode
  bcachefs: Kill snapshot arg to fsck_write_inode()
  bcachefs: Check for unlinked, non-empty dirs in check_inode()
  bcachefs: Check for unlinked inodes with dirents
  bcachefs: Check for directories with no backpointers
  bcachefs: Kill alloc_v4.fragmentation_lru
  bcachefs: minor lru fsck fixes
  bcachefs: Mark more errors AUTOFIX
  bcachefs: Make sure we print error that causes fsck to bail out
  bcachefs: bkey errors are only AUTOFIX during read
  bcachefs: Create lost+found in correct snapshot
  bcachefs: Fix reattach_inode()
  bcachefs: Add missing wakeup to bch2_inode_hash_remove()
  bcachefs: Fix trans_commit disk accounting revert
  bcachefs: Fix bch2_inode_is_open() check
  bcachefs: Fix return type of dirent_points_to_inode_nowarn()
  bcachefs: Fix bad shift in bch2_read_flag_list()
2024-10-05 15:18:04 -07:00
Kent Overstreet
0f25eb4b60 bcachefs: Rework logged op error handling
Initially it was thought that we just wanted to ignore errors from
logged op replay, but it turns out we do need to catch -EROFS, or we'll
go into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
1f73cb4d34 bcachefs: Add warn param to subvol_get_snapshot, peek_inode
These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
72350ee0ea bcachefs: Kill snapshot arg to fsck_write_inode()
It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.

This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
c9306a91c3 bcachefs: Check for unlinked, non-empty dirs in check_inode()
We want to check for this early so it can be reattached if necessary in
check_unreachable_inodes(); better than letting it be deleted and having
the children reattached, losing their filenames.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
c7da5ee2e5 bcachefs: Check for unlinked inodes with dirents
link count works differently in bcachefs - it's only nonzero for files
with multiple hardlinks, which means we can also avoid checking it
except for files that are known to have hardlinks.

That means we need a few different checks instead; in particular, we
don't want fsck to delet a file that has a dirent pointing to it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
1c6051bbd7 bcachefs: Check for directories with no backpointers
It's legal for regular files to have missing backpointers (due to
hardlinks), and fsck should automatically add them, but for directories
this is an error that should be flagged.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
260af1562e bcachefs: Kill alloc_v4.fragmentation_lru
The fragmentation_lru field hasn't been needed since we reworked the LRU
btrees to use the btree write buffer; previously it was used to resolve
collisions, but the revised LRU btree uses the backpointer (the bucket)
as part of the key.

It should have been deleted at the time of the LRU rework; since it
wasn't, that left places for bugs to hide, in check/repair.

This fixes LRU fsck on a filesystem image helpfully provided by a user
who disappeared before I could get his name for the reported-by.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
01bf5e3bd2 bcachefs: minor lru fsck fixes
check_lru_key() wasn't using write buffer updates for deleting bad lru
entries - dating from before the lru btree used the btree write buffer.

And when possibly flushing the btree write buffer (to make sure we're
seeing a real inconsistency), we need to be using the modern
bch2_btree_write_buffer_maybe_flush().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
1bea714c53 bcachefs: Mark more errors AUTOFIX
Errors are getting marked as AUTOFIX once they've been (re)-tested and
audited.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
492e24d760 bcachefs: Make sure we print error that causes fsck to bail out
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
658c82f41e bcachefs: bkey errors are only AUTOFIX during read
Newly generated keys, in the transaction commit path or write path,
should not be AUTOFIX; those indicate bugs that we need to fail fast
for.

Fixes: 5612daafb764 ("bcachefs: Fix fsck warnings from bkey validation")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
fda7b1ffde bcachefs: Create lost+found in correct snapshot
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
20826fe6b8 bcachefs: Fix reattach_inode()
Ensure a copy of the lost+found inode exists in the snapshot that we're
reattaching, so that we don't trigger warnings in
lookup_inode_for_snapshot() later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
6b63a948a7 bcachefs: Add missing wakeup to bch2_inode_hash_remove()
This fixes two different bugs:

- Looser locking with the rhashtable means we need to recheck if the
  inode is still hashed after prepare_to_wait(), and add a corresponding
  wakeup after removing from the hash table.

- da18ecbf0fb6 ("fs: add i_state helpers") changed the bit waitqueues
  used for inodes, and bcachefs wasn't updated and thus broke; this
  updates bcachefs to the new helper.

Fixes: 112d21fd1a12 ("bcachefs: switch to rhashtable for vfs inodes hash")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:31 -04:00
Kent Overstreet
d28786606a bcachefs: Fix trans_commit disk accounting revert
We only are applying JSET_ENTRY_TYPE_write_buffer_keys, revert path was
missed.

Fixes: a3581ca35d2b ("bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-02 21:37:42 -04:00
Kent Overstreet
3b1425a4eb bcachefs: Fix bch2_inode_is_open() check
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-02 21:31:31 -04:00
Kent Overstreet
abaa6d4f6a bcachefs: Fix return type of dirent_points_to_inode_nowarn()
we're returning an error code now, not a bool

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-02 21:30:55 -04:00
Linus Torvalds
7ec462100e Getting rid of asm/unaligned.h includes
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZv3NAgAKCRBZ7Krx/gZQ
 68kbAP0YzQxUgl0/o7Soda8XwKSPZTM9ls6kRk1UHTTG/i4ZigEA/G+i/mBQctL0
 AB911kK8mxfXppfOXzstFBjoJSqiigQ=
 =IE7D
 -----END PGP SIGNATURE-----

Merge tag 'pull-work.unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull generic unaligned.h cleanups from Al Viro:
 "Get rid of architecture-specific <asm/unaligned.h> includes, replacing
  them with a single generic <linux/unaligned.h> header file.

  It's the second largest (after asm/io.h) class of asm/* includes, and
  all but two architectures actually end up using exact same file.

  Massage the remaining two (arc and parisc) to do the same and just
  move the thing to from asm-generic/unaligned.h to linux/unaligned.h"

[ This is one of those things that we're better off doing outside the
  merge window, and would only cause extra conflict noise if it was in
  linux-next for the next release due to all the trivial #include line
  updates.  Rip off the band-aid.   - Linus ]

* tag 'pull-work.unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  move asm/unaligned.h to linux/unaligned.h
  arc: get rid of private asm/unaligned.h
  parisc: get rid of private asm/unaligned.h
2024-10-02 16:42:28 -07:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Kent Overstreet
e764e68103 bcachefs: Fix bad shift in bch2_read_flag_list()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-01 17:20:24 -04:00
Guenter Roeck
2007d28ec0 bcachefs: rename version -> bversion for big endian builds
Builds on big endian systems fail as follows.

fs/bcachefs/bkey.h: In function 'bch2_bkey_format_add_key':
fs/bcachefs/bkey.h:557:41: error:
	'const struct bkey' has no member named 'bversion'

The original commit only renamed the variable for little endian builds.
Rename it for big endian builds as well to fix the problem.

Fixes: cf49f8a8c277 ("bcachefs: rename version -> bversion")
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-29 23:55:52 -04:00
Linus Torvalds
9f9a534724 bcachefs fixes for 6.11-rc1
Assorted minor syzbot fixes, and for bigger stuff:
 
 - Fix two disk accounting rewrite bugs
  - Disk accounting keys use the version field of bkey so that journal
    replay can tell which updates have been applied to the btree. This is
    set in the transaction commit path, after we've gotten our journal
    reservation (and our time ordering), but the
    BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay uses
    was incorrectly skipping this for new updates generated prior to
    journal replay.
 
    This fixes the underlying cause of an assertion pop in
    disk_accounting_read.
 
  - A couple fixes for disk accounting + device removal. Checking if
    acocunting replicas entries were marked in the superblock was being
    done at the wrong point, when deltas in the journal could still zero
    them out, and then additionally we'd try to add a missing replicas
    entry to the superblock without checking if it referred to an invalid
    (removed) device.
 
 - A whole slew of repair fixes
  - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
    an infinite loop when repairing a filesystem with many snapshots
  - fix incorrect transaction restart handling leading to occasional
    "fsck counted ..." warnings"
  - fix warning in __bch2_fsck_err() for bkey fsck errors
  - check_inode() in fsck now correctly checks if the filesystem was
    clean
  - there shouldn't be pending logged ops if the fs was clean, we now
    check for this
  - remove_backpointer() doesn't remove a dirent that doesn't actually
    point to the inode
  - many more fsck errors are AUTOFIX
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmb4QtsACgkQE6szbY3K
 bnYx4A//bhGgZYgP55FxduuxUH8XjX2eOnXwuPv/MmYO/4oCok5VBa9bRDTVXhIK
 PtY4pP2IJZ3+u963mwbwJAawsPA01AEEty9tE+AdXbltDRQ03I33OEuIy0HFIso2
 s8VBkVPbru6yU4RCCvYNIVvRG/9GOL+J0GgrR1t05zHVyKXe1FuS00Yq5+z3niNP
 HtuGTsD273Nnhikz47bqyD+M6VizU+uzSUFLgnB3zrzpb+gPSGETSwgc4ggajlM4
 2P10Vc4L/Nb3KYV9RW+C3WpRfUR/o8BZA3wjJfNo0JeA4iDaUbltSjpCA07EcAnA
 3D6Omzqkm4aobL2WlvioT0UhZx4t8X/8x5t5F9HyX52i1k+g87oMT9/KIKec1Dzd
 8vQCwCdXFfWaLSZoOJsHyIljip7BuRLKhWwKosdzzLIAnRQy5StxAhsG99fNStu6
 JOWICPNCn1b6SkktnoKou1unL+K5RczeNfAxMAjcJjTD7IIAmytLe4mdRbP9q+Oa
 x8no7pttbb4JnoRvfo42GVz8KWQR07oN/Zy7mH3K4Y0Ix+xDOrLqlfLIDLGpxMNv
 HZz+UPchdlfpYJO+nTLoAOGXZWnKDqg70SAEcWKDc82Ri4vNOhraYDZvXrzl9qE+
 63RPzqDbg3uXGxLYMvujjPe610QkPxS9zKKyDvUZZx0ZiUX4CjI=
 =cdrz
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Assorted minor syzbot fixes, and for bigger stuff:

  Fix two disk accounting rewrite bugs:

   - Disk accounting keys use the version field of bkey so that journal
     replay can tell which updates have been applied to the btree.

     This is set in the transaction commit path, after we've gotten our
     journal reservation (and our time ordering), but the
     BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
     uses was incorrectly skipping this for new updates generated prior
     to journal replay.

     This fixes the underlying cause of an assertion pop in
     disk_accounting_read.

   - A couple of fixes for disk accounting + device removal.

     Checking if acocunting replicas entries were marked in the
     superblock was being done at the wrong point, when deltas in the
     journal could still zero them out, and then additionally we'd try
     to add a missing replicas entry to the superblock without checking
     if it referred to an invalid (removed) device.

  A whole slew of repair fixes:

   - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
     an infinite loop when repairing a filesystem with many snapshots

   - fix incorrect transaction restart handling leading to occasional
     "fsck counted ..." warnings

   - fix warning in __bch2_fsck_err() for bkey fsck errors

   - check_inode() in fsck now correctly checks if the filesystem was
     clean

   - there shouldn't be pending logged ops if the fs was clean, we now
     check for this

   - remove_backpointer() doesn't remove a dirent that doesn't actually
     point to the inode

   - many more fsck errors are AUTOFIX"

* tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
  bcachefs: check_subvol_path() now prints subvol root inode
  bcachefs: remove_backpointer() now checks if dirent points to inode
  bcachefs: dirent_points_to_inode() now warns on mismatch
  bcachefs: Fix lost wake up
  bcachefs: Check for logged ops when clean
  bcachefs: BCH_FS_clean_recovery
  bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
  bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
  bcachefs: Check for accounting keys with bversion=0
  bcachefs: rename version -> bversion
  bcachefs: Don't delete unlinked inodes before logged op resume
  bcachefs: Fix BCH_SB_ERRS() so we can reorder
  bcachefs: Fix fsck warnings from bkey validation
  bcachefs: Move transaction commit path validation to as late as possible
  bcachefs: Fix disk accounting attempting to mark invalid replicas entry
  bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
  bcachefs: Fix accounting read + device removal
  bcachefs: bch_accounting_mode
  bcachefs: fix transaction restart handling in check_extents(), check_dirents()
  bcachefs: kill inode_walker_entry.seen_this_pos
  ...
2024-09-29 09:17:44 -07:00
Kent Overstreet
3a5895e3ac bcachefs: check_subvol_path() now prints subvol root inode
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:23 -04:00
Kent Overstreet
0b0f0ad93c bcachefs: remove_backpointer() now checks if dirent points to inode
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:23 -04:00
Kent Overstreet
a6508079b1 bcachefs: dirent_points_to_inode() now warns on mismatch
if an inode backpointer points to a dirent that doesn't point back,
that's an error we should warn about.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:23 -04:00
Alan Huang
e057a290ef bcachefs: Fix lost wake up
If the reader acquires the read lock and then the writer enters the slow
path, while the reader proceeds to the unlock path, the following scenario
can occur without the change:

writer: pcpu_read_count(lock) return 1 (so __do_six_trylock will return 0)
reader: this_cpu_dec(*lock->readers)
reader: smp_mb()
reader: state = atomic_read(&lock->state) (there is no waiting flag set)
writer: six_set_bitmask()

then the writer will sleep forever.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:23 -04:00
Kent Overstreet
d50d7a5fa4 bcachefs: Check for logged ops when clean
If we shut down successfully, there shouldn't be any logged ops to
resume.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:22 -04:00
Kent Overstreet
1c0ee43b2c bcachefs: BCH_FS_clean_recovery
Add a filesystem flag to indicate whether we did a clean recovery -
using c->sb.clean after we've got rw is incorrect, since c->sb is
updated whenever we write the superblock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:22 -04:00
Kent Overstreet
9773547b16 bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
We had a bug where disk accounting keys didn't always have their version
field set in journal replay; change the BUG_ON() to a WARN(), and
exclude this case since it's now checked for elsewhere (in the bkey
validate function).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:22 -04:00
Kent Overstreet
a3581ca35d bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
This was added to avoid double-counting accounting keys in journal
replay. But applied incorrectly (easily done since it applies to the
transaction commit, not a particular update), it leads to skipping
in-mem accounting for real accounting updates, and failure to give them
a version number - which leads to journal replay becoming very confused
the next time around.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 22:32:20 -04:00
Kent Overstreet
f8911ad88d bcachefs: Check for accounting keys with bversion=0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
cf49f8a8c2 bcachefs: rename version -> bversion
give bversions a more distinct name, to aid in grepping

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
fd65378db9 bcachefs: Don't delete unlinked inodes before logged op resume
Previously, check_inode() would delete unlinked inodes if they weren't
on the deleted list - this code dating from before there was a deleted
list.

But, if we crash during a logged op (truncate or finsert/fcollapse) of
an unlinked file, logged op resume will get confused if the inode has
already been deleted - instead, just add it to the deleted list if it
needs to be there; delete_dead_inodes runs after logged op resume.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
8d65b15f8d bcachefs: Fix BCH_SB_ERRS() so we can reorder
BCH_SB_ERRS() has a field for the actual enum val so that we can reorder
to reorganize, but the way BCH_SB_ERR_MAX was defined didn't allow for
this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
5612daafb7 bcachefs: Fix fsck warnings from bkey validation
__bch2_fsck_err() warns if the current task has a btree_trans object and
it wasn't passed in, because if it has to prompt for user input it has
to be able to unlock it.

But plumbing the btree_trans through bkey_validate(), as well as
transaction restarts, is problematic - so instead make bkey fsck errors
FSCK_AUTOFIX, which doesn't need to warn.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
7c980a43e9 bcachefs: Move transaction commit path validation to as late as possible
In order to check for accounting keys with version=0, we need to run
validation after they've been assigned version numbers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
431312b59c bcachefs: Fix disk accounting attempting to mark invalid replicas entry
This fixes the following bug, where a disk accounting key has an invalid
replicas entry, and we attempt to add it to the superblock:

bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): starting version 1.12: rebalance_work_acct_fix opts=metadata_replicas=2,data_replicas=2,foreground_target=ssd,background_target=hdd,nopromote_whole_extents,verbose,fsck,fix_errors=yes
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): recovering from clean shutdown, journal seq 15211644
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): accounting_read...
accounting not marked in superblock replicas
  replicas cached: 1/1 [0], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 0 in entry cached: 1/1 [0]
replicas_v0 (size 88):
user: 2 [3 5] user: 2 [1 4] cached: 1 [2] btree: 2 [1 2] user: 2 [2 5] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 2 [1 2] user: 2 [2 3] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 3] user: 2 [1 5] user: 2 [2 4]

bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): inconsistency detected - emergency read only at journal seq 15211644
accounting not marked in superblock replicas
  replicas user: 1/1 [3], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 0 in entry cached: 1/1 [0]
replicas_v0 (size 96):
user: 2 [3 5] user: 2 [1 3] cached: 1 [2] btree: 2 [1 2] user: 2 [2 4] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 1 [3] user: 2 [1 5] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 2] user: 2 [1 4] user: 2 [2 3] user: 2 [2 5]

accounting not marked in superblock replicas
  replicas user: 1/2 [3 7], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 7 in entry user: 1/2 [3 7]
replicas_v0 (size 96):
user: 2 [3 7] user: 2 [1 3] cached: 1 [2] btree: 2 [1 2] user: 2 [2 4] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 1 [3] user: 2 [1 5] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 2] user: 2 [1 4] user: 2 [2 3] user: 2 [2 5] user: 2 [3 5]

 done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): alloc_read... done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): stripes_read... done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): snapshots_read... done

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
49fd90b2cc bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
9104fc1928 bcachefs: Fix accounting read + device removal
accounting read was checking if accounting replicas entries were marked
in the superblock prior to applying accounting from the journal,
which meant that a recently removed device could spuriously trigger a
"not marked in superblocked" error (when journal entries zero out the
offending counter).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
1e0272ef47 bcachefs: bch_accounting_mode
Minor refactoring - replace multiple bool arguments with an enum; prep
work for fixing a bug in accounting read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
3672bda8f5 bcachefs: fix transaction restart handling in check_extents(), check_dirents()
Dealing with outside state within a btree transaction is always tricky.

check_extents() and check_dirents() have to accumulate counters for
i_sectors and i_nlink (for subdirectories). There were two bugs:

- transaction commit may return a restart; therefore we have to commit
  before accumulating to those counters
- get_inode_all_snapshots() may return a transaction restart, before
  updating w->last_pos; then, on the restart,
  check_i_sectors()/check_subdir_count() would see inodes that were not
  for w->last_pos

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:35 -04:00
Kent Overstreet
22a507d68e bcachefs: kill inode_walker_entry.seen_this_pos
dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Kent Overstreet
b29c30ab48 bcachefs: Fix incorrect IS_ERR_OR_NULL usage
Returning a positive integer instead of an error code causes error paths
to become very confused.

Closes: syzbot+c0360e8367d6d8d04a66@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Hongbo Li
dc5bfdf8ea bcachefs: fix the memory leak in exception case
The pointer clean points the memory allocated by kmemdup, when the
return value of bch2_sb_clean_validate_late is not zero. The memory
pointed by clean is leaked. So we should free it in this case.

Fixes: a37ad1a3aba9 ("bcachefs: sb-clean.c")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Hongbo Li
3125c95ea6 bcachefs: fast exit when darray_make_room failed
In downgrade_table_extra, the return value is needed. When it
return failed, we should exit immediately.

Fixes: 7773df19c35f ("bcachefs: metadata version bucket_stripe_sectors")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Kent Overstreet
951dd86e7c bcachefs: Fix iterator leak in check_subvol()
A couple small error handling fixes

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Kent Overstreet
2a1df87346 bcachefs: Add snapshot to bch_inode_unpacked
this allows for various cleanups in fsck

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Diogo Jahchan Koike
40d40c6bea bcachefs: assign return error when iterating through layout
syzbot reported a null ptr deref in __copy_user [0]

In __bch2_read_super, when a corrupt backup superblock matches the
default opts offset, no error is assigned to ret and the freed superblock
gets through, possibly being assigned as the best sb in bch2_fs_open and
being later dereferenced, causing a fault. Assign EINVALID to ret when
iterating through layout.

[0]: https://syzkaller.appspot.com/bug?extid=18a5c5e8a9c856944876

Reported-by: syzbot+18a5c5e8a9c856944876@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=18a5c5e8a9c856944876
Signed-off-by: Diogo Jahchan Koike <djahchankoike@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Kent Overstreet
c6040447c5 bcachefs: Fix srcu warning in check_topology
check_topology doesn't need the srcu lock and doesn't use normal btree
transactions - we can just drop the srcu lock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00