linux-next

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git synced 2025-01-04 04:02:26 +00:00

Author	SHA1	Message	Date
Kent Overstreet	3ac87fa03f	bcachefs: logged ops only use inum 0 of logged ops btree we wish to use the logged ops btree for other items that aren't strictly logged ops: cursors for inode allocation There's no reason to create another cached btree for inode allocator cursors - so reserve different parts of the keyspace for different purposes. Older versions will ignore or delete the cursors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	354ae858ba	bcachefs: rcu_pending now works in userspace Introduce a typedef to handle the difference between unsigned long/struct urcu_gp_poll_state. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Geert Uytterhoeven	8a2713582e	bcachefs: BCACHEFS_PATH_TRACEPOINTS should depend on TRACING When tracing is disabled, there is no point in asking the user about enabling extra btree_path tracepoints in bcachefs. Fixes: `32ed4a620c` ("bcachefs: Btree path tracepoints") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	e37f4286d4	bcachefs: Fix allocating too big journal entry The "journal space available" calculations didn't take into account mismatched bucket sizes; we need to take the minimum space available out of our devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	030d6ebb78	bcachefs: Improve "unable to allocate journal write" message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	a5b377f773	bcachefs: fix bch2_journal_key_insert_take() seq Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	0b5819b73c	bcachefs: bch2_async_btree_node_rewrites_flush() Add a method to flush btree node rewrites at the end of recovery, to ensure that corrected errors are persisted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	247a12f3a2	bcachefs: If we did repair on a btree node, make sure we rewrite it Ensure that "invalid bkey" repair gets persisted, so that it doesn't repeatedly spam the logs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	28d5570cd2	bcachefs: bkey_fsck_err now respects errors_silent Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	5e41519938	bcachefs: list_pop_entry() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	67434cd4b7	bcachefs: Convert write path errors to inum_to_path() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-14 22:46:14 -05:00
Kent Overstreet	2ab8d31989	bcachefs: bch2_inum_to_path() Add a function for walking backpointers to find a path from a given inode number, and convert various error messages to use it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:04 -05:00
Kent Overstreet	428a2c2d6b	bcachefs: Fix fsck.c build in userspace Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:04 -05:00
Yang Li	9a956407c2	bcachefs: Add missing parameter description to bch2_bucket_alloc_trans() The function bch2_bucket_alloc_trans() lacked a description for the nowait parameter in its documentation comment block. This patch adds the missing description to ensure all parameters are properly documented. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=12179 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:04 -05:00
Kent Overstreet	5bb89aa54d	bcachefs: Don't recurse in check_discard_freespace_key When calling check_discard_freeespace_key from the allocator, we can't repair without recursing - run it asynchronously instead. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:04 -05:00
Kent Overstreet	12b2baa0b5	bcachefs: Check for extent crc uncompressed/compressed size mismatch When not compressed, these must be equal - this fixes an assertion pop in bch2_rechecksum_bio(). Reported-by: syzbot+50d3544c9b8db9c99fd2@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:04 -05:00
Kent Overstreet	4271983e79	bcachefs: bch2_trans_relock() is trylock for lockdep fix some spurious lockdep splats Reported-by: syzbot+e088be3c2d5c05aaac35@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:04 -05:00
Kent Overstreet	1b1f8623fb	bcachefs: cryptographic MACs on superblock are not (yet?) supported We should add support for cryptographic macs on the superblock - and it won't be hard, but it'll need an incompatible feature bit (and we have a new incompatible feature versioning scheme coming). For now, just add a guard to avoid a dull ptr deref in gen_poly_key(). Reported-by: syzbot+dd3d9835055dacb66f35@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	9d59eb0be2	bcachefs: Check for inode journal seq in the future More check and repair code: this fixes a warning in bch2_journal_flush_seq_async() Reported-by: syzbot+d119b445ec739e7f3068@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	6efc86ff29	bcachefs: Check for bucket journal seq in the future This fixes an assertion pop in bch2_journal_noflush_seq() - log the error to the superblock and continue instead. Reported-by: syzbot+85700120f75fc10d4e18@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	422310542e	bcachefs: do_fsck_ask_yn() __bch2_fsck_err() is huge, and badly needs more refactoring Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	ca04ac9a4a	bcachefs: Don't error out when logging fsck error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	b7b7f5ab55	bcachefs: mark more errors AUTOFIX mark errors as autofix where syzbot has hit the repair paths Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	1a7e03622b	bcachefs: add missing printbuf_reset() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	5a1b4c8d17	bcachefs: Fix journal_iter list corruption Fix exiting an iterator that wasn't initialized. Reported-by: syzbot+2f7c2225ed8a5cb24af1@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	fbb140ee45	bcachefs: Guard against backpointers to unknown btrees Reported-by: syzbot+997f0573004dcb964555@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	6248d420a9	bcachefs: Issue a transaction restart after commit in repair transaction commits invalidate pointers to btree values, and they also downgrade intent locks. This breaks the interior btree update path, which takes intent locks and then calls into the allocator. This isn't an ideal solution: we can't unconditionally issue a restart after a transaction commit, because that would break other codepaths. Reported-by: syzbot+78d82470c16a49702682@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	a5d7cf3466	bcachefs: Guard against journal seq overflow Wraparound is impractical to handle since in various places we use 0 as a sentinal value - but 64 bits (or 56, because the btree write buffer steals a few bits) is enough for all practical purposes. Reported-by: syzbot+73ed43fbe826227bd4e0@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	8afb03592f	bcachefs: BCH_FS_recovery_running If we're autofixing topology errors, we shouldn't shutdown if we're still in recovery. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	502a010a6c	bcachefs: Make topology errors autofix These repair paths are well tested, we can repair them without explicit user intervention This also tweaks bch2_topology_error() so that we run topology repair if we're in recovery, not just fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	4d13c89412	bcachefs: struct bkey_validate_context Add a new parameter to bkey validate functions, and use it to improve invalid bkey error messages: we can now print the btree and depth it came from, or if it came from the journal, or is a btree root. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	aa492d5318	bcachefs: Ignore empty btree root journal entries There's no reason to treat them as errors: just ignore them, and go with a previous btree root if we had one. Reported-by: syzbot+e22007d6acb9c87c2362@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	33213a5be1	bcachefs: Fix null ptr deref in btree_path_lock_root() Historically, we required that all btree node roots point to a valid (possibly fake) node, but we're improving our ability to continue in the presence of errors. Reported-by: syzbot+e22007d6acb9c87c2362@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	839c29d574	bcachefs: Go RW earlier, for normal rw mount Previously, when mounting read-write after a clean shutdown, we wouldn't go read-write until after all the recovery passes completed. Now, go RW early in recovery, the same as any other situation we'll need to go read-write. This fixes a bug where we discover unlinked inodes after a clean shutdown: repair fails because we're read only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	d6dd534eb3	bcachefs: Fix bch2_btree_node_update_key_early() Fix an assertion pop from the recent btree cache freelist fixes. Fixes: `baefd3f849` ("bcachefs: btree_cache.freeable list fixes") Reported-by: Tyler <th020394@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	72177d492d	bcachefs: Change "disk accounting version 0" check to commit only 6.11 had a bug where we'd sometimes create disk accounting keys with version 0, which causes issues for journal replay - but we don't need to delete existing accounting keys with version 0. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	ec3ca7c9e0	bcachefs: Don't try to en/decrypt when encryption not available If a btree node says it's encrypted, but the superblock never had an encryptino key - whoops, that needs to be handled. Reported-by: syzbot+026f1857b12f5eb3f9e9@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	2d66d3160d	bcachefs: Fix dup/misordered check in btree node read We were checking for out of order keys, but not duplicate keys. Reported-by: syzbot+dedbd67513939979f84f@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:03 -05:00
Kent Overstreet	46522a75a4	bcachefs: Bad btree roots are now autofix Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	873a885d1a	bcachefs: Kill bch2_bucket_alloc_new_fs() The early-early allocation path, bch2_bucket_alloc_new_fs(), is no longer needed - and inconsistencies around new_fs_bucket_idx have been a frequent source of bugs. Reported-by: syzbot+592425844580a6598410@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	3307caf863	bcachefs: Fix btree node scan when unknown btree IDs are present btree_root entries for unknown btree IDs are created during recovery, before reading those btree roots. But btree_node_scan may find btree nodes with unknown btree IDs when we haven't seen roots for those btrees. Reported-by: syzbot+1f202d4da221ec6ebf8e@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	658ca21817	bcachefs: backpointer_to_missing_ptr is now autofix Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	ba91f39cd4	bcachefs: Fix accounting_read when we rewind If we rewind recovery to run topology repair, that causes accounting_read to run twice. This fixes accounting being double counted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	8b53739160	bcachefs: disk_accounting: bch2_dev_rcu -> bch2_dev_rcu_noerror Accounting keys that reference invalid devices are corrected by fsck, they shouldn't cause an emergency shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	f3542deaa9	bcachefs: errcode cleanup: journal errors Instead of throwing standard error codes, we should be throwing dedicated private error codes, this greatly improves debugability. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	3ed349d91e	bcachefs: Use separate rhltable for bch2_inode_or_descendents_is_open() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	2ae6c5e05d	bcachefs: BCH_ERR_btree_node_read_error_cached Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	0e796cf804	bcachefs: btree_write_buffer_flush_seq() no longer closes journal Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	4e378cabba	bcachefs: discard fastpath now uses bch2_discard_one_bucket() The discard bucket fastpath previously was using its own code for discarding buckets and clearing them in the need_discard btree, which didn't have any of the consistency checks of the main discard path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	2c9a60bc31	bcachefs: Bias reads more in favor of faster device Per reports of performance issues on mixed multi device filesystems where we're issuing too much IO to the spinning rust - tweak this algorithm. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	709336f96d	bcachefs: trivial btree write buffer refactoring Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	01d8d04564	bcachefs: Can now block journal activity without closing cur entry Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	1f3c4ab3fb	bcachefs: New backpointers helpers - bch2_backpointer_del() - bch2_backpointer_maybe_flush() Kill a bit of open coding and make sure we're properly handling the btree write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	da89857b5f	bcachefs: kill bch_backpointer.bucket_offset usage bch_backpointer.bucket_offset is going away - it's no longer needed since we no longer store backpointers in alloc keys, the same information is in the key position itself. And we'll be reclaiming the space in bch_backpointer for the bucket generation number. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	283dcbb80c	bcachefs: Fix check_backpointers_to_extents range limiting bch2_get_btree_in_memory_pos() will return positions that refer directly to the btree it's checking will fit in memory - i.e. backpointer positions, not buckets. This also means check_bp_exists() no longer has to refer to the device, and we can delete some code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	ad5834890f	bcachefs: bch_backpointer -> bkey_i_backpointer Since we no longer store backpointers in alloc keys, there's no reason not to pass around bkey_i_backpointers; this means we don't have to pass the bucket pos separately. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	165ca83f55	bcachefs: Drop swab code for backpointers in alloc keys Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	bbc2ccccfd	bcachefs: bucket_pos_to_bp_end() Better helpers for iterating over backpointers within a specific bucket Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:02 -05:00
Kent Overstreet	3f2e467845	bcachefs: check for backpointers to invalid device Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:01 -05:00
Kent Overstreet	62b185571a	bcachefs: fix bp_pos_to_bucket_nodev_noerror _noerror means don't produce inconsistent errors, so it should be using bch2_dev_rcu_noerror(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:19:01 -05:00
Kent Overstreet	16de129896	bcachefs: Fix evacuate_bucket tracepoint `86a494c8ee` ("bcachefs: Kill bch2_get_next_backpointer()") dropped some things the tracepoint emitted because bch2_evacuate_bucket() no longer looks at the alloc key - but we did want at least some of that. We still no longer look at the alloc key so we can't report on the fragmentation number, but that's a direct function of dirty_sectors and a copygc concern anyways - copygc should get its own tracepoint that includes information from the fragmentation LRU. But we can report on the number of sectors we moved and the bucket size. Co-developed-by: Piotr Zalewski <pZ010001011111@proton.me> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-09 06:18:49 -05:00
Kent Overstreet	92084feca4	bcachefs: fix O(n^2) issue with whiteouts in journal keys The journal_keys array can't be substantially modified after we go RW, because lookups need to be able to check it locklessly - thus we're limited on what we can do when a key in the journal has been overwritten. This is a problem when there's many overwrites to skip over for peek() operations. To fix this, add tracking of ranges of overwrites: we create a range entry when there's more than one contiguous whiteout. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:19 -05:00
Kent Overstreet	1a8f5adc20	bcachefs: btree_and_journal_iter: don't iterate over too many whiteouts when prefetching To help ameloriate issues with peek operations having to skip over deletions in the journal - just bail out if all we're doing is prefetching btree nodes. Since btree node prefetching runs every time we iterate to a new node, and has to sequentially scan ahead, this avoids another O(n^2). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:19 -05:00
Kent Overstreet	1d1374a083	bcachefs: journal keys: sort keys for interior nodes first There's an unavoidable issue with btree lookups when we're overlaying journal keys and the journal has many deletions for keys present in the btree - peek operations will have to iterate over all those deletions to find the next live key to return. This is mainly a problem for lookups in interior nodes, if we have to traverse to a leaf. Looking up an insert position in a leaf (for journal replay) doesn't have to find the next live key, but walking down the btree does. So to ameloriate this, change journal key sort ordering so that we replay keys from roots and interior nodes first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:19 -05:00
Kent Overstreet	ed144047ef	bcachefs: kill bch2_journal_entries_free() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:19 -05:00
Kent Overstreet	b287adb628	bcachefs: Don't BUG_ON() when superblock feature wasn't set for compressed data We don't allocate the mempools for compression/decompression unless we need them - but that means there's an inconsistency to check for. Reported-by: syzbot+cb3fbcfb417448cfd278@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:19 -05:00
Kent Overstreet	3a1897837a	bcachefs: Don't use a shared decompress workspace mempool gzip and zstd require different decompress workspace sizes, and if we start with one and then start using the other at runtime we may not get the correct size Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	3c0fc088af	bcachefs: compression workspaces should be indexed by opt, not type type includes lz4 and lz4_old, which do not get different compression workspaces, and incompressible, a fake type - BCH_COMPRESSION_OPTS() is the correct enum to use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	d5b149f310	bcachefs: add missing BTREE_ITER_intent this fixes excessive transaction restarts due to trans_commit having to upgrade Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	7fdfb0cbea	bcachefs: Kill bch2_get_next_backpointer() Since for quite some time backpointers have only been stored in the backpointers btree, not alloc keys (an aborted experiment, support for which has been removed) - we can replace get_next_backpointer() with simple btree iteration. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	59fad23c7a	bcachefs: Delete backpointers check in try_alloc_bucket() try_alloc_bucket() has a "safety" check, which avoids allocating a bucket if there's any backpointers present. But backpointers are not the source of truth for live data in a bucket, the bucket sector counts are; this check was fairly useless, and we're also deferring backpointers checks from fsck to runtime in the near future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	3140d0052a	bcachefs: peek_prev_min(): Search forwards for extents, snapshots With extents and snapshots, for slightly different reasons, we may have to search forwards to find a key that compares equal to iter->pos (i.e. a key that peek_prev() should return, as it returns keys <= iter->pos). peek_slot() does this, and is an easy way to fix this case. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	632bcf3865	bcachefs: Implement bch2_btree_iter_prev_min() A user contributed a filessytem dump, where the dump was actually corrupted (due to being taken while the filesystem was online), but which exposed an interesting bug in fsck - reconstruct_inode(). When itearting in BTREE_ITER_filter_snapshots mode, it's required to give an end position for the iteration and it can't span inode numbers; continuing into the next inode might mean we start seeing keys from a different snapshot tree, that the is_ancestor() checks always filter, thus we're never able to return a key and stop iterating. Backwards iteration never implemented the end position because nothing else needed it - except for reconstuct_inode(). Additionally, backwards iteration is now able to overlay keys from the journal, which will be useful if we ever decide to start doing journal replay in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	a7df326af0	bcachefs: discard_one_bucket() now uses need_discard_or_freespace_err() More conversion of inconsistent errors to fsck errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	cbc079bcff	bcachefs: bch2_bucket_do_index(): inconsistent_err -> fsck_err Factor out a common helper, need_discard_or_freespace_err(), which is now used by both fsck and the runtime checks, and can repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	c51b601907	bcachefs: try_alloc_bucket() now uses bch2_check_discard_freespace_key() check_discard_freespace_key() was doing all the same checks as try_alloc_bucket(), but with repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	ecadaf9ae3	bcachefs: rework bch2_bucket_alloc_freelist() freelist iteration Prep work for converting try_alloc_bucket() to use bch2_check_discard_freespace_key(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	3de116ce17	bcachefs: kill inconsistent err in invalidate_one_bucket() Change it to a normal fsck_err() - meaning it'll get repaired at runtime when that's flipped on. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	df4270ccd3	bcachefs: Don't delete reflink pointers to missing indirect extents To avoid tragic loss in the event of transient errors (i.e., a btree node topology error that was later corrected by btree node scan), we can't delete reflink pointers to correct errors. This adds a new error bit to bch_reflink_p, indicating that it is known to point to a missing indirect extent, and the error has already been reported. Indirect extent lookups now use bch2_lookup_indirect_extent(), which on error reports it as a fsck_err() and sets the error bit, and clears it if necessary on succesful lookup. This also gets rid of the bch2_inconsistent_error() call in __bch2_read_indirect_extent, and in the reflink_p trigger: part of the online self healing project. An on disk format change isn't necessary here: setting the error bit will be interpreted by older versions as pointing to a different index, which will also be missing - which is fine. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:18 -05:00
Kent Overstreet	6cf666ffb5	bcachefs: Reorganize reflink.c a bit Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	b36ff5dc08	bcachefs: Reserve 8 bits in bch_reflink_p Better repair for reflink pointers, as well as propagating new inode options to indirect extents, are going to require a few extra bits bch_reflink_p: so claim a few from the high end of the destination index. Also add some missing bounds checking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	4a370320dc	bcachefs: Kill FSCK_NEED_FSCK If we find an error that indicates that we need to run fsck, we can specify that directly with run_explicit_recovery_pass(). These are now log_fsck_err() calls: we're just logging in the superblock that an error occurred - and possibly doing an emergency shutdown, depending on policy. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	d6ad842cf7	bcachefs: lru errors are expected when reconstructing alloc Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	7ad0ba0e18	bcachefs: Delete dead code from bch2_discard_one_bucket() alloc key validation ensures that if a bucket is in need_discard state the sector counts are all zero - we don't have to check for that. The NEED_INC_GEN check appears to be dead code, as well: we only see buckets in the need_discard btree, and it's an error if they aren't in the need_discard state. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	7fda5f1508	bcachefs: bch2_btree_bit_mod_iter() factor out a new helper, make it handle extents bitset btrees (freespace). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	2228901e43	bcachefs: delete dead code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	a0678f9c85	bcachefs: Fix shutdown message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	53f02a6929	bcachefs: Don't use page allocator for sb_read_scratch Kill another unnecessary dependency on PAGE_SIZE Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Youling Tang	c03828056d	bcachefs: Simplify code in bch2_dev_alloc() - Remove unnecessary variable 'ret'. - Remove unnecessary bch2_dev_free() operations. Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Youling Tang	db6b114bd5	bcachefs: Remove redundant initialization in bch2_vfs_inode_init() `inode->v.i_ino` has been initialized to `inum.inum`. If `inum.inum` and `bi->bi_inum` are not equal, BUG_ON() is triggered in bch2_inode_update_after_write(). Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Youling Tang	0fe251fd82	bcachefs: Removes NULL pointer checks for __filemap_get_folio return values __filemap_get_folio the return value cannot be NULL, so unnecessary checks are removed. Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:17 -05:00
Kent Overstreet	797a14eb7d	bcachefs: Add support for FS_IOC_GETFSSYSFSPATH [TEST]: ``` $ cat ioctl_getsysfspath.c #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/fs.h> #include <unistd.h> int main(int argc, char *argv[]) { int fd; struct fs_sysfs_path sysfs_path = {}; if (argc != 2) { fprintf(stderr, "Usage: %s <path_to_file_or_directory>\n", argv[0]); exit(EXIT_FAILURE); } fd = open(argv[1], O_RDONLY); if (fd == -1) { perror("open"); exit(EXIT_FAILURE); } if (ioctl(fd, FS_IOC_GETFSSYSFSPATH, &sysfs_path) == -1) { perror("ioctl FS_IOC_GETFSSYSFSPATH"); close(fd); exit(EXIT_FAILURE); } printf("FS_IOC_GETFSSYSFSPATH: %s\n", sysfs_path.name); close(fd); return 0; } $ gcc ioctl_getsysfspath.c $ sudo bcachefs format /dev/sda $ sudo mount.bcachefs /dev/sda /mnt $ sudo ./a.out /mnt FS_IOC_GETFSSYSFSPATH: bcachefs/c380b4ab-fbb6-41d2-b805-7a89cae9cadb ``` Original patch link: [1]: https://lore.kernel.org/all/20240207025624.1019754-8-kent.overstreet@linux.dev/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Youling Tang <youling.tang@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Kent Overstreet	a57150fe53	bcachefs: Add support for FS_IOC_GETFSUUID Use super_set_uuid() to set `sb->s_uuid_len` to avoid returning `-ENOTTY` with sb->s_uuid_len being 0. Original patch link: [1]: https://lore.kernel.org/all/20240207025624.1019754-2-kent.overstreet@linux.dev/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Youling Tang	19128c53d4	bcachefs: Correct the description of the '--bucket=size' options Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Integral	88c2aa59c3	bcachefs: add support for true/false & yes/no in bool-type options Here is the patch which uses existing constant table: Currently, when using bcachefs-tools to set options, bool-type options can only accept 1 or 0. Add support for accepting true/false and yes/no for these options. Signed-off-by: Integral <integral@murena.io> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Acked-by: David Howells <dhowells@redhat.com>	2024-12-08 23:56:16 -05:00
Kent Overstreet	45f667488e	bcachefs: Move fsck ioctl code to fsck.c chardev.c and fs-ioctl.c are not organized by subject; let's try to fix this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Kent Overstreet	2cb00966dd	bcachefs: Kill unnecessary iter_rewind() in bkey_get_empty_slot() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Kent Overstreet	0ad36d94fe	bcachefs: Simplify btree_iter_peek() filter_snapshots Collapse all the BTREE_ITER_filter_snapshots handling down into a single block; btree iteration is much simpler in the !filter_snapshots case. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Kent Overstreet	95918915a6	bcachefs: Rename btree_iter_peek_upto() -> btree_iter_peek_max() We'll be introducing btree_iter_peek_prev_min(), so rename for consistency. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00
Kent Overstreet	f97b3e7fd8	bcachefs: Assert that we're not violating key cache coherency rules We're not allowed to have a dirty key in the key cache if the key doesn't exist at all in the btree - creation has to bypass the key cache, so that iteration over the btree can check if the key is present in the key cache. Things break in subtle ways if cache coherency is broken, so this needs an assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-12-08 23:56:16 -05:00

1 2 3 4 5 ...

94808 Commits