linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2025-01-09 14:43:16 +00:00

Author	SHA1	Message	Date
Kent Overstreet	1f93726e63	bcachefs: Tracepoint improvements Delete some obsolete tracepoints, organize alloc tracepoints better, make a few tracepoints more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	e1b8f5f5ca	bcachefs: Plumb btree_id & level to trans_mark For backpointers, we'll need the full key location - that means btree_id and btree level. This patch plumbs it through. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	0b09032653	bcachefs: Improve bch2_lru_delete() error messages When we detect a filesystem inconsistency, we should include the relevent keys in the error message. This patch adds a parameter to pass the key with the lru entry to bch2_lru_delete(), so that it can be printed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	9b93596c33	bcachefs: Improve error message when alloc key doesn't match lru entry Error messages should always print out the full key when available - this gives us a starting point when looking through the journal to debug what went wrong. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	7003589dab	bcachefs: Ensure buckets have io_time[READ] set It's an error if a bucket is in state BCH_DATA_cached but not on the LRU btree - i.e io_time[READ] == 0 - so, make sure it's set before adding it. Also, make some of the LRU code a bit clearer and more direct. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	84befe8ef9	bcachefs: Use bch2_trans_inconsistent_on() in more places This gets us better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	a9c0a4cbf1	bcachefs: Minor device removal fixes - We weren't clearing the LRU btree - bch2_alloc_read() runs before bch2_check_alloc_key() deletes alloc keys for devices/buckets that don't exists, so it needs to check for that - bch2_check_lrus() needs to check that buckets exists - improve some error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	aae29082c6	bcachefs: bch2_btree_delete_extent_at() New helper, for deleting extents. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	822835ffea	bcachefs: Fold bucket_state in to BCH_DATA_TYPES() Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark. Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets. This is a new on disk format version, with upgrade and fsck required for the accounting changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	62491956f4	bcachefs: Move alloc assertion to .key_invalid() .key_invalid is a better place for this assertion. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	11c7d3e817	bcachefs: Check for read_time == 0 in bch2_alloc_v4_invalid() We've been seeing this error in fsck and we weren't able to track down where it came from - but now that .key_invalid methods take a rw argument, we can safely check for this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	275c8426fb	bcachefs: Add rw to .key_invalid() This adds a new parameter to .key_invalid() methods for whether the key is being read or written; the idea being that methods can do more aggressive checks when a key is newly created and being written, when we wouldn't want to delete the key because of those checks. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	e1effd42a1	bcachefs: More improvements for alloc info checks - Move checks for whether the device & bucket are valid from the .key_invalid method to bch2_check_alloc_key(). This is because .key_invalid() is called on keys that may no longer exist (post journal replay), which is a problem when removing/resizing devices. - We weren't checking the need_discard btree to ensure that every set bucket has a corresponding alloc key. This refactors the code for checking the freespace btree, so that it now checks both. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	f0ac7df23d	bcachefs: Convert .key_invalid methods to printbufs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	5735608c14	bcachefs: Kill main in-memory bucket array All code using the in-memory bucket array, excluding GC, has now been converted to use the alloc btree directly - so we can finally delete it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	5add07d56a	bcachefs: Fsck for need_discard & freespace btrees Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	caece7fe3f	bcachefs: New bucket invalidate path In the old allocator code, preparing an existing empty bucket was part of the same code path that invalidated buckets containing cached data. In the new allocator code this is no longer the case: the main allocator path finds empty buckets (via the new freespace btree), and can't allocate buckets that contain cached data. We now need a separate code path to invalidate buckets containing cached data when we're low on empty buckets, which this patch implements. When the number of free buckets decreases that triggers the new invalidate path to run, which uses the LRU btree to pick cached data buckets to invalidate until we're above our watermark. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	59cc38b8d4	bcachefs: New discard implementation In the old allocator code, buckets would be discarded just prior to being used - this made sense in bcache where we were discarding buckets just after invalidating the cached data they contain, but in a filesystem where we typically have more free space we want to be discarding buckets when they become empty. This patch implements the new behaviour - it checks the need_discard btree for buckets awaiting discards, and then clears the appropriate bit in the alloc btree, which moves the buckets to the freespace btree. Additionally, discards are now enabled by default. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	f25d8215f4	bcachefs: Kill allocator threads & freelists Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	c6b2826cd1	bcachefs: Freespace, need_discard btrees This adds two new btrees for the upcoming allocator rewrite: an extents btree of free buckets, and a btree for buckets awaiting discards. We also add a new trigger for alloc keys to keep the new btrees up to date, and a compatibility path to initialize them on existing filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3d48a7f85f	bcachefs: KEY_TYPE_alloc_v4 This introduces a new alloc key which doesn't use varints. Soon we'll be adding backpointers and storing them in alloc keys, which means our pack/unpack workflow for alloc keys won't really work - we'll need to be mutating alloc keys in place. Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that converts older types of alloc keys to v4 if needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	31f63fd124	bcachefs: Introduce a separate journal watermark for copygc Since journal reclaim -> btree key cache flushing may require the allocation of new btree nodes, it has an implicit dependency on copygc in order to make forward progress - so we should avoid blocking copygc unless the journal is really close to full. This introduces watermarks to replace our single MAY_GET_UNRESERVED bit in the journal, and adds a watermark for copygc and plumbs it through. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3e1547116f	bcachefs: x-macroize alloc_reserve enum This makes an array of strings available, like our other enums. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3117db99f3	bcachefs: Don't issue discards when in nochanges mode When the nochanges option is selected, we're supposed to never issue writes. Unfortunately, it seems discards were missed when implemnting this, leading to some painful filesystem corruption. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	ec061b215d	bcachefs: btree_gc no longer uses main in-memory bucket array This changes the btree_gc code to only use the second bucket array, the one dedicated to GC. On completion, it compares what's in its in memory bucket array to the allocation information in the btree and writes it directly, instead of updating the main in-memory bucket array and writing that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	12ce5b7df1	bcachefs: Btree key cache coherency - Updates to non key cache iterators will now be transparently redirected to the key cache for cached btrees. - Except when creating new keys: then the update goes to underlying btree For for iterating over a cached btree to work, we need to ensure that if a key exists in the key cache, it also exists in the btree - otherwise the iterator code will skip past it and not check the key cache. Otherwise, for consistency, all updates should go to the same place - the key cache. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	0678cbe2cb	bcachefs: Ignore cached data when calculating fragmentation Previously, bucket fragmentation was considered to be bucket size - total amount of live data, both dirty and cached. This meant that if a bucket was full but only a small amount of data in it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the dirty data out of the bucket and the allocator wouldn't be able to invalidate and drop the cached data. This changes fragmentation to exclude cached data, so that copygc will evacuate these buckets and copygc/the allocator will always be able to make forward progress. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	3763cb9566	bcachefs: Don't use in-memory bucket array for alloc updates More prep work for getting rid of the in-memory bucket array: now that we have BTREE_ITER_WITH_JOURNAL, the allocator code can do ntree lookups before journal replay is finished, and there's no longer any need for it to get allocation information from the in-memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	1f5f52bd03	bcachefs: Kill allocator short-circuit invalidate The allocator thread invalidates buckets (increments their generation number) prior to discarding them and putting them on freelists. We've had a short circuit path for some time to only update the in-memory bucket mark when doing the invalidate if we're not invalidating cached data, but that short-circuit path hasn't really been needed for quite some time (likely since the btree key cache code was added). We're deleting it now as part of deleting/converting code that uses the in memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	21aec962df	bcachefs: New data structure for buckets waiting on journal commit Implement a hash table, using cuckoo hashing, for empty buckets that are waiting on a journal commit before they can be reused. This replaces the journal_seq field of bucket_mark, and is part of eventually getting rid of the in memory bucket array. We may need to make bch2_bucket_needs_journal_commit() lockless, pending profiling and testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	d8601afca8	bcachefs: Simplify journal replay With BTREE_ITER_WITH_JOURNAL, there's no longer any restrictions on the order we have to replay keys from the journal in, and we can also start up journal reclaim right away - and delete a bunch of code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:21 -04:00
Kent Overstreet	5222a4607c	bcachefs: BTREE_ITER_WITH_JOURNAL This adds a new btree iterator flag, BTREE_ITER_WITH_JOURNAL, that is automatically enabled when initializing a btree iterator before journal replay has completed - it overlays the contents of the journal with the btree. This lets us delete bch2_btree_and_journal_walk() and just use the normal btree iterator interface instead - which also lets us delete a significant amount of duplicated code. Note that BTREE_ITER_WITH_JOURNAL is still unoptimized in this patch - we're redoing the binary search over keys in the journal every time we call bch2_btree_iter_peek(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:21 -04:00
Kent Overstreet	36f035e908	bcachefs: Fix allocator + journal interaction The allocator needs to wait until the last update touching a bucket has been commited before writing to it again. However, the code was checking against the last dirty journal sequence number, not the last flushed journal sequence number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	a786087744	bcachefs: New in-memory array for bucket gens The main in-memory bucket array is going away, but we'll still need to keep bucket generations in memory, at least for now - ptr_stale() needs to be an efficient operation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	abe19d458e	bcachefs: Refactor open_bucket code Prep work for adding a hash table of open buckets - instead of embedding a bch_extent_ptr, we need to refer to the bucket directly so that we're not calling sector_to_bucket() in the hash table lookup code, which has an expensive divide. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	c64740ef27	bcachefs: Don't start allocator threads too early If the allocator threads start before journal replay has finished replaying alloc keys, journal replay might overwrite the allocator's btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	09943313d7	bcachefs: Rewrite bch2_bucket_alloc_new_fs() This changes bch2_bucket_alloc_new_fs() to a simple bump allocator that doesn't need to use the in memory bucket array, part of a larger patch series to entirely get rid of the in memory bucket array, except for gc/fsck. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	7243498de7	bcachefs: Kill non-lru cache replacement policies Prep work for persistent LRUs and getting rid of the in memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	20572300dc	bcachefs: Improve alloc_mem_to_key() This moves some common code into alloc_mem_to_key(), which translates from the in-memory format for a bucket to the btree key format. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	fb0e480872	bcachefs: bch2_alloc_write() This adds a new helper that much like the one we have for inode updates, that allocates the packed alloc key, packs it and calls bch2_trans_update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:18 -04:00
Kent Overstreet	b547d005d5	bcachefs: Erasure coding fixes When we added the stripe and stripe_redundancy fields to alloc keys, we neglected to add them to the functions that convert back and forth with the in-memory types. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	3e52c22255	bcachefs: Add journal_seq to inode & alloc keys Add fields to inode & alloc keys that record the journal sequence number when they were most recently modified. For alloc keys, this is needed to know what journal sequence number we have to flush before the bucket can be reused. Currently this is tracked in memory, but we'll be getting rid of the in memory bucket array. For inodes, this is needed for fsync when the inode has been evicted from the vfs cache. Currently we use a bloom filter per outstanding journal buf - but that mechanism has been broken since we added the ability to not issue a flush/fua for every journal write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	904823de49	bcachefs: Convert bch2_mark_key() to take a btree_trans * This helps to unify the interface between bch2_mark_key() and bch2_trans_mark_key() - and it also gives access to the journal reservation and journal seq in the mark_key path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	b0d1b70af8	bcachefs: Must check for errors from bch2_trans_cond_resched() But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	69294246b7	bcachefs: Fix allocator shutdown error message We return 1 to indicate kthread_should_stop() returned true - we shouldn't be printing an error. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	8b3e9bd65f	bcachefs: Always check for transaction restarts On transaction restart iterators won't be locked anymore - make sure we're always checking for errors. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	8d34458781	bcachefs: Add safe versions of varint encode/decode This adds safe versions of bch2_varint_(encode\|decode) that don't read or write past the end of the buffer, or varint being encoded. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	2e655e6de2	bcachefs: Add open_buckets to sysfs This is to help debug a rare shutdown deadlock in the allocator code - the btree code is leaking open_buckets. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	bc3f8b25f3	bcachefs: Check for errors from bch2_trans_update() Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00

1 2 3 4

157 Commits