If an extent ends up with a replica that is encrypted an a replica that
isn't encrypted (due the user changing options), and then
copygc/rebalance moves one of the replicas by reading from the
unencrypted replica, we had a bug where we wouldn't correctly initialize
op->nonce - for each crc field in an extent, crc.offset + crc.nonce must
be equal.
This patch fixes that by moving op.nonce initialization to
bch2_migrate_write_init.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
When we detect an invalid key being inserted, we should print what code
was doing the update.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We really only need to distinguish between btree iterators and btree key
cache iterators - this is more prep work for btree_path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This was used for an optimization that hasn't existing in quite awhile
- iter->uptodate will probably be going away as well.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
These utility functions are for managing btree node state within a
btree_trans - rename them for consistency, and drop some unneeded
arguments.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is prep work for splitting btree_path out from btree_iter -
btree_path will not have a pointer to btree_trans.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
BTREE_ITER_SET_POS_AFTER_COMMIT is used internally to automagically
advance extent btree iterators on sucessful commit.
But with the upcomnig btree_path patch it's getting more awkward to
support, and it adds overhead to core data structures that's only used
in a few places, and can be easily done by the caller instead.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This consolidates the code for doing extent updates, and makes the btree
iterator usage a bit cleaner and more efficient.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This factors out bch2_dump_trans_iters_updates() from the iter alloc
overflow path, and makes some small improvements to what it prints.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
iter->real_pos needs to match the key returned or bad things will happen
when we go to update the key at that position. When we returned a
pending update from btree_trans_peek_updates(), this wasn't necessarily
the case.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds progress stats to sysfs for copygc, rebalance, recovery, and the
cmd_job ioctls.
Signed-off-by: Brett Holman <bholman.devel@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fix replaces multiple 64 bit divisions with do_div() equivalents.
Signed-off-by: Brett Holman <bholman.devel@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
DIV_ROUND_UP() wasn't doing what we wanted when passing it negative
numbers - fix it by just not passing it negative numbers anymore.
Also, no need to do the scaling by compression ratio for incompressible
data.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Valgrind was complaining about a jump depending on uninitialized memory
- we weren't, but this change makes the code less confusing for valgrind
to follow.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This makes the flow control in bch2_btree_iter_peek() and
bch2_btree_iter_peek_prev() a bit cleaner.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
__bch2_read() -> __bch2_read_extent() -> bch2_bucket_io_time_reset() may
cause a transaction restart, which we don't return an error for because
it doesn't prevent us from making forward progress on the read we're
submitting.
Instead, change __bch2_read() and bchfs_read() to check for transaction
restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Inode creation is done with non-cached btree iterators, but then in the
same transaction the inode may be updated again with a cached iterator -
it makes cache coherency easier if new inodes always land in the
underlying btree.
This patch adds a check to bch2_trans_update() - if the same key is
updated multiple times in the same transaction with both cached and non
cache iterators, use the non cached iterator.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This will be used to make other operations on btree iterators within a
transaction more efficient, and enable some other improvements to how we
manage btree iterators.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes a bad ptr deref on recovery from unclean shutdown in
bch2_btree_node_get_noiter().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
With the recent transaction restart changes, it's no longer needed - all
transaction commits have BTREE_INSERT_NOUNLOCK semantics.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
iter->should_be_locked means that if bch2_btree_iter_relock() fails, we
need to restart the transaction.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
If there's more than one iterator in the btree_trans, it's requried to
call bch2_trans_begin() to handle transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Start tracking when btree transactions have been restarted - and assert
that we're always calling bch2_trans_begin() immediately after
transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Btree node merging now happens prior to transaction commit, not after,
so we don't need to pay attention to BTREE_INSERT_NOUNLOCK.
Also, foreground_maybe_merge shouldn't be calling
bch2_btree_iter_traverse_all() - this is becoming private to the btree
iterator code and should only be called by bch2_trans_begin().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Upcoming patch will require that a transaction restart is always
immediately followed by bch2_trans_begin().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
On transaction restart iterators won't be locked anymore - make sure
we're always checking for errors.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
bch2_btree_iter_traverse_all() may loop, and it needs to clear
iter->should_be_locked on every iteration.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
They should already be traversed, and we're asserting that since the
introduction of iter->should_be_locked
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
bch2_btree_node_ptr_v2 has a field for stashing a pointer to the in
memory btree node; this is safe because we clear this field when reading
in nodes from disk and we never free in memory btree nodes - but, we
have bug reports that indicate something might be faulty with this
optimization, so let's add an option for it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds a new helper for btree_cache.c that does what we want where
the iterator is still being traverse - and also eliminates some
unnecessary transaction restarts.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>