This option was useful when the replicas mechism was new and still being
debugged, but hasn't been used in ages - let's delete it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
In the old allocator code, buckets would be discarded just prior to
being used - this made sense in bcache where we were discarding buckets
just after invalidating the cached data they contain, but in a
filesystem where we typically have more free space we want to be
discarding buckets when they become empty.
This patch implements the new behaviour - it checks the need_discard
btree for buckets awaiting discards, and then clears the appropriate
bit in the alloc btree, which moves the buckets to the freespace btree.
Additionally, discards are now enabled by default.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We're seeing a very strange bug where journal_flush_delay sometimes gets
set to 0 in the superblock. Together with the preceding patch, this
should help us track it down.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This moves validation of superblock options to bch2_sb_validate(), so
they'll be checked in the write path as well.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Now we've got strings for metadata versions - this changes
bch2_sb_to_text() and our mount log message to use it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Options no longer have to be manually added to bch2_sb_to_text() - it
now uses the master list of options in opts.h. Also, improve some of the
formatting by converting it to tabstops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add an option that tells recovery to only read the journal, to be used
by the list_journal command.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds a _to_text() pretty printer for journal entries - including
every subtype - which will shortly be used by the 'bcachefs
list_journal' subcommand.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Add a journal entry type for logging messages, and add an option to use
it to log the transaction name - this makes for a very handy debugging
tool, as with it we can use the 'bcachefs list_journal' command to see
not only what updates were done, but what was doing them.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It'll now be handled at format time and in sysfs like other options - it
still can only be set at format time, though.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds flags for options that must be a power of two (block size and
btree node size), and options that are stored in the superblock as a
power of two (encoded extent max).
Also: options are now stored in memory in the same units they're
displayed in (bytes): we now convert when getting and setting from the
superblock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds more latency/event measurements and breaks some apart into
more events. Journal writes are broken apart into flush writes and
noflush writes, btree compactions are broken out from btree splits,
btree mergers are added, as well as btree_interior_updates - foreground
and total.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We've got three types of options now - filesystem, device and inode, and
a given option may belong to more than one of those types.
This patch changes the options to specify explicitly when they're a
filesystem option - in the future we'll probably be adding more device
options.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This converts journal_write_delay, journal_flush_disabled, and
journal_reclaim_delay to normal filesystems options, and also adds them
to the superblock.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Quota support was disabled when snapshots were released, because of some
tricky interactions with snpashots. We're sidestepping that for now -
we're simply disabling quota accounting on snapshot subvolumes.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This patch converts more enums in the on disk format to our standard
x-macro-with-strings deal - to enable better pretty-printing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The fsck code has been handling transaction restarts locally, to avoid
calling fsck_err() multiple times (and asking the user/logging the error
multiple times) on transaction restart.
However, with our improving assertions about iterator validity, this
isn't working anymore - the code wasn't entirely correct, in ways that
are fine for now but are going to matter once we start wanting online
fsck.
This code converts much of the fsck code to handle transaction restarts
in a more rigorously correct way - moving restart handling up to the top
level of check_dirent, check_xattr and others - at the cost of logging
errors multiple times on transaction restart.
Fixing the issues with logging errors multiple times is probably going
to require memoizing calls to fsck_err() - we'll leave that for future
improvements.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Existing quota support breaks badly with snapshots. We're not deleting
the code because some of it will be needed when we reimplement quotas
along the lines of btrfs subvolume quotas.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
bch2_btree_node_ptr_v2 has a field for stashing a pointer to the in
memory btree node; this is safe because we clear this field when reading
in nodes from disk and we never free in memory btree nodes - but, we
have bug reports that indicate something might be faulty with this
optimization, so let's add an option for it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We probably don't ever want to flip this off in production, but it may
be useful for certain kinds of testing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We're seeing a bug where inode creates end up spinning in
bch2_inode_create - disabling sharding will simplify what we're testing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
An option was added to control whether reflink support was on or off
because for a long time, reflink + inline data extent support was
missing - but that's since been fixed, so we can drop the option now.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This patch standardizes all the enums that have associated string tables
(probably more enums should have string tables).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This is to generate strings for them, so that we can print them out.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When the replicas mechanism was added, for tracking data by which drives
it's replicated on, the check for whether we have sufficient devices was
never updated to make use of it. This patch finally does that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Also, make journal writes obey foreground_target and metadata_target.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When inline data extents were added, reflink was forgotten about - we
need indirect inline data extents for reflink + inline data to work
correctly.
This patch adds them, and a new feature bit that's flipped when they're
used.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Some options can't be parsed until the filesystem initialized;
previously, passing these options to mount or remount would cause mount
to fail.
This changes the mount path so that we parse the options passed in
twice, and just ignore any options that can't be parsed the first time.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
There is a bug where we cnan end up clearing the data_has field in the
superblock members section, which causes us to skip reading the journal
and thus journal replay fails. This option tells the recovery path to
not trust those fields.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
To be used the debug tool that dumps the contents of the journal.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reflink might be buggy, so we're adding an option so users can help
bisect what's going on.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This will be used by the userspace debug tools.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Helps for preventing things from getting out of sync.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Inline data extents + reflink is still broken
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
With the refactoring that's coming to add fuse support, we want
bch2_hash_info_init() to be cheaper so we don't have to rely on anything
cached besides the inode in the btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add helptext to option definitions - so we can unify the option
handling with the format command
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>