The function pointer release in struct block_device_operations
should point to functions declared as void.
Sparse warnings:
drivers/md/bcache/super.c:656:27: warning:
incorrect type in initializer (different base types)
drivers/md/bcache/super.c:656:27:
expected void ( *release )( ... )
drivers/md/bcache/super.c:656:27:
got int ( static [toplevel] *<noident> )( ... )
drivers/md/bcache/super.c:656:2: warning:
initialization from incompatible pointer type [enabled by default]
drivers/md/bcache/super.c:656:2: warning:
(near initialization for ‘bcache_ops.release’) [enabled by default]
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Kent writes:
Hey Jens, this is everything I've got ready for 3.10 - there's _still_
one more bug I'm trying to track down.
Andrew - I've got patches that rip out the pkey() and pbtree() macros,
but they're somewhat tied up with some other nontrivial refactorings so
I think I'm going to wait a bit on those.
The main fix is that bch_allocator_thread() wasn't waiting on
garbage collection to finish (if invalidate_buckets had set
ca->invalidate_needs_gc); we need that to make sure the allocator
doesn't spin and potentially block gc from finishing.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
By default the cciss driver supports all "older" HP Smart Array
controllers and hpsa supports all controllers starting with the G6 family.
There are module parameters that allow a user to override those defaults
and use hpsa for any HP Smart Array controller.
If the user does override the default behavior and uses hpsa for older
controllers it is possible that cciss may try to load in a kdump crash
kernel. This may happen if cciss is loaded first from the kdump initrd
image. If cciss does load rather than hpsa and reset_devices is true we
immediately call cciss_hard_reset_controller. This will result in a
kernel panic and the core file cannot be created. This patch prevents
cciss from trying to load in this scenario.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add the cciss_allow_hpsa modules parameter. This allows users to use the
hpsa driver instead of cciss for older controllers.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add CONFIG_PM_SLEEP to suspend/resume functions to fix the following build
warning when CONFIG_PM_SLEEP is not selected. This is because sleep PM
callbacks defined by SIMPLE_DEV_PM_OPS are only used when the
CONFIG_PM_SLEEP is enabled.
drivers/block/mg_disk.c:783:12: warning: 'mg_suspend' defined but not used [-Wunused-function]
drivers/block/mg_disk.c:807:12: warning: 'mg_resume' defined but not used [-Wunused-function]
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Workaround for handling unaligned writes: limit number of outstanding
unaligned writes
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stacked md devices reuse the bvm for the subordinate device, causing
problems...
Reported-by: Michael Balser <michael.balser@profitbricks.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
bch_bio_max_sectors() was checking against BIO_MAX_PAGES as if the limit
was for the total bytes in the bio, not the number of segments.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Add a new superblock version, and consolidate related defines.
Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Temporarily disabling TRIM support until TRIM related issues
are addressed in the firmware.
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reported smatch warning:
drivers/block/mtip32xx/mtip32xx.c:4163 mtip_block_shutdown() warn: variable dereferenced before check 'dd->disk' (see line 4159)
dd->disk->disk_name accessed before the check if dd->disk is NULL. Fixed this
and access of dd->queue/dd->disk->queue.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
m68k/allmodconfig:
drivers/md/bcache/bset.c: In function ‘bset_search_tree’:
drivers/md/bcache/bset.c:727: error: implicit declaration of function ‘prefetch’
drivers/md/bcache/btree.c: In function ‘bch_btree_node_get’:
drivers/md/bcache/btree.c:933: error: implicit declaration of function ‘prefetch’
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Recently introduced al_begin_io_nonblock() was returning -EBUSY,
even when it should return -EWOULDBLOCK.
Impact:
A few spurious wake_up() calls in prepare_al_transaction_nonblock().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It was unnoticed for some time that assigning to current->policy is
no longer sufficient to set a real time priority for a kernel thread.
Reported-by: Charlie Suffin <Charlie.Suffin@stratus.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With an automatic after split-brain recovery policy of
"after-sb-1pri call-pri-lost-after-sb",
when trying to drbd_set_role() to R_SECONDARY,
we run into a deadlock.
This was first recognized and supposedly fixed by
2009-06-10 "Fixed a deadlock when using automatic split brain recovery when both nodes are"
replacing drbd_set_role() with drbd_change_state() in that code-path,
but the first hunk of that patch forgets to remove the drbd_set_role().
We apparently only ever tested the "two primaries" case.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If single_open() fails in drbd_proc_open(), module refcount is left incremented.
The patch adds module_put() on the error path.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The sanity check when receiving P_BARRIER_ACK does expect all write
requests with a given req->epoch to have been either all replicated,
or all not replicated.
Because req->epoch was assigned before calling maybe_pull_ahead(),
this expectation was not met, leading to an off-by-one in the sanity
check, and further to a "Protocol Error".
Fix: move the call to maybe_pull_ahead() a few lines up,
and assign req->epoch only after that.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We validated resync_after dependencies, if changed via disk-options.
But we did not validate them when first created via attach.
We also did not check or cleanup dependencies that used to be correct,
but now point to meanwhile removed minor devices.
If the drbd_resync_after_valid() validation in disk-options tried to
follow a dependency chain in this way, this could lead to NULL pointer
dereference.
Validate resync_after settings in drbd_adm_attach() already, as well as
in drbd_adm_disk_opts(), and and only reject dependency loops.
Depending on non-existing disks is allowed and equivalent to no dependency.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We forgot to free the disk_conf,
so for each attach/detach cycle we leaked 336 bytes.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We completed empty flushes (blkdev_issue_flush()) with IO error
if we lost the local disk, even if we still have an established
replication link to a healthy remote disk.
Fix this to only report errors to upper layers,
if neither local nor remote data is reachable.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The issue was that if the connection broke while we did the
gracefull state change to C_DISCONNECTING (C_TEARDOWN), then
we returned a success code from the state engine. (SS_CW_NO_NEED)
The result of that is that we missed to call the fence-peer
script in such a case.
Fixed that by introducing a new error code (SS_OUTDATE_WO_CONN).
This one should never reach back into user space.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Introduced in drbd: always write bitmap on detach,
the bitmap bulk writeout on detach was indicating
it expected exclusive bitmap access.
Where I meant to say: expect no more modifications,
but testing/counting is still allowed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Patch best viewed with git diff --ignore-space-change.
Now that we attempt the fallback to local bitmap operation
only when disconnected, we can safely drop the extra "silent"
state request from both invalidate and invalidate-remote.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Since commit
drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
trying to invalidate a disconnected Primary returned an error code
that did not really match the situation:
"Refusing to be Outdated while Connected"
Insert two more specific conditions into is_valid_state(),
changing that to "Need access to UpToDate data",
respectively "Need a connection to start verify or resync".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
To avoid other state change requests, after passing through
sanitize_state(), to be mistaken for an invalidate,
move the "set all bits as out-of-sync" into the invalidate path.
Make invalidate and invalidate-remote behave consistently wrt.
current connection state (need either an established replication link,
or really be disconnected). Also mention that in the documentation.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We've seen a spurious full resync, because a connection breakage
raced with drbd_start_resync(, C_SYNC_TARGET),
and the resulting state change request intended to start the resync
ended up looking like a local invalidate.
Fix:
Double check the state inside the lock,
and don't even request that state change,
if we had connection or IO problems.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
And some manual common subexpression elimination which may help the
compiler produce smaller code.
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 82a84eaf7e51ba3da0c36cbc401034a4e943492d left a return 0 in
closure_debug_init(). Whoops.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Took out some nested functions, and fixed some more checkpatch
complaints.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
config: make ARCH=i386 allmodconfig
All error/warnings:
drivers/md/bcache/bset.c: In function 'bch_ptr_bad':
>> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
drivers/md/bcache/debug.c: In function 'bch_pbtree':
>> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
drivers/md/bcache/btree.c: In function 'bch_btree_read_done':
>> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat]
--
drivers/md/bcache/closure.o: In function `closure_debug_init':
>> (.init.text+0x0): multiple definition of `init_module'
>> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Does writethrough and writeback caching, handles unclean shutdown, and
has a bunch of other nifty features motivated by real world usage.
See the wiki at http://bcache.evilpiepirate.org for more.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Hack, but bcache needs a way around lockdep for locking during garbage
collection - we need to keep multiple btree nodes locked for coalescing
and rw_lock_nested() isn't really sufficient or appropriate here.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>
Exported so it can be used by bcache's tracepoints
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Ingo Molnar <mingo@redhat.com>
Needed for bcache - need a cheap source of random numbers for perturbing
IO sizes, for rate limiting IO to the SSD.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: "Theodore Ts'o" <tytso@mit.edu>