linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2025-01-14 17:35:42 +00:00

Author	SHA1	Message	Date
NeilBrown	43a705076e	md: support updating bitmap parameters via sysfs. A new attribute directory 'bitmap' in 'md' is created which contains files for configuring the bitmap. 'location' identifies where the bitmap is, either 'none', or 'file' or 'sector offset from metadata'. Writing 'location' can create or remove a bitmap. Adding a 'file' bitmap this way is not yet supported. 'chunksize' and 'time_base' must be set before 'location' can be set. 'chunksize' can be set before creating a bitmap, but is currently always over-ridden by the bitmap superblock. 'time_base' and 'backlog' can be updated at any time. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>	2009-12-14 12:51:41 +11:00
NeilBrown	72e02075a3	md: factor out parsing of fixed-point numbers safe_delay_store can parse fixed point numbers (for fractions of a second). We will want to do that for another sysfs file soon, so factor out the code. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	f6af949c56	md: support bitmap offset appropriate for external-metadata arrays. For md arrays were metadata is managed externally, the kernel does not know about a superblock so the superblock offset is 0. If we want to have a write-intent-bitmap near the end of the devices of such an array, we should support sector_t sized offset. We need offset be possibly negative for when the bitmap is before the metadata, so use loff_t instead. Also add sanity check that bitmap does not overlap with data. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	9cd30fdc33	md: remove needless setting of thread->timeout in raid10_quiesce As bitmap_create and bitmap_destroy already set thread->timeout as appropriate, there is no need to do it in raid10_quiesce. There is a possible need to wake the thread after the timeout has been set low, but it is better to do that where the timeout is actually set low, in bitmap_create. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	1b04be96f6	md: change daemon_sleep to be in 'jiffies' rather than 'seconds'. This removes a lot of multiplications by HZ. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	42a04b5078	md: move offset, daemon_sleep and chunksize out of bitmap structure ... and into bitmap_info. These are all configuration parameters that need to be set before the bitmap is created. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	c3d9714e88	md: collect bitmap-specific fields into one structure. In preparation for making bitmap fields configurable via sysfs, start tidying up by making a single structure to contain the configuration fields. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	709ae4879a	md/raid1: add takeover support for raid5->raid1 A 2-device raid5 array can now be converted to raid1. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:41 +11:00
NeilBrown	6eef4b21ff	md: add honouring of suspend_{lo,hi} to raid1. This will allow us to stop writeout to portions of the array while they are resynced by someone else - e.g. another node in a cluster. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:40 +11:00
NeilBrown	729a18663a	md/raid5: don't complete make_request on barrier until writes are scheduled The post-barrier-flush is sent by md as soon as make_request on the barrier write completes. For raid5, the data might not be in the per-device queues yet. So for barrier requests, wait for any pre-reading to be done so that the request will be in the per-device queues. We use the 'preread_active' count to check that nothing is still in the preread phase, and delay the decrement of this count until after write requests have been submitted to the underlying devices. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:51:40 +11:00
NeilBrown	a2826aa92e	md: support barrier requests on all personalities. Previously barriers were only supported on RAID1. This is because other levels requires synchronisation across all devices and so needed a different approach. Here is that approach. When a barrier arrives, we send a zero-length barrier to every active device. When that completes - and if the original request was not empty - we submit the barrier request itself (with the barrier flag cleared) and then submit a fresh load of zero length barriers. The barrier request itself is asynchronous, but any subsequent request will block until the barrier completes. The reason for clearing the barrier flag is that a barrier request is allowed to fail. If we pass a non-empty barrier through a striping raid level it is conceivable that part of it could succeed and part could fail. That would be way too hard to deal with. So if the first run of zero length barriers succeed, we assume all is sufficiently well that we send the request and ignore errors in the second run of barriers. RAID5 needs extra care as write requests may not have been submitted to the underlying devices yet. So we flush the stripe cache before proceeding with the barrier. Note that the second set of zero-length barriers are submitted immediately after the original request is submitted. Thus when a personality finds mddev->barrier to be set during make_request, it should not return from make_request until the corresponding per-device request(s) have been queued. That will be done in later patches. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Andre Noll <maan@systemlinux.org>	2009-12-14 12:49:49 +11:00
NeilBrown	efa593390e	md: don't reset curr_resync_completed after an interrupted resync If a resync/recovery/check/repair is interrupted for some reason, it can be useful to know exactly where it got up to. So in that case, do not clear curr_resync_completed. Initialise it when starting a resync/recovery/... instead. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:49:49 +11:00
NeilBrown	c07b70ad32	md: adjust resync_min usefully when resync aborts. When a 'check' or 'repair' finished we should clear resync_min so that a future check/repair will cover the whole array (by default). However if it is interrupted, we should update resync_min to where we got up to, so that when the check/repair continues it just does the remainder of the array. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:49:48 +11:00
NeilBrown	7820f9e1dd	md: remove sparse warning:symbol XXX was not declared. Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:49:47 +11:00
NeilBrown	8553fe7ec7	md/raid5: remove some sparse warnings. qd_idx is previously declared and given exactly the same value! Signed-off-by: NeilBrown <neilb@suse.de>	2009-12-14 12:49:47 +11:00
NeilBrown	aa5cbd1038	md/bitmap: protect against bitmap removal while being updated. A write intent bitmap can be removed from an array while the array is active. When this happens, all IO is suspended and flushed before the bitmap is removed. However it is possible that bitmap_daemon_work is still running to clear old bits from the bitmap. If it is, it can dereference the bitmap after it has been freed. So introduce a new mutex to protect bitmap_daemon_work and get it before destroying a bitmap. This is suitable for any current -stable kernel. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org	2009-12-14 12:49:46 +11:00
Mikulas Patocka	d2fdb776e0	dm snapshot: use merge origin if snapshot invalid If the snapshot we are merging became invalid (e.g. it ran out of space) redirect all I/O directly to the origin device. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:36 +00:00
Mike Snitzer	d8ddb1cfff	dm snapshot: report merge failure in status Set 'merge_failed' flag if a snapshot fails to merge. Update snapshot_status() to report "Merge failed" if 'merge_failed' is set. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:35 +00:00
Mike Snitzer	8a2d528620	dm snapshot: merge consecutive chunks together s->store->type->prepare_merge returns the number of chunks that can be copied linearly working backwards from the returned chunk number. For example, if it returns 3 chunks with old_chunk == 10 and new_chunk == 20, then chunk 20 can be copied to 10, chunk 19 to 9 and 18 to 8. Until now kcopyd only copied one chunk at a time. This patch now copies the full set at once. Consequently, snapshot_merge_process() needs to delay the merging of all chunks if any have writes in progress, not just the first chunk in the region that is to be merged. snapshot-merge's performance is now comparable to the original snapshot-origin target. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:34 +00:00
Mikulas Patocka	73dfd078cf	dm snapshot: trigger exceptions in remaining snapshots during merge When there is one merging snapshot and other non-merging snapshots, snapshot_merge_process() must make exceptions in the non-merging snapshots. Use a sequence count to resolve the race between I/O to chunks that are about to be merged. The count increases each time an exception reallocation finishes. Use wait_event() to wait until the count changes. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:34 +00:00
Mikulas Patocka	17aa03326d	dm snapshot: delay merging a chunk until writes to it complete Track writes to chunks that are currently being merged and delay merging a chunk until all writes to that chunk finish. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:33 +00:00
Mikulas Patocka	9fe8625488	dm snapshot: queue writes to chunks being merged While a set of chunks is being merged, any overlapping writes need to be queued. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:33 +00:00
Mikulas Patocka	1e03f97e43	dm snapshot: add merging Merging is started when origin is resumed and it is stopped when origin is suspended or when the merging snapshot is destroyed or errors are detected. Merging is not yet interlocked with writes: this will be handled in subsequent patches. The code relies on callbacks from a private kcopyd thread. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:32 +00:00
Mikulas Patocka	9d3b15c4c7	dm snapshot: permit only one merge at once Merging more than one snapshot is not supported, so prevent this happening. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:32 +00:00
Mike Snitzer	10b8106a70	dm snapshot: support barriers in snapshot merge target Sets num_flush_requests=2 to support flushing both the origin and cow devices used by the snapshot-merge target. Also, snapshot_ctr() now gets the origin device using FMODE_WRITE if the target is snapshot-merge (which writes to the origin device). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:31 +00:00
Mikulas Patocka	3452c2a1eb	dm snapshot: avoid allocating exceptions in merge The snapshot-merge target should not allocate new exceptions because the intent is to merge all of its exceptions as quickly and safely as possible. This patch introduces the snapshot-merge mapping function and updates __origin_write() so that it doesn't allocate exceptions on any snapshots that are being merged. If a write request to a merging snapshot device is to be dispatched directly to the origin (because the chunk is not remapped or was already merged), snapshot_merge_map() must make exceptions in other snapshots so calls do_origin(). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:31 +00:00
Mikulas Patocka	515ad66cc4	dm snapshot: rework writing to origin To track the completion of exceptions relating to the same location on the device, the current code selects one exception as primary_pe, links the other exceptions to it and uses reference counting to wait until all the reallocations are complete. It is considered too complicated to extend this code to handle the new snapshot-merge target, where sets of non-overlapping chunks would also need to become linked. Instead, a simpler (but less efficient) approach is taken. Bios are linked to one exception. When it completes, bios are simply retried, and if other related exceptions are still outstanding, they'll get queued again to wait for another one. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:30 +00:00
Mikulas Patocka	d698aa4500	dm snapshot: add merge target The snapshot-merge target allows a snapshot to be merged back into the snapshot's origin device. One anticipated use of snapshot merging is the rollback of filesystems to back out problematic system upgrades. This patch adds snapshot-merge target management to both dm_snapshot_init() and dm_snapshot_exit(). As an initial place-holder, snapshot-merge is identical to the snapshot target. Documentation is provided. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:30 +00:00
Mikulas Patocka	4454a6216f	dm exception store: add merge specific methods Add functions that decide how many consecutive chunks of snapshot to merge back into the origin next and to update the metadata afterwards. prepare_merge provides a pointer to the most recent still-to-be-merged chunk and returns how many previous ones are consecutive and can be processed together. commit_merge removes the nr_merged most-recent chunks permanently from the exception store. The number must not exceed that returned by prepare_merge. Introduce NUM_SNAPSHOT_HDR_CHUNKS to show where the snapshot header chunk is accounted for. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:29 +00:00
Mike Snitzer	615d1eb9ca	dm snapshot: create function for chunk_is_tracked wait Move the __chunk_is_tracked() loop into a separate function as we will also need to call it from the write path in the rare case of conflicting writes to the same chunk. Originally introduced in commit a8d41b59f3f5a7ac19452ef442a7fc1b5fa17366 ("dm snapshot: fix race during exception creation"). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:29 +00:00
Mikulas Patocka	9eaae8ffbc	dm snapshot: make bio optional in __origin_write To support the merging of snapshots back into their origin we need to trigger exceptions in other snapshots not being merged without any incoming bio on the origin device. The bio parameter to __origin_write() becomes optional and the sector needs supplying separately. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:28 +00:00
Kiyoshi Ueda	c2f3d24b78	dm mpath: reject messages when device is suspended This patch rejects messages that can generate I/O while the device itself is suspended. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:27 +00:00
Kiyoshi Ueda	64dbce580d	dm: export suspended state to targets This patch adds the exported dm_suspended() function so that targets can check whether or not they are suspended. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:27 +00:00
Kiyoshi Ueda	4f186f8bbf	dm: rename dm_suspended to dm_suspended_md This patch renames dm_suspended() to dm_suspended_md() and keeps it internal to dm. No functional change. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:26 +00:00
Kiyoshi Ueda	4d4471cb5c	dm: swap target postsuspend call and setting suspended flag This patch moves DMF_SUSPENDED flag set before postsuspend. No one should care about the ordering, because the flag set and the postsuspend are protected by a single lock, md->suspend_lock, and all strict flag-checkers take the lock. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:26 +00:00
Milan Broz	61afef614b	dm crypt: add plain64 iv The default plain IV is 32-bit only. This plain64 IV provides a compatible mode for encrypted devices bigger than 4TB. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:25 +00:00
Jun'ichi Nomura	6db4ccd635	dm: trace request based remapping This patch adds a remapping trace to request-based dm. BIO-based dm already has the equivalent tracepoint. For example, under this dm stack (linear LV on multipath): # dmsetup ls --tree -o ascii vg-lv0 (253:1) `-mpath0 (253:0) \|- (8:160) \|- (66:80) \|- (65:176) `- (65:160) Trace of 'dd of=/dev/vg/lv0 bs=128k count=1 oflag=direct' looks like this: without the patch: dd-6674 [000] 539.727384: block_bio_queue: 253,1 WS 0 + 256 [dd] dd-6674 [000] 539.727392: block_remap: 253,0 WS 384 + 256 <- (253,1) 0 dd-6674 [000] 539.727394: block_bio_queue: 253,0 WS 384 + 256 [dd] dd-6674 [000] 539.727405: block_getrq: 253,0 WS 384 + 256 [dd] dd-6674 [000] 539.727409: block_plug: [dd] dd-6674 [000] 539.727410: block_rq_insert: 253,0 W 0 () 384 + 256 [dd] dd-6674 [000] 539.727416: block_rq_issue: 253,0 W 0 () 384 + 256 [dd] dd-6674 [000] 539.727426: block_rq_insert: 65,176 W 0 () 384 + 256 [dd] dd-6674 [000] 539.727427: block_rq_issue: 65,176 W 0 () 384 + 256 [dd] ... and with the patch: (the line with '' is the trace added by this patch) dd-6617 [002] 162.914301: block_bio_queue: 253,1 WS 0 + 256 [dd] dd-6617 [002] 162.914314: block_remap: 253,0 WS 384 + 256 <- (253,1) 0 dd-6617 [002] 162.914316: block_bio_queue: 253,0 WS 384 + 256 [dd] dd-6617 [002] 162.914331: block_getrq: 253,0 WS 384 + 256 [dd] dd-6617 [002] 162.914335: block_plug: [dd] dd-6617 [002] 162.914337: block_rq_insert: 253,0 W 0 () 384 + 256 [dd] dd-6617 [002] 162.914347: block_rq_issue: 253,0 W 0 () 384 + 256 [dd] dd-6617 [002] 162.914356: block_rq_remap: 65,176 W 384 + 256 <- (253,0) 384 dd-6617 [002] 162.914358: block_rq_insert: 65,176 W 0 () 384 + 256 [dd] dd-6617 [002] 162.914359: block_rq_issue: 65,176 W 0 () 384 + 256 [dd] ... Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:25 +00:00
Mike Snitzer	c1f0c183f6	dm snapshot: allow live exception store handover between tables Permit in-use snapshot exception data to be 'handed over' from one snapshot instance to another. This is a pre-requisite for patches that allow the changes made in a snapshot device to be merged back into its origin device and also allows device resizing. The basic call sequence is: dmsetup load new_snapshot (referencing the existing in-use cow device) - the ctr code detects that the cow is already in use and allows the two snapshot target instances to be linked together dmsetup suspend original_snapshot dmsetup resume new_snapshot - the new_snapshot becomes live, and if anything now tries to access the original one it will receive -EIO dmsetup remove original_snapshot (There can only be two snapshot targets referencing the same cow device simultaneously.) Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:24 +00:00
Alasdair G Kergon	042d2a9bcd	dm: keep old table until after resume succeeded When swapping a new table into place, retain the old table until its replacement is in place. An old check for an empty table is removed because this is enforced in populate_table(). __unbind() becomes redundant when followed by __bind(). Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:24 +00:00
Alasdair G Kergon	a794015597	dm: bind new table before destroying old When replacing a mapped device's table during a 'resume', delay the destruction of the old table until the new one is successfully in place. This will make it easier for a later patch to transfer internal state information from the old table to the new one (something we do not currently support) while giving us more options for reversion if a later part of the operation fails. Devices are always in the suspended state during dm_swap_table(). This patch reinforces the requirement that all I/O must have been flushed from the table targets while in this state (including any in workqueues). In the case of 'noflush' suspending, unprocessed I/O should have been 'pushed back' to the dm core prior to this point, for resubmission after the new table is in place. Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:23 +00:00
Mike Snitzer	1d0f3ce832	dm ioctl: retrieve status from inactive table Add the flag DM_QUERY_INACTIVE_TABLE_FLAG to the ioctls to return infomation about the loaded-but-not-yet-active table instead of the live table. Prior to this patch it was impossible to obtain this information until the device had been 'resumed'. Userspace dmsetup and libdevmapper support the flag as of version 1.02.40. e.g. dmsetup info --inactive vg1-lv1 Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:22 +00:00
Mikulas Patocka	12fc0f49dc	dm io: handle empty barriers Accept empty barriers in dm-io. dm-io will process empty write barrier requests just like the other read/write requests. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:22 +00:00
Mike Anderson	67a46dad25	dm mpath: prevent io from work queue while suspended Reject messages that can generate I/O while the device itself is suspended. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:21 +00:00
Mike Anderson	6380f26f04	dm mpath: add mutex to synchronize adding and flushing work Add a mutex to allow possible creators of new work to synchronize with flushing work queues. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Acked-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:21 +00:00
Mike Anderson	c50abeb380	dm ioctl: forbid messages to devices being deleted Once we begin deleting a device, prevent any further messages being sent to targets of its table (to avoid races). Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:20 +00:00
Mike Anderson	432a212c0d	dm: add dm_deleting_md function Add dm_deleting_md to check whether or not a given mapped device is currently being deleted. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:20 +00:00
Kiyoshi Ueda	6df400ab64	dm mpath: flush workqueues before suspend completes This patch stops the remaining dm-mpath activity during the suspend sequence by flushing workqueues in postsuspend function. The current dm-mpath target may not be quiet even after suspend completes because some workqueues (e.g. device_handler's work, event handling) are not flushed during the suspend sequence, even though suspended devices/targets are supposed to be quiet in this state. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:19 +00:00
Alasdair G Kergon	7c6664114b	dm: rename dm_get_table to dm_get_live_table Rename dm_get_table to dm_get_live_table. Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:19 +00:00
Kiyoshi Ueda	d0bcb87865	dm: add request based barrier support This patch adds barrier support for request-based dm. CORE DESIGN The design is basically same as bio-based dm, which emulates barrier by mapping empty barrier bios before/after a barrier I/O. But request-based dm has been using struct request_queue for I/O queueing, so the block-layer's barrier mechanism can be used. o Summary of the block-layer's behavior (which is depended by dm-core) Request-based dm uses QUEUE_ORDERED_DRAIN_FLUSH ordered mode for I/O barrier. It means that when an I/O requiring barrier is found in the request_queue, the block-layer makes pre-flush request and post-flush request just before and just after the I/O respectively. After the ordered sequence starts, the block-layer waits for all in-flight I/Os to complete, then gives drivers the pre-flush request, the barrier I/O and the post-flush request one by one. It means that the request_queue is stopped automatically by the block-layer until drivers complete each sequence. o dm-core For the barrier I/O, treats it as a normal I/O, so no additional code is needed. For the pre/post-flush request, flushes caches by the followings: 1. Make the number of empty barrier requests required by target's num_flush_requests, and map them (dm_rq_barrier()). 2. Waits for the mapped barriers to complete (dm_rq_barrier()). If error has occurred, save the error value to md->barrier_error (dm_end_request()). (*) Basically, the first reported error is taken. But -EOPNOTSUPP supersedes any error and DM_ENDIO_REQUEUE follows. 3. Requeue the pre/post-flush request if the error value is DM_ENDIO_REQUEUE. Otherwise, completes with the error value (dm_rq_barrier_work()). The pre/post-flush work above is done in the kernel thread (kdmflush) context, since memory allocation which might sleep is needed in dm_rq_barrier() but sleep is not allowed in dm_request_fn(), which is an irq-disabled context. Also, clones of the pre/post-flush request share an original, so such clones can't be completed using the softirq context. Instead, complete them in the context of underlying device drivers. It should be safe since there is no I/O dispatching during the completion of such clones. For suspend, the workqueue of kdmflush needs to be flushed after the request_queue has been stopped. Otherwise, the next flush work can be kicked even after the suspend completes. TARGET INTERFACE No new interface is added. Just use the existing num_flush_requests in struct target_type as same as bio-based dm. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:18 +00:00
Kiyoshi Ueda	980691e5f3	dm: move dm_end_request This patch moves dm_end_request() to make the next patch more readable. No functional change. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2009-12-10 23:52:17 +00:00

1 2 3 4 5 ...

1609 Commits