linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2025-01-09 14:43:16 +00:00

Author	SHA1	Message	Date
Tao Ma	c901fb0073	ocfs2: Make ocfs2_extend_trans() really extend. In ocfs2, we use ocfs2_extend_trans() to extend a journal handle's blocks. But if jbd2_journal_extend() fails, it will only restart with the the new number of blocks. This tends to be awkward since in most cases we want additional reserved blocks. It makes our code harder to mantain since the caller can't be sure all the original blocks will not be accessed and dirtied again. There are 15 callers of ocfs2_extend_trans() in fs/ocfs2, and 12 of them have to add h_buffer_credits before they call ocfs2_extend_trans(). This makes ocfs2_extend_trans() really extend atop the original block count. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:09 -07:00
Tao Ma	3e4218df31	ocfs2/trivial: Code cleanup for allocation reservation. Two tiny cleanup for allocation reservation. 1. Remove some extra codes in ocfs2_local_alloc_find_clear_bits. 2. Remove an unuseful variables in ocfs2_find_resv_lhs. Signed-off-by: Tao Ma <tao.ma@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:09 -07:00
Tao Ma	b065556a7d	ocfs2: make ocfs2_adjust_resv_from_alloc simple. When we allocate some bits from the reservation, we always allocate from the r_start(see ocfs2_resmap_resv_bits). So there should be no reason to check between r_start and start. And I don't think we will change this behaviour later by allocating from some bits after r_start. Why not make ocfs2_adjust_resv_from_alloc simple for now? The only chance we have to adjust the reservation is when we haven't reached the end. With this patch, the function is more readable. Note: btw, this patch also fixes an original bug in the function which I haven't found before. if (end < ocfs2_resv_end(resv)) rhs = end - ocfs2_resv_end(resv); This code is of course buggy. ;) Signed-off-by: Tao Ma <tao.ma@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:09 -07:00
Sunil Mushran	4b37fcb7d4	ocfs2: Make nointr a default mount option OCFS2 has never really supported intr. This patch acknowledges this reality and makes nointr the default mount option. In a later patch, we intend to support intr. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:08 -07:00
Sunil Mushran	5c80d4c9e5	ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE o2dlm join and leave messages are more than informational as they are required for debugging locking issues. This patch changes them from KERN_INFO to KERN_NOTICE. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:08 -07:00
Srinivas Eeda	23fd9abdc8	o2net: log socket state changes This patch logs socket state changes that lead to socket shutdown. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:08 -07:00
Wengang Wang	a5196ec5ef	ocfs2: print node # when tcp fails Print the node number of a peer node if sending it a message failed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:08 -07:00
Mark Fasheh	83f92318fa	ocfs2: Add dir_resv_level mount option The default behavior for directory reservations stays the same, but we add a mount option so people can tweak the size of directory reservations according to their workloads. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:07 -07:00
Mark Fasheh	b07f8f24df	ocfs2: change default reservation window sizes The default reservation size of 4 (32-bit windows) is a bit too ambitious. Scale it back to 16 bits (resv_level=2). I have been testing various sizes on a 4-node cluster which runs a mixed workload that is heavily threaded. With a 256MB local alloc, I get roughly the following levels of average file fragmentation: resv_level=0 70% resv_level=1 21% resv_level=2 23% resv_level=3 24% resv_level=4 60% resv_level=5 did not test resv_level=6 60% resv_level=2 seemed like a good compromise between not letting windows be too small, but not so big that heavier workloads will immediately suffer without tuning. This patch also change the behavior of directory reservations - they now track file reservations. The previous compromise of giving directory windows only 8 bits wound up fragmenting more at some window sizes because file allocations had smaller unused windows to poach from. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:07 -07:00
Mark Fasheh	6b82021b9e	ocfs2: increase the default size of local alloc windows I have observed that the current size of 8M gives us pretty poor fragmentation on multi-threaded workloads which do lots of writes. Generally, I can increase the size of local alloc windows and observe a marked decrease in fragmentation, even up and beyond window sizes of 512 megabytes. This makes sense for a couple reasons - larger local alloc means more room for reservation windows. On multi-node workloads the larger local alloc helps as well because we don't have to do window slides as often. Also, I removed the OCFS2_DEFAULT_LOCAL_ALLOC_SIZE constant as it is no longer used and the comment above it was out of date. To test fragmentation, I used a workload which launched 4 threads that did 4k writes into a series of about 140 alternating files. With resv_level=2, and a 4k/4k file system I observed the following average fragmentation for various localalloc= parameters: localalloc= avg. fragmentation 8 48 32 16 64 10 120 7 On larger cluster sizes, the difference is more dramatic. The new default size top out at 256M, which we'll only get for cluster sizes of 32K and above. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:07 -07:00
Mark Fasheh	73c8a80003	ocfs2: clean up localalloc mount option size parsing This patch pulls the local alloc sizing code into localalloc.c and provides a callout to it from ocfs2_fill_super(). Behavior is essentially unchanged except that I correctly calculate the maximum local alloc size. The old code in ocfs2_parse_options() calculated the max size as: ocfs2_local_alloc_size(sb) * 8 which is correct, in bits. Unfortunately though the option passed in is in megabytes. Ultimately, this bug made no real difference - the shrink code would catch a too-large size and bring it down to something reasonable. Still, it's less than efficient as-is. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:18:06 -07:00
Mark Fasheh	a57c8fd2ad	ocfs2: remove ocfs2_local_alloc_in_range() Inodes are always allocated from the global bitmap now so we don't need this any more. Also, the existing implementation bounces reservations around needlessly. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-05-05 18:17:31 -07:00
Mark Fasheh	33d5d380d6	ocfs2: allocate btree internal block groups from the global bitmap Otherwise, the need for a very large contiguous allocation tends to wreak havoc on many inode allocation reservations on the local alloc, thus ruining any chances for contiguousness. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-05-05 18:17:31 -07:00
Mark Fasheh	e3b4a97dbe	ocfs2: use allocation reservations for directory data Use the reservations system for unindexed dir tree allocations. We don't bother with the indexed tree as reads from it are mostly random anyway. Directory reservations are marked seperately, to allow the reservations code a chance to optimize their window sizes. This patch allocates only 8 bits for directory windows as they generally are not expected to grow as quickly as file data. Future improvements to dir window sizing can trivially be made. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-05-05 18:17:30 -07:00
Mark Fasheh	4fe370afaa	ocfs2: use allocation reservations during file write Add a per-inode reservations structure and pass it through to the reservations code. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-05-05 18:17:30 -07:00
Mark Fasheh	d02f00cc05	ocfs2: allocation reservations This patch improves Ocfs2 allocation policy by allowing an inode to reserve a portion of the local alloc bitmap for itself. The reserved portion (allocation window) is advisory in that other allocation windows might steal it if the local alloc bitmap becomes full. Otherwise, the reservations are honored and guaranteed to be free. When the local alloc window is moved to a different portion of the bitmap, existing reservations are discarded. Reservation windows are represented internally by a red-black tree. Within that tree, each node represents the reservation window of one inode. An LRU of active reservations is also maintained. When new data is written, we allocate it from the inodes window. When all bits in a window are exhausted, we allocate a new one as close to the previous one as possible. Should we not find free space, an existing reservation is pulled off the LRU and cannibalized. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-05-05 18:17:30 -07:00
Joel Becker	ec20cec7a3	ocfs2: Make ocfs2_journal_dirty() void. jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0 since before the kernel moved to git. There is no point in checking this error. ocfs2_journal_dirty() has been faithfully returning the status since the beginning. All over ocfs2, we have blocks of code checking this can't fail status. In the past few years, we've tried to avoid adding these checks, because they are pointless. But anyone who looks at our code assumes they are needed. Finally, ocfs2_journal_dirty() is made a void function. All error checking is removed from other files. We'll BUG_ON() the status of jbd2_journal_dirty_metadata() just in case they change it someday. They won't. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:17:29 -07:00
Joel Becker	d577632e65	ocfs2: Avoid a gcc warning in ocfs2_wipe_inode(). gcc warns that a variable is uninitialized. It's actually handled, but an early return fools gcc. Let's just initialize the variable to a garbage value that will crash if the usage is ever broken. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-03 19:15:49 -07:00
Li Dongyang	6b933c8e6f	ocfs2: Avoid direct write if we fall back to buffered I/O when we fall back to buffered write from direct write, we call __generic_file_aio_write() but that will end up doing direct write even we are only prepared to do buffered write because the file has the O_DIRECT flag set. This is a fix for https://bugzilla.novell.com/show_bug.cgi?id=591039 revised with Joel's comments. Signed-off-by: Li Dongyang <lidongyang@novell.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-04-30 13:45:13 -07:00
Joel Becker	f9221fd803	Merge branch 'skip_delete_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2-mark into ocfs2-fixes	2010-04-30 13:37:29 -07:00
Joel Becker	a36d515c7a	ocfs2_dlmfs: Fix math error when reading LVB. When asked for a partial read of the LVB in a dlmfs file, we can accidentally calculate a negative count. Reported-by: Dan Carpenter <error27@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-04-23 15:24:59 -07:00
Tao Ma	c21a534e2f	ocfs2: Update VFS inode's id info after reflink. In reflink we update the id info on the disk but forgot to update the corresponding information in the VFS inode. Update them accordingly when we want to preserve the attributes. Reported-by: Jeff Liu <jeff.liu@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-04-23 14:43:22 -07:00
Dan Carpenter	0350cb078f	ocfs2: potential ERR_PTR dereference on error paths If "handle" is non null at the end of the function then we assume it's a valid pointer and pass it to ocfs2_commit_trans(); Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-04-23 14:42:06 -07:00
Mark Fasheh	a9743fcdc0	ocfs2: Add directory entry later in ocfs2_symlink() and ocfs2_mknod() If we get a failure during creation of an inode we'll allow the orphan code to remove the inode, which is correct. However, we need to ensure that we don't get any errors after the call to ocfs2_add_entry(), otherwise we could leave a dangling directory reference. The solution is simple - in both cases, all I had to do was move ocfs2_dentry_attach_lock() above the ocfs2_add_entry() call. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:42:22 -07:00
Li Dongyang	062d340384	ocfs2: use OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_mknod error path Mark the inode with flag OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_mknod, so we can kill the inode in case of error. [ Fixed up comment style -Mark ] Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:05:00 -07:00
Li Dongyang	ab41fdc8fd	ocfs2: use OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_symlink error path Mark the inode with flag OCFS2_INODE_SKIP_ORPHAN_DIR when we get an error after allocating one, so that we can kill the inode. Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:05:00 -07:00
Li Dongyang	d4cd1871cf	ocfs2: add OCFS2_INODE_SKIP_ORPHAN_DIR flag and honor it in the inode wipe code Currently in the error path of ocfs2_symlink and ocfs2_mknod, we just call iput with the inode we failed with, but the inode wipe code will complain because we don't add the inode to orphan dir. One solution would be to lock the orphan dir during the entire transaction, but that's too heavy for a rare error path. Instead, we add a flag, OCFS2_INODE_SKIP_ORPHAN_DIR which tells the inode wipe code that it won't find this inode in the orphan dir. [ Merge fixes and comment style cleanups -Mark ] Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:03:49 -07:00
Tao Ma	79681842e1	ocfs2: Reset status if we want to restart file extension. In __ocfs2_extend_allocation, we will restart our file extension if ((!status) && restart_func). But there is a bug that the status is still left as -EGAIN. This is really an old bug, but it is masked by the return value of ocfs2_journal_dirty. So it show up when we make ocfs2_journal_dirty void. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-04-16 03:10:54 -07:00
Joel Becker	a42ab8e1a3	ocfs2: Compute metaecc for superblocks during online resize. Online resize writes out the new superblock and its backups directly. The metaecc data wasn't being recomputed. Let's do that directly. Signed-off-by: Joel Becker <joel.becker@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com>[ Cc: stable@kernel.org	2010-03-31 18:39:08 -07:00
Wengang Wang	428257f887	ocfs2: Check the owner of a lockres inside the spinlock The checking of lockres owner in dlm_update_lvb() is not inside spinlock protection. I don't see problem in current call path of dlm_update_lvb(). But just for code robustness. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-30 12:55:55 -07:00
Coly Li	a03ab788d0	ocfs2: one more warning fix in ocfs2_file_aio_write(), v2 This patch fixes another compiling warning in ocfs2_file_aio_write() like this, fs/ocfs2/file.c: In function ‘ocfs2_file_aio_write’: fs/ocfs2/file.c:2026: warning: suggest parentheses around ‘&&’ within ‘\|\|’ As Joel suggested, '!ret' is unary, this version removes the wrap from '!ret'. Signed-off-by: Coly Li <coly.li@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-30 12:52:13 -07:00
Tao Ma	efd647f744	ocfs2_dlmfs: User DLM_* when decoding file open flags. In commit 0016eedc4185a3cd7e578b027a6e69001b85d6c4, we have changed dlmfs to use stackglue. So when use DLM* when we decode dlm flags from open level. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-30 12:45:56 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Srinivas Eeda	14741472a0	ocfs2: Fix a race in o2dlm lockres mastery In o2dlm, the master of a lock resource keeps a map of all interested nodes. This prevents the master from purging the resource before an interested node can create a lock. A race between the mastery thread and the mastery handler allowed an interested node to discover who the master is without informing the master directly. This is easily fixed by holding the dlm spinlock a little longer in the mastery handler. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-23 18:22:59 -07:00
Tristan Ye	b54c2ca475	Ocfs2: Handle deletion of reflinked oprhan inodes correctly. The rule is that all inodes in the orphan dir have ORPHANED_FL, otherwise we treated it as an ERROR. This rule works well except for some rare cases of reflink operation: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1215 The problem is caused by how reflink and our orphan_scan thread interact. * The orphan scan pulls the orphans into a queue first, then runs the queue at a later time. We only hold the orphan_dir's lock during scanning. * Reflink create a oprhaned target in orphan_dir as its first step. It removes the target and clears the flag as the final step. These two steps take the orphan_dir's lock, but it is not held for the duration. Based on the above semantics, a reflink inode can be moved out of the orphan dir and have its ORPHANED_FL cleared before the queue of orphans is run. This leads to a ERROR in ocfs2_query_wipde_inode(). This patch teaches ocfs2_query_wipe_inode() to detect previously orphaned reflink targets. If a reflink fails or a crash occurs during the relfink operation, the inode will retain ORPHANED_FL and will be properly wiped. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-23 18:22:55 -07:00
Tristan Ye	3939fda4b3	Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir. Currently, some callers were missing to journal the dirty inode after adding it to orphan dir. Now we're going to journal such modifications within the ocfs2_orphan_add() itself, It's safe to do so, though some existing caller may duplicate this, and it makes the logic look more straightforward anyway. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-23 18:22:51 -07:00
Mark Fasheh	b4414eea0e	ocfs2: Clear undo bits when local alloc is freed When the local alloc file changes windows, unused bits are freed back to the global bitmap. By defnition, those bits can not be in use by any file. Also, the local alloc will never have been able to allocate those bits if they were part of a previous truncate. Therefore it makes sense that we should clear unused local alloc bits in the undo buffer so that they can be used immediatly. [ Modified to call it ocfs2_release_clusters() -- Joel ] Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-23 18:22:40 -07:00
Tao Ma	b23179681c	ocfs2: Init meta_ac properly in ocfs2_create_empty_xattr_block. You can't store a pointer that you haven't filled in yet and expect it to work. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-19 14:53:52 -07:00
Tao Ma	dfe4d3d6a6	ocfs2: Fix the update of name_offset when removing xattrs When replacing a xattr's value, in some case we wipe its name/value first and then re-add it. The wipe is done by ocfs2_xa_block_wipe_namevalue() when the xattr is in the inode or block. We currently adjust name_offset for all the entries which have (offset < name_offset). This does not adjust the entrie we're replacing. Since we are replacing the entry, we don't adjust the total entry count. When we calculate a new namevalue location, we trust the entries now-wrong offset in ocfs2_xa_get_free_start(). The solution is to also adjust the name_offset for the replaced entry, allowing ocfs2_xa_get_free_start() to calculate the new namevalue location correctly. The following script can trigger a kernel panic easily. echo 'y'\|mkfs.ocfs2 --fs-features=local,xattr -b 4K $DEVICE mount -t ocfs2 $DEVICE $MNT_DIR FILE=$MNT_DIR/$RANDOM for((i=0;i<76;i++)) do string_76="a$string_76" done string_78="aa$string_76" string_82="aaaa$string_78" touch $FILE setfattr -n 'user.test1234567890' -v $string_76 $FILE setfattr -n 'user.test1234567890' -v $string_78 $FILE setfattr -n 'user.test1234567890' -v $string_82 $FILE Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-19 14:53:51 -07:00
Mark Fasheh	b22b63ebaf	ocfs2: Always try for maximum bits with new local alloc windows What we were doing before was to ask for the current window size as the maximum allocation. This had the effect of limiting the amount of allocation we could get for the local alloc during times when the window size was shrunk due to fragmentation. In some cases, that could actually increase fragmentation by artificially limiting the number of bits we can accept. So while we still want to ask for a minimum number of bits equal to window size, there is no reason why we should limit the number of bits the local alloc should accept. Hence always allow the maximum number of local alloc bits. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-18 13:22:42 -07:00
Tao Ma	1a934c3e57	ocfs2: enable discontig block group support. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-03-18 15:54:22 +08:00
Tao Ma	abf1b3cb5b	ocfs2: Set ac_last_group properly with discontig group. ac_last_group is used to record the last block group we used during allocation. But the initialization process only calls ocfs2_which_suballoc_group and fails to use suballoc_loc properly. So let us do it. Another function ocfs2_test_suballoc_bit also needs fix. I have searched all the callers of ocfs2_which_suballoc_group, and all the callers notices suballoc_loc now. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-04-27 08:30:36 +08:00
Tao Ma	74380c479a	ocfs2: Free block to the right block group. In case the block we are going to free is allocated from a discontiguous block group, we have to use suballoc_loc to be the right group. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-03-22 14:20:18 +08:00
Tao Ma	af2bf0d860	ocfs2: Add ocfs2_gd_is_discontig. Add ocfs2_gd_is_discontig so that we can test whether a group descriptor is discontiguous or not. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-05-17 15:14:17 +08:00
Tao Ma	8571882c21	ocfs2: ocfs2_group_bitmap_size has to handle old volume. ocfs2_group_bitmap_size has to handle the case when the volume don't have discontiguous block group support. So pass the feature_incompat in and check it. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-04-13 14:38:06 +08:00
Tao Ma	4711954eaa	ocfs2: Some tiny bug fixes for discontiguous block allocation. The fixes include: 1. some endian problems. 2. we should use bit/bpc in ocfs2_block_group_grow_discontig to allocate clusters. 3. set num_clusters properly in __ocfs2_claim_clusters. 4. change name from ocfs2_supports_discontig_bh to ocfs2_supports_discontig_bg. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-04-22 14:09:15 +08:00
Joel Becker	95ec0adf0b	ocfs2: Don't relink cluster groups when allocating discontig block groups We don't have enough credits, and the filesystem is in a full state anyway. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-26 10:10:08 +08:00
Joel Becker	8b06bc592e	ocfs2: Grow discontig block groups in one transaction. Rather than extending the transaction every time we add an extent to a discontiguous block group, we grab enough credits to fill the extent list up front. This means we can free the bits in the same transaction if we end up not getting enough space. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-26 10:09:29 +08:00
Joel Becker	2b6cb576aa	ocfs2: Set suballoc_loc on allocated metadata. Get the suballoc_loc from ocfs2_claim_new_inode() or ocfs2_claim_metadata(). Store it on the appropriate field of the block we just allocated. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-26 10:09:15 +08:00
Joel Becker	ba2066351b	ocfs2: Return allocated metadata blknos on the ocfs2_suballoc_result. Rather than calculating the resulting block number, return it on the ocfs2_suballoc_result structure. This way we can calculate block numbers for discontiguous block groups. Cluster groups keep doing it the old way. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-26 10:08:59 +08:00

... 2 3 4 5 6 ...

1370 Commits