This patch adds another debugfs knob which switches UBIFS to R/O mode.
I needed it while trying to reproduce the 'first log node is not CS node'
bug. Without this debugfs knob you have to perform a power cut to repruduce
the bug. The knob is named 'ro_error' and all it does is it sets the
'ro_error' UBIFS flag which makes UBIFS disallow any further writes - even
write-back will fail with -EROFS. Useful for debugging.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
Fix the following compilation warning:
fs/ubifs/dir.c: In function 'ubifs_rename':
fs/ubifs/dir.c:972:15: warning: 'saved_nlink' may be used uninitialized
in this function
Use the 'uninitialized_var()' macro to get rid of this false-positive.
Artem: massaged the patch a bit.
Signed-off-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
UBIFS has a feature called "empty space fix-up" which is a quirk to work-around
limitations of dumb flasher programs. Namely, of those flashers that are unable
to skip NAND pages full of 0xFFs while flashing, resulting in empty space at
the end of half-filled eraseblocks to be unusable for UBIFS. This feature is
relatively new (introduced in v3.0).
The fix-up routine (fixup_free_space()) is executed only once at the very first
mount if the superblock has the 'space_fixup' flag set (can be done with -F
option of mkfs.ubifs). It basically reads all the UBIFS data and metadata and
writes it back to the same LEB. The routine assumes the image is pristine and
does not have anything in the journal.
There was a bug in 'fixup_free_space()' where it fixed up the log incorrectly.
All but one LEB of the log of a pristine file-system are empty. And one
contains just a commit start node. And 'fixup_free_space()' just unmapped this
LEB, which resulted in wiping the commit start node. As a result, some users
were unable to mount the file-system next time with the following symptom:
UBIFS error (pid 1): replay_log_leb: first log node at LEB 3:0 is not CS node
UBIFS error (pid 1): replay_log_leb: log error detected while replaying the log at LEB 3:0
The root-cause of this bug was that 'fixup_free_space()' wrongly assumed
that the beginning of empty space in the log head (c->lhead_offs) was known
on mount. However, it is not the case - it was always 0. UBIFS does not store
in it the master node and finds out by scanning the log on every mount.
The fix is simple - just pass commit start node size instead of 0 to
'fixup_leb()'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
Cc: stable@vger.kernel.org [v3.0+]
Reported-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>
Tested-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>
Reported-by: James Nute <newten82@gmail.com>
Pull CIFS fixes from Steve French.
* git://git.samba.org/sfrench/cifs-2.6:
cifs: always update the inode cache with the results from a FIND_*
cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
cifs: on CONFIG_HIGHMEM machines, limit the rsize/wsize to the kmap space
Initialise mid_q_entry before putting it on the pending queue
This renames CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND to encourage future
reuse of the capability in question in related cases.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIcBAABAgAGBQJQBcRhAAoJEKhOf7ml8uNsnIoP/2XhSul9N/AWC5jfEAh4Af07
QdhfJmYXnXC1Irndh/IoAITu+vHQecm0XjbvAy/9QOBn9oSkM7kNilvOLrCrdzzQ
j9/BRMRCJRcu/vMyJmt37z0OIgfiktgDoOBaE6nC5t+1nHotcByAMWdy/AGwqqaL
q3lbYcoRtDDQpDr9XPm68cyRdddvWnq81gXb90gNovvfgCjNFVvscshXmMGv3Luy
Dx29zROJHJNOWG3kV1Xq7PdNffZj1ChCgIsBRKkzKWROcVEGPEuH5O0wjf4I4rCV
PW6nRV9WOykqJI5CAnrWzr9bf8AvpclXtGYWFiwPvUF0kMggSoNFb5xQyRy45SBC
nC+daLZNO123yU8xKb3qXaotsKPJ0qRTKAWUqWaGkRkQ0Mg90VmanyYkmP5PkeUX
ZABNS4QlxnLGDtZuhSBioUO5pf0iDdzSrYkIOuYD81DGM8yKWWmUyxupOoVW5Kmu
QD0d34+ZgEndv9znZzBF8DdGxkwjwljJW6sIBw7PGDq3qXcYdzd4awgtPlnGEOh/
oi6iG24r8oysB8w5IJpwj20/zCvJyYVR+m+eHXxEs373xIGpbAfJbHYRKHqkYgTo
nYkZyLgE0g46Izqbb42yrN7y5dUhSsrbImTI8L5xaLVkBYhspEuSO/eSLgoklWiw
VgbmreU3R0apj0hwPcA5
=oZrz
-----END PGP SIGNATURE-----
Merge tag 'pm-post-3.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull a last-minute PM update from Rafael J. Wysocki:
"This renames CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND to encourage future
reuse of the capability in question in related cases."
* tag 'pm-post-3.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND
As discussed in
http://thread.gmane.org/gmane.linux.kernel/1249726/focus=1288990,
the capability introduced in 4d7e30d98939a0340022ccd49325a3d70f7e0238
to govern EPOLLWAKEUP seems misnamed: this capability is about governing
the ability to suspend the system, not using a particular API flag
(EPOLLWAKEUP). We should make the name of the capability more general
to encourage reuse in related cases. (Whether or not this capability
should also be used to govern the use of /sys/power/wake_lock is a
question that needs to be separately resolved.)
This patch renames the capability to CAP_BLOCK_SUSPEND. In order to ensure
that the old capability name doesn't make it out into the wild, could you
please apply and push up the tree to ensure that it is incorporated
for the 3.5 release.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
When we get back a FIND_FIRST/NEXT result, we have some info about the
dentry that we use to instantiate a new inode. We were ignoring and
discarding that info when we had an existing dentry in the cache.
Fix this by updating the inode in place when we find an existing dentry
and the uniqueid is the same.
Cc: <stable@vger.kernel.org> # .31.x
Reported-and-Tested-by: Andrew Bartlett <abartlet@samba.org>
Reported-by: Bill Robertson <bill_robertson@debortoli.com.au>
Reported-by: Dion Edwards <dion_edwards@debortoli.com.au>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:
crash> bt
PID: 2789 TASK: f02edaa0 CPU: 3 COMMAND: "fsx"
#0 [eed63cbc] schedule at c083c5b3
#1 [eed63d80] kmap_high at c0500ec8
#2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
#3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
#4 [eed63e50] do_writepages at c04f3e32
#5 [eed63e54] __filemap_fdatawrite_range at c04e152a
#6 [eed63ea4] filemap_fdatawrite at c04e1b3e
#7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
#8 [eed63ecc] do_sync_write at c052d202
#9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
EAX: 00000004 EBX: 00000003 ECX: abd73b73 EDX: 012a65c6
DS: 007b ESI: 012a65c6 ES: 007b EDI: 00000000
SS: 007b ESP: bf8db178 EBP: bf8db1f8 GS: 0033
CS: 0073 EIP: 40000424 ERR: 00000004 EFLAGS: 00000246
Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.
This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.
There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.
Cc: <stable@vger.kernel.org>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
We currently rely on being able to kmap all of the pages in an async
read or write request. If you're on a machine that has CONFIG_HIGHMEM
set then that kmap space is limited, sometimes to as low as 512 slots.
With 512 slots, we can only support up to a 2M r/wsize, and that's
assuming that we can get our greedy little hands on all of them. There
are other users however, so it's possible we'll end up stuck with a
size that large.
Since we can't handle a rsize or wsize larger than that currently, cap
those options at the number of kmap slots we have. We could consider
capping it even lower, but we currently default to a max of 1M. Might as
well allow those luddites on 32 bit arches enough rope to hang
themselves.
A more robust fix would be to teach the send and receive routines how
to contend with an array of pages so we don't need to marshal up a kvec
array at all. That's a fairly significant overhaul though, so we'll need
this limit in place until that's ready.
Cc: <stable@vger.kernel.org>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
A user reported a crash in cifs_demultiplex_thread() caused by an
incorrectly set mid_q_entry->callback() function. It appears that the
callback assignment made in cifs_call_async() was not flushed back to
memory suggesting that a memory barrier was required here. Changing the
code to make sure that the mid_q_entry structure was completely
initialised before it was added to the pending queue fixes the problem.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
- Really fix a cursor leak in xfs_alloc_ag_vextent_near
- Fix a performance regression related to doing allocation in workqueues
- Prevent recursion in xfs_buf_iorequest which is causing stack overflows
- Don't call xfs_bdstrat_cb in xfs_buf_iodone callbacks
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJQAGW0AAoJENaLyazVq6ZOOyEP/0xLuQKFF71eB/VL8BFylEHY
H22/aDEWTo9pWDxxwBirVioBeCU07ByEv7zeQM1nqEm9pXESTSzsBUYKX2tWSRzY
YclXYA0rpdCK2cdXpuz+0kWBFr9Y1Q1BIYNll6C3ZqhADgubAMHa13rKVUQlQqpD
EZhvGrh42ujRhckwmi1E3+g3Ll79fty47WzyzEOa18ij3LI3q5Dm7WZpZlhv/MVW
Fj975Q+LdJVchoQZ7gTiddMkZ936TwjLxM4EtWZd46CMG2/YRcPHx2YaItI0k6Xa
Q34pbUHidZjqzng28iO6Y6BaB/rPLX/f7KcoZib+rc85zt8sShoDFXS9eq+DdTeH
f5AgkPzpFfr3QpK3Fv5ZjICj5SkC6KhI14qnxLZhVGIWgJLGD8lb0fYDwN2y5BHq
HmA7G4hALBaZ+TOpXQ2+XfxFyrffkQcM/Ja57yfDj37aOKYxMubAco4JlADBzBkg
m1gF2QebQrbZ49k9x85vpIvoNFXcAg6FeuesYXciMcORCpMXp4f0oLjGlfwf13Fo
upZ2AGfcpIcMl3fZzliw64VuJpX4QswTzUZFqXH8+Vn9eB6cwFqA8PJpLxBSshCk
Y6E+LE6oYbsN+T6R+6Xj0DsPl4OIE0eWlGPaZZu9eWku02e0vZMj57RxUWpNcOmH
6lq8TJaS/N5Qri51NPVs
=TP1A
-----END PGP SIGNATURE-----
Merge tag 'for-linus-v3.5-rc7' of git://oss.sgi.com/xfs/xfs
Pull xfs regression fixes from Ben Myers:
- Really fix a cursor leak in xfs_alloc_ag_vextent_near
- Fix a performance regression related to doing allocation in
workqueues
- Prevent recursion in xfs_buf_iorequest which is causing stack
overflows
- Don't call xfs_bdstrat_cb in xfs_buf_iodone callbacks
* tag 'for-linus-v3.5-rc7' of git://oss.sgi.com/xfs/xfs:
xfs: do not call xfs_bdstrat_cb in xfs_buf_iodone_callbacks
xfs: prevent recursion in xfs_buf_iorequest
xfs: don't defer metadata allocation to the workqueue
xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near
If a parent and child process open the two ends of a fifo, and the
child immediately exits, the parent may receive a SIGCHLD before its
open() returns. In that case, we need to make sure that open() will
return successfully after the SIGCHLD handler returns, instead of
throwing EINTR or being restarted. Otherwise, the restarted open()
would incorrectly wait for a second partner on the other end.
The following test demonstrates the EINTR that was wrongly thrown from
the parent’s open(). Change .sa_flags = 0 to .sa_flags = SA_RESTART
to see a deadlock instead, in which the restarted open() waits for a
second reader that will never come. (On my systems, this happens
pretty reliably within about 5 to 500 iterations. Others report that
it manages to loop ~forever sometimes; YMMV.)
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define CHECK(x) do if ((x) == -1) {perror(#x); abort();} while(0)
void handler(int signum) {}
int main()
{
struct sigaction act = {.sa_handler = handler, .sa_flags = 0};
CHECK(sigaction(SIGCHLD, &act, NULL));
CHECK(mknod("fifo", S_IFIFO | S_IRWXU, 0));
for (;;) {
int fd;
pid_t pid;
putc('.', stderr);
CHECK(pid = fork());
if (pid == 0) {
CHECK(fd = open("fifo", O_RDONLY));
_exit(0);
}
CHECK(fd = open("fifo", O_WRONLY));
CHECK(close(fd));
CHECK(waitpid(pid, NULL, 0));
}
}
This is what I suspect was causing the Git test suite to fail in
t9010-svn-fe.sh:
http://bugs.debian.org/678852
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Split inode_permission() into inode- and superblock-dependent parts.
This is aimed at unionmounts where the superblock from the upper layer has to
be checked rather than the superblock from the lower layer as the upper layer
may be writable, thus allowing an unwritable file from the lower layer to be
copied up and modified.
Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com> (Further development)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pass mount flags to sget() so that it can use them in initialising a new
superblock before the set function is called. They could also be passed to the
compare function.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add comments describing what the directions "up" and "down" mean and ref count
handling to the VFS mount following family of functions.
Signed-off-by: Valerie Aurora <vaurora@redhat.com> (Original author)
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
copy_tree() can theoretically fail in a case other than ENOMEM, but always
returns NULL which is interpreted by callers as -ENOMEM. Change it to return
an explicit error.
Also change clone_mnt() for consistency and because union mounts will add new
error cases.
Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix.
[AV: folded braino fix by Dan Carpenter]
Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Valerie Aurora <valerie.aurora@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Make the chown() and lchown() syscalls jump to the fchownat() syscall with the
appropriate extra arguments.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
we want to take it out of mark_files_ro() reach *before* we start
checking if we ought to drop write access.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add a helper that abstracts out the jump to an already parsed struct path
from ->follow_link operation from procfs. Not only does this clean up
the code by moving the two sides of this game into a single helper, but
it also prepares for making struct nameidata private to namei.c
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Currently the non-nd_set_link based versions of ->follow_link are expected
to do a path_put(&nd->path) on failure. This calling convention is unexpected,
undocumented and doesn't match what the nd_set_link-based instances do.
Move the path_put out of the only non-nd_set_link based ->follow_link
instance into the caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a) ->d_iput() is wrong here - what we do to inode is completely usual, it's
dentry->d_fsdata that we want to drop. Just use ->d_release().
b) switch to ->s_d_op - no need to play with d_set_d_op()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
all callers want the same thing, actually - a kinda-sorta analog of
kern_path_create(). I.e. they want parent vfsmount/dentry (with
->i_mutex held, to make sure the child dentry is still their child)
+ the child dentry.
Signed-off-by Al Viro <viro@zeniv.linux.org.uk>
Since commit 197e37d9, the banner comment on lookup_open() no longer matches
what the function returns. It used to return a struct file pointer or NULL and
now it returns an integer and is passed the struct file pointer it is to use
amongst its arguments. Update the comment to reflect this.
Also add a banner comment to atomic_open().
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just pass struct file *. Methods are happier that way...
There's no need to return struct file * from finish_open() now,
so let it return int. Next: saner prototypes for parts in
namei.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Change of calling conventions:
old new
NULL 1
file 0
ERR_PTR(-ve) -ve
Caller *knows* that struct file *; no need to return it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>