linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2025-01-11 15:40:50 +00:00

Author	SHA1	Message	Date
Chuck Lever	b6a85258d8	NFS: Fix warning introduced by NFSv4.0 transport blocking patches When CONFIG_NFS_V4_1 is not enabled, gcc emits this warning: linux/fs/nfs/nfs4state.c:255:12: warning: ‘nfs4_begin_drain_session’ defined but not used [-Wunused-function] static int nfs4_begin_drain_session(struct nfs_client *clp) ^ Eventually NFSv4.0 migration recovery will invoke this function, but that has not yet been merged. Hide nfs4_begin_drain_session() behind CONFIG_NFS_V4_1 for now. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-04 12:26:30 -04:00
Chuck Lever	1cec16abf2	When CONFIG_NFS_V4_1 is not enabled, "make C=2" emits this warning: linux/fs/nfs/nfs4session.c:337:6: warning: symbol 'nfs41_set_target_slotid' was not declared. Should it be static? Move nfs41_set_target_slotid() and nfs41_update_target_slotid() back behind CONFIG_NFS_V4_1, since, in the final revision of this work, they are used only in NFSv4.1 and later. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-04 12:26:30 -04:00
Dong Fang	05726acabe	fuse: use list_for_each_entry() for list traversing Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2013-09-04 17:42:42 +02:00
Bob Peterson	068213f7d3	GFS2: Remove unnecessary memory barrier Function test_and_clear_bit implies a memory barrier, so subsequent memory barriers are unnecessary. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2013-09-04 15:58:21 +01:00
Christoph Hellwig	02afc27fae	direct-io: Handle O_(D)SYNC AIO Call generic_write_sync() from the deferred I/O completion handler if O_DSYNC is set for a write request. Also make sure various callers don't call generic_write_sync if the direct I/O code returns -EIOCBQUEUED. Based on an earlier patch from Jan Kara <jack@suse.cz> with updates from Jeff Moyer <jmoyer@redhat.com> and Darrick J. Wong <darrick.wong@oracle.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-04 09:23:46 -04:00
Christoph Hellwig	7b7a8665ed	direct-io: Implement generic deferred AIO completions Add support to the core direct-io code to defer AIO completions to user context using a workqueue. This replaces opencoded and less efficient code in XFS and ext4 (we save a memory allocation for each direct IO) and will be needed to properly support O_(D)SYNC for AIO. The communication between the filesystem and the direct I/O code requires a new buffer head flag, which is a bit ugly but not avoidable until the direct I/O code stops abusing the buffer_head structure for communicating with the filesystems. Currently this creates a per-superblock unbound workqueue for these completions, which is taken from an earlier patch by Jan Kara. I'm not really convinced about this use and would prefer a "normal" global workqueue with a high concurrency limit, but this needs further discussion. JK: Fixed ext4 part, dynamic allocation of the workqueue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-04 09:23:46 -04:00
Linus Torvalds	f83b0a4e4c	Big part of this is the addition of compression to the generic pstore layer so that all backends can use the pitiful amounts of storage they control more effectively. Three other small fixes/cleanups too. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJSJhMJAAoJEKurIx+X31iBcpsP/1DfQDD9IHEt7E+tTObIdKOx 1yygmQu3IL2pvdUIkKRfblDP0PJRuwJmRPdp48Rb5CV+8i4WQuTCueOvNomIMZ4G 1DkAzJPD2W4C+dSBD3rvWk4totXCsqLwCkaWp/FzWM7OW+cJN3Kj3Vdh3k01CVxn PqfCoZB5y8CVVd6EQiUmAU8z/LoGE0psxxO/CL7lBC1IsAlVqSO93hqHwE9cm/f4 7EN069sHbKV5zOT1QsV7agn9TfHAHuj0fr3t8aDs/cC0rgz19VAsxg0sRkMHzKJD MGLqGCTHyQvYmo2xfHZLb1NjhL6EmHsQqhM9KQnkdDyQ3QgYOCSsglcYygSdZcN1 ZjynLbD/OSkLABefyw/ZjW1GGzIqU6jWzj+0fmbkUa0Pe1B1aDuf/OiAcYMRJdmg clWE7RbnisSlrLYl3MI36T4XvyzzHVfumOVET6foWcnVb8/t/gHhC6Yf96FSUWVj jISE8ECfjkk7SQTYHE0ThvBmqv8M/IwUEg6wA9tnsQByLMAO+VqGoXcBMKOxz+hD 7TPoQpa04Sqnplwx7HefbhErKH9TCQWUOgT5kjB7+bzbPpEG1cB3pb7izg3VFd7k l7d6vfXtAukeuYmU8PgY/ea6/Nnw7iMcDZCegzv+A4Ms/kWM9oH2ctvV6V/edYyb 0x7BgSOu69HztW5IveB6 =6H8r -----END PGP SIGNATURE----- Merge tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore changes from Tony Luck: "A big part of this is the addition of compression to the generic pstore layer so that all backends can use the pitiful amounts of storage they control more effectively. Three other small fixes/cleanups too. * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore/ram: (really) fix undefined usage of rounddown_pow_of_two pstore/ram: Read and write to the 'compressed' flag of pstore efi-pstore: Read and write to the 'compressed' flag of pstore erst: Read and write to the 'compressed' flag of pstore powerpc/pseries: Read and write to the 'compressed' flag of pstore pstore: Add file extension to pstore file if compressed pstore: Add decompression support to pstore pstore: Introduce new argument 'compressed' in the read callback pstore: Add compression support to pstore pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selected pstore: Add new argument 'compressed' in pstore write callback powerpc/pseries: Remove (de)compression in nvram with pstore enabled pstore: d_alloc_name() doesn't return an ERR_PTR acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()	2013-09-03 21:14:06 -07:00
Al Viro	173c84012a	switch fchmod() to fdget Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 23:04:45 -04:00
Al Viro	7e3fb5842e	switch epoll_ctl() to fdget Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 23:04:44 -04:00
Al Viro	e95c311e17	git simplify nilfs check for busy subtree Reviewed-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:52:50 -04:00
Al Viro	badcf2b7b8	constify touch_atime() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:52:45 -04:00
Jeff Layton	8033426e6b	vfs: allow umount to handle mountpoints without revalidating them Christopher reported a regression where he was unable to unmount a NFS filesystem where the root had gone stale. The problem is that d_revalidate handles the root of the filesystem differently from other dentries, but d_weak_revalidate does not. We could simply fix this by making d_weak_revalidate return success on IS_ROOT dentries, but there are cases where we do want to revalidate the root of the fs. A umount is really a special case. We generally aren't interested in anything but the dentry and vfsmount that's attached at that point. If the inode turns out to be stale we just don't care since the intent is to stop using it anyway. Try to handle this situation better by treating umount as a special case in the lookup code. Have it resolve the parent using normal means, and then do a lookup of the final dentry without revalidating it. In most cases, the final lookup will come out of the dcache, but the case where there's a trailing symlink or !LAST_NORM entry on the end complicates things a bit. Cc: Neil Brown <neilb@suse.de> Reported-by: Christopher T Vogan <cvogan@us.ibm.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:50:29 -04:00
Yan, Zheng	590fb51f1c	vfs: call d_op->d_prune() before unhashing dentry The d_prune dentry operation is used to notify filesystem when VFS about to prune a hashed dentry from the dcache. There are three code paths that prune dentries: shrink_dcache_for_umount_subtree(), prune_dcache_sb() and d_prune_aliases(). For the d_prune_aliases() case, VFS unhashes the dentry first, then call the d_prune dentry operation. This confuses ceph_d_prune() (ceph uses the d_prune dentry operation to maintain a flag indicating whether the complete contents of a directory are in the dcache, pruning unhashed dentry does not affect dir's completeness) This patch fixes the issue by calling the d_prune dentry operation in d_prune_aliases(), before unhashing the dentry. Also make VFS only call the d_prune dentry operation for hashed dentry, to avoid calling the d_prune dentry operation twice when dentry is pruned by d_prune_aliases(). Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:50:28 -04:00
Al Viro	184cacabe2	only regular files with FMODE_WRITE need to be on s_files Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:50:28 -04:00
Al Viro	301f0268b6	nfsd: racy access to ->d_name in nsfd4_encode_path() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:50:28 -04:00
Linus Torvalds	32dad03d16	Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "A lot of activities on the cgroup front. Most changes aren't visible to userland at all at this point and are laying foundation for the planned unified hierarchy. - The biggest change is decoupling the lifetime management of css (cgroup_subsys_state) from that of cgroup's. Because controllers (cpu, memory, block and so on) will need to be dynamically enabled and disabled, css which is the association point between a cgroup and a controller may come and go dynamically across the lifetime of a cgroup. Till now, css's were created when the associated cgroup was created and stayed till the cgroup got destroyed. Assumptions around this tight coupling permeated through cgroup core and controllers. These assumptions are gradually removed, which consists bulk of patches, and css destruction path is completely decoupled from cgroup destruction path. Note that decoupling of creation path is relatively easy on top of these changes and the patchset is pending for the next window. - cgroup has its own event mechanism cgroup.event_control, which is only used by memcg. It is overly complex trying to achieve high flexibility whose benefits seem dubious at best. Going forward, new events will simply generate file modified event and the existing mechanism is being made specific to memcg. This pull request contains prepatory patches for such change. - Various fixes and cleanups" Fixed up conflict in kernel/cgroup.c as per Tejun. * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits) cgroup: fix cgroup_css() invocation in css_from_id() cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp() cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup cgroup: implement CFTYPE_NO_PREFIX cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax cgroup: fix cgroup_write_event_control() cgroup: fix subsystem file accesses on the root cgroup cgroup: change cgroup_from_id() to css_from_id() cgroup: use css_get() in cgroup_create() to check CSS_ROOT cpuset: remove an unncessary forward declaration cgroup: RCU protect each cgroup_subsys_state release cgroup: move subsys file removal to kill_css() cgroup: factor out kill_css() cgroup: decouple cgroup_subsys_state destruction from cgroup destruction cgroup: replace cgroup->css_kill_cnt with ->nr_css cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item cgroup: move cgroup->subsys[] assignment to online_css() cgroup: reorganize css init / exit paths cgroup: add __rcu modifier to cgroup->subsys[] ...	2013-09-03 18:25:03 -07:00
Dave Chinner	1d03c6fa88	xfs: XFS_MOUNT_QUOTA_ALL needed by userspace So move it to a header file shared with userspace. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-09-03 15:00:06 -05:00
Dave Chinner	50fc5f7acc	xfs: dtype changed xfs_dir2_sfe_put_ino to xfs_dir3_sfe_put_ino So fix up the export in xfs_dir2.h that is needed by userspace. <sigh> Now xfs_dir3_sfe_put_ino has been made static. Revert 98f7462 ("xfs: xfs_dir3_sfe_put_ino can be static") to being non static so that the code shared with userspace is identical again. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>	2013-09-03 14:51:16 -05:00
Chuck Lever	2cf8bca8b9	NFS: Update session draining barriers for NFSv4.0 transport blocking Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:38 -04:00
Chuck Lever	be05c860d7	NFS: Add nfs4_sequence calls for OPEN_CONFIRM Ensure OPEN_CONFIRM is not emitted while the transport is plugged. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:37 -04:00
Chuck Lever	fbd4bfd1d9	NFS: Add nfs4_sequence calls for RELEASE_LOCKOWNER Ensure RELEASE_LOCKOWNER is not emitted while the transport is plugged. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:37 -04:00
Chuck Lever	160881e33d	NFS: Enable nfs4_setup_sequence() for DELEGRETURN When CONFIG_NFS_V4_1 is disabled, the calls to nfs4_setup_sequence() and nfs4_sequence_done() are compiled out for the DELEGRETURN operation. To allow NFSv4.0 transport blocking to work for DELEGRETURN, these call sites have to be present all the time. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:36 -04:00
Chuck Lever	3bd2384a77	NFS: NFSv4.0 transport blocking Plumb in a mechanism for plugging an NFSv4.0 mount, using the same infrastructure as NFSv4.1 sessions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:35 -04:00
Chuck Lever	abf79bb341	NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking Anchor an nfs4_slot_table in the nfs_client for use with NFSv4.0 transport blocking. It is initialized only for NFSv4.0 nfs_client's. Introduce appropriate minor version ops to handle nfs_client initialization and shutdown requirements that differ for each minor version. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:35 -04:00
Chuck Lever	eb2a1cd3c9	NFS: Add global helper for releasing slot table resources The nfs4_destroy_slot_tables() function is renamed to avoid confusion with the new helper. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:34 -04:00
Chuck Lever	744aa52530	NFS: Add global helper to set up a stand-along nfs4_slot_table Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:34 -04:00
Chuck Lever	9d33059c1b	NFS: Enable slot table helpers for NFSv4.0 I'd like to re-use NFSv4.1's slot table machinery for NFSv4.0 transport blocking. Re-organize some of nfs4session.c so the slot table code is built even when NFS_V4_1 is disabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:33 -04:00
Chuck Lever	220e09ccd3	NFS: Remove unused call_sync minor version op Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:33 -04:00
Chuck Lever	9915ea7e0a	NFS: Add RPC callouts to start NFSv4.0 synchronous requests Refactor nfs4_call_sync_sequence() so it is used for NFSv4.0 now. The RPC callouts will house transport blocking logic similar to NFSv4.1 sessions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:32 -04:00
Chuck Lever	a9c92d6b85	NFS: Common versions of sequence helper functions NFSv4.0 will have need for this functionality when I add the ability to block NFSv4.0 traffic before migration recovery. I'm not really clear on why nfs4_set_sequence_privileged() gets a generic name, but nfs41_init_sequence() gets a minor version-specific name. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:32 -04:00
Chuck Lever	5a580e0ae2	NFS: Clean up nfs4_setup_sequence() Clean up: Both the NFSv4.0 and NFSv4.1 version of nfs4_setup_sequence() are used only in fs/nfs/nfs4proc.c. No need to keep global header declarations for either version. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:31 -04:00
Chuck Lever	2a3eb2b97b	NFS: Rename nfs41_call_sync_data as a common data structure Clean up: rename nfs41_call_sync_data for use as a data structure common to all NFSv4 minor versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:31 -04:00
Chuck Lever	e8d92382dd	NFS: When displaying session slot numbers, use "%u" consistently Clean up, since slot and sequence numbers are all unsigned anyway. Among other things, squelch compiler warnings: linux/fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’: linux/fs/nfs/nfs4proc.c:703:2: warning: signed and unsigned type in conditional expression [-Wsign-compare] and linux/fs/nfs/nfs4session.c: In function ‘nfs4_alloc_slot’: linux/fs/nfs/nfs4session.c:151:31: warning: signed and unsigned type in conditional expression [-Wsign-compare] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:30 -04:00
Trond Myklebust	ba6c05928d	NFS: Ensure that rmdir() waits for sillyrenames to complete If an NFS client does mkdir("dir"); fd = open("dir/file"); unlink("dir/file"); close(fd); rmdir("dir"); then the asynchronous nature of the sillyrename operation means that we can end up getting EBUSY for the rmdir() in the above test. Fix that by ensuring that we wait for any in-progress sillyrenames before sending the rmdir() to the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:26:29 -04:00
Weston Andros Adamson	a5250def7c	NFSv4: use the mach cred for SECINFO w/ integrity Commit 5ec16a8500d339b0e7a0cc76b785d18daad354d4 introduced a regression that causes SECINFO to fail without actualy sending an RPC if: 1) the nfs_client's rpc_client was using KRB5i/p (now tried by default) 2) the current user doesn't have valid kerberos credentials This situation is quite common - as of now a sec=sys mount would use krb5i for the nfs_client's rpc_client and a user would hardly be faulted for not having run kinit. The solution is to use the machine cred when trying to use an integrity protected auth flavor for SECINFO. Older servers may not support using the machine cred or an integrity protected auth flavor for SECINFO in every circumstance, so we fall back to using the user's cred and the filesystem's auth flavor in this case. We run into another problem when running against linux nfs servers - they return NFS4ERR_WRONGSEC when using integrity auth flavor (unless the mount is also that flavor) even though that is not a valid error for SECINFO*. Even though it's against spec, handle WRONGSEC errors on SECINFO by falling back to using the user cred and the filesystem's auth flavor. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:25:10 -04:00
Andy Adamson	dc24826bfc	NFS avoid expired credential keys for buffered writes We must avoid buffering a WRITE that is using a credential key (e.g. a GSS context key) that is about to expire or has expired. We currently will paint ourselves into a corner by returning success to the applciation for such a buffered WRITE, only to discover that we do not have permission when we attempt to flush the WRITE (and potentially associated COMMIT) to disk. Use the RPC layer credential key timeout and expire routines which use a a watermark, gss_key_expire_timeo. We test the key in nfs_file_write. If a WRITE is using a credential with a key that will expire within watermark seconds, flush the inode in nfs_write_end and send only NFS_FILE_SYNC WRITEs by adding nfs_ctx_key_to_expire to nfs_need_sync_write. Note that this results in single page NFS_FILE_SYNC WRITEs. Signed-off-by: Andy Adamson <andros@netapp.com> [Trond: removed a pr_warn_ratelimited() for now] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-03 15:25:09 -04:00
Linus Torvalds	542a086ac7	Driver core patches for 3.12-rc1 Here's the big driver core pull request for 3.12-rc1. Lots of tiny changes here fixing up the way sysfs attributes are created, to try to make drivers simpler, and fix a whole class race conditions with creations of device attributes after the device was announced to userspace. All the various pieces are acked by the different subsystem maintainers. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (GNU/Linux) iEYEABECAAYFAlIlIPcACgkQMUfUDdst+ynUMwCaAnITsxyDXYQ4DqEsz8EcOtMk 718AoLrgnUZs3B+70AT34DVktg4HSThk =USl9 -----END PGP SIGNATURE----- Merge tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches from Greg KH: "Here's the big driver core pull request for 3.12-rc1. Lots of tiny changes here fixing up the way sysfs attributes are created, to try to make drivers simpler, and fix a whole class race conditions with creations of device attributes after the device was announced to userspace. All the various pieces are acked by the different subsystem maintainers" * tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits) firmware loader: fix pending_fw_head list corruption drivers/base/memory.c: introduce help macro to_memory_block dynamic debug: line queries failing due to uninitialized local variable sysfs: sysfs_create_groups returns a value. debugfs: provide debugfs_create_x64() when disabled rbd: convert bus code to use bus_groups firmware: dcdbas: use binary attribute groups sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled driver core: add #include <linux/sysfs.h> to core files. HID: convert bus code to use dev_groups Input: serio: convert bus code to use drv_groups Input: gameport: convert bus code to use drv_groups driver core: firmware: use __ATTR_RW() driver core: core: use DEVICE_ATTR_RO driver core: bus: use DRIVER_ATTR_WO() driver core: create write-only attribute macros for devices and drivers sysfs: create __ATTR_WO() driver-core: platform: convert bus code to use dev_groups workqueue: convert bus code to use dev_groups MEI: convert bus code to use dev_groups ...	2013-09-03 11:37:15 -07:00
Linus Torvalds	fc6d0b0376	Merge branch 'lockref' (locked reference counts) Merge lockref infrastructure code by me and Waiman Long. I already merged some of the preparatory patches that didn't actually do any semantic changes earlier, but this merges the actual _reason_ for those preparatory patches. The "lockref" structure is a combination "spinlock and reference count" that allows optimized reference count accesses. In particular, it guarantees that the reference count will be updated AS IF the spinlock was held, but using atomic accesses that cover both the reference count and the spinlock words, we can often do the update without actually having to take the lock. This allows us to avoid the nastiest cases of spinlock contention on large machines under heavy pathname lookup loads. When updating the dentry reference counts on a large system, we'll still end up with the cache line bouncing around, but that's much less noticeable than actually having to spin waiting for the lock. * lockref: lockref: implement lockless reference count updates using cmpxchg() lockref: uninline lockref helper functions vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock() vfs: use lockref_get_not_zero() for optimistic lockless dget_parent() lockref: add 'lockref_get_or_lock() helper	2013-09-03 08:08:21 -07:00
Miklos Szeredi	efeb9e60d4	fuse: readdir: check for slash in names Userspace can add names containing a slash character to the directory listing. Don't allow this as it could cause all sorts of trouble. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org	2013-09-03 14:28:38 +02:00
Maxim Patlasov	06a7c3c278	fuse: hotfix truncate_pagecache() issue The way how fuse calls truncate_pagecache() from fuse_change_attributes() is completely wrong. Because, w/o i_mutex held, we never sure whether 'oldsize' and 'attr->size' are valid by the time of execution of truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we released fc->lock in the middle of fuse_change_attributes(), we completely loose control of actions which may happen with given inode until we reach truncate_pagecache. The list of potentially dangerous actions includes mmap-ed reads and writes, ftruncate(2) and write(2) extending file size. The typical outcome of doing truncate_pagecache() with outdated arguments is data corruption from user point of view. This is (in some sense) acceptable in cases when the issue is triggered by a change of the file on the server (i.e. externally wrt fuse operation), but it is absolutely intolerable in scenarios when a single fuse client modifies a file without any external intervention. A real life case I discovered by fsx-linux looked like this: 1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends FUSE_SETATTR to the server synchronously, but before getting fc->lock ... 2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP to the server synchronously, then calls fuse_change_attributes(). The latter updates i_size, releases fc->lock, but before comparing oldsize vs attr->size.. 3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and updating attributes and i_size, but now oldsize is equal to outarg.attr.size because i_size has just been updated (step 2). Hence, fuse_do_setattr() returns w/o calling truncate_pagecache(). 4. As soon as ftruncate(2) completes, the user extends file size by write(2) making a hole in the middle of file, then reads data from the hole either by read(2) or mmap-ed read. The user expects to get zero data from the hole, but gets stale data because truncate_pagecache() is not executed yet. The scenario above illustrates one side of the problem: not truncating the page cache even though we should. Another side corresponds to truncating page cache too late, when the state of inode changed significantly. Theoretically, the following is possible: 1. As in the previous scenario fuse_dentry_revalidate() discovered that i_size changed (due to our own fuse_do_setattr()) and is going to call truncate_pagecache() for some 'new_size' it believes valid right now. But by the time that particular truncate_pagecache() is called ... 2. fuse_do_setattr() returns (either having called truncate_pagecache() or not -- it doesn't matter). 3. The file is extended either by write(2) or ftruncate(2) or fallocate(2). 4. mmap-ed write makes a page in the extended region dirty. The result will be the lost of data user wrote on the fourth step. The patch is a hotfix resolving the issue in a simplistic way: let's skip dangerous i_size update and truncate_pagecache if an operation changing file size is in progress. This simplistic approach looks correct for the cases w/o external changes. And to handle them properly, more sophisticated and intrusive techniques (e.g. NFS-like one) would be required. I'd like to postpone it until the issue is well discussed on the mailing list(s). Changed in v2: - improved patch description to cover both sides of the issue. Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org	2013-09-03 13:41:58 +02:00
Anand Avati	d331a415ae	fuse: invalidate inode attributes on xattr modification Calls like setxattr and removexattr result in updation of ctime. Therefore invalidate inode attributes to force a refresh. Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org	2013-09-03 13:41:58 +02:00
Maxim Patlasov	4a4ac4eba1	fuse: postpone end_page_writeback() in fuse_writepage_locked() The patch fixes a race between ftruncate(2), mmap-ed write and write(2): 1) An user makes a page dirty via mmap-ed write. 2) The user performs shrinking truncate(2) intended to purge the page. 3) Before fuse_do_setattr calls truncate_pagecache, the page goes to writeback. fuse_writepage_locked fills FUSE_WRITE request and releases the original page by end_page_writeback. 4) fuse_do_setattr() completes and successfully returns. Since now, i_mutex is free. 5) Ordinary write(2) extends i_size back to cover the page. Note that fuse_send_write_pages do wait for fuse writeback, but for another page->index. 6) fuse_writepage_locked proceeds by queueing FUSE_WRITE request. fuse_send_writepage is supposed to crop inarg->size of the request, but it doesn't because i_size has already been extended back. Moving end_page_writeback to the end of fuse_writepage_locked fixes the race because now the fact that truncate_pagecache is successfully returned infers that fuse_writepage_locked has already called end_page_writeback. And this, in turn, infers that fuse_flush_writepages has already called fuse_send_writepage, and the latter used valid (shrunk) i_size. write(2) could not extend it because of i_mutex held by ftruncate(2). Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org	2013-09-03 13:41:57 +02:00
Jaegeuk Kim	222cbdc483	f2fs: avoid an overflow during utilization calculation The current f2fs uses all the block counts with 32 bit numbers, which is able to cover about 15TB volume. But in calculation of utilization, f2fs multiplies the count by 100 which can induce overflow. This patch fixes this. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-09-03 13:41:37 +09:00
Jaegeuk Kim	c34e333fd5	f2fs: trigger GC when there are prefree segments Previously, f2fs conducts SSR when free_sections() < overprovision_sections. But, even though there are a lot of prefree segments, it can consider SSR only. So, let's consider the number of prefree segments too for triggering SSR. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-09-03 10:11:20 +09:00
Linus Torvalds	15570086b5	vfs: reimplement d_rcu_to_refcount() using lockref_get_or_lock() This moves __d_rcu_to_refcount() from <linux/dcache.h> into fs/namei.c and re-implements it using the lockref infrastructure instead. It also adds a lot of comments about what is actually going on, because turning a dentry that was looked up using RCU into a long-lived reference counted entry is one of the more subtle parts of the rcu walk. We also used to be _particularly_ subtle in unlazy_walk() where we re-validate both the dentry and its parent using the same sequence count. We used to do it by nesting the locks and then verifying the sequence count just once. That was silly, because nested locking is expensive, but the sequence count check is not. So this just re-validates the dentry and the parent separately, avoiding the nested locking, and making the lockref lookup possible. Acked-by: Waiman Long <waiman.long@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-02 11:38:06 -07:00
Waiman Long	df3d0bbcdb	vfs: use lockref_get_not_zero() for optimistic lockless dget_parent() A valid parent pointer is always going to have a non-zero reference count, but if we look up the parent optimistically without locking, we have to protect against the (very unlikely) race against renaming changing the parent from under us. We do that by using lockref_get_not_zero(), and then re-checking the parent pointer after getting a valid reference. [ This is a re-implementation of a chunk from the original patch by Waiman Long: "dcache: Enable lockless update of dentry's refcount". I've completely rewritten the patch-series and split it up, but I'm attributing this part to Waiman as it's close enough to his earlier patch - Linus ] Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-02 11:29:22 -07:00
Trond Myklebust	2127d82af3	NFSv4: Convert idmapper to use the new framework for pipefs dentries Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-09-01 11:12:42 -04:00
Eric W. Biederman	c7b96acf14	userns: Kill nsown_capable it makes the wrong thing easy nsown_capable is a special case of ns_capable essentially for just CAP_SETUID and CAP_SETGID. For the existing users it doesn't noticably simplify things and from the suggested patches I have seen it encourages people to do the wrong thing. So remove nsown_capable. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-08-30 23:44:11 -07:00
Maxime Bizon	3bd11cf56e	pstore/ram: (really) fix undefined usage of rounddown_pow_of_two Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170 Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is false (documented behaviour). Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2013-08-30 15:57:01 -07:00
J. Bruce Fields	248f807b47	nfsd4: nfsd4_create_clid_dir prints uninitialized data Take the easy way out and just remove the printk. Reported-by: David Howells <dhowells@redhat.com>	2013-08-30 17:30:52 -04:00

1 2 3 4 5 ...

33155 Commits