33432 Commits

Author SHA1 Message Date
Steve French
77993be3f3 [CIFS] quiet sparse compile warning
Jeff's patchset introduced trivial sparse warning on new cifs toupper routine

Signed-off-by: Steve French <smfrench@gmail.com>
CC: Jeff Layton <jlayton@redhat.com>
2013-09-08 14:54:24 -05:00
Shirish Pargaonkar
32811d242f cifs: Start using per session key for smb2/3 for signature generation
Switch smb2 code to use per session session key and smb3 code to
    use per session signing key instead of per connection key to
    generate signatures.

    For that, we need to find a session to fetch the session key to
    generate signature to match for every request and response packet.

    We also forgo checking signature for a session setup response
    from the server.

Acked-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:47:50 -05:00
Shirish Pargaonkar
5c234aa5e3 cifs: Add a variable specific to NTLMSSP for key exchange.
Add a variable specific to NTLMSSP authentication to determine
whether to exchange keys during negotiation and authentication phases.

Since session key for smb1 is per smb connection, once a very first
sesion is established, there is no need for key exchange during
subsequent session setups. As a result, smb1 session setup code sets this
variable as false.

Since session key for smb2 and smb3 is per smb connection, we need to
exchange keys to generate session key for every sesion being established.
As a result, smb2/3 session setup code sets this variable as true.

Acked-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:47:49 -05:00
Shirish Pargaonkar
d4e63bd6e4 cifs: Process post session setup code in respective dialect functions.
Move the post (successful) session setup code to respective dialect routines.

For smb1, session key is per smb connection.
For smb2/smb3, session key is per smb session.

If client and server do not require signing, free session key for smb1/2/3.

If client and server require signing
  smb1 - Copy (kmemdup) session key for the first session to connection.
         Free session key of that and subsequent sessions on this connection.
  smb2 - For every session, keep the session key and free it when the
         session is being shutdown.
  smb3 - For every session, generate the smb3 signing key using the session key
         and then free the session key.

There are two unrelated line formatting changes as well.

Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:47:47 -05:00
Wei Yongjun
31f92e9a87 CIFS: convert to use le32_add_cpu()
Convert cpu_to_le32(le32_to_cpu(E1) + E2) to use le32_add_cpu().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:47:43 -05:00
Pavel Shilovsky
933d4b3657 CIFS: Fix missing lease break
If a server sends a lease break to a connection that doesn't have
opens with a lease key specified in the server response, we can't
find an open file to send an ack. Fix this by walking through
all connections we have.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:41:43 -05:00
Pavel Shilovsky
1a05096de8 CIFS: Fix a memory leak when a lease break comes
This happens when we receive a lease break from a server, then
find an appropriate lease key in opened files and schedule the
oplock_break slow work. lw pointer isn't freed in this case.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:41:40 -05:00
Jeff Layton
ec71e0e159 cifs: convert case-insensitive dentry ops to use new case conversion routines
Have the case-insensitive d_compare and d_hash routines convert each
character in the filenames to wchar_t's and then use the new
cifs_toupper routine to convert those into uppercase.

With this scheme we should more closely emulate the case conversion that
the servers will do.

Reported-and-Tested-by: Jan-Marek Glogowski <glogow@fbihome.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:38:08 -05:00
Jeff Layton
c2ccf53dd0 cifs: add new case-insensitive conversion routines that are based on wchar_t's
The existing NLS case conversion routines do not appropriately handle
the (now common) case where the local host is using UTF8. This is
because nls_utf8 has no support at all for converting a utf8 string
between cases and the NLS infrastructure in general cannot handle
a multibyte input character.

In any case, what we really need for cifs is to emulate how we expect
the server to convert the character to upper or lowercase. Thus, even
if we had routines that could handle utf8 case conversion, we likely
would end up with the wrong result if the name ends up being in the
upper planes.

This patch adds a new scheme for doing unicode case conversion. The
case conversion tables that Microsoft has published for Windows 8
have been converted to a set of lookup tables, and a routine is
added to convert a wchar_t from lower to uppercase using those
tables.

Reported-and-Tested-by: Jan-Marek Glogowski <glogow@fbihome.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:38:05 -05:00
Scott Lovenberg
cdf1246ffb cifs: Move and expand MAX_SERVER_SIZE definition
MAX_SERVER_SIZE has been moved to cifs_mount.h and renamed
CIFS_NI_MAXHOST for clarity.  It has been expanded to 1024 as the
previous value of 16 was very short.

Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:34:22 -05:00
Scott Lovenberg
8c3a2b4c42 cifs: Move string length definitions to uapi
The max string length definitions for user name, domain name, password,
and share name have been moved into their own header file in uapi so the
mount helper can use autoconf to define them instead of keeping the
kernel side and userland side definitions in sync manually.  The names
have also been standardized with a "CIFS" prefix and "LEN" suffix.

Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:34:11 -05:00
Pavel Shilovsky
d244bf2dfb CIFS: Implement follow_link for nounix CIFS mounts
by using a query reparse ioctl request.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:27:41 -05:00
Pavel Shilovsky
b42bf88828 CIFS: Implement follow_link for SMB2
that allows to access files through symlink created on a server.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:27:34 -05:00
Jeff Layton
3ae35cde67 cifs: display iocharset= option in /proc/mounts
...but only if it's not the default charset.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:24:30 -05:00
Jeff Layton
30706a5454 cifs: create a new Documentation/ directory and move docfiles into it
Currently, we have a number of documentation files that live under
fs/cifs/. Generally, these don't get picked up by distro packagers,
since they're in a non-standard location. Move them to a new spot
under Documentation/ instead.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:24:10 -05:00
Jeff Layton
73e216a8a4 cifs: ensure that srv_mutex is held when dealing with ssocket pointer
Oleksii reported that he had seen an oops similar to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ipt_MASQUERADE xt_REDIRECT xt_tcpudp iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables carl9170 ath usb_storage f2fs nfnetlink_log nfnetlink md4 cifs dns_resolver hid_generic usbhid hid af_packet uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev rfcomm btusb bnep bluetooth qmi_wwan qcserial cdc_wdm usb_wwan usbnet usbserial mii snd_hda_codec_hdmi snd_hda_codec_realtek iwldvm mac80211 coretemp intel_powerclamp kvm_intel kvm iwlwifi snd_hda_intel cfg80211 snd_hda_codec xhci_hcd e1000e ehci_pci snd_hwdep sdhci_pci snd_pcm ehci_hcd microcode psmouse sdhci thinkpad_acpi mmc_core i2c_i801 pcspkr usbcore hwmon snd_timer snd_page_alloc snd ptp rfkill pps_core soundcore evdev usb_common vboxnetflt(O) vboxdrv(O)Oops#2 Part8
 loop tun binfmt_misc fuse msr acpi_call(O) ipv6 autofs4
CPU: 0 PID: 21612 Comm: kworker/0:1 Tainted: G        W  O 3.10.1SIGN #28
Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ET92WW (2.52 ) 02/22/2013
Workqueue: cifsiod cifs_echo_request [cifs]
task: ffff8801e1f416f0 ti: ffff880148744000 task.ti: ffff880148744000
RIP: 0010:[<ffffffff814dcc13>]  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
RSP: 0000:ffff880148745b00  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880148745b78 RCX: 0000000000000048
RDX: ffff880148745c90 RSI: ffff880181864a00 RDI: ffff880148745b78
RBP: ffff880148745c48 R08: 0000000000000048 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880181864a00
R13: ffff880148745c90 R14: 0000000000000048 R15: 0000000000000048
FS:  0000000000000000(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000088 CR3: 000000020c42c000 CR4: 00000000001407b0
Oops#2 Part7
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff880148745b30 ffffffff810c4af9 0000004848745b30 ffff880181864a00
 ffffffff81ffbc40 0000000000000000 ffff880148745c90 ffffffff810a5aab
 ffff880148745bc0 ffffffff81ffbc40 ffff880148745b60 ffffffff815a9fb8
Call Trace:
 [<ffffffff810c4af9>] ? finish_task_switch+0x49/0xe0
 [<ffffffff810a5aab>] ? lock_timer_base.isra.36+0x2b/0x50
 [<ffffffff815a9fb8>] ? _raw_spin_unlock_irqrestore+0x18/0x40
 [<ffffffff810a673f>] ? try_to_del_timer_sync+0x4f/0x70
 [<ffffffff815aa38f>] ? _raw_spin_unlock_bh+0x1f/0x30
 [<ffffffff814dcc87>] kernel_sendmsg+0x37/0x50
 [<ffffffffa081a0e0>] smb_send_kvec+0xd0/0x1d0 [cifs]
 [<ffffffffa081a263>] smb_send_rqst+0x83/0x1f0 [cifs]
 [<ffffffffa081ab6c>] cifs_call_async+0xec/0x1b0 [cifs]
 [<ffffffffa08245e0>] ? free_rsp_buf+0x40/0x40 [cifs]
Oops#2 Part6
 [<ffffffffa082606e>] SMB2_echo+0x8e/0xb0 [cifs]
 [<ffffffffa0808789>] cifs_echo_request+0x79/0xa0 [cifs]
 [<ffffffff810b45b3>] process_one_work+0x173/0x4a0
 [<ffffffff810b52a1>] worker_thread+0x121/0x3a0
 [<ffffffff810b5180>] ? manage_workers.isra.27+0x2b0/0x2b0
 [<ffffffff810bae00>] kthread+0xc0/0xd0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
 [<ffffffff815b199c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810bad40>] ? kthread_create_on_node+0x120/0x120
Code: 84 24 b8 00 00 00 4c 89 f1 4c 89 ea 4c 89 e6 48 89 df 4c 89 60 18 48 c7 40 28 00 00 00 00 4c 89 68 30 44 89 70 14 49 8b 44 24 28 <ff> 90 88 00 00 00 3d ef fd ff ff 74 10 48 8d 65 e0 5b 41 5c 41
 RIP  [<ffffffff814dcc13>] sock_sendmsg+0x93/0xd0
 RSP <ffff880148745b00>
CR2: 0000000000000088

The client was in the middle of trying to send a frame when the
server->ssocket pointer got zeroed out. In most places, that we access
that pointer, the srv_mutex is held. There's only one spot that I see
that the server->ssocket pointer gets set and the srv_mutex isn't held.
This patch corrects that.

The upstream bug report was here:

    https://bugzilla.kernel.org/show_bug.cgi?id=60557

Cc: <stable@vger.kernel.org>
Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2013-09-08 14:24:07 -05:00
Al Viro
d040790391 prune_super(): sb->s_op is never NULL
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:56 -04:00
Al Viro
dfc59e2c90 exportfs: don't assume that ->iterate() won't feed us too long entries
On some filesystems it's impossible even with fs corruption, but we'd
better not rely on that, what with memcpy() into on-stack array we
are doing there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:55 -04:00
Al Viro
5d8943b04b afs: get rid of redundant ->d_name.len checks
No dentry can get to directory modification methods without
having passed either ->lookup() or ->atomic_open(); if name is
rejected by those two (or by ->d_hash()) with an error, it won't
be seen by anything else.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-07 19:54:55 -04:00
Weston Andros Adamson
b1b3e13694 NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity
Commit 97431204ea005ec8070ac94bc3251e836daa7ca7 introduced a regression
that causes SECINFO_NO_NAME to fail without sending an RPC if:

 1) the nfs_client's rpc_client is using krb5i/p (now tried by default)
 2) the current user doesn't have valid kerberos credentials

This situation is quite common - as of now a sec=sys mount would use
krb5i for the nfs_client's rpc_client and a user would hardly be faulted
for not having run kinit.

The solution is to use the machine cred when trying to use an integrity
protected auth flavor for SECINFO_NO_NAME.

Older servers may not support using the machine cred or an integrity
protected auth flavor for SECINFO_NO_NAME in every circumstance, so we fall
back to using the user's cred and the filesystem's auth flavor in this case.

We run into another problem when running against linux nfs servers -
they return NFS4ERR_WRONGSEC when using integrity auth flavor (unless the
mount is also that flavor) even though that is not a valid error for
SECINFO*.  Even though it's against spec, handle WRONGSEC errors on
SECINFO_NO_NAME by falling back to using the user cred and the
filesystem's auth flavor.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 18:39:25 -04:00
Trond Myklebust
0aea92bf67 NFS: nfs_compare_super shouldn't check the auth flavour unless 'sec=' was set
Also don't worry about obsolete mount flags...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 18:38:04 -04:00
Trond Myklebust
47040da3c7 NFSv4: Allow security autonegotiation for submounts
In cases where the parent super block was not mounted with a 'sec=' line,
allow autonegotiation of security for the submounts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 17:52:42 -04:00
Trond Myklebust
41d058c3ba NFSv4: Disallow security negotiation for lookups when 'sec=' is specified
Ensure that nfs4_proc_lookup_common respects the NFS_MOUNT_SECFLAVOUR
flag.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 17:52:42 -04:00
Linus Torvalds
dc0755cdb1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 2 (of many) from Al Viro:
 "Mostly Miklos' series this time"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  constify dcache.c inlined helpers where possible
  fuse: drop dentry on failed revalidate
  fuse: clean up return in fuse_dentry_revalidate()
  fuse: use d_materialise_unique()
  sysfs: use check_submounts_and_drop()
  nfs: use check_submounts_and_drop()
  gfs2: use check_submounts_and_drop()
  afs: use check_submounts_and_drop()
  vfs: check unlinked ancestors before mount
  vfs: check submounts and drop atomically
  vfs: add d_walk()
  vfs: restructure d_genocide()
2013-09-07 14:36:57 -07:00
Linus Torvalds
c7c4591db6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace changes from Eric Biederman:
 "This is an assorted mishmash of small cleanups, enhancements and bug
  fixes.

  The major theme is user namespace mount restrictions.  nsown_capable
  is killed as it encourages not thinking about details that need to be
  considered.  A very hard to hit pid namespace exiting bug was finally
  tracked and fixed.  A couple of cleanups to the basic namespace
  infrastructure.

  Finally there is an enhancement that makes per user namespace
  capabilities usable as capabilities, and an enhancement that allows
  the per userns root to nice other processes in the user namespace"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  userns:  Kill nsown_capable it makes the wrong thing easy
  capabilities: allow nice if we are privileged
  pidns: Don't have unshare(CLONE_NEWPID) imply CLONE_THREAD
  userns: Allow PR_CAPBSET_DROP in a user namespace.
  namespaces: Simplify copy_namespaces so it is clear what is going on.
  pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup
  sysfs: Restrict mounting sysfs
  userns: Better restrictions on when proc and sysfs can be mounted
  vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces
  kernel/nsproxy.c: Improving a snippet of code.
  proc: Restrict mounting the proc filesystem
  vfs: Lock in place mounts from more privileged users
2013-09-07 14:35:32 -07:00
Linus Torvalds
11c7b03d42 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Nothing major for this kernel, just maintenance updates"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
  apparmor: add the ability to report a sha1 hash of loaded policy
  apparmor: export set of capabilities supported by the apparmor module
  apparmor: add the profile introspection file to interface
  apparmor: add an optional profile attachment string for profiles
  apparmor: add interface files for profiles and namespaces
  apparmor: allow setting any profile into the unconfined state
  apparmor: make free_profile available outside of policy.c
  apparmor: rework namespace free path
  apparmor: update how unconfined is handled
  apparmor: change how profile replacement update is done
  apparmor: convert profile lists to RCU based locking
  apparmor: provide base for multiple profiles to be replaced at once
  apparmor: add a features/policy dir to interface
  apparmor: enable users to query whether apparmor is enabled
  apparmor: remove minimum size check for vmalloc()
  Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes
  Smack: network label match fix
  security: smack: add a hash table to quicken smk_find_entry()
  security: smack: fix memleak in smk_write_rules_list()
  xattr: Constify ->name member of "struct xattr".
  ...
2013-09-07 14:34:07 -07:00
Trond Myklebust
5e6b19901b NFSv4: Fix security auto-negotiation
NFSv4 security auto-negotiation has been broken since
commit 4580a92d44e2b21c2254fa5fef0f1bfb43c82318 (NFS:
Use server-recommended security flavor by default (NFSv3))
because nfs4_try_mount() will automatically select AUTH_SYS
if it sees no auth flavours.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
2013-09-07 16:18:30 -04:00
Trond Myklebust
19e7b8d240 NFS: Clean up nfs_parse_security_flavors()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 16:12:45 -04:00
Trond Myklebust
74c9881162 NFS: Clean up the auth flavour array mess
What is the point of having a 'auth_flavor_len' field, if it is
always set to 1, and can't be used to determine if the user has
selected an auth flavour?
This cleanup goes back to using auth_flavor_len for its original
intended purpose, and gets rid of the ad-hoc replacements.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-07 16:11:24 -04:00
Richard Weinberger
65984ff9d2 um: hostfs: Fix writeback
We have to implement ->release() and trigger writeback from it.
Otherwise we might lose dirty pages at munmap().

Signed-off-by: Richard Weinberger <richard@nod.at>
2013-09-07 10:38:29 +02:00
Kees Cook
cb69f36ba1 ecryptfs: avoid ctx initialization race
It might be possible for two callers to race the mutex lock after the
NULL ctx check. Instead, move the lock above the check so there isn't
the possibility of leaking a crypto ctx. Additionally, report the full
algo name when failing.

Signed-off-by: Kees Cook <keescook@chromium.org>
[tyhicks: remove out label, which is no longer used]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2013-09-06 16:58:18 -07:00
Dan Carpenter
e6cbd6a44d ecryptfs: remove check for if an array is NULL
It doesn't make sense to check if an array is NULL.  The compiler just
removes the check.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
2013-09-06 16:51:56 -07:00
Yan, Zheng
a8d436f015 ceph: use d_invalidate() to invalidate aliases
d_invalidate() is the standard VFS method to invalidate dentry.
compare to d_delete(), it also try shrinking children dentries.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-09-06 12:55:29 -07:00
Yan, Zheng
ed284c49f6 ceph: remove ceph_lookup_inode()
commit 6f60f889 (ceph: fix freeing inode vs removing session caps race)
introduced ceph_lookup_inode(). But there is already a ceph_find_inode()
which provides similar function. So remove ceph_lookup_inode(), use
ceph_find_inode() instead.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Alex Elder <alex.elder@linary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
2013-09-06 12:55:09 -07:00
Andy Adamson
0e20162ed1 NFSv4.1 Use MDS auth flavor for data server connection
Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
uses the nfs_client cl_rpcclient for all state management operations, and
will use krb5i or auth_sys with no regard to the mount command authflavor
choice.

The MDS, as any NFSv4.1 mount point, uses the nfs_server rpc client for all
non-state management operations with a different nfs_server for each fsid
encountered traversing the mount point, each with a potentially different
auth flavor.

pNFS data servers are not mounted in the normal sense as there is no associated
nfs_server structure. Data servers can also export multiple fsids, each with
a potentially different auth flavor.

Data servers need to use the same authflavor as the MDS server rpc client for
non-state management operations. Populate a list of rpc clients with the MDS
server rpc client auth flavor for the DS to use.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-06 14:49:16 -04:00
Milosz Tanski
971f0bdeaa ceph: trivial buildbot warnings fix
The linux-next build bot found a three of warnings, this addresses all of them.

 * non-ANSI function declaration of function 'ceph_fscache_register' and
   'ceph_fscache_unregister'
 * symbol 'ceph_cache_netfs' was not declared, now it's extern in the header.
 * warning: "pr_fmt" redefined

Signed-off-by: Milosz Tanski <milosz@adfin.com>
2013-09-06 16:50:12 +00:00
Milosz Tanski
e81568eb18 ceph: Do not do invalidate if the filesystem is mounted nofsc
Previously we would always try to enqueue work even if the filesystem is not
mounted with fscache enabled (or the file has no cookie). In the case of the
filesystem mouned nofsc (but with fscache compiled in) this would lead to a
crash.

Signed-off-by: Milosz Tanski <milosz@adfin.com>
2013-09-06 16:50:12 +00:00
Milosz Tanski
d4d3aa38d6 ceph: page still marked private_2
Previous patch that allowed us to cleanup most of the issues with pages marked
as private_2 when calling ceph_readpages. However, there seams to be a case in
the error case clean up in start read that still trigers this from time to
time. I've only seen this one a couple times.

BUG: Bad page state in process petabucket  pfn:335b82
page:ffffea000cd6e080 count:0 mapcount:0 mapping:          (null) index:0x0
page flags: 0x200000000001000(private_2)
Call Trace:
 [<ffffffff81563442>] dump_stack+0x46/0x58
 [<ffffffff8112c7f7>] bad_page+0xc7/0x120
 [<ffffffff8112cd9e>] free_pages_prepare+0x10e/0x120
 [<ffffffff8112e580>] free_hot_cold_page+0x40/0x160
 [<ffffffff81132427>] __put_single_page+0x27/0x30
 [<ffffffff81132d95>] put_page+0x25/0x40
 [<ffffffffa02cb409>] ceph_readpages+0x2e9/0x6f0 [ceph]
 [<ffffffff811313cf>] __do_page_cache_readahead+0x1af/0x260

Signed-off-by: Milosz Tanski <milosz@adfin.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-06 16:50:12 +00:00
Milosz Tanski
9b8dd1e8a5 ceph: ceph_readpage_to_fscache didn't check if marked
Previously ceph_readpage_to_fscache did not call if page was marked as cached
before calling fscache_write_page resulting in a BUG inside of fscache.

FS-Cache: Assertion failed
------------[ cut here ]------------
kernel BUG at fs/fscache/page.c:874!
invalid opcode: 0000 [#1] SMP
Call Trace:
 [<ffffffffa02e6566>] __ceph_readpage_to_fscache+0x66/0x80 [ceph]
 [<ffffffffa02caf84>] readpage_nounlock+0x124/0x210 [ceph]
 [<ffffffffa02cb08d>] ceph_readpage+0x1d/0x40 [ceph]
 [<ffffffff81126db6>] generic_file_aio_read+0x1f6/0x700
 [<ffffffffa02c6fcc>] ceph_aio_read+0x5fc/0xab0 [ceph]

Signed-off-by: Milosz Tanski <milosz@adfin.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-06 16:50:12 +00:00
Milosz Tanski
76be778b3a ceph: clean PgPrivate2 on returning from readpages
In some cases the ceph readapages code code bails without filling all the pages
already marked by fscache. When we return back to readahead code this causes
a BUG.

Signed-off-by: Milosz Tanski <milosz@adfin.com>
2013-09-06 16:50:11 +00:00
Milosz Tanski
99ccbd229c ceph: use fscache as a local presisent cache
Adding support for fscache to the Ceph filesystem. This would bring it to on
par with some of the other network filesystems in Linux (like NFS, AFS, etc...)

In order to mount the filesystem with fscache the 'fsc' mount option must be
passed.

Signed-off-by: Milosz Tanski <milosz@adfin.com>
Signed-off-by: Sage Weil <sage@inktank.com>
2013-09-06 16:50:11 +00:00
Milosz Tanski
cd0a2df681 Patches for Ceph FS-Cache support
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIVAwUAUimQLxOxKuMESys7AQJc+Q/+N3BN+ZWRhfqSKANFyEuXIsUmzueCmmYc
 ZOgdRGTorlYKefqFHTFOHLvPbbxXaT2/HKUhq38yuA8UuIkoPL3rRQpjNcvWR+TX
 s/H17MRqPeZOfaDC/p9y2uL5MbuUvzhlCZ/GTi5w2ZwiNuWBo10gxeyCrXQSOFtH
 YFq0dVuG0AXzWdZuWcM3MtgY0llcMmfnZpIDjF4JDXJidgXY+wtjNUF3ByYZ+33+
 +CmzXrnCaY+3N44Ji2Dn+ci8tym8uht4dnbTZFkQ0I6B+k93V2RkZeHWnDQWqW+c
 THjyG9c+LSf0m8FIh43DNNJSkywbh5dxsBgnqxhQTJMij0dV1ne8wjKptJMgz+0b
 HFUi4rE6oRQtbLdTtJhdjdFFORBGdFj71gW8foBdFAZTP4Amf/fbiAfHeK+33oDt
 s5PMJyfA3BDM90eBFoxDWjCEe+o6YcccHC0SVWM1ZJPQ/U0hXL99O6NSHBrr9iNP
 spBgM+fNTgtUMf6P6MwjJfTQHov5xevBNaLB3boUPAI6/yK8KQt9xoevJ7t902uQ
 /19bXoNgMmwAti1Gd6T5UnWlAHsOWdnIASUu8LVqEoh2PY42T2I4NTWhgs4NFntu
 91MYysF93sTx0sMvGvWdCUSl7zMMeKXCUSUnPvMld+BMqyuK3XzsO3xkMn97t2/U
 p6kQZXZDlwU=
 =s35L
 -----END PGP SIGNATURE-----

Merge tag 'fscache-fixes-for-ceph' into wip-fscache

Patches for Ceph FS-Cache support
2013-09-06 16:41:20 +00:00
Linus Torvalds
2e515bf096 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
 "The usual trivial updates all over the tree -- mostly typo fixes and
  documentation updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits)
  doc: Documentation/cputopology.txt fix typo
  treewide: Convert retrun typos to return
  Fix comment typo for init_cma_reserved_pageblock
  Documentation/trace: Correcting and extending tracepoint documentation
  mm/hotplug: fix a typo in Documentation/memory-hotplug.txt
  power: Documentation: Update s2ram link
  doc: fix a typo in Documentation/00-INDEX
  Documentation/printk-formats.txt: No casts needed for u64/s64
  doc: Fix typo "is is" in Documentations
  treewide: Fix printks with 0x%#
  zram: doc fixes
  Documentation/kmemcheck: update kmemcheck documentation
  doc: documentation/hwspinlock.txt fix typo
  PM / Hibernate: add section for resume options
  doc: filesystems : Fix typo in Documentations/filesystems
  scsi/megaraid fixed several typos in comments
  ppc: init_32: Fix error typo "CONFIG_START_KERNEL"
  treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks
  page_isolation: Fix a comment typo in test_pages_isolated()
  doc: fix a typo about irq affinity
  ...
2013-09-06 09:36:28 -07:00
Linus Torvalds
ec0ad73080 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext3, reiserfs, udf & isofs fixes from Jan Kara:
 "The contains a bunch of ext3 cleanups and minor improvements, major
  reiserfs locking changes which should hopefully fix deadlocks
  introduced by BKL removal, and udf/isofs changes to refuse mounting fs
  rw instead of mounting it ro automatically which makes eject button
  work as expected for all media (see the changelog for why userspace
  should be ok with this change)"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  jbd: use a single printk for jbd_debug()
  reiserfs: locking, release lock around quota operations
  reiserfs: locking, handle nested locks properly
  reiserfs: locking, push write lock out of xattr code
  jbd: relocate assert after state lock in journal_commit_transaction()
  udf: Refuse RW mount of the filesystem instead of making it RO
  udf: Standardize return values in mount sequence
  isofs: Refuse RW mount of the filesystem instead of making it RO
  ext3: allow specifying external journal by pathname mount option
  jbd: remove unneeded semicolon
2013-09-06 09:06:02 -07:00
Linus Torvalds
eb97a784f0 f2fs updates for v3.12
This patch-set includes the following major enhancement patches.
 o support inline xattrs
 o add sysfs support to control GCs explicitly
 o add proc entry to show the current segment usage information
 o improve the GC/SSR performance
 
 The other bug fixes are as follows.
 o avoid the overflow on status calculation
 o fix some error handling routines
 o fix inconsistent xattr states after power-off-recovery
 o fix incorrect xattr node offset definition
 o fix deadlock condition in fsync
 o fix the fdatasync routine for power-off-recovery
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJSKDoaAAoJEEAUqH6CSFDSovoQAJSWnvRfeu4olkKe7LblVXA5
 NFYsjtdtnWsmSY1kq2j541SLo8Kw2UibozbrN6BaJ9MOKnTz1+x0R9U0vpewmCO4
 FkxlGX/3i3k/4tR0AvD4U56xgqh+IhYi18nBN8kOTwhLqjFtx5JFKAHBnGwjbB4T
 YpEaitNY6dL8l+DUxs11KnPmNazbck6iNGOYXpvfhTS4DNSJTT0L/fLqugDhFJNI
 7e3f6vVORRwC5UdtJk6B6HXxv1pHv4uGeLki0W4jgGp7AdxpawbfeDrDcrECjoc+
 0s/QQTsjoeIKeCfojSEgLGSSl8PZpx2VVCxri+nMPjLzY81QUXbpsAlhB2RW9Uz/
 E9ESAPpzL9ykh35THALic7N0ATXGlepnu0EGU6+fjWGUIyHeV+2yoswz599VliRO
 GunHgwrfNMyXWHw9zw6SPIJvN3caPn3wlDhffei9wOl92YkleBuHA7ojIzfRc2vz
 YQ7jKmZNZ/CM2qiw350XIfSaa+3iszlxwoWK7DLWQZm3um0MpYme9RmadnPvxsRM
 gnUYiovPwR+om3zAnURMvq/LNKi6NjflRgu2OAU/0CpJMEX9vaVe/xOKdjCs19je
 dxinQGuOS5P+J141SkM3jJ1eyZLC4zCyp42xQSZ1+Zg+6BU4PVY3+i4hCQFopHDt
 fyeav8SM/fk9HrKU3npq
 =YMk3
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This patch-set includes the following major enhancement patches:
   - support inline xattrs
   - add sysfs support to control GCs explicitly
   - add proc entry to show the current segment usage information
   - improve the GC/SSR performance

  The other bug fixes are as follows:
   - avoid the overflow on status calculation
   - fix some error handling routines
   - fix inconsistent xattr states after power-off-recovery
   - fix incorrect xattr node offset definition
   - fix deadlock condition in fsync
   - fix the fdatasync routine for power-off-recovery"

* tag 'for-f2fs-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits)
  f2fs: optimize gc for better performance
  f2fs: merge more bios of node block writes
  f2fs: avoid an overflow during utilization calculation
  f2fs: trigger GC when there are prefree segments
  f2fs: use strncasecmp() simplify the string comparison
  f2fs: fix omitting to update inode page
  f2fs: support the inline xattrs
  f2fs: add the truncate_xattr_node function
  f2fs: introduce __find_xattr for readability
  f2fs: reserve the xattr space dynamically
  f2fs: add flags for inline xattrs
  f2fs: fix error return code in init_f2fs_fs()
  f2fs: fix wrong BUG_ON condition
  f2fs: fix memory leak when init f2fs filesystem fail
  f2fs: fix a compound statement label error
  f2fs: avoid writing inode redundantly when creating a file
  f2fs: alloc_page() doesn't return an ERR_PTR
  f2fs: should cover i_xattr_nid with its xattr node page lock
  f2fs: check the free space first in new_node_page
  f2fs: clean up the needless end 'return' of void function
  ...
2013-09-06 09:04:34 -07:00
Trond Myklebust
4109bb7496 NFS: Don't check lock owner compatability unless file is locked (part 2)
When coalescing requests into a single READ or WRITE RPC call, and there
is no file locking involved, we don't have to refuse coalescing for
requests where the lock owner information doesn't match.

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-09-06 11:27:41 -04:00
Milosz Tanski
5a6f282a20 fscache: Netfs function for cleanup post readpages
Currently the fscache code expect the netfs to call fscache_readpages_or_alloc
inside the aops readpages callback.  It marks all the pages in the list
provided by readahead with PG_private_2.  In the cases that the netfs fails to
read all the pages (which is legal) it ends up returning to the readahead and
triggering a BUG.  This happens because the page list still contains marked
pages.

This patch implements a simple fscache_readpages_cancel function that the netfs
should call before returning from readpages.  It will revoke the pages from the
underlying cache backend and unmark them.

The problem was originally worked out in the Ceph devel tree, but it also
occurs in CIFS.  It appears that NFS, AFS and 9P are okay as read_cache_pages()
will clean up the unprocessed pages in the case of an error.

This can be used to address the following oops:

[12410647.597278] BUG: Bad page state in process petabucket  pfn:3d504e
[12410647.597292] page:ffffea000f541380 count:0 mapcount:0 mapping:
	(null) index:0x0
[12410647.597298] page flags: 0x200000000001000(private_2)

...

[12410647.597334] Call Trace:
[12410647.597345]  [<ffffffff815523f2>] dump_stack+0x19/0x1b
[12410647.597356]  [<ffffffff8111def7>] bad_page+0xc7/0x120
[12410647.597359]  [<ffffffff8111e49e>] free_pages_prepare+0x10e/0x120
[12410647.597361]  [<ffffffff8111fc80>] free_hot_cold_page+0x40/0x170
[12410647.597363]  [<ffffffff81123507>] __put_single_page+0x27/0x30
[12410647.597365]  [<ffffffff81123df5>] put_page+0x25/0x40
[12410647.597376]  [<ffffffffa02bdcf9>] ceph_readpages+0x2e9/0x6e0 [ceph]
[12410647.597379]  [<ffffffff81122a8f>] __do_page_cache_readahead+0x1af/0x260
[12410647.597382]  [<ffffffff81122ea1>] ra_submit+0x21/0x30
[12410647.597384]  [<ffffffff81118f64>] filemap_fault+0x254/0x490
[12410647.597387]  [<ffffffff8113a74f>] __do_fault+0x6f/0x4e0
[12410647.597391]  [<ffffffff810125bd>] ? __switch_to+0x16d/0x4a0
[12410647.597395]  [<ffffffff810865ba>] ? finish_task_switch+0x5a/0xc0
[12410647.597398]  [<ffffffff8113d856>] handle_pte_fault+0xf6/0x930
[12410647.597401]  [<ffffffff81008c33>] ? pte_mfn_to_pfn+0x93/0x110
[12410647.597403]  [<ffffffff81008cce>] ? xen_pmd_val+0xe/0x10
[12410647.597405]  [<ffffffff81005469>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[12410647.597407]  [<ffffffff8113f361>] handle_mm_fault+0x251/0x370
[12410647.597411]  [<ffffffff812b0ac4>] ? call_rwsem_down_read_failed+0x14/0x30
[12410647.597414]  [<ffffffff8155bffa>] __do_page_fault+0x1aa/0x550
[12410647.597418]  [<ffffffff8108011d>] ? up_write+0x1d/0x20
[12410647.597422]  [<ffffffff8113141c>] ? vm_mmap_pgoff+0xbc/0xe0
[12410647.597425]  [<ffffffff81143bb8>] ? SyS_mmap_pgoff+0xd8/0x240
[12410647.597427]  [<ffffffff8155c3ae>] do_page_fault+0xe/0x10
[12410647.597431]  [<ffffffff81558818>] page_fault+0x28/0x30

Signed-off-by: Milosz Tanski <milosz@adfin.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-06 09:17:30 +01:00
David Howells
5002d7bef8 CacheFiles: Implement interface to check cache consistency
Implement the FS-Cache interface to check the consistency of a cache object in
CacheFiles.

Original-author: Hongyi Jia <jiayisuse@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hongyi Jia <jiayisuse@gmail.com>
cc: Milosz Tanski <milosz@adfin.com>
2013-09-06 09:17:30 +01:00
David Howells
da9803bc88 FS-Cache: Add interface to check consistency of a cached object
Extend the fscache netfs API so that the netfs can ask as to whether a cache
object is up to date with respect to its corresponding netfs object:

	int fscache_check_consistency(struct fscache_cookie *cookie)

This will call back to the netfs to check whether the auxiliary data associated
with a cookie is correct.  It returns 0 if it is and -ESTALE if it isn't; it
may also return -ENOMEM and -ERESTARTSYS.

The backends now have to implement a mandatory operation pointer:

	int (*check_consistency)(struct fscache_object *object)

that corresponds to the above API call.  FS-Cache takes care of pinning the
object and the cookie in memory and managing this call with respect to the
object state.

Original-author: Hongyi Jia <jiayisuse@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hongyi Jia <jiayisuse@gmail.com>
cc: Milosz Tanski <milosz@adfin.com>
2013-09-06 09:17:30 +01:00
Phillip Lougher
9e01242386 Squashfs: add corruption check for type in squashfs_readdir()
We read the type field from disk.  This value should be sanity
checked for correctness to avoid an out of bounds access when
reading the squashfs_filetype_table array.

Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
2013-09-06 04:57:54 +01:00