linux-stable/fs/nfs
Josef Bacik 31db25e314 nfs: fix panic when nfs4_ff_layout_prepare_ds() fails
[ Upstream commit 719fcafe07 ]

We've been seeing the following panic in production

BUG: kernel NULL pointer dereference, address: 0000000000000065
PGD 2f485f067 P4D 2f485f067 PUD 2cc5d8067 PMD 0
RIP: 0010:ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
Call Trace:
 <TASK>
 ? __die+0x78/0xc0
 ? page_fault_oops+0x286/0x380
 ? __rpc_execute+0x2c3/0x470 [sunrpc]
 ? rpc_new_task+0x42/0x1c0 [sunrpc]
 ? exc_page_fault+0x5d/0x110
 ? asm_exc_page_fault+0x22/0x30
 ? ff_layout_free_layoutreturn+0x110/0x110 [nfs_layout_flexfiles]
 ? ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
 ? ff_layout_cancel_io+0x6f/0x90 [nfs_layout_flexfiles]
 pnfs_mark_matching_lsegs_return+0x1b0/0x360 [nfsv4]
 pnfs_error_mark_layout_for_return+0x9e/0x110 [nfsv4]
 ? ff_layout_send_layouterror+0x50/0x160 [nfs_layout_flexfiles]
 nfs4_ff_layout_prepare_ds+0x11f/0x290 [nfs_layout_flexfiles]
 ff_layout_pg_init_write+0xf0/0x1f0 [nfs_layout_flexfiles]
 __nfs_pageio_add_request+0x154/0x6c0 [nfs]
 nfs_pageio_add_request+0x26b/0x380 [nfs]
 nfs_do_writepage+0x111/0x1e0 [nfs]
 nfs_writepages_callback+0xf/0x30 [nfs]
 write_cache_pages+0x17f/0x380
 ? nfs_pageio_init_write+0x50/0x50 [nfs]
 ? nfs_writepages+0x6d/0x210 [nfs]
 ? nfs_writepages+0x6d/0x210 [nfs]
 nfs_writepages+0x125/0x210 [nfs]
 do_writepages+0x67/0x220
 ? generic_perform_write+0x14b/0x210
 filemap_fdatawrite_wbc+0x5b/0x80
 file_write_and_wait_range+0x6d/0xc0
 nfs_file_fsync+0x81/0x170 [nfs]
 ? nfs_file_mmap+0x60/0x60 [nfs]
 __x64_sys_fsync+0x53/0x90
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Inspecting the core with drgn I was able to pull this

  >>> prog.crashed_thread().stack_trace()[0]
  #0 at 0xffffffffa079657a (ff_layout_cancel_io+0x3a/0x84) in ff_layout_cancel_io at fs/nfs/flexfilelayout/flexfilelayout.c:2021:27
  >>> prog.crashed_thread().stack_trace()[0]['idx']
  (u32)1
  >>> prog.crashed_thread().stack_trace()[0]['flseg'].mirror_array[1].mirror_ds
  (struct nfs4_ff_layout_ds *)0xffffffffffffffed

This is clear from the stack trace, we call nfs4_ff_layout_prepare_ds()
which could error out initializing the mirror_ds, and then we go to
clean it all up and our check is only for if (!mirror->mirror_ds).  This
is inconsistent with the rest of the users of mirror_ds, which have

  if (IS_ERR_OR_NULL(mirror_ds))

to keep from tripping over this exact scenario.  Fix this up in
ff_layout_cancel_io() to make sure we don't panic when we get an error.
I also spot checked all the other instances of checking mirror_ds and we
appear to be doing the correct checks everywhere, only unconditionally
dereferencing mirror_ds when we know it would be valid.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Fixes: b739a5bd9d ("NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26 18:20:57 -04:00
..
blocklayout pNFS: Fix the pnfs block driver's calculation of layoutget size 2024-01-25 15:27:23 -08:00
filelayout pNFS/filelayout: Fix coalescing test for single DS 2023-01-24 07:24:30 +01:00
flexfilelayout nfs: fix panic when nfs4_ff_layout_prepare_ds() fails 2024-03-26 18:20:57 -04:00
cache_lib.c NFS client updates for Linux 4.15 2017-11-17 14:18:00 -08:00
cache_lib.h NFS client updates for Linux 4.15 2017-11-17 14:18:00 -08:00
callback_proc.c pNFS: Avoid a live lock condition in pnfs_update_layout() 2022-06-06 11:53:55 -04:00
callback_xdr.c SUNRPC: Parametrize how much of argsize should be zeroed 2022-09-26 14:02:42 -04:00
callback.c NFSD: Move svc_serv_ops::svo_function into struct svc_serv 2022-02-28 10:26:40 -05:00
callback.h NFSv4.1: Fix uninitialised variable in devicenotify 2022-01-06 14:00:20 -05:00
client.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
delegation.c NFSv4: Fix a potential state reclaim deadlock 2022-10-27 15:52:10 -04:00
delegation.h NFSv4: Fix delegation return in cases where we have to retry 2021-06-13 19:36:27 -04:00
dir.c nfs: Remove redundant null checks before kfree 2022-10-27 15:52:10 -04:00
direct.c pNFS: Fix the pnfs block driver's calculation of layoutget size 2024-01-25 15:27:23 -08:00
dns_resolve.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
dns_resolve.h NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
export.c nfsd: allow reaping files still under writeback 2024-03-26 18:20:23 -04:00
file.c NFS Client Updates for Linux 6.1 2022-10-13 09:58:42 -07:00
fs_context.c nfs: fix possible null-ptr-deref when parsing param 2022-12-31 13:33:04 +01:00
fscache.c mm, netfs, fscache: stop read optimisation when folio removed from pagecache 2024-01-10 17:10:31 +01:00
fscache.h nfs: Convert to release_folio 2022-05-09 23:12:33 -04:00
getroot.c NFS: Remove the nfs4_label argument from nfs_setsecurity 2021-11-05 14:54:40 -04:00
inode.c nfs: use vfs setgid helper 2023-08-30 16:11:10 +02:00
internal.h pNFS: Fix the pnfs block driver's calculation of layoutget size 2024-01-25 15:27:23 -08:00
io.c NFS: Fix up incorrect documentation 2021-04-05 09:04:20 -04:00
iostat.h
Kconfig NFS: Replace readdir's use of xxhash() with hash_64() 2022-04-07 16:19:47 -04:00
Makefile nfs: Convert to new fscache volume/cookie API 2022-01-10 11:53:25 +00:00
mount_clnt.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
namespace.c NFS: Fix an Oops in nfs_d_automount() 2022-12-31 13:32:18 +01:00
netns.h NFS: Add sysfs support for per-container identifier 2019-07-06 14:54:49 -04:00
nfs2super.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
nfs2xdr.c NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN 2023-09-13 09:42:49 +02:00
nfs3_fs.h vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
nfs3acl.c vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
nfs3client.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
nfs3proc.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
nfs3super.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
nfs3xdr.c NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN 2023-09-13 09:42:49 +02:00
nfs4_fs.h NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
nfs4client.c NFSv4.1: fix pnfs MDS=DS session trunking 2023-10-06 14:56:31 +02:00
nfs4file.c NFSv4.2 fix problems with __nfs42_ssc_open 2022-08-19 20:31:57 -04:00
nfs4getroot.c
nfs4idmap.c nfs: remove unnecessary (void*) conversions. 2022-10-03 11:26:36 -04:00
nfs4idmap.h
nfs4namespace.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
nfs4proc.c NFSv4.2: fix nfs4_listxattr kernel BUG at mm/usercopy.c:102 2024-03-26 18:20:56 -04:00
nfs4renewd.c treewide: remove editor modelines and cruft 2021-05-07 00:26:34 -07:00
nfs4session.c NFSv4: Sanity check the parameters in nfs41_update_target_slotid() 2021-11-07 09:23:14 -05:00
nfs4session.h NFSv4: Sanity check the parameters in nfs41_update_target_slotid() 2021-11-07 09:23:14 -05:00
nfs4state.c NFSv4: Fix a nfs4_state_manager() race 2023-10-10 22:00:41 +02:00
nfs4super.c NFS: Adjust fs_context error logging 2021-01-10 13:32:39 -05:00
nfs4sysctl.c nfs: Do not convert nfs_idmap_cache_timeout to jiffies 2018-01-18 15:10:47 -05:00
nfs4trace.c pNFS/flexfiles: Add tracing for layout errors 2020-01-15 10:54:33 -05:00
nfs4trace.h trace: Relocate event helper files 2024-03-06 14:45:17 +00:00
nfs4xdr.c NFSv4.2: Fix a memory stomp in decode_attr_security_label 2022-12-31 13:32:18 +01:00
nfs42.h NFSv4.2: fix listxattr maximum XDR buffer size 2024-03-26 18:20:56 -04:00
nfs42proc.c nfs42: client needs to strip file mode's suid/sgid bit after ALLOCATE op 2023-10-25 12:03:14 +02:00
nfs42xattr.c NFSv4.2: fix wrong shrinker_id 2023-07-19 16:21:43 +02:00
nfs42xdr.c NFSv4.2: Rework scratch handling for READ_PLUS (again) 2023-09-13 09:43:05 +02:00
nfs.h
nfsroot.c NFS: Fix an off by one in root_nfs_cat() 2024-03-26 18:20:56 -04:00
nfstrace.c NFSv4: Catch and trace server filehandle encoding errors 2021-04-14 09:36:29 -04:00
nfstrace.h trace: Relocate event helper files 2024-03-06 14:45:17 +00:00
pagelist.c NFSv4.1 mark qualified async operations as MOVEABLE tasks 2022-05-31 17:09:30 -04:00
pnfs_dev.c NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info 2023-09-19 12:27:58 +02:00
pnfs_nfs.c pNFS: Fix assignment of xprtdata.cred 2023-09-13 09:42:49 +02:00
pnfs.c pNFS: Fix the pnfs block driver's calculation of layoutget size 2024-01-25 15:27:23 -08:00
pnfs.h NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked 2022-10-06 09:52:09 -04:00
proc.c NFS: NFSv2/v3 clients should never be setting NFS_CAP_XATTR 2022-02-25 18:50:13 -05:00
read.c NFSv4.2: Rework scratch handling for READ_PLUS (again) 2023-09-13 09:43:05 +02:00
super.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
symlink.c fs: Change the type of filler_t 2022-05-09 16:36:48 -04:00
sysctl.c
sysfs.c NFS: rename nfs_client_kset to nfs_kset 2023-10-10 22:00:35 +02:00
sysfs.h NFSv4: Fix up RCU annotations for struct nfs_netns_client 2020-10-15 13:31:08 -04:00
unlink.c NFSv4.1 mark qualified async operations as MOVEABLE tasks 2022-05-31 17:09:30 -04:00
write.c NFS: Fix data corruption caused by congestion. 2024-03-06 14:45:14 +00:00