linux-stable/fs/nfs
Yanjun Zhang 632344b9ef NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies()
[ Upstream commit a848c29e34 ]

On the node of an NFS client, some files saved in the mountpoint of the
NFS server were copied to another location of the same NFS server.
Accidentally, the nfs42_complete_copies() got a NULL-pointer dereference
crash with the following syslog:

[232064.838881] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116
[232064.839360] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116
[232066.588183] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
[232066.588586] Mem abort info:
[232066.588701]   ESR = 0x0000000096000007
[232066.588862]   EC = 0x25: DABT (current EL), IL = 32 bits
[232066.589084]   SET = 0, FnV = 0
[232066.589216]   EA = 0, S1PTW = 0
[232066.589340]   FSC = 0x07: level 3 translation fault
[232066.589559] Data abort info:
[232066.589683]   ISV = 0, ISS = 0x00000007
[232066.589842]   CM = 0, WnR = 0
[232066.589967] user pgtable: 64k pages, 48-bit VAs, pgdp=00002000956ff400
[232066.590231] [0000000000000058] pgd=08001100ae100003, p4d=08001100ae100003, pud=08001100ae100003, pmd=08001100b3c00003, pte=0000000000000000
[232066.590757] Internal error: Oops: 96000007 [#1] SMP
[232066.590958] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm vhost_net vhost vhost_iotlb tap tun ipt_rpfilter xt_multiport ip_set_hash_ip ip_set_hash_net xfrm_interface xfrm6_tunnel tunnel4 tunnel6 esp4 ah4 wireguard libcurve25519_generic veth xt_addrtype xt_set nf_conntrack_netlink ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipport dummy ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_filter sch_ingress nfnetlink_cttimeout vport_gre ip_gre ip_tunnel gre vport_geneve geneve vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conncount dm_round_robin dm_service_time dm_multipath xt_nat xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack xt_comment nft_compat nft_counter nf_tables nfnetlink ocfs2 ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_ssif nbd overlay 8021q garp mrp bonding tls rfkill sunrpc ext4 mbcache jbd2
[232066.591052]  vfat fat cas_cache cas_disk ses enclosure scsi_transport_sas sg acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio dm_mirror dm_region_hash dm_log dm_mod nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc fuse xfs libcrc32c ast drm_vram_helper qla2xxx drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops cec sha256_arm64 sha1_ce drm_ttm_helper ttm nvme_fc igb sbsa_gwdt nvme_fabrics drm nvme_core i2c_algo_bit i40e scsi_transport_fc megaraid_sas aes_neon_bs
[232066.596953] CPU: 6 PID: 4124696 Comm: 10.253.166.125- Kdump: loaded Not tainted 5.15.131-9.cl9_ocfs2.aarch64 #1
[232066.597356] Hardware name: Great Wall .\x93\x8e...RF6260 V5/GWMSSE2GL1T, BIOS T656FBE_V3.0.18 2024-01-06
[232066.597721] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[232066.598034] pc : nfs4_reclaim_open_state+0x220/0x800 [nfsv4]
[232066.598327] lr : nfs4_reclaim_open_state+0x12c/0x800 [nfsv4]
[232066.598595] sp : ffff8000f568fc70
[232066.598731] x29: ffff8000f568fc70 x28: 0000000000001000 x27: ffff21003db33000
[232066.599030] x26: ffff800005521ae0 x25: ffff0100f98fa3f0 x24: 0000000000000001
[232066.599319] x23: ffff800009920008 x22: ffff21003db33040 x21: ffff21003db33050
[232066.599628] x20: ffff410172fe9e40 x19: ffff410172fe9e00 x18: 0000000000000000
[232066.599914] x17: 0000000000000000 x16: 0000000000000004 x15: 0000000000000000
[232066.600195] x14: 0000000000000000 x13: ffff800008e685a8 x12: 00000000eac0c6e6
[232066.600498] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff8000054e5828
[232066.600784] x8 : 00000000ffffffbf x7 : 0000000000000001 x6 : 000000000a9eb14a
[232066.601062] x5 : 0000000000000000 x4 : ffff70ff8a14a800 x3 : 0000000000000058
[232066.601348] x2 : 0000000000000001 x1 : 54dce46366daa6c6 x0 : 0000000000000000
[232066.601636] Call trace:
[232066.601749]  nfs4_reclaim_open_state+0x220/0x800 [nfsv4]
[232066.601998]  nfs4_do_reclaim+0x1b8/0x28c [nfsv4]
[232066.602218]  nfs4_state_manager+0x928/0x10f0 [nfsv4]
[232066.602455]  nfs4_run_state_manager+0x78/0x1b0 [nfsv4]
[232066.602690]  kthread+0x110/0x114
[232066.602830]  ret_from_fork+0x10/0x20
[232066.602985] Code: 1400000d f9403f20 f9402e61 91016003 (f9402c00)
[232066.603284] SMP: stopping secondary CPUs
[232066.606936] Starting crashdump kernel...
[232066.607146] Bye!

Analysing the vmcore, we know that nfs4_copy_state listed by destination
nfs_server->ss_copies was added by the field copies in handle_async_copy(),
and we found a waiting copy process with the stack as:
PID: 3511963  TASK: ffff710028b47e00  CPU: 0   COMMAND: "cp"
 #0 [ffff8001116ef740] __switch_to at ffff8000081b92f4
 #1 [ffff8001116ef760] __schedule at ffff800008dd0650
 #2 [ffff8001116ef7c0] schedule at ffff800008dd0a00
 #3 [ffff8001116ef7e0] schedule_timeout at ffff800008dd6aa0
 #4 [ffff8001116ef860] __wait_for_common at ffff800008dd166c
 #5 [ffff8001116ef8e0] wait_for_completion_interruptible at ffff800008dd1898
 #6 [ffff8001116ef8f0] handle_async_copy at ffff8000055142f4 [nfsv4]
 #7 [ffff8001116ef970] _nfs42_proc_copy at ffff8000055147c8 [nfsv4]
 #8 [ffff8001116efa80] nfs42_proc_copy at ffff800005514cf0 [nfsv4]
 #9 [ffff8001116efc50] __nfs4_copy_file_range.constprop.0 at ffff8000054ed694 [nfsv4]

The NULL-pointer dereference was due to nfs42_complete_copies() listed
the nfs_server->ss_copies by the field ss_copies of nfs4_copy_state.
So the nfs4_copy_state address ffff0100f98fa3f0 was offset by 0x10 and
the data accessed through this pointer was also incorrect. Generally,
the ordered list nfs4_state_owner->so_states indicate open(O_RDWR) or
open(O_WRITE) states are reclaimed firstly by nfs4_reclaim_open_state().
When destination state reclaim is failed with NFS_STATE_RECOVERY_FAILED
and copies are not deleted in nfs_server->ss_copies, the source state
may be passed to the nfs42_complete_copies() process earlier, resulting
in this crash scene finally. To solve this issue, we add a list_head
nfs_server->ss_src_copies for a server-to-server copy specially.

Fixes: 0e65a32c8a ("NFS: handle source server reboot")
Signed-off-by: Yanjun Zhang <zhangyanjun@cestc.cn>
Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17 15:22:19 +02:00
..
blocklayout pNFS: Fix the pnfs block driver's calculation of layoutget size 2024-01-25 15:27:23 -08:00
filelayout pNFS/filelayout: fixup pNfs allocation modes 2024-06-12 11:03:51 +02:00
flexfilelayout nfs: fix panic when nfs4_ff_layout_prepare_ds() fails 2024-03-26 18:20:57 -04:00
cache_lib.c NFS client updates for Linux 4.15 2017-11-17 14:18:00 -08:00
cache_lib.h NFS client updates for Linux 4.15 2017-11-17 14:18:00 -08:00
callback_proc.c pNFS: Avoid a live lock condition in pnfs_update_layout() 2022-06-06 11:53:55 -04:00
callback_xdr.c SUNRPC: Fix integer overflow in decode_rc_list() 2024-10-17 15:22:18 +02:00
callback.c nfsd: stop setting ->pg_stats for unused stats 2024-08-19 06:00:04 +02:00
callback.h NFSv4.1: Fix uninitialised variable in devicenotify 2022-01-06 14:00:20 -05:00
client.c NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() 2024-10-17 15:22:19 +02:00
delegation.c NFS: Avoid unnecessary rescanning of the per-server delegation list 2024-09-18 19:23:03 +02:00
delegation.h NFSv4: Fix delegation return in cases where we have to retry 2021-06-13 19:36:27 -04:00
dir.c nfs: don't invalidate dentries on transient errors 2024-07-25 09:49:12 +02:00
direct.c nfs: drop the incorrect assertion in nfs_swap_rw() 2024-07-05 09:31:52 +02:00
dns_resolve.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
dns_resolve.h NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
export.c nfsd: allow reaping files still under writeback 2024-03-26 18:20:23 -04:00
file.c NFS Client Updates for Linux 6.1 2022-10-13 09:58:42 -07:00
fs_context.c nfs: keep server info for remounts 2024-06-12 11:03:50 +02:00
fscache.c mm, netfs, fscache: stop read optimisation when folio removed from pagecache 2024-01-10 17:10:31 +01:00
fscache.h nfs: Convert to release_folio 2022-05-09 23:12:33 -04:00
getroot.c NFS: Remove the nfs4_label argument from nfs_setsecurity 2021-11-05 14:54:40 -04:00
inode.c nfs: Handle error of rpc_proc_register() in nfs_net_init(). 2024-05-17 11:55:54 +02:00
internal.h nfs: fix undefined behavior in nfs_block_bits() 2024-06-16 13:41:41 +02:00
io.c NFS: Fix up incorrect documentation 2021-04-05 09:04:20 -04:00
iostat.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig NFS: Replace readdir's use of xxhash() with hash_64() 2022-04-07 16:19:47 -04:00
Makefile nfs: Convert to new fscache volume/cookie API 2022-01-10 11:53:25 +00:00
mount_clnt.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
namespace.c NFS: Fix an Oops in nfs_d_automount() 2022-12-31 13:32:18 +01:00
netns.h nfs: make the rpc_stat per net namespace 2024-05-17 11:55:54 +02:00
nfs2super.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
nfs2xdr.c NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN 2023-09-13 09:42:49 +02:00
nfs3_fs.h vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
nfs3acl.c vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
nfs3client.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
nfs3proc.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
nfs3super.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
nfs3xdr.c NFS: Guard against READDIR loop when entry names exceed MAXNAMELEN 2023-09-13 09:42:49 +02:00
nfs4_fs.h NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
nfs4client.c NFSv4.1 another fix for EXCHGID4_FLAG_USE_PNFS_DS for DS server 2024-08-03 08:49:16 +02:00
nfs4file.c NFSv4.2 fix problems with __nfs42_ssc_open 2022-08-19 20:31:57 -04:00
nfs4getroot.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nfs4idmap.c nfs: remove unnecessary (void*) conversions. 2022-10-03 11:26:36 -04:00
nfs4idmap.h
nfs4namespace.c NFS: Avoid memcpy() run-time warning for struct sockaddr overflows 2022-10-27 15:52:10 -04:00
nfs4proc.c NFSv4: Fix clearing of layout segments in layoutreturn 2024-09-18 19:23:03 +02:00
nfs4renewd.c treewide: remove editor modelines and cruft 2021-05-07 00:26:34 -07:00
nfs4session.c NFSv4: Sanity check the parameters in nfs41_update_target_slotid() 2021-11-07 09:23:14 -05:00
nfs4session.h NFSv4: Sanity check the parameters in nfs41_update_target_slotid() 2021-11-07 09:23:14 -05:00
nfs4state.c NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() 2024-10-17 15:22:19 +02:00
nfs4super.c NFS: Adjust fs_context error logging 2021-01-10 13:32:39 -05:00
nfs4sysctl.c nfs: Do not convert nfs_idmap_cache_timeout to jiffies 2018-01-18 15:10:47 -05:00
nfs4trace.c pNFS/flexfiles: Add tracing for layout errors 2020-01-15 10:54:33 -05:00
nfs4trace.h trace: Relocate event helper files 2024-03-06 14:45:17 +00:00
nfs4xdr.c NFSv4.2: Fix a memory stomp in decode_attr_security_label 2022-12-31 13:32:18 +01:00
nfs42.h NFSv4.2: fix listxattr maximum XDR buffer size 2024-03-26 18:20:56 -04:00
nfs42proc.c NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() 2024-10-17 15:22:19 +02:00
nfs42xattr.c NFSv4.2: fix wrong shrinker_id 2023-07-19 16:21:43 +02:00
nfs42xdr.c NFSv4.2: Rework scratch handling for READ_PLUS (again) 2023-09-13 09:43:05 +02:00
nfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nfsroot.c NFS: Fix an off by one in root_nfs_cat() 2024-03-26 18:20:56 -04:00
nfstrace.c NFSv4: Catch and trace server filehandle encoding errors 2021-04-14 09:36:29 -04:00
nfstrace.h trace: Relocate event helper files 2024-03-06 14:45:17 +00:00
pagelist.c NFSv4.1 mark qualified async operations as MOVEABLE tasks 2022-05-31 17:09:30 -04:00
pnfs_dev.c NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info 2023-09-19 12:27:58 +02:00
pnfs_nfs.c pNFS: Fix assignment of xprtdata.cred 2023-09-13 09:42:49 +02:00
pnfs.c NFSv4: Fix clearing of layout segments in layoutreturn 2024-09-18 19:23:03 +02:00
pnfs.h NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked 2022-10-06 09:52:09 -04:00
proc.c NFS: NFSv2/v3 clients should never be setting NFS_CAP_XATTR 2022-02-25 18:50:13 -05:00
read.c NFSv4.2: Rework scratch handling for READ_PLUS (again) 2023-09-13 09:43:05 +02:00
super.c NFSv4: Add missing rescheduling points in nfs_client_return_marked_delegations 2024-09-12 11:10:25 +02:00
symlink.c nfs: propagate readlink errors in nfs_symlink_filler 2024-07-25 09:49:11 +02:00
sysctl.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sysfs.c NFS: rename nfs_client_kset to nfs_kset 2023-10-10 22:00:35 +02:00
sysfs.h NFSv4: Fix up RCU annotations for struct nfs_netns_client 2020-10-15 13:31:08 -04:00
unlink.c NFSv4.1 mark qualified async operations as MOVEABLE tasks 2022-05-31 17:09:30 -04:00
write.c nfs: fix UAF in direct writes 2024-04-03 15:19:34 +02:00